Scientist Resume
PROFESSIONAL SUMMARY:
- I have 7+ years of work experience designing, building and implementing analytical and enterprise application using machine learning, Python, R, Scala and Java.
- 5+ years of experience with a focus in Big d Confidential, Deep Learning, Machine Learning, Image processing or AI.
- Has very good experience implementing and handling end - to - end d Confidential science products.
- Good experience in periodic model validation and optimization workflows for the d Confidential science products developed.
- Collaborated with engineers to deploy successful models and algorithms into production environments.
- Good understanding of model validation processes and optimizations.
- An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
- Proficient in understanding and analyzing business requirements, building predictive models, designing experiments, testing hypothesis, and interpreting statistical results into actionable insights and recommendations.
- Fluency in Python with working knowledge of ML & Statistical libraries (e.g. Scikit-learn, Pandas).
- Experience in processing real time d Confidential and building ML pipelines end to end.
- Very Strong in Python, statistical analysis, tools, and modeling.
- Very good hands-on experience working with large d Confidential sets and Deep Learning algorithms using apache spark and TensorFlow.
- Good knowledge of recurrent neural networks, LSTM networks and word2vec.
- Good experience in refining and improve our image recognition pipeline.
- Deep interest in learning both the theoretical and practical aspects of working with and deriving insights from d Confidential .
- Good experience in extracting and analyzing very large volume of d Confidential covering a wide range of information from user profile to transaction history using machine learning tools.
- Built state-of-the-art statistical procedures, algorithms and models to solve a range of problems in diverse domains.
- Proficient code writing capability in a major programming language such as Python, R, Java and Scala.
- Expertise in deep neural network topologies such as convolutional nets and recurrent nets.
- Good experience with deep learning frameworks like Caffe and TensorFlow.
- Experience using Deep Learning to solve problems in Image or Video analysis.
- Good understanding of Apache Spark features & advantages over map reduce or traditional systems.
- Very good hands-on in Spark Core, Spark Sql, Spark Streaming and Spark machine learning using Scala and Python programming languages.
- Solid Understanding of RDD Operations in Apache Spark i.e. Transformations & Actions, Persistence (Caching), Accumulators, Broadcast Variables.
- In depth understanding of Apache spark job execution Components like DAG, lineage graph, Dag Scheduler, Task scheduler, Stages and task.
- Good understanding Driver, Executor Spark web UI.
- Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep learning.
- Experience in submitting Apache Spark job and map reduce jobs to YARN.
- Has ability to work effectively in a fast-paced, changing environment.
- Highly organized and detail oriented, with a strong ability to coordinate and track multiple deliverables, tasks and dependencies.
- Proficiency with SQL and experience in working with d Confidential bases
- Experience in exposing Apache Spark as web services.
- Worked under direction of CSO to develop an efficient solution to a predictive analytics problem, testing a number of potential machine learning algorithms of apache spark.
- Experience in real time processing using Apache Spark and Kafka.
- Have good working experience of No SQL d Confidential base like Cassandra and MongoDB.
- Delivered at multiple end-to-end Big d Confidential analytical based solutions and distributed systems like Apache Spark.
- Experience leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test
- Hands on experience leading delivery through Agile methodologies
- Experience in managing code on Github
- Very Good Knowledge in YARN (Hadoop 2.x.x) terminology and High availability Hadoop Clusters.
- Experience in analyzing the log files for Hadoop and eco system services and finding out root cause.
- Proficient in Java, with a good knowledge of its ecosystems.
- Good hands on experience on Spring & Hibernate framework.
- Solid understanding of object-oriented programming.
- Familiarity with concepts of MVC, JDBC, and RESTful.
- Familiarity with build tools such as Maven and SBT.
TECHNICAL SKILLS
Languages: Python, R, Scala and Java
Machine learning library: Spark ML, Spark Mllib, Scikit-Learn. NLTK & Stanford NLP
Deep learning framework: Tensorflow
Big D Confidential Frameworks: Apache Spark, Apache Hadoop, Kafka, Mongo DB, Cassandra.
Machine learning: Linear regression, Logistic Regression, Naive Bayes, SVM, Decision Trees, RandomForest, Boosting, Kmeans, Bagging etc
Big d Confidential Distribution: Cloudera & Amazon EMRCloud
Web Technologies: Flask, django and spring MVC
Front End Technologies: JSP, HTML5, Ajax, JQuery and XML
Servers: Web server,Apache2, Nginx Web Sphere and Tomcat
Visualization Tool: Apache Zeppelin, Matplotlib and Tableau.
D Confidential bases: Oracle, Mysql and Postgress.
No SQL: MongoDB and Cassandra
Operating Systems: Linux and windows
Scheduling Tools: Airflow & oozie.
PROFESSIONAL EXPERIENCE:
Confidential
Scientist
Responsibilities:
- Performed d Confidential exploratory, d Confidential visualizations, and feature selections using Python and Apache Spark.
- Scaled Scikit-learn machine learning algorithms using apache spark.
- Using techniques such as Fast Fourier Transformations, Convolution Neural Networks and Deep learning.
- I develop Deep Convolution and Recurrent Neural Networks with TensorFlow and have significant Risk Management & Quantitative Finance experience.
- Used Python, Convolution Neural Networks (CNN), Deep Belief Networks (DBN), Theano, caffe etc.
- Applied unsupervised and supervised learning methods in analyzing high-dimensional d Confidential . Proficient use of Python scikit-learn, pandas, and numpy packages.
- Performed d Confidential modeling operations using Power Bi, Pandas, and SQL.
- Utilized Python libraries wxPython, numPY, Twisted and matPlotLib
- Used python libraries like Beautiful Soup and matplotlib.
- Developed and implemented predictive models of user behavior d Confidential on websites, URL categorical, social network analysis, social mining and search content based on large-scale Machine Learning,
- Wrote scripts in Python using Apache Spark and ElasticSearch engine for use in creating dashboards visualized in Grafana.
- Converted pandas d Confidential frame d Confidential set to apache spark d Confidential frame.
- Used multile machine learning algorithms, including random forest and boosted tree, SVM, SGD, neural network, and deep learning using Tensorflow.
- Collaborated with engineers to deploy successful models and algorithms into production environments.
- Collaborated with a diverse team that includes statisticians, Chief Science Officer and engineers to build d Confidential science project pipelines and algorithms to derive valuable insights from current and new d Confidential sets.
- Used Pyspark d Confidential frame to read text d Confidential, csv d Confidential, image d Confidential from HDFS, S3 and Hive.
- Cleaned input text d Confidential using Pyspark Machine learning feature exactions API.
- Created features to train algorithms.
- Used various algorithms of Pyspark ML API.
- Trained model using historical d Confidential stored in HDFS and Amazon S3.
- Used Spark Streaming to load the trained model to predict on real time d Confidential from kafka.
- Stored the result in MongoDB
- Web application can pick d Confidential which is stored in MongoDB.
- Used Apache Zeppelin to vizualization of Big D Confidential .
- Fully automated job scheduling, monitoring, and cluster management without human intervention using airflow.
- Build apache spark as Web service using flask. worked with input file formats like orc, parquet, json, avro.
- Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep
Environment: Machine learning, Scikit-learning,Pandas, Spark core, Spark SQL, Spark streaming, Python, airflow, Amazon EMR, ec2, s3,pandas, numpy, matplotlib, tensorflow, kafka, flask, mongoDB, Hive, hdfs, github, REST & airflow.
Confidential
Scientist
Responsibilities:
- Responsible for performing Machine-learning techniques regression/classification to predict the outcomes.
- Responsible for design and development of advanced R/Python programs to prepare transform and harmonize d Confidential sets in preparation for modeling.
- Designed and automated the process of score cuts that achieve increased close and good rates using advanced R programming.
- Utilized Convolution Neural Networks to implement a machine learning image recognition component.
- Managed d Confidential sets using Panda d Confidential frames and MySQL, queried MYSQL relational d Confidential base (RDBMS) queries from python using Python-MySQL connector MySQLdb package to retrieve information.
- Utilized standard Python modules such as csv, itertools and pickle for development.
- Tech stack is Python 2.7/PyCharm/Anaconda/pandas/numpy/unittest/R/Oracle.
- Developed large d Confidential sets from structured and unstructured d Confidential . Perform d Confidential mining.
- Partnered with modelers to develop d Confidential frame requirements for projects.
- Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
- Tracked various camp Confidential ns, generating customer profiling analysis and d Confidential manipulation.
- Provided python programming, with detailed direction, in the execution of d Confidential analysis that contributed to the final project deliverables. Responsible for d Confidential mining.
- Analyzed large d Confidential sets to answer business questions by generating reports and outcome.
- Worked in a team of programmers and d Confidential analysts to develop insightful deliverables that support d Confidential -driven marketing strategies.
- Executed SQL queries from R/Python on complex table configurations.
- Retrieving d Confidential from d Confidential base through SQL as per business requirements.
- Create, maintain, modify and optimize SQL Server d Confidential bases.
- Manipulation of D Confidential using python Programming.
- Adhering to best practices for project support and documentation.
- Understanding the business problem, build the hypothesis and validate the same using the d Confidential .
- Managing the Reporting/Dash boarding for the Key metrics of the business.
- Involved in d Confidential analysis with using different analytic techniques and modeling techniques.
Environment: Python,Oracle,Python, scikit learn, Pandas, Numpy, Scipy, NLTK, jupyter notebook, R and Studio
Confidential
Analyst
Responsibilities:
- Developed end to end enterprise Applications using Spring MVC, REST and JDBC Template Modules.
- Written well designed testable, efficient java code.
- Understanding and analyzing complex issues and addressing challenges arising during the software development process, both conceptually and technically.
- Implemented best practices of Automated Build, Test and Deployment.
- Developed design patterns, d Confidential structures and algorithms based on project need.
- Worked on multiple tools such as Toad, Eclipse, SVN, Apache and Tomcat.
- Deployed models via APIs into applications or workflows
- Worked on User Interface technologies like HTML5, CSS/SCSS.
- Wrote Stored procedure and SQL queries based on project need.
- Deployed built jar into application server.
- Created Automated Unit Tests using Flexible/Open Source Frameworks
- Developed Multi-threaded and Transaction Handling code (JMS, D Confidential base).
Environment: Java, Spring MVC,Hibernate, JMS, HTML5, CSS/SCSS, Junit, Eclipse, Tomcat and Oracle.
