Scientist Resume

PROFESSIONAL SUMMARY:

I have 7+ years of work experience designing, building and implementing analytical and enterprise application using machine learning, Python, R, Scala and Java.
5+ years of experience with a focus in Big d Confidential, Deep Learning, Machine Learning, Image processing or AI.
Has very good experience implementing and handling end - to - end d Confidential science products.
Good experience in periodic model validation and optimization workflows for the d Confidential science products developed.
Collaborated with engineers to deploy successful models and algorithms into production environments.
Good understanding of model validation processes and optimizations.
An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
Proficient in understanding and analyzing business requirements, building predictive models, designing experiments, testing hypothesis, and interpreting statistical results into actionable insights and recommendations.
Fluency in Python with working knowledge of ML & Statistical libraries (e.g. Scikit-learn, Pandas).
Experience in processing real time d Confidential and building ML pipelines end to end.
Very Strong in Python, statistical analysis, tools, and modeling.
Very good hands-on experience working with large d Confidential sets and Deep Learning algorithms using apache spark and TensorFlow.
Good knowledge of recurrent neural networks, LSTM networks and word2vec.
Good experience in refining and improve our image recognition pipeline.
Deep interest in learning both the theoretical and practical aspects of working with and deriving insights from d Confidential .
Good experience in extracting and analyzing very large volume of d Confidential covering a wide range of information from user profile to transaction history using machine learning tools.
Built state-of-the-art statistical procedures, algorithms and models to solve a range of problems in diverse domains.
Proficient code writing capability in a major programming language such as Python, R, Java and Scala.
Expertise in deep neural network topologies such as convolutional nets and recurrent nets.
Good experience with deep learning frameworks like Caffe and TensorFlow.
Experience using Deep Learning to solve problems in Image or Video analysis.
Good understanding of Apache Spark features & advantages over map reduce or traditional systems.
Very good hands-on in Spark Core, Spark Sql, Spark Streaming and Spark machine learning using Scala and Python programming languages.
Solid Understanding of RDD Operations in Apache Spark i.e. Transformations & Actions, Persistence (Caching), Accumulators, Broadcast Variables.
In depth understanding of Apache spark job execution Components like DAG, lineage graph, Dag Scheduler, Task scheduler, Stages and task.
Good understanding Driver, Executor Spark web UI.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep learning.
Experience in submitting Apache Spark job and map reduce jobs to YARN.
Has ability to work effectively in a fast-paced, changing environment.
Highly organized and detail oriented, with a strong ability to coordinate and track multiple deliverables, tasks and dependencies.
Proficiency with SQL and experience in working with d Confidential bases
Experience in exposing Apache Spark as web services.
Worked under direction of CSO to develop an efficient solution to a predictive analytics problem, testing a number of potential machine learning algorithms of apache spark.
Experience in real time processing using Apache Spark and Kafka.
Have good working experience of No SQL d Confidential base like Cassandra and MongoDB.
Delivered at multiple end-to-end Big d Confidential analytical based solutions and distributed systems like Apache Spark.
Experience leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test
Hands on experience leading delivery through Agile methodologies
Experience in managing code on Github
Very Good Knowledge in YARN (Hadoop 2.x.x) terminology and High availability Hadoop Clusters.
Experience in analyzing the log files for Hadoop and eco system services and finding out root cause.
Proficient in Java, with a good knowledge of its ecosystems.
Good hands on experience on Spring & Hibernate framework.
Solid understanding of object-oriented programming.
Familiarity with concepts of MVC, JDBC, and RESTful.
Familiarity with build tools such as Maven and SBT.

TECHNICAL SKILLS

Languages: Python, R, Scala and Java

Machine learning library: Spark ML, Spark Mllib, Scikit-Learn. NLTK & Stanford NLP

Deep learning framework: Tensorflow

Big D Confidential Frameworks: Apache Spark, Apache Hadoop, Kafka, Mongo DB, Cassandra.

Machine learning: Linear regression, Logistic Regression, Naive Bayes, SVM, Decision Trees, RandomForest, Boosting, Kmeans, Bagging etc

Big d Confidential Distribution: Cloudera & Amazon EMRCloud

Web Technologies: Flask, django and spring MVC

Front End Technologies: JSP, HTML5, Ajax, JQuery and XML

Servers: Web server,Apache2, Nginx Web Sphere and Tomcat

Visualization Tool: Apache Zeppelin, Matplotlib and Tableau.

D Confidential bases: Oracle, Mysql and Postgress.

No SQL: MongoDB and Cassandra

Operating Systems: Linux and windows

Scheduling Tools: Airflow & oozie.

PROFESSIONAL EXPERIENCE:

Confidential

Scientist

Responsibilities:

Performed d Confidential exploratory, d Confidential visualizations, and feature selections using Python and Apache Spark.
Scaled Scikit-learn machine learning algorithms using apache spark.
Using techniques such as Fast Fourier Transformations, Convolution Neural Networks and Deep learning.
I develop Deep Convolution and Recurrent Neural Networks with TensorFlow and have significant Risk Management & Quantitative Finance experience.
Used Python, Convolution Neural Networks (CNN), Deep Belief Networks (DBN), Theano, caffe etc.
Applied unsupervised and supervised learning methods in analyzing high-dimensional d Confidential . Proficient use of Python scikit-learn, pandas, and numpy packages.
Performed d Confidential modeling operations using Power Bi, Pandas, and SQL.
Utilized Python libraries wxPython, numPY, Twisted and matPlotLib
Used python libraries like Beautiful Soup and matplotlib.
Developed and implemented predictive models of user behavior d Confidential on websites, URL categorical, social network analysis, social mining and search content based on large-scale Machine Learning,
Wrote scripts in Python using Apache Spark and ElasticSearch engine for use in creating dashboards visualized in Grafana.
Converted pandas d Confidential frame d Confidential set to apache spark d Confidential frame.
Used multile machine learning algorithms, including random forest and boosted tree, SVM, SGD, neural network, and deep learning using Tensorflow.
Collaborated with engineers to deploy successful models and algorithms into production environments.
Collaborated with a diverse team that includes statisticians, Chief Science Officer and engineers to build d Confidential science project pipelines and algorithms to derive valuable insights from current and new d Confidential sets.
Used Pyspark d Confidential frame to read text d Confidential, csv d Confidential, image d Confidential from HDFS, S3 and Hive.
Cleaned input text d Confidential using Pyspark Machine learning feature exactions API.
Created features to train algorithms.
Used various algorithms of Pyspark ML API.
Trained model using historical d Confidential stored in HDFS and Amazon S3.
Used Spark Streaming to load the trained model to predict on real time d Confidential from kafka.
Stored the result in MongoDB
Web application can pick d Confidential which is stored in MongoDB.
Used Apache Zeppelin to vizualization of Big D Confidential .
Fully automated job scheduling, monitoring, and cluster management without human intervention using airflow.
Build apache spark as Web service using flask. worked with input file formats like orc, parquet, json, avro.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep

Environment: Machine learning, Scikit-learning,Pandas, Spark core, Spark SQL, Spark streaming, Python, airflow, Amazon EMR, ec2, s3,pandas, numpy, matplotlib, tensorflow, kafka, flask, mongoDB, Hive, hdfs, github, REST & airflow.

Confidential

Scientist

Responsibilities:

Responsible for performing Machine-learning techniques regression/classification to predict the outcomes.
Responsible for design and development of advanced R/Python programs to prepare transform and harmonize d Confidential sets in preparation for modeling.
Designed and automated the process of score cuts that achieve increased close and good rates using advanced R programming.
Utilized Convolution Neural Networks to implement a machine learning image recognition component.
Managed d Confidential sets using Panda d Confidential frames and MySQL, queried MYSQL relational d Confidential base (RDBMS) queries from python using Python-MySQL connector MySQLdb package to retrieve information.
Utilized standard Python modules such as csv, itertools and pickle for development.
Tech stack is Python 2.7/PyCharm/Anaconda/pandas/numpy/unittest/R/Oracle.
Developed large d Confidential sets from structured and unstructured d Confidential . Perform d Confidential mining.
Partnered with modelers to develop d Confidential frame requirements for projects.
Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
Tracked various camp Confidential ns, generating customer profiling analysis and d Confidential manipulation.
Provided python programming, with detailed direction, in the execution of d Confidential analysis that contributed to the final project deliverables. Responsible for d Confidential mining.
Analyzed large d Confidential sets to answer business questions by generating reports and outcome.
Worked in a team of programmers and d Confidential analysts to develop insightful deliverables that support d Confidential -driven marketing strategies.
Executed SQL queries from R/Python on complex table configurations.
Retrieving d Confidential from d Confidential base through SQL as per business requirements.
Create, maintain, modify and optimize SQL Server d Confidential bases.
Manipulation of D Confidential using python Programming.
Adhering to best practices for project support and documentation.
Understanding the business problem, build the hypothesis and validate the same using the d Confidential .
Managing the Reporting/Dash boarding for the Key metrics of the business.
Involved in d Confidential analysis with using different analytic techniques and modeling techniques.

Environment: Python,Oracle,Python, scikit learn, Pandas, Numpy, Scipy, NLTK, jupyter notebook, R and Studio

Confidential

Analyst

Responsibilities:

Developed end to end enterprise Applications using Spring MVC, REST and JDBC Template Modules.
Written well designed testable, efficient java code.
Understanding and analyzing complex issues and addressing challenges arising during the software development process, both conceptually and technically.
Implemented best practices of Automated Build, Test and Deployment.
Developed design patterns, d Confidential structures and algorithms based on project need.
Worked on multiple tools such as Toad, Eclipse, SVN, Apache and Tomcat.
Deployed models via APIs into applications or workflows
Worked on User Interface technologies like HTML5, CSS/SCSS.
Wrote Stored procedure and SQL queries based on project need.
Deployed built jar into application server.
Created Automated Unit Tests using Flexible/Open Source Frameworks
Developed Multi-threaded and Transaction Handling code (JMS, D Confidential base).

Environment: Java, Spring MVC,Hibernate, JMS, HTML5, CSS/SCSS, Junit, Eclipse, Tomcat and Oracle.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship