We provide IT Staff Augmentation Services!

Data Scientist/machine Learning Engineer Resume

3.00/5 (Submit Your Rating)

Newport, NJ

SUMMARY

  • Seasoned technology specialist with 16+ years of experience in all the phases of software development life cycle.
  • I have extensive experience in creating medium to large enterprise distributed software solutions from incubation to product. My experience also cuts through hands - on coding in Data scientist, Machine Learning, Deep Learning, Dev Ops, Big Data, and Cloud Technologies

TECHNICAL SKILLS

Technologies: Python, R, Scala, Java, Mule and C#

Scripting: Unix and Python Shell Scripting

Development Tools: Anaconda, Jupyter, Cloud Jupyter, Sub Lime Text, Vim, R Studio, PyCharm, IntelliJ, Visual Studio Code, Visual Studio 2017, Eclipse, Dev StatX TOAD and SQL Developer

Business Intelligence: Big Data, Hadoop, Spark, PySpark, Alteryx, Trifecta, Grafana, Kafka, Elastic Search(ELK), Hive, Pig and Kibana

Cloud Technologies: Microsoft Azure Cloud, Dockers, Edge Nodes, Kubernetes and AWS

Database Technologies: HBase, Cassandra, Mango DB, Maria DB, Mem SQL, SQL Server 2014, Oracle 12c and DB2

Reporting Tools: Tableau and Power BI

Source Control: Bit Bucket, GitHub and TFS

Build Tools: Maven, Jenkins, Jules, Ansible, Netflix Zulu, Eureka, Apigee, TFS Team Build and WiX 3.0

Business Knowledge: Retail Banking, Wealth Management, Financial and Healthcare

SDLC Process: Agile, Scrum, Jira, Kanban and Dev ops

PROFESSIONAL EXPERIENCE

Data Scientist/Machine Learning Engineer

Confidential, Newport, NJ

Responsibilities:

  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization and performed Gap analysis.
  • Analyzing large data sets apply machine learning techniques and develop predictive models
  • Used Pandas, NumPy, Seaborn, SciPy, TensorFlow, Pytorch, Textacy, Keras, NLTK, Matplotlib, Scikit-learn and XGBoost in Python for developing various machine learning algorithms.
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Application of various machine learning algorithms and statistical modeling like XGBoost, Random Forest, Decision trees, Neural Networks to identify Volume using Scikit-learn package in python.
  • Evaluated models using Cross Validation, F1 score, ROC curves, Log Loss and used AUC for feature selection.
  • Working on NoSQL databases like Cassandra and Maria DB.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster.
  • Experience in building reliable and auditable CI/CD deployment pipeline using Jenkins
  • Developing Kafka and Hadoop integration for data ingestion, data mapping and data process capabilities.
  • Working in distributed computing and micro services
  • Code paths are unit tested, defect free and integration tested
  • Working with Cloud based applications and container orchestration like Docker and Kubernetes
  • Experience with Agile Development approach
  • Experience working with GPUs to develop models
  • Experience handling terabyte size datasets
  • Written Python script to integrate and productize the model
  • Written Spark code for productize the model

Environment: Python, PySpark, AWS, S3, Cloudera, Horton, Kibana, Elastic Search, logintash, Netflix Zuul, Eureka, Side car, Control-M, Gunicorn, Edge Node, Dockers, Kubernetes, Kafka, Altyrex, Git, Jenkin, Bit Bucket, Dev Ops, Scrum and Tableau

Lead Analyst

Confidential, Pennington, NJ

Responsibilities:

  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization and performed Gap analysis.
  • Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-learn, Tensorflow, NLP and NLTK in Python for developing various machine learning algorithms.
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Application of various machine learning algorithms and statistical modeling like linear regression, Naïve Bayes, Random Forest, Decision trees, neural networks, SVM, K-means, KNN clustering to identify Volume using scikit-learn package in python.
  • Analyzing large data sets apply machine learning techniques and develop predictive models
  • Evaluated models using Cross Validation, Log loss function, ROC curves and used AUC for feature selection.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from Oracle database.
  • Processed data to build matrix-based collaborative filtering recommendation model via spark (MLlibALS) to drive the web application of the financial product recommendation
  • Worked with advanced NLP, clustering, classification, and graph analytics algorithms

Environment: Python, PySpark, HAAS, Azure, Java, C#, Hadoop, Tableau, PowerBI

Lead Analyst

Confidential, Pennington, NJ

Responsibilities:

  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization and performed Gap analysis.
  • Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-learn, Tensorflow, NLP and NLTK in Python for developing various machine learning algorithms.
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Application of various machine learning algorithms and statistical modeling like linear regression, Naïve Bayes, Random Forest, Decision trees, neural networks, SVM, K-means, KNN clustering to identify Volume using scikit-learn package in python.
  • Analyzing large data sets apply machine learning techniques and develop predictive models
  • Evaluated models using Cross Validation, Log loss function, ROC curves and used AUC for feature selection.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from Oracle database.
  • Processed data to build matrix-based collaborative filtering recommendation model via spark (MLlibALS) to drive the web application of the financial product recommendation
  • Worked with advanced NLP, clustering, classification, and graph analytics algorithms

Environment: Python, PySpark, HAAS, Azure, Java, C#, Hadoop, Tableau, PowerBI

Lead Analyst

Confidential, Pennington, NJ

Responsibilities:

  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization and performed Gap analysis.
  • Used Pandas, NumPy, seaborn, SciPy, matplotlib, Scikit-learn, Tensorflow, NLP and NLTK in Python for developing various machine learning algorithms.
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Application of various machine learning algorithms and statistical modeling like linear regression, Naïve Bayes, Random Forest, Decision trees, neural networks, SVM, K-means, KNN clustering to identify Volume using scikit-learn package in python.
  • Analyzing large data sets apply machine learning techniques and develop predictive models, statistical models and developing and enhancing statistical models by leveraging best-in-class modeling techniques.
  • Evaluated models using Cross Validation, Log loss function, ROC curves and used AUC for feature selection.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from Oracle database.
  • Processed data to build matrix-based collaborative filtering recommendation model via spark (MLlibALS) to drive the web application of the financial product recommendation
  • Worked with advanced NLP, clustering, classification, and graph analytics algorithms

Environment: Python, PySpark, HAAS, Azure, Java, C#, Hadoop, Tableau, PowerBI

We'd love your feedback!