We provide IT Staff Augmentation Services!

Data Scientist/ Machine Learning Resume

2.00/5 (Submit Your Rating)

CA

SUMMARY:

  • Above 8+ years of experience in Machine Learning, Data Mining with large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive Modeling, Data Visualization.
  • Designing of Physical Data Architecture of New system engines.
  • Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using R, Python and Tableau.
  • Having good experience in NLP with Apache, Hadoop and Python.
  • Hands on SparkMlib utilities such as Including Classification, Regression, Clustering, Collaborative Filtering, Dimensionality Reduction.
  • Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random Forest, SVM, K - Nearest Neighbors, Bayesian, XG Boost) in Forecasting/ Predictive Analytics, Segmentation Methodologies, Regression based models, Hypothesis testing, Factor analysis/ PCA, Ensembles.
  • Hands on experience in implementing LDA, Naïve Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, Neural Networks, Principle Component Analysis and good knowledge on Recommender Systems.
  • Developing Logical Data Architecture with adherence to Enterprise Architecture.
  • Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scales across massive volume of structured and unstructured data.
  • Adept in statistical programming languages like R and also Python including BigData technologies like Hadoop, Hive.
  • Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
  • Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
  • Skilled in using dplyr and pandas in R and python for performing exploratory data analysis.
  • Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms.
  • Experience in designing star schema, Snow flake schema for Data Warehouse, ODS architecture.
  • Experience in designing and developing the Tableau and updating the existing desktop, developing ad-hoc reports, scheduling the processes and administering the tableau activities using tableau.
  • Experienced in designing customized interactive dashboards in Tableau using Marks, Action, Filters, Parameter and Calculations.
  • Good understanding of Teradata SQL Assistant, Teradata Administrator and data load/ export utilities like BTEQ, Fast Load, Multi Load, Fast Export.
  • Experience and Technical proficiency in Designing, Data Modeling Online Applications, Solution Lead for Architecting Data Warehouse/Business Intelligence Applications.
  • Experience in maintaining database architecture and metadata that support the Enterprise Data warehouse.
  • Prediction - Prediction of a numerical value using Regression or CART.
  • Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, Pivot Tables and OLAP reporting.
  • Highly skilled in using visualization tools like Tableau, ggplot2 and d3.js for creating dashboards.
  • Highly skilled in using Hadoop (Pig and Hive) for basic analysis and extraction of data in the infrastructure to provide data summarization.

TECHNICAL SKILLS:

Languages: SQL, PL/SQL, ASP, Visual Basic, XML, SAS, Python, SQL, T-SQL, SQL Server, C, C++, JAVA, HTML, Shell Scripting, PERL, R, Matlab, Scala.

Data Modeling Tools: Erwin r 9.6/9.5, ER/Studio 9.7, Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables.

Databases: Oracle 11g/12c, MS Access, SQL Server 2012/2014, Sybase and DB2, Teradata14/15, Hive.

Big Data Tools: Hadoop, Hive, Spark, Pig, HBase, Sqoop, Flume.

BI Tools: Tableau 7.0/8.2, Tableau server 8.2, Tableau Reader 8.1,SAP Business Objects, Crystal Reports

Packages: Microsoft Office 2010, Microsoft Project 2010, SAP and Microsoft Visio, Share point Portal Server

Applications: Toad for Oracle, Oracle SQL Developer, MS Word, MS Excel,MS Power Point, Teradata, Designer 6i.

Methodologies: RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.

Operating Systems: Microsoft Windows, Linux/UNIX.

PROFESSIONAL EXPERIENCE:

Confidential, Cupertino, CA

Data Scientist/ Machine Learning

Responsibilities:

  • Built models using Statistical techniques like Bayesian HMM and Machine Learning classification models like XG Boost, SVM, and Random Forest.
  • A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL, GIT, Unix Commands, NoSQL, MongoDB, Hadoop.
  • Designing and develop Tableau Reports, Documents, Dashboards for specified requirements and timelines.
  • Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
  • Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms.
  • Installed and used Caffe Deep Learning Framework
  • Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
  • Worked as Data Architects and IT Architects to understand the movement of data and its storage and ER Studio 9.7.
  • Purchasing, Setting up and configuring a Tableau Server and MS-SQL 2008 R2 server for Data warehouse purpose.
  • Preparing Dashboards using calculations, parameters in Tableau.
  • Designed, developed and implemented Tableau Business Intelligence reports.
  • Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
  • Data Manipulation and Aggregation from different source using Nexus, Toad, Business Objects, Power BI and Smart View.
  • Implemented Agile Methodology for building an internal application.
  • Focus on integration overlap and Informatica newer commitment to MDM with the acquisition of Identity Systems.
  • Good knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, Task Tracker, NameNode, DataNode, Secondary NameNode, and MapReduce concepts.
  • As Architect delivered various complex OLAP databases/cubes, scorecards, dashboards and reports.
  • Programmed a utility in Python that used multiple packages (scipy, numpy, pandas)
  • Implemented Classification using supervised algorithms like Logistic Regression, Decision trees, KNN, Naive Bayes.
  • Used Teradata15 utilities such as Fast Export, MLOAD for handling various tasks data migration/ETL from OLTP Source Systems to OLAP Target Systems
  • Experience in Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Flume including their installation and configuration.
  • Updated Python scripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification.
  • Data transformation from various resources, data organization, features extraction from raw and stored.
  • Validated the machine learning classifiers using ROC Curves and Lift Charts.
  • Extracted data from HDFS and prepared data for exploratory analysis using data munging.

Environment: ER Studio 9.7, Tableau 9.03, AWS, Teradata 15, MDM, GIT, Unix, Python 3.5.2,, Machine learning, MLLib, SAS, regression, logistic regression, Hadoop, NoSQL, Teradata, OLTP, random forest, OLAP, HDFS, ODS, NLTK, SVM, JSON, XML, MapReduce.

Confidential,Palo Alto, CA

Data Architecture/Data Modeler

Responsibilities:
  • Coded R functions to interface with Caffe Deep Learning Framework.
  • Working in Amazon Web Services cloud computing environment
  • Used Tableau to automatically generate reports, Worked with partially adjudicated insurance flat files, internal records, 3rd party data sources, JSON, XML and more.
  • Interacting with business stake holders, gathering requirements and managing the delivery, covering the entire Tableau development life cycle.
  • Created BI interactive Dashboards and submitting to the server using Tableau Publisher.
  • Worked with several R packages including knitr, dplyr, SparkR, CausalInfer, spacetime.
  • Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Mahout, Hadoop and MongoDB.
  • Gathering all the data that is required from multiple data sources and creating datasets that will be used in analysis.
  • Performed Exploratory Data Analysis and Data Visualizations using R, and Tableau.
  • Perform a proper EDA, Univariate and bi-variate analysis to understand the intrinsic effect/combined effects.
  • Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes.
  • Independently coded new programs and designed Tables to load and test the program effectively for the given POC's using with Big Data/Hadoop.
  • Designed data models and data flow diagrams using Erwin and MS Visio.
  • As an Architect implemented MDM hub to provide clean, consistent data for a SOA implementation.
  • Developed, Implemented & Maintained the Conceptual, Logical & Physical Data Models using Erwin for Forward/Reverse Engineered Databases.
  • Established Data architecture strategy, best practices, standards, and roadmaps.
  • Lead the development and presentation of a data analytics data-hub prototype with the help of the other members of the emerging solutions team
  • Performed data cleaning and imputation of missing values using R.
  • Worked with Hadoop eco system covering HDFS, HBase, YARN and Map Reduce
  • Take up ad-hoc requests based on different departments and locations
  • Used Hive to store the data and perform data cleaning steps for huge datasets.
  • Created dash boards and visualization on regular basis using ggplot2 and Tableau
  • Creating customized business reports and sharing insights to the management
  • Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.
  • Interacted with the other departments to understand and identify data needs and requirements and work with other members of the IT organization to deliver data visualization and reporting solutions to address those needs.

Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.

Confidential,Princeton,NJ

Java Developer

Responsibilities:
  • Designed & developed the application using Spring Framework
  • Developed class diagrams, sequence and use case diagrams using UML Rational Rose.
  • Designed the application with reusable J2EE design patterns
  • Developed test cases for Unit testing using JUnit and performed integration and system testing
  • Involved in coding for the presentation layer using Struts Framework, JSP, AJAX, XML, XSLT and JavaScript
  • Closely worked and supported the creation of database schema objects (tables, stored procedures, and triggers) using Oracle SQL.
  • Designed DAO objects for accessing RDBMS
  • Designed & developed Data Transfer Objects to carry the data between different layers
  • Developed web pages using JSP, HTML, DHTML and JSTL
  • Designed and developed a web-based client using Servlets, JSP, Tag Libraries, JavaScript, HTML and XML using Struts Framework.
  • Developed views and controllers for client and manager modules using Spring MVC and Spring Core.
  • Used Spring Security for securing the web tier Access.
  • Business logic is implemented using Hibernate.
  • Developed and modified database objects as per the requirements.
  • Involved in Unit integration, bug fixing, acceptance testing with test cases, Code reviews.
  • Interaction with customers and identified System Requirements and developed Software Requirement Specifications.
  • Implemented Java design patterns wherever required.
  • Implemented Multi-threading concepts.

Environment: Java, PL/SQL, SQL, HTML, CSS, Java Script, hibernate, Middleware Technologies, Ajax, Servlets, JSP, Web logic, JBoss, WebSphere, XML, XHTML, Eclipse, JMS, Oracle11g, EJB.

Confidential

Java Developer

Responsibilities:
  • Participating in system design, planning, estimation, and implementation.
  • Involved in developing Use case diagrams, Class diagrams, Sequence diagrams and process flow diagrams for the modules using UML and Rational Rose.
  • Developed the presentation layer using JSP, AJAX, HTML, XHTML, CSS and client validations using JavaScript.
  • Developed and implemented the MVC Architectural Pattern using Spring Framework.
  • Effective usage of J2EE Design Patterns Namely Session Facade, Factory Method, Command, and Singleton to develop various base framework components in the application.
  • Developed various EJBs (session and entity beans) for handling business logic.
  • Developed Session Beans and DAO classes for Accounts and other Modules.
  • Worked on generating the web services classes by using WSDL, UDDI, and SOAP.
  • Consumed Web Services using WSDL, SOAP, and UDDI from the third party for authorizing payments to/from customers.
  • Designed and developed systems based on JEE specifications and used Spring Framework with MVC architecture.
  • Used Spring Roo Framework Design/Enterprise Integration patterns and REST architecture compliance for design and development of applications.
  • Involved in the application development using Spring Core, Spring Roo, Spring JEE, Spring Aspects modules and Java web-based technologies such as Web Service (REST /SOA /micro services) including micro services implementations and Hibernate ORM.
  • Used LDAP and Microsoft active directory series for authorization and authentication services.
  • Implemented different design patterns such as singleton, Session Façade, Factory, and MVC design patterns such as Business delegate, session façade and DAO design patterns.
  • Used JPA - Object Mapping for the backend data persistence.

Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.

We'd love your feedback!