We provide IT Staff Augmentation Services!

Machine Learning/data Scientist Resume

4.00/5 (Submit Your Rating)

Tempe, AZ

PROFESSIONAL SUMMARY:

  • 8+ years of IT experience and 4+ in Machine Learning, Datamining with large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive Modeling, Data Visualization.
  • Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using R, Python.
  • Excellent Knowledge of Relational Database Design, Data Warehouse/OLAP concepts, and methodologies.
  • Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scale across a massive volume of structured and unstructured data.
  • Hands on experience in importing, cleaning, transforming, and validating data and making conclusions from the data for decision - making purposes.
  • Solid understanding of statistical analysis, predictive analysis, machine learning, data mining, quantitative analytics, multivariate testing, and optimization algorithms.
  • Worked on various databases Oracle, SQL server, SQL Server, Teradataand, DB2.
  • Well versed in system analysis, ER/Dimensional Modeling, Database design and implementing RDBMS specific features.
  • Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export using multiple ETL tools such as Informatica Power Center Experience in testing, data validation and writing SQL and PL/SQL statements - Stored Procedures, Functions, Triggers and packages.
  • Expertise in OLTP/OLAP System Study, Analysis, and E-R Modeling, developing Database Schemas like Star schema and Snowflake schema used in relational, dimensional and multidimensional Modeling.
  • Highly skilled in Tableau Desktop for data visualization, Reporting and Analysis, Cross Map, Scatter Plots, Geographic Map, Pie Charts and Bar Charts, Page Trails and Density Chart.
  • Experience with data warehousing techniques like Slowly Changing Dimensions, Surrogate key, Snowflaking etc. Worked with Star Schema, Data Models, E-R diagrams and Physical Data Models.
  • Develop, maintain and teach new tools and methodologies related to data science and high-performance computing.
  • Experience in developing different statistical MachineLearning, Text Analytics, Data Mining solutions to various business generating and problems data visualization using Python.
  • Experienced in MachineLearning techniques ANOVA, PCA, Forecasting, Time Series Regression, Linear/Nonlinear Regression, Logistics Regression, Clustering and Tree based models.
  • Expertise in Technical proficiency in Designing, Data Modeling Online Application, Solution Lead for Architecting Data Warehouse/Business Intelligence Applications.
  • Cluster Analysis, Principal Component Analysis (PCA), Association Rules, Recommender Systems.
  • Hands on experience in credentials and experience in database management and data visualization.
  • Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
  • Hands on experience with RStudio for doing data pre-processing and building machine learning algorithms on different datasets.
  • Collaborated with the lead Data Architect to model the Data warehouse in accordance withFSLDM subject areas, 3NF format, Snowflake schema.
  • Flexible with Unix/Linux and Windows Environments, working with Operating Systems like Centos5/6, Ubuntu13/14, Cosmos.

TECHNICAL SKILLS:

Languages: Python, R, Scala, Java and JavaScript.

Spark ML, Kafka, Spark MiLB, Scikit: Learn & NLTK

IDE's: Eclipse, NetBeans, MS Visual Studio, Sublime, SOAP UI

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat.

Databases: Microsoft SQL Server 2008 MySQL 4.x/5.x, Oracle 10g, 11g, 12c, DB2, Teradata.

NOSQL Databases: HBase, Cassandra, MongoDB.

Build Tools: Jenkins, Maven, ANT, Toad, SQL Loader, RTC, RSA, Control-M, Oozie, Hue, SOAP UI

Development Methodologies: Agile/Scrum, Waterfall, UML, Design Patterns

Database Tools: SQL Server Data Tools, Visual Studio, Spotlight, SQL Server Management Studio, Query Analyzer, Enterprise Manager, JIRA, Profiler

ETL Tools: Informatica Power Centre, SSIS

Operating Systems: All versions of UNIX, Windows, LINUX, Macintosh HD, Sun Solaris

PROFESSIONAL EXPERIENCE:

Confidential, Scottsdale Arizona

Machine Learning/Data Scientist

Responsibilities:

  • Design and develop state-of-the-art deep-learning / machine-learning algorithms for analyzing image and video data among others.
  • Develop and implement innovative AI and machine learning tools that will be used in the Risk
  • Experience with Tensor Flow, Cafe and other Deep Learning frameworks.
  • Effective software development processes to customize and extend the computer vision and image processing techniques to solve new problems for Automation Anywhere.
  • Develop and implement innovative data quality improvement tools.
  • Involved in Peer Reviews, Functional and Requirement Reviews.
  • Develop project requirements and deliverable timelines; execute efficiently to meet the plan timelines.
  • Creating and support a data management workflow from data collection, storage, analysis to training and validation.
  • Involved with Data Analysis Primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats.
  • Well experienced in Normalization and De-Normalization techniques for optimum performance in relational and dimensional database environments.
  • Understanding requirements, significance of weld point data, energy efficiency using large datasets
  • Develop necessary connectors to plug ML software into wider data pipeline architectures.
  • Creating and support a data management workflow from data collection, storage, analysis to training and validation.
  • Wrangled data, worked on large datasets (acquired data and cleaned the data), analyzed trends by making visualizations using matplotlib and python.
  • Experience with TensorFlow, Theano, Keras and other Deep Learning Frameworks.
  • Built Artificial Neural Network using TensorFlow in Python to identify the customer's probability of cancelling the connections. (Churn rate prediction)
  • Understanding the business problems and analyzing the data by using appropriate Statistical models to generate insights.
  • Knowledge of Information Extraction, NLP algorithms coupled with Deep Learning
  • Developed NLP models for Topic extraction, Sentiment Analysis
  • Identify and assess available machine learning and statistical analysis libraries (including regressors, classifiers, statistical tests, and clustering algorithms).
  • Design and build scalable software architecture to enable real-time / big-data processing.
  • Used Teradata utilities such as Fast Export, MLOAD for handling various tasks data migration/ETL from OLTP Source Systems to OLAP Target Systems.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve datafromOracle database and used ETL for data transformation.
  • ML performance a deep analysis of the HTPD/RTPD/LTPD test data to define a model of FBC growth rate across the temperature.
  • MLmodels for projection pre-production SLC, MLC, TLC single and multi-die packages ICC memory.
  • Used TensorFlow library in dual GPU environment for training and testing of the Neural Networks
  • Develop necessary connectors to plug ML software into wider data pipeline architectures.
  • Identify and assess available machine learning and statistical analysis libraries (including regressors, classifiers, statistical tests, and clustering algorithms).
  • Design and build scalable software architecture to enable real-time / big-data processing.
  • Taking responsibility for technical problem solving, creatively meeting product objectives and developing best practices.
  • Have a high sense of urgency to deliver projects as well as troubleshoot and fix data queries/ issues.
  • Work independently with R&D partners to understand requirements.

Environment : R 9.0, R Studio, Machine learning, Informatic a 9.0, Scala, Spark, Cassandra, ML, DL, Scikit-learn, Shogun, Data Warehouse, MLLib, Cloudera Oryx, Apache.

Confidential, Tempe, AZ

Data Scientist

Responsibilities:

  • Provided Configuration Management and Build support for more than 5 different applications, built and deployed to the production and lower environments.
  • Evaluated the performance of Various Classification and Regression algorithms using R language to predict the future power.
  • Worked with several R packages including knitr, dplyr, SparkR, Causal Infer, spacetime.
  • Involved in Detecting Patterns with Unsupervised Learning like K-Means Clustering.
  • Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Mahout, Hadoop, and MongoDB.
  • Gathering all the data that is required from multiple data sources and creating datasets that will be used in the analysis.
  • Performed Exploratory Data Analysis and Data Visualizations using R, and Tableau.
  • Perform a proper EDA, Univariate and bivariate analysis to understand the intrinsic effect/combined effects.
  • Worked with Data Governance, Data quality, data lineage, Data architect to design various models and processes.
  • Designed and developed Ad-hoc reports as per business analyst, operation analyst, and project management data requests.
  • Worked on Data Verifications and Validations to evaluate the data generated according to the requirements is appropriate and consistent.
  • Developed triggers, stored procedures, functions, and packages using cursors and ref cursor concepts associated with the project using Pl/SQL
  • Used Python, R, SQL to create Statistical algorithms involving Multivariate Regression, LinearRegression, Logistic Regression, PCA, Random forest models, Decision trees, Support Vector Machine for estimating the risks of welfare dependency.
  • Applied statistical Modeling like decision trees, regression models, and SVM.
  • Utilized Convolution Neural Networks to implement a machine learning image recognition component. Implemented Backpropagation in generating accurate predictions
  • Performed Information Extraction using NLP algorithms coupled with Deep Learning (ANN and CNN), Keras and TensorFlow.
  • Implemented Apache Spark to speedup Convolutional neural networks Modeling.
  • Analyzed sentimental data and detected patterns in customer usage data sets.
  • Avoid overfitting by following standard practices such as keeping the number of independent parameters less than the data points avoidable in the model.
  • Used Graphical Entity-Relationship Diagramming to create new database design via easy to use, graphical interface.
  • Created multiple custom SQL Queries in Teradata SQL Workbench to prepare the right data sets for Tableau dashboards
  • Perform analyses such as regression analysis, logistic regression, discriminant analysis, cluster analysis using SAS programming.
  • Used Metadata tool for importing metadata from the repository, new job categories and creating new data elements.
  • Scheduled the task for weekly updates and running the model in the workflow. Automated the entire process flow in generating the analysis and reports.

Environment: Erwin 8, Teradata 13, SQL Server 2008, Oracle 9i, SQL*Loader, PL/SQL, ODS, OLAP, OLTP, SSAS.

Confidential - Tampa, FL

Data Engineer/ Data Scientist

Responsibilities:

  • Worked with Data Architect and SMEs in identifying the solution for reporting issues found while comparing existing data model in ISR and that developed for ASM.
  • Performed requirements Modeling and develop analysis diagrams, activity diagrams, sequence diagrams, state diagrams, data models, and use-case realizations using RUP tools in Agile.
  • Actively involved in analyzing the order management and supply chain management process, managing inventory process and documenting the process.
  • Collected business requirements to set rules for proper data transfer from Data Source to Data Target in Data Mapping, ETL tools like Informatica for loading data into the staging tables in the database.
  • Involved in designing logical and physical Data models and wrote Business and technical metadata as well as a prepared source to target data mappings documents in excels.
  • Defined and created procedures to install Tableau desktop in silent mode.
  • Generate DDL's and make the same available to the DBA for execution
  • Identified Key data Elements (KDE's) and building business rules to measure KPI's.
  • Facilitated meetings with business users, data architects, data modelers, business analysts, QA and multiple delivery teams to define the data quality and profiling requirements.
  • Experience with creation of users, groups, projects, workbooks and the appropriate permission sets for Tableau server logins and security checks.
  • Generate DDL from Erwin and make the same available to DBA for deployment
  • Audited Data of the identified subject areas, and assist building DQ Dashboards
  • Established a process for data review and remediation of Data Quality issues
  • Planned and defined system requirements to Wire Frame with Use Case, Use Case Scenario and Use Case Narrative using the UML (Unified Modeling Language) methodologies
  • Identified/documented data sources and transformation rules required populating and maintaining data warehouse content.
  • Provide DBA with DDL for implementation in Development.
  • Maintained benchmark controls to policies, company standards and contracts, performed vendor sourcing, pricing and contract negotiation, performed procurement, ensured compliance with service/joint interest contracts.
  • Analysing and building a proof of concepts to convert SAS reports into tableau or use SAS dataset in Tableau.
  • Queried database using SQL for backend testing
  • Preparing Dashboards using calculations, parameters in Tableau.
  • Used SDLC (System Development Life Cycle) methodologies like RUP and Agile methodology.
  • Created business requirement documents and integrated the requirements and underlying platform functionality. Interfaced between Business new product builders and System platform builders
  • Worked closely with the vendor to define work rules for different types of exempt and non-exempt employees.

Environment : Oracle, Erwin, Informatica, Data Warehousing, SQL, Tableau, SAS

Confidential - Phoenix, Arizona

Data Engineer

Responsibilities:

  • Documented logical, physical, relational and dimensional data models. Designed the Data Marts in dimensional data Modeling using star and snowflake schemas.
  • Prepared documentation for all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, and glossary evolve and change during the project
  • Coordinated with DBA on database build and table normalizations and de-normalizations
  • Created, documented and maintained logical & physical database models.
  • Identified the entities and relations between the entities to develop Conceptual Model using ERWIN.
  • Creating Data mappings, Tech Design, loading strategies for ETL to load newly created or existing tables.
  • Perform in-depthdata analysis and load customer details from data warehousing to analyze, generate comprehensive reports to decision-makers and other affected by the results.
  • Created Schema objects like Indexes, Views, and Sequences, triggers, grants, roles, Snapshots.
  • Developed strategies and loading techniques for better loading and faster query performance.
  • Extensively worked on documentation of Data Model, Mapping, Transformations and Scheduling batch jobs.
  • Developed a dimensional model for Data Warehouse/OLAP applications by identifying required facts and dimensions.
  • Used the Data Warehousing Life Cycle to identify data elements from the source systems, performed data analysis to come up with data cleansing and integration rules for the ETL process.
  • Defined Functional Test Cases, documented, Executed test script in Facets system.
  • Designed STAR schema for the detailed data marts and plan data marts consisting of conformed dimensions.
  • Performed data analysis and data profiling using complex SQL on various sources systems including Oracle.
  • Created advanced chart visualizations in Tableau using Dual Axis, Box Plots, Bullet Graphs, Treemaps, Bubble Charts, Pie Chart, Gantt chart, Histograms.
  • Involved in Creation of dashboards, stories, and visualizations in Tableau. Created report schedules on Tableau Server.
  • Worked on SQL queries in a dimensional data warehouse as well as a relational data warehouse.
  • Written SQL scripts to test the mappings and Developed Traceability Matrix of Business Requirements mapped to Test Scripts to ensure any Change Control in requirements leads to test case update.
  • Recommend additional methods to analyze collect and manage data to improve data quality and efficiency of the data systems across the application.
  • Perform administrative tasks, including the creation of database objects such as database, tables, and views, using SQL DCL, DDL, and DML requests.
  • Flexible to work late hours to coordinate with offshore team.

Environment: SQL DCL, DDL, DML, Tableau visualizations, Data Warehouse/OLAP, Bubble Charts, Pie Chart, Gantt chart.

Confidential

Data Analyst

Responsibilities:

  • Documented System Proposal involving system request, feasibility analysis and requirements definition.
  • Performed Analysis Modelling using Use-case diagrams and descriptions, Class Diagrams and Sequence Diagrams.
  • Performed Design Modelling using Package Diagrams, Database design and Data Access and Manipulation Design.
  • Developed test cases, established traceability between requirements and test cases.
  • Identified key performance indicators (KPIs) and created report.
  • Obtained sample data (signature) from the user and stored it in the form of points in Oracle database.
  • Performed data pre-processing, after data area cropping identified the relevant data points to be stored.
  • Extracted data from different sources like Oracle and text files using SAS/Access, SAS SQL procedures and created SAS datasets.
  • Identified critical features in the data like pressure, velocity, angle and stores in the database.
  • Used SAS Proc SQL pass through facility to connect to Oracle tables and created SAS datasets using various SQL joins such as left join, right join, inner join and full join.
  • Extensively use SAS procedures like means, frequency and other statistical calculations for Data validation.
  • Performed Hypothesis testing using SAS to check if the difference in the population mean is significant.
  • Utilized concepts related to, statistical analysis, multicollinearity, Correlation, ANOVA.
  • Gathered data from multiple sources and interpreted the data to draw conclusions for managerial actions and strategy
  • Embed SQL queries in Excel and used Excel functions to calculate parameters like standard deviation, angle, velocity which are used to compare new signatures with the stored ones.
  • Generated customized reports using SAS/MACRO facility, PROC REPORT, PROC TABULATE and PROC SQL
  • Created and delivered reports and presentations with key findings and recommendations.

Environment : SAS/MACRO,PROC REPORT, PROC TABULATE,PROC SQL,SAS.

We'd love your feedback!