We provide IT Staff Augmentation Services!

Data Scientist Resume

3.00/5 (Submit Your Rating)

SUMMARY:

  • Over 12 years of experience in design, development and delivering IT solutions to complex enterprise problems.
  • Knowledge of CRISP - DM methodology for prediction
  • Experienced with machine learning algorithm such as logistic regression, random forest, KNN, SVM, neural network, linear regression, lasso regression and k-means
  • Implemented Bagging and Boosting to enhance the model performance.
  • Experience of working on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)
  • Excellent Tableau Developer, expertise in building, publishing customized interactive reports and dashboards with customized parameters and user-filters using Tableau 10.1/10.3
  • Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupiter Notebook 4.X and Alteryx
  • Comprehensive knowledge and experience in normalization/de-normalization, data extraction, data cleansing and data manipulation
  • Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server 2008, Oracle, RedShift, Neteza
  • Expert in Informatica PowerCenter 9.x, 8.x (Designer, Workflow Manager, Workflow Monitor), and Power Connect, Power Exchange.
  • Experienced in the Analysis, Design, Development, Testing, and Implementation of Data Warehouse solutions for Financial and Retail Sectors.
  • Excellent in analyzing and documenting the business requirements in functional and technical terminology.
  • Strong functional experience for FMCG Sales (Direct and Retail) - Knowledge of Retail as well as organized trade for FMCG products, Sales Cycle for FMCG goods, EDI interface.
  • Experience of working on Agile (SCRUM) and Waterfall Methodology.
  • Excellent interpersonal and communication skills.

TECHNICAL SKILLS:

Machine Learning: Logistics Regression, Naive Bayes, Decision Tree, Random Forest, KNN, Linear Regression, Lasso, Ridge, SVM, Regression Tree, K-means

Analytic/Predictive modeling Tools: Alteryx, Knime, Statistica. Toad Data Point

Visualization Tools:  Tableau, Python - Matplotlib, Seaborn

Languages:  Python, R

ETL:  Informatica 9.x/8.x

Tools:  TOAD, PL/SQL Developer, SQL*Plus, SSMS

Infrastructure tools:  Git, Serena Dimensions, VSS

RDBMS:  Oracle11g/10g/9i/8i, MS SQL Server 2008/2005, Netezza, RedShift

Data Modeling Tools:  Oracle Designer, Erwin r7.1/7.2

PROFESSIONAL EXPERIENCE:

Confidential  

Data Scientist

Responsibilities:

  • Identified patterns of behavior in customer migration to products and services.
  • Collected historical data and third party data from different data source
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, Numpy
  • Worked on outliers identification with box-plot, K-means clustering using Pandas, Numpy
  • Participated in features engineering such as feature intersection generating, feature normalize and Label encoding with Scikit-learn preprocessing
  • Modeled customers to discover untapped business opportunities.
  • Recommended product strategies to minimize attrition of profitable customers, enhance retention of promising customers and encourage migration towards profitable segments.
  • Improved CRM activities. Used Linear & Logistic Regression, Decision Tree (Random Forest), market basket analysis, conjoint analysis
  • Used F-Score, Precision, Recall to evaluate Model performance
  • Real time analysis of customer’s financial profile and providing recommendation for financial products best suited
  • Collected historical data and third party data from different data source and performed data integration using Alteryx
  • Perform data cleansing, data imputation and data preparation using Pandas and Numpy.
  • Perform data exploratory analysis using Matlplotlib
  • Built the machine learning model include: logistic regression, random forest to score and identify the potential new business case with Python Scikit-learn
  • Used F-Score, Precision, Recall to evaluate Model performance
  • Prepared and presented data quality report to stakeholders to give understanding of data

Confidential 

Senior Data Integration / ETL Developer

Responsibilities:

  • Working as ETL Lead developer for Agile team for consumer and partner business groups
  • Build data integration solution to meet functional and non-functional requirements
  • Work with analyst and business users to translate functional specification into technical design
  • Participate in requirement definition, solution development and implementation phases of data warehouse and reporting projects.
  • Definition of logical and physical data models B2C ecommerce business.
  • Analysis and design for data analysis and data integration of disparate systems
  • Create new data model for subject areas and data mart with ER Studio
  • Implement process and logic to extract, transform and distribute data across one or more data stores
  • ETL development using Informatica in Microsoft SQL Server environment and all aspects of data warehouse reporting. The objective is transition of legacy data warehouse to a structure focused on supporting customer behavior analytics.
  • Design and develop ETL specifications, processes, and documentation to produce required data deliverables (data profiling, source to target maps, ETL flows, scripts, etc.).
  • Support of ETL environment including but not limited to automation, job scheduling, monitoring, maintenance, security, and administration.
  • Configure and tune ETL mappings to optimize the data warehousing architecture
  • Troubleshoot and resolve data, system issues and performance issues.
  • Experience of working on MPP platform of Netezza
  • Migration from SQL Server to Netezza for converting stored procedures to Netezza Stored procedures
  • Migrating Informatica mappings from SQL Server to Netezza

Confidential 

Senior Data Integration / ETL Developer

Responsibilities:

  • Working as ETL Lead developer for consumer and partner business groups
  • Build data integration solution to meet functional and non-functional requirements
  • Work with analyst and business users to translate functional specification into technical design
  • Participate in requirement definition, solution development and implementation phases of data warehouse and reporting projects.
  • Analysis and design for data analysis and data integration of disparate systems
  • Create new data model for subject areas and data mart with ER Studio
  • Implement process and logic to extract, transform and distribute data across one or more data stores
  • ETL development using Informatica in Microsoft SQL Server environment and all aspects of data warehouse reporting.
  • Design and develop ETL specifications, processes, and documentation to produce required data deliverables (data profiling, source to target maps, ETL flows, scripts, etc.).
  • Support of ETL environment including but not limited to automation, job scheduling, monitoring, maintenance, security, and administration.
  • Configure and tune ETL mappings to optimize the data warehousing architecture
  • Troubleshoot and resolve data, system issues and performance issues.

Confidential 

Senior ETL Developer

Responsibilities:

  • Build data integration solution to meet functional and non-functional requirements
  • Data modelling for new facts as per requirement
  • Technical design for new mappings with details about data lineage
  • Definition of logical and physical data models B2B ecommerce business.
  • Developed and implemented Custom SDE mappings for Source dependent extraction of data for Dimension and Fact tables.
  • Developed and implemented Custom SILOS mapping for Source Independent loading of data for Dimension and Fact tables.
  • Developed and implemented Custom PLP mappings for post load processing of data for creating aggregates based on fact tables
  • Worked with Data warehouse Adminitration Console (DAC) for configuration and scheduling of full and incremental load
  • Designed, developed, and tested complex Informatica mappings/Mapplets to extract data from oracle tables using Informatica powercenter 9.0.1
  • Extensively worked on Mapping Variables, Mapping Parameters, Workflow Variables and Session Parameters for the CDC process during that period.
  • Extensively used debugger in identifying bugs in existing mappings by analyzing data flow and evaluating transformations.
  • Created and implemented mappings and mapplets using transformations like Source Qualifier, Connected/ Unconnected lookups, and filter and Update Strategy transformations.
  • Performance tuning the mappings to achieve quicker execution times for the multi-terabyte join/ lookups.
  • Responsible for creating complex Queries for data retrieval & processing.

Environment: Informatica Power Center 9.0.1, Oracle11g SQL, TOAD, Windows 2008, Sun Solaris, OBIA

Confidential 

Technical Lead/Data Modeler

Responsibilities:

  • Technical lead for Commerce applications which includes different sales systems.
  • Conduct meetings and discussion with business and functional team to understand and gather data warehouse requirement
  • Perform data modeling tasks for the new enhancements
  • Created logical and physical data models using Oracle Designer
  • Design star schema for data marts by reverse engineering reports and identifying data elements
  • Responsible for Communication with Functional and technical teams.
  • Conduct JAD sessions with DBA and application development team
  • Developed, implemented and enforced ETL best practices standards.
  • Designed, developed, and tested complex Informatica mappings/Mapplets to extract data from external flat files, SQL Server, oracle tables using Informatica powercenter 9.1.
  • Implemented Slowly Changing Dimension Type 1 and Type 2 for inserting and updating Target tables for maintaining the history.
  • Extensively worked on Mapping Variables, Mapping Parameters, Workflow Variables and Session Parameters for the CDC process during that period.
  • Extensively used debugger in identifying bugs in existing mappings by analyzing data flow and evaluating transformations.
  • Created and implemented mappings and mapplets using transformations like Source Qualifier, Connected/ Unconnected lookups, XML, Routers, Joiners and Stored Procedures.
  • Performance tuning the mappings to achieve quicker execution times for the multi-terabyte join/ lookups.
  • Generate complex workflows/worklets taking into consideration the interdependencies between sessions and mappings using Informatica PowerCenter.
  • Responsible for creating complex Queries & PL/SQL procedures for data retrieval & processing.

Environment: Informatica Power Center 9.1/8.6, Oracle10g SQL, Oracle10g PL/SQL, TOAD, Windows 2008, Sun Solaris, XML, Oracle Designer

Confidential 

Technical Lead

Responsibilities:

  • Technical lead for Finance applications which includes suppliers and invoice systems.
  • Discussion and meetings with business team for gathering requirement for data warehouse applicaitons
  • Created logical and physical data models using Erwin after discussing requirements with business and functional team
  • Design star Schema for data marts
  • Involved in business requirement analysis and design
  • Responsible for preparing design document
  • Involved in creating conceptual, logical and physical data models for cleansing model
  • Responsible for developing and enforcing standards and best practices
  • Involved in data profiling, data quality, data cleansing and metadata management
  • Created complex Informatica mappings using Unconnected Lookup, joiner, Rank, Source Qualifier, Sorter, Aggregator, newly changed dynamic Lookup and Router transformations to extract, transform and load data to Cleansing area
  • Wrote complex PL/SQL functions/procedures/packages
  • Developed Informatica workflows/worklets/sessions associated with the mappings using Workflow Manager
  • Wrote Test cases and executed test scripts in Cleansing area
  • Involved in performance tuning of Informatica mappings and PL/SQL scripts
  • Supported during QA/UAT/PROD deployments
  • Wrote UNIX shell scripts for file transfer/archiving.
  • Scheduling of Informatica workflows/sessions using Control-M

Environment: Erwin4.5,Informatica Power Center 8.2.1/8.5.1 , Oracle9i/10G, PLSQL, SQL Server 2005, XML, Sun-Solaris Shell Scripting, MS Excel, Flat Files

Confidential 

IT Consultant

Responsibilities:

  • Analyzed Business requirements, prepared the Physical design, High level designs and technical specifications.
  • Involved in the process of extracting data from integrated heterogeneous sources using Pro*C, and sorting extracted data using Synch sort methods.
  • Performed the preparation of technical design docs, source to target mapping.
  • Prepared the strategies for performance tuning and reusable components in the custom PL/SQL scripting.
  • Coded complex mappings using PL/SQL, responsible for establishing connectivity with various data sources including SQL Server, FlatFiles, XML and SAP R/3.
  • Created mappings to generate the statistics for the dataflow and also mail notification when a particular load doesn’t meet the SLA.
  • Created UNIX scripts to call other programs and file maintenance. And also, oracle database scripts for table partitions compress and rebuild indexes, partition exchange and purging the partitions.
  • Experience in writing SQLs, views and PL/SQL programs like functions, stored procedures, packages and cursors.
  • Involved in the preparation of the QA check list as per standards. And also, prepared the test cases based on the business requirements and documented them in a specified manner.
  • Prepared Technical Design Documents, Impact Document for the enhancement required by the client.

Environment: Oracle 8i, SQL Server 2003, Cognos Report net, UNIX, Windows, Erwin4.1 & SQL Navigator.

Confidential 

Senior Developer

Responsibilities:

  • Member of design team and developed application to load data coming from different sales system, validated data and loading data into targets.
  • Requirement gathering and performing data modeling for new requirements
  • Created logical and physical data models using Erwin
  • Design 3NF schema design for the OLTP applications
  • Prepared Technical Design Documents, Impact Document for the enhancement required by the client.
  • Developed Extraction, Transformation and Loading of data from different source systems using Informatica PowerCenter tools - Mapping Designer, Repository Manager, Workflow manager and workflow Monitor
  • Created complex mappings using transformations like Source qualifier, Sequence generator, Lookup, Joiner, filter, Update Strategy, Rank and aggregators.
  • Implemented Slowly Changing Dimension Type 1 and Type 2 for maintaining Targets.
  • Created workflows and worklets taking into consideration the interdependencies between sessions and mappings and various commands like command, assignment, control and session tasks.
  • Performance enhancement of the Mappings and data access
  • Involved in preparing test plan and testing for ETL development

Environment: Informatica 6.2, Oracle 8i, UNIX, Windows, Erwin4.1, PL/SQL, TOAD

Confidential 

Senior Developer

Responsibilities:

  • Member of design team and developed application to load data from different sources, validate order data against master data and order specific validation and made entry of valid orders in NOTS database. If invalid, made entry for such orders in invalid tables for correcting the orders.
  • Developed software for checking the validity of the orders and populating Order Management Database and Erroneous Orders Database if required.
  • Tested the functionality against test cases.
  • Version Control Maintenance.
  • Prepared technical design documents for EDI interface.
  • Responsible for the release of EDI interface

Environment: Oracle 9i, Oracle PLSQL, TOAD

Confidential

Senior Developer

Responsibilities:

  • Member of design team and development team for third party logistics Inventory Management Interface
  • Tested the functionality against test cases.
  • Version Control Maintenance.
  • Prepared technical design documents for 3PL IM interface.
  • Responsible for the release of EDI interface.

Environment: Oracle 9i, Oracle PLSQL, TOAD

Confidential 

Programmer Analyst

Responsibilities:

  • Involved in analysis and design of the Product Registration and Image Server
  • Developed and implemented online GPS based module
  • Tested the functionality against test cases.
  • Responsible for Version Control

Environment: Oracle 8i, Oracle PLSQL, JSP

Confidential 

Programmer Analyst

Responsibilities:

  • Member of development team for Health Care Codes Data Upload System
  • Tested the functionality against test cases.
  • Version Control Maintenance

Environment: Oracle 8i, Oracle PLSQL, TOAD

We'd love your feedback!