We provide IT Staff Augmentation Services!

Data Scientist Resume

New Hyde Park, NY


  • Over 8+ years of experience in system analysis, design, development and testing in various projects, 7+ years of experience in MS SQL Server Database Development, 5+ years of SSIS, SSMS, SSRS and SSAS Microsoft BI Stack, 5+ years of Data warehouse development experience.
  • Experience in working with NoSQL Databases like MongoDB
  • Over 2+ years of experience in IBM DataStage design and development of parallel ETL jobs
  • Over 2+ years of experience in Unix shell scripting, for ETL job run automation and Datastage engine admin
  • Over 2+ years of experience in loading, optimizing and querying Netezza/IBM Pure Data appliance
  • Over 3+ years working with Microsoft SQL Server Change Data Capture, utilized it for incremental loading
  • Over a year working with Reference Data Management (RDM) for Data Warehouse
  • Over 2+ years using R and Python to build Machine Learning Algorithms for predictive models
  • Over 4+ years of experience in design, development in Oracle PL/SQL Development, version 12c/11g
  • Expert level skills in Data Modeling, Data Mapping, Table Normalization, Table Denormalization, Optimization and Tuning, RDBMS concepts and constructs, Kimball and Inmon Data warehousing methodologies.
  • Experience in software Analysis, Design, Development, Testing, Implementation and Production Support of Client/Server and Web based applications.
  • Proficient in Installing SQL Server 2016 - 2008 configuration of their tools.
  • Hands on experience in migration of database from SQL Server 2012 to SQL Server 2016.
  • Experience in various Development Methodology like AGILE, WATERFALL, SCRUM
  • Expert in TSQL & PL/SQL DDL/DML, performed most of the SQL Server Enterprise Manager and Management studio functionality using T-SQL Scripts and Batches .
  • Expert in creating indexed Views, complex Stored Procedures, effective Functions, and appropriate Triggers to facilitate efficient data manipulation and data consistency.
  • Expert in Data Extraction, Transforming and Loading (ETL) using SQL Server Integration Services (SSIS), DTS, Bulk Insert, BCP. From sources like Oracle, Excel, CSV, XML
  • Expert writing SSIS Configuration, Logging, Procedural Error Handling, Custom Logging and Data Error Handling, Master Child Load methods using Control tables.
  • Experience in using tools like index Tuning Wizard, SQL Profiler, and Windows Performance Monitor for Monitoring and Tuning MS SQL Server Performance.
  • Experience in creating Jobs, Alerts, SQL Mail Agent, and scheduled DTS and SSIS Packages.
  • Good knowledge in Star Schema and Snowflake Schema in Data Warehouse in Dimensional Modeling.
  • Extensive experience creating SSRS reports like Parameterized report, Bar Report, Chart Reports, Linked Report, Sub Report, Dashboard, Scorecards reports.
  • Extensive experience creating ROLAP/MOLAP/HOLAP and KPI’s using SSAS cubes, processing cube and deploying cube to Production.
  • Implementation of Master Data Management in Data Warehouse ensuring data governance, removal of duplicate records, data cleansing and profiling.
  • Extensive hands on experience in core .Net technologies like ASP, ASP. NET, ADO.NET, HTML, AJAX, CSS, C#.NET, VB, XML to create web forms .
  • Experience in designing Database Models using MS Visio and ER-Win.
  • Experience writing deployment script using DOCKER, SQLCMD, RS, DTUTIL and release document for release management team.
  • Worked extensively on system analysis, design, development, testing and implementation of projects (Complete SDLC) and capable of handling responsibilities independently as well as a proactive team member.
  • Good SQL Server Administration skills including backup recovery, database maintenance, user authorizations, Database creation, Tables, indexes, Partitions, running Database Consistency Checks using DBCC.
  • Expert in designing complex reports like reports using Cascading parameters, drill-through and drill down Reports, Parameterized Reports and Report Models and ad-hoc reports using SSRS, COGNOS, Qlick View, Tableau, Zoho, Crystal Reports and Excel Pivot table based on OLAP cubes which make use of multiple value selection in parameters pick list, and matrix dynamics reports
  • Good Business and System analysis skills to understand the requirements and customize client


Operating Systems: Windows 7.0/10, Unix

Databases: MS SQL Server 2016-2008 R2,Oracle 12c, My SQL, MS Access,DB2 NetezzaNoSQL DB: Mongo DB

ETL Tools: SQL Server Integration Services(SSIS), IBM DataStage

Database Tools: SQL Profiler, Management studio, Index Analyzer, SQL Agents, SQL Alerts, Visual Source Safe. Microsoft SQL Server CDC, IBM CDC, AWS Ec2, AWS RDS, MapReduce

Languages: R,T-SQL, Visual Basic 6.0, C, C++, C#, JAVA, HTML, PL/SQL,VBA,PYTHON, Hadoop, Spark

Reporting Tools: SQL Server Reporting Services (SSRS), Tableau,, MS Excel

DB Modeling Tools: Erwin. Embarcadero

Frameworks: Docker


Confidential, New Hyde Park, NY

Data Scientist


  • Worked with Perficient Consulting group to build an Enterprise Data warehouse solution for Northwell Health
  • Developed predictive models with Clinical Analysts for LACE score index calculations with R machine learning algorithm
  • Performed Statistical analysis using linear regression to find correlation between data points
  • Inferred model results based on multivariate prediction models like K nearest neighbors, deep neural networks, Random forest, logistic regression, Naïve Bayes etc. with Bayesian inference method.
  • Analyzed and processed complex datasets about patient readmission rates and ways to reduce it.
  • Designed and developed DataStage code for extracting source data into Persistent stage layer then to a Standard Interface gateway using SCDII (type 2 slowly change dimension) methodology and Microsoft Change Data capture
  • Enhanced the vendors ETL framework to ensure a more dynamic and parameterized design to extract data from source tables using ETL configuration tables
  • Maintained over 100 ETL jobs that populate IBM UDMH (Unified Data Model for Healthcare) Atomic Model and Dimensional model
  • Extensive understanding of Linstedt Data Vault modeling and ETL loading implementation and architecture
  • Designed ETL’s to populate Anchor tables, array tables, bridge tables, Detail tables and Reference tables
  • Loaded data into Netezza database, ensuring proper distribution of data across data slices with the most adequate distribution keys
  • Converted data mapping documents to ETL/Datastage jobs loading data into Netezza.
  • Created Unix scripts to execute jobs automatically and in a schedule
  • Debug and monitor parallel job failures and errors that arise from source system anomalies or interepratation of data mapping
  • Implement CDC from Source system to Atomic and Dimensional tables, ensuring hard and soft deletes are flagged at the record level
  • Create Data Validation scripts that would be used for verification correctness of ETL logic and transformations between source system tables and dimension tables.

Confidential, New Hyde Park, NY

Data Engineer


  • Designed and developed a SCD I incremental loading process from a SQL Server Data Source to a Oracle Staging Database using IBM Infosphere Change Data Capture, PLSQL packages and Datastage
  • Set up Infosphere Change Management console subscriptions to a SQL Server database, set bookmarks to start of the Log Reading and configured Oracle Target Tables for record insert
  • Designed and Developed DataStage jobs to process FULL Data loads from SQL Server Source to Oracle Stage
  • Imported Table Definitions using ODBC plug in in DataStage Repository
  • Designed one Parameterized job using Parametersets that can be re used for multiple tables implementing Run Time Column Propagation in DataStage
  • Created Datastage jobs that wrote into Parameter files so that subsequent jobs in the sequence can read it for proper execution
  • Designed Data Stage SEQUENCE Jobs that controlled ETL for multiple tables in a subject area and sends a success email once job is complete
  • Implemented complex Datastage Transformer logic for various business rules in the ETL
  • Designed and developed incremental load logic that used a control table that store the min and max LSN(transaction commit cycle ID) for the successfully loaded transactions.
  • Implemented Error Handling in Datastage and designed Error jobs to notify user and update log table
  • Designed performance boosting jobs that ran the Datastage job on 4 nodes taking advantage of Datastage Parallel execution of different partitions to reduce job run time
  • Implemented Microsoft SQL Server Change Data Capture for SQL Server Data sources, taking advantage of the inbuilt functions like sys.get min LSN, sys.get max LSN, and sys.net change
  • Installed SQL Server 2012 and Management tools using SQL Server Setup Program.
  • Created Unique index on business keys in table enabling data validation and integrity.
  • Designed and Developed an ETL Control Log table that records the number of inserts, update, delete and error message for each running process
  • Created Tables in Oracle specifying the appropriate tablespace, data storage parameters( initial extent, next extent, pct increase)

Confidential, New Hyde Park, NY

Senior Data Warehouse / IBM DataStage Developer


  • Understood business/data transformation rules, business structure and hierarchy, relationships, data transformation through mapping development.
  • Reverse Engineered the Stage Schema to create a Data Model using Embarcadero ER/Studio Architect.
  • Created a Data Lineage from Source Table to Stage Tables showing data mapping rules, transformations and business logic using Embarcadero ER/Studio Architect.
  • Developed SSIS packages to extract data from SQL Server Source and load into Oracle Database using Oracle Attunity Drivers Develop PL/SQL packages, procedures, functions, and views for business requirement
  • Worked extensively on Ref Cursor, External Tables and Collections.
  • Implemented large table partitions using RANGE Partition and LIST Partition techniques
  • Implemented and design procedures using FOR ALL and BULK Collection concepts
  • Experience with Performance Tuning for Oracle RDBMS using Explain Plan and HINTS.
  • Expertise in Dynamic SQL, Collections and Exception handling
  • Implemented Partition Exchange to migrate data for over 2 billion records efficiently from MS SQL Server to Oracle DB
  • Created Stage tables to process Source data per logical partition for ETL Parallelism in SSIS.
  • Created an SSIS Package to efficiently load CLOB columns from SQL Server to Oracle using Attunity drivers and Conditional Split transformation
  • Created Source to Target Mapping Documentation in excel to outlines the data conversion/transformation routines implemented from SQL Server Source tables columns to Oracle target Table columns
  • Created Source analysis documentation to describe the tables in source system with relationships outlining business keys
  • Modified table parameters to enable faster bulk load with NOLOGGING and Parallel hint
  • Enable parallel DML insert (Direct Load INSERT)
  • Developed SSIS packages to extract data from MS SQL DB to Oracle DB for over 2 billion records and 1 TB data volume
  • Developed Oracle Packages with Procedures for Full and Incremental ETL processes using Merge statement
  • Implemented SCD II in incremental loading, keeping the historical track of Inserted, updated and deleted records using SQL MERGE STATEMENTS, ensured to process only changed records
  • Implemented SCD I in incremental loading of huge historical table from source to a stage environment
  • Modified and validated Audit Trail Triggers capturing CDC records.
  • Partitioned tables with size over 100GB to increase availability and efficiency of full and incremental load into stage environment.
  • Created Mapping documents for data analytics vendor for various Source systems
  • Created Source to Target and Source Analysis documentation for various Source Systems imported to Stage Environment
  • Implemented Parallel data reading with separate data flows running in parallel and loading into one table
  • Participated in daily Stand up meeting working on Product Back log for Sprint goals Engaged in 3 week sprint in Scrum Development methodology in fast paced work envir

Environment: MS SQL SERVER 2014/2012, SSIS, SSRS, Business Intelligence Studio, DTS, Oracle12C, T-SQL, VSS 2005, Erwin r7.2, SQL profiler, Embarcadero ER Studio Data Architect, Dot NET framework 3.5, ASP.NET,C#, XML, PL/SQL, Star and Snowflake schema Data Models

Confidential, Murray Hill, NJ

SQL Server Database/ SSIS/TSQL


  • Installed SQL Server 2012 and Management tools using SQL Server Setup Program.
  • Use Tableau Desktop to analyze and obtain insights into large data sets
  • Created primary key and foreign key relationships between tables with appropriate cluster and non cluster indexes
  • Designed, reviewed, and created primary objects (views, indexes, etc.) based on logical design models, user requirements and physical constraints.
  • Partitioned Tables and Indexes to more than one file group in database
  • Developed a logical Data model and Star Schema based on client business process.
  • Wrote T-SQL queries, functions, Stored Procedures and used them to build packages.
  • Used SQL Server profiler for auditing and analyzing the events which occurred during a particular time horizon and stored in script.
  • Performed daily tasks including backup and restore by using SQL Server 2012 tools like SQL Server Management Studio, SQL Server Profiler, SQL Server Agent, and Database Engine Tuning Advisor.
  • Written complex Stored Procedures, Triggers, and Functions for SQL Server 2012.
  • Created traces using SQL server profiler to find long running queries and modify those queries as a part of Performance Tuning operations.
  • Extensively used joins and sub queries to simplify complex queries involving multiple tables.
  • Monitored Strategies, Processes and Procedures to ensure the Data Integrity, Optimized and reduced Data Redundancy, maintained the required level of security for all production and test databases.
  • Ensured Data Recovery, Maintenance and space requirements for physical Database.
  • Re-organized database structures as needed, automate procedures at regular intervals.
  • Optimized long running Stored Procedures & Queries for effective data retrieval.
  • Wrote PL/SQL queries, functions, Stored Procedures and used them against an Oracle Database
  • Tuned SSIS package performance by maximizing parallel processing in both Control flow and Data flow levels
  • Implemented complex C# scripting on ETL tool (SSIS) for error column name identification during data level transformation
  • Translating business requirements (BRD) into technical design specifications (FRD).
  • Utilized Excel Source and Microsoft Access as source for SSIS Extraction needs
  • Identified the Confirmed dimensions, Junk dimensions, SCDs based on client source System analysis.
  • Designed SSIS Packages to transfer data from various sources like Text Files, XML Files, Excel, Flat files to SQL Server2008/2012
  • Implemented default logging, Custom Logging and configured Error Handling to handle various OnError events
  • Scheduling the SSIS packages Jobs and prepared Runbook for the Production Support Team
  • Prepared Release Notes, Test plans and Deployment document for Release Team on various SSIS packages developed
  • Using SQL Server Reporting Services (SSRS) delivering enterprise, Web-enabled reporting to create reports that draw content from a variety of data sources.
  • Responsible for optimizing all indexes, SQL queries, stored procedures to improve the quality of software.
  • Transformations of data such as adding derived column, count of records, merging of data has been done while pulling data from Source to the Target.
  • Created xml configuration files so that packages can be executed on any server/database by changing the configuration path in the XML files.
  • Extensively used SQL Loader to Load Data from Flat Files to Oracle Database.
  • Played Active part in the development of SQL Server Maintenance plan, scheduling of Jobs, Alerts and Troubleshooting
  • Used Version Control software like TFS- Team Foundation Server Version control and STAR Team for package development history and audit.
  • OLAP model based on Dimension and FACTS for efficient loads of data based on Star Schema structure on levels of reports.
  • Implemented reconciliation processes, and identified and resolved inconsistencies

Environment: MS SQL SERVER 2008R2/2012, SSIS, SSRS, Business Intelligence Studio, DTS, Oracle10g, T-SQL, VSS 2005, Erwin r7.2, SQL profiler, Dot NET framework 3.5, ASP.NET,C#, XML, PL/SQL, Star and Snowflake schema Data Models

Confidential, Chicago, IL

SQL Database Developer/SSRS/SSAS Developer


  • Performed System Study and Requirements Analysis, prepared Data Flow Diagrams, Entity Relationship Diagrams, Data Diagrams, Table Structures.
  • Maintained Broker account databases ensuring that business rules are obeyed on database level
  • Created SSIS packages to load daily reports from different trading Platforms for application analysis
  • Design Excel Power pivot for SSAS cubes presentation and visualization of multidimensional data
  • Involved in Creation of Data Marts with Dimensions using STAR and SNOW FLAKE Schema.
  • Worked with offshore Development and Production Support team from Pune India. All daily jobs were supported by Offshore team and when issue arose, they production support team would involve the first level India Development team if it is off hours. If it cannot be resolved then team in US would be involved.
  • Created dynamic package that executes child packages based on expressions in execute package task using environment variables
  • Managed Security Master Table ensuring CUSIP, SEDOL or CINS security identifiers were maintained precisely
  • Identified Master Data for business Account Master, Portfolio Master, Asset Master
  • Wrote TSQL queries to ensure consistency and correctness of Master Data based on Business Requirements
  • Develop the Documents for Logging/Error Handling for SSIS Packages.
  • Published SSRS report to a SharePoint site configured for Report access in different departments
  • Schedule Jobs to run SSIS package in night feed to DSS system for fresh day data using SQL Server Agent.
  • Wrote complex T-SQL Queries to perform data validation and graph validation to make sure test results matched back to expected results based on business requirements.
  • Created pivot table to provide structured to Reports in proper manner like Commission Reports, AUM Reports, Holding Reports, Portfolio Reports, and Trade limit Reports.
  • Scheduled the Reports to run on daily and weekly basis in Report Manager and also email them to director and analysts to review in Excel Sheet using IBM Tivoli and Cron Job.
  • Participated in testing during the UAT (QA testing). Mentored and monitored team development and status.
  • Gathered Requirements from the end user and checked the structure of the schema and data with the Data Modeler.
  • Imported Metadata and created Presentation Layer, Business Layer and DataSource layer, Physical Layer.

Environment: SQL Server 2008/2008R2, SQL Server Integration Services, MS Excel 2012, MS Access 12 Enterprise Manager, Management Studio, SQL Profiler and SSRS, Net 2.5-3.0, Ado.net,Asp.net,HTTP

Confidential, Manhattan, NY

T-SQL/SSIS/SSRS/SQL Server Database Developer


  • Installed SQL Server 2008 and Management tools using SQL Server Setup Program.
  • Created SSRS reports showing various KPI’S like Medical Claims Ratio and Customer Satisfaction
  • Used SSIS packages to roll our data to Live Tables and to Medical Claim Processing Database.
  • Involved in extensive SSRS Fraud, Medical Claims reporting and regular maintenance and verification of reports.
  • Created SSIS packages for the extracting, transforming and loading the data from SQL Server 2008 and flat files.
  • Developed the SQL Server Integration Services (SSIS) packages to transform data from SQL 2005 to MS SQL 2008.
  • Convert all existing DTS package of MS SQL 2005 to 2008 by adding extra SSIS task.
  • Responsible for creating batches & scripts for implementing logical design to T-SQL.
  • Responsible for creating database objects like table, views, store procedure, triggers etc) to provide structure to store data and to maintain database efficiently.
  • Created Views and Index Views to reduce database complexities for the end users.
  • Perform T-SQL tuning and optimizing queries using SQL server 2008.

Environment: SQL Server 2005/2008, SQL Server Integration Services, MS Visio, MS Excel 2008, MS Access 2008, Enterprise Manager, Management Studio, MS SQL Query Analyzer, SQL Profiler and SSRS and Crystal report 9.0, .Net 4.0, WCF,HTML5,Javascript,Asp.net.

Hire Now