Senior Data Scientist And Analyst Resume
Chicago, IL
SUMMARY:
- An accomplished Senior Data Scientist / Analyst over 15 years’ experience with large Data Sets Medical data and retail data up to one trillion records with one TB size. Building researching, apply statistical methods using data science and building machine learning and natural language processing applications. Find solutions to Business based on requirement.
- Data processing / Business logic apply in Big Data Hadoop, Apache Spark using Hive, H Base and Scala. Resulted data will be used to applying Data science (R Studio) and Machine Learning. Using K mean, Linear Regression Multiple regression, Decision Tree and recommendation system algorithms. Using this data reports will be built on Tableau and SSRS.
- Strong experience on Machine Learning R, Big data products like Hadoop and Spark and Design, implement, and validate solutions in Apache Spark, Apache Hive, using Scala on a large state - of-the-art cluster
- Strong Experience on Data modeling design and architecture. Build Conceptual, Logical and physical Data Models. Architecture the OLTP Database and OLAP ( MOLAP, ROLAP ) Data ware house / Marts.
- Professional experience on BI/ ETL over 14 years of proven expertise in leading and managing the different phases of DW& BI projects like Scoping Study, Requirements Gathering, Analysis, Planning, Design, Development, Testing and Implementation of Data Warehousing, ETL & BI solutions for reputable companies and banks such GE Financials, Target retail and Financials. Participated in more than 11 DW & BI project implementations. Responsible for managing and delivering all projects in DW portfolio on-time, within budget, and ensure strategic and business requirements are met.
- Specialize in enterprise data warehouse architecture, strategy study, data warehouse implementation, DW, ETL & BI architecting, technology selection and proof of concepts. Received s for developing reusable assets in the information management and business intelligence space and guiding team to nurture and promote innovation. Specialize in providing architecture solution blue- prints that enable utilization of Corporate Assets for building out DW, BI &ETL solutions and managing multi-technology, multi- geography DW&BI implementation and Data governance programs.
- Excellent experience of designing and implementing DW&BI solutions by utilizing various industry leading DW technologies such as INFORMATICA Power Center, MS SQL technologies (SSIS, SSRS, SSAS), ORACLE, TEREDATA, Hadoop, SYBASE, SAP BO Data Services to build Data Warehouse on industry leading databases, building logical and physical data models in Data Modeling tools like CA ERWIN DATA MODELER & EMBARCADERO ER/STUDIO, designing ETL solutions in SSIS/ Informatics to integrate data from various data sources and end user business intelligence solutions delivered using best in class Business Intelligence Reporting and Dashboard tools like MICROSTRATEGY & COGNOS Business Intelligence tools suite.
- Hands on experience on Hadoop processing large sets of structured data using Sqoop, semi-structured and unstructured data and supporting systems application architecture using Flume. Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review
- Expertise in DB2, Hadoop, Oracle 9i, 10g, INFORMATICA Power Center, SQL Server 2014/2012/2008/2005 Integration Services (SSIS) and SQL Server 2014/2012/2008/2005 reporting services (SSRS)
- Strong Architecture experience on SQL Server DB Design, ETL process, SSAS analysis, SSRS . SQL performance, Maintain isolation levels and security.
- Expertise in Data Warehouse/Data mart, ODS, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, data modeling, Effort Estimation, ETL Design, development, System testing, Implementation and production support.
- Extensive testing ETL experience using Informatica 9.1/8.6.1/8.58.1/7.1/6.2/5.1 (Power Center/ Power Mart) (Designer, Workflow Manager, Workflow Monitor and Server Manager) Teradata and Business Objects.
- Strong experience in Dimensional Modeling using Star and Snow Flake Schema, Identifying Facts and Dimensions, Physical and logical data modeling using ERwin and ER-Studio.
- Experience on Java Programing language to write java script to perform tasks on Informatics.
- Strong knowledge on Data Science to extract knowledge or insights from data in various forms, either structured or unstructured, use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints, merge data sources, ensure consistency of datasets, create visualizations to aid in understanding data; build mathematical models using the data; and present and communicate the data insights/findings.
- Professional experience on BI design and implementation using Informatics and SSIS. Implementation of Business Intelligence solutions using Data Warehouse/Data Mart Design, ETL, OLAP, BI, Client/Server applications.
- Experience as DB backup / restore, security, SISS/ SSRS deployment.
- Strong knowledge Data Warehouse creation, Data Marts. Implemented using SSIS, SSAS.
- Hands on experience on Hadoop processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture. Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review
- Ad-hoc reporting using Report Builder 3.0/2.0
- Data modeling using MS Visio/ERWIN.
- Designing Data ware house using star schema and snow flake schema.
- Worked with various data sources like flat file/csv/excel/Oledb data source/Xml in SSIS
- Designing Dimension and Fact.
- OLAP Cube designing using MOLAP/ROLAP
- Strong knowledge in creating database objects like Stored Procedures, Functions Triggers, Views and Indexes
- Performance tuning of SSIS packages, stored procedures and SQL Queries
- DBA activities like database creation, back up, restore, maintenance, SQL Agent, SQL Server Profiler.
- Experience in MS SQL Server installation, configuration, performance tuning, client/server connectivity, query optimization, back-up/recovery, Stored Procedure Tuning and Trigger Implementations.
- Extensively used SQL Server Profiler, Index Tuning Wizard and Show plan to optimize the SQL Server performance for bad queries and Indexes.
- Good knowledge of Crystal Report.
PROFESSIONAL EXPERIENCE:
Confidential, Chicago IL
Senior Data scientist and Analyst
Responsibilities:
- Designed ETL pipeline suing Hadoop, implement, and validate your solutions in Apache Spark, Apache Hive, using Scala n a large state-of-the-art cluster. system reads big data files from various health care companies and pharmacy vendors. Used Hadoop to export the data and Saves Hadoop file system.
- Using R studio apply Data science to find correct Model for business using linear regression, Logistic regression, SVMs logarithm and applied machine learning algorithms to find decision. Used Decision tree and KNN, K -Mean to for decision.
- Use your machine learning expertise to research and recommend the best approaches to solving our technology and business problems.
- Reports will be generated buy Company by product, buy geographically buy product buy company and buy product and buy vendor.
- Using above scenario Data science applied based on business requirement.
- Created H base database to save structured data to apply business rules.
- Used Apache Spark Scala to run the business logic algorithms to find required information. Spark executes algorithms very quick using in memory. Saves data on Hbase by company by medical product, Also pharmacy vendor by product.
- Used tableau and MS BI for reporting. Based on business purpose reports created on graph, bar and Pie chart.
Environment: R Studio, Apache Spark, Scala, Hadoop, Hive, H Base, Tableau, MS SSIS, IBM DB2, SQL Server 2014
Confidential, Chicago IL
Senior Data scientist and Analyst
Responsibilities:
- Designed Database, identified the attributes, created Master tables, Look tables and Transaction / Historical tables. Using ERWIN created table relation and eliminated redundant columns.
- Identified the business need architect the ETL process, Analysis / Rule engine SP’s, Validation Test SP’s.
- Used Informatics Power Center 8.6 for extraction, transformation and load (ETL) of data in the data warehouse.
- Used Informatics Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
- Used the PL/SQL procedures for Informatica mappings for truncating the data in target tables at run time.
- Extensively used Informatica debugger to figure out the problems in mapping. Also involved in troubleshooting existing ETL bugs.
- Used Data Science R studio to find correct Model and Predict Product sales, Quantity, Color for coming years. Applied Linear regression, Multiple Regression, Classification algorithms for data analysis and used Recommendation engine for Machine learning, resulted data shown on selected product and recommendations.
- Worked on Extraction, Transformation and Loading (ETL) data from various sources into Data Warehouses and Data Marts using Informatics Power Center (Repository Manager, Designer, Workflow Manager, Workflow Monitor and Metadata Manger), Power Exchange, Power Connect as ETL tool on Oracle, DB2 and SQL Server Databases.
- Created DB table schema/ design using best practices. Implemented Table partition, page compression, column store index.
- Created Complex SSIS jobs to run backend rules engine, Extract data from real time Oracle DB, Transform and Load on SQL Server. On each Load runs business rules and send mails to business users.
- On SSIS used variables to maintain data between tasks, Config file has all connection and server credentials.
- Created adhoc reports to users in Tableau by connecting various data sources.
- Handled Tableau admin activities granting access, managing extracts and Installations.
- Strong experience on Java OOPS concepts, created java scripts to UN business rules and customize code to read files from third party seevr.
- Using SSRS reports to show every day vendor order list, Items by color / Size / cloth type. Each item sales by store/ state/ Country. Used Group by Row/ Column, Graphical reports and complex expressions to customize data by column.
- Created complex reports using Row/ Column grouping.
- Implemented Sql performance using adding non cluster index on column where using on Where / joins. It increased performance. Avoided Cursors, used while loops.
- To handle 1.5 trillion transaction records used table partition efficiently. Based Month and year unique number, query gets reads small partition of data instead of 1.5 trillion records. Used non cluster indexes on this table to increase performance.
Environment: R Studio, Hadoop, H Base, Hive, Oracle 10g, Informatica Power Center 8.1, SQL Server 2014, SSIS, SSRS, Informatics, Tableau, C# Scripting, Java, HTML, SSAS, Table Partition
Confidential, Chicago IL
Senior BI/ ETL Architect
Responsibilities:
- Designed Database table schemas, created relationships. Avoided data redundant.
- Used SQL Server Isolation levels to maintain data accuracy / committed.
- Tuned the performance of mappings by following Informatica best practices and also applied several methods to get best performance by decreasing the run time of workflows.
- Automated the Informatica jobs using UNIX shell scripting.
- Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
- Created procedures to truncate data in the target before the session run.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Worked on Java spring to create Hadoop Map Reduce scripts. Implemented java libraries to use on ETL tools.
- Implemented SQL performance used Scale in and scale out approach. It help a lot to run applications smooth, without server issues. Increased server/ Memory capacity and kept production applications on independent servers.
- Implemented Data ware house and Data Marts. Data Marts having Sales, Product Details, Vendor item selection / Purchase details. It will help Business to get the daily reports.
- Created Complex SSIS jobs to Extract data from IBM DB2 and save on local sql server. Performed data transformation, logical condition to eliminate data.
- On SSIS used variables to maintain data between tasks, Config file has all connection and server credentials.
- Used C# script to perform complex business rules calculations and send mail based on red flag. Created mail on HTML format and send using SSIS Mail task.
- Using SSRS reports created complex reports to show Products info, Product Allocation, created complex reports using Row/ Column grouping.
- Have used BTEQ, FEXP, FLOAD, MLOAD Teradata utilities to export and load data to/from Flat files.
- Implemented Sql performance using adding non cluster index on column where using on Where / joins. It increased performance. Avoided Cursors, used while loops.
- Implemented SQL performance used Scale in and scale out approach. It helps a lot to run applications smooth, without server issues. Increased server/ Memory capacity and kept production applications on independent servers.
- Parameterized the mappings and increased the re-usability.
Environment: Oracle 9i, Informatica Power Center 8.1, SQL Server 2012, SSIS, SSRS, C# Scripting, HTML, SSAS, Table Partition, PL/SQL Developer, Bourne shell., Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Java
Confidential, Minneapolis MN
Senior SQL Server SSIS SSRS / BI Developer
Responsibilities:
- Designed ETL packages using SSIS, based on business group, SSIS package get the data from IBM DB2 to SQL Server for analysis.
- Involved in database development usingstored procedures, triggers, viewsandcursors.
- Used theObject Oriented Analysis and DesignspecificationUnified Modeling Language (UML)for visualizing, specifying, constructing, and documenting data.
- Followed theAgile Methodologyin the SDLC.
- CreatedDatabases,Tables,Cluster/Non-Cluster Index,Unique/Check Constraints,Views, Stored Procedures, Triggers.
- Involved inmigration of catalogs, reports from Cognos Series 7/CRN to Cognos 8using Migration Utilities.
- Wrote efficientstored proceduresfor the optimal performance of the system
- Monitored performance and optimizedSQLqueriesfor maximum efficiency.
- Responsible for Creating and ModifyingT-SQLstored procedures/triggersfor validating the integrity of the data
- Createdindexedviews, and appropriateindexesto reduce the running time for complex queries
- To keep track of data manipulations usingAuditfunctionalities.
- Generated custom and parameterized reports usingSSRS
- Created reports that call sub reports inSSRS
- Configuration and Deployment of all the Reports(RDL,RDS) across various SDLC environments
- Actively involved in developing ComplexSSRSReports involving Sub Reports, Matrix/Tabular Reports, Charts and Graphs
- Responsible for creating datasets usingT-SQLandStored Proceduresand also involved inData visualization.
- Created several packages inSSIS.
- Custom components forSSIS packageswere written inVB.net.
- Scheduled the packages to keep extracting the data from OLTP at specific time intervals
- Used various transformation tasks to createETLpackages for data conversion.
- ImplementedEvent HandlersandError HandlinginSSISpackages and notified process results to various user communities
- Assisted in the design ofstar schemaandsnow flake schema
- Responsible for rebuilding the indexes and tables as part ofperformance tuning
- UsedSQL Server Agentfor schedulingjobsandalerts
- Active part in creatingOLAP and ROLAP cubesand usedMDXqueries to retrieve data from the cubes.
- PerformedQuery optimization&Performance Tuning
- Optimized queries usingSQL ProfilerandDatabase engine tuning advisor
Environment: SQL Server 2005, SSIS, SSRS, C# Scripting, Asp.net, Oracle, IB2, Congnos 7
Confidential, Boston, MA
SQL Server Developer
Responsibilities:
- Upgrade experience ofSQL Server . Experience in applying Service Packs at database and OS Level.
- Helped in the Installation and configuration of MS SQL Server
- Responsible for developingStored Procedures,Triggers,ViewsandUser defined functions
- Efficient use of Joins and sub-queries in queries that involve data from multiple tables
- Fine tuning ofstored proceduresto improve the performance
- PerformedPerformancetuningon SQLqueries,triggersandstored procedures
- UsedAuditfunctions to keep track of the changes made to the database
- Responsible for design andNormalizationof the database tables.
- UsedException handlingto catch and report the errors.
- Created custom andAd hocreports usingSSRS
- Generated severaldrill downanddrill throughreports using SSRS
- Responsible for creatingSSISpackages to extract data from different sources consolidate and merge into one single source.
- UsedDTSpackages to implement Extraction, Transformation and Load (ETL)
Environment: SQL Server 2005, SSIS, SSRS, C# Scripting, Asp.net