Data Scientist Resume
Brentwood, TN
PROFESSIONAL SUMMARY:
- Over 8+ years of experience in areas including Data Analyst, Statistical Analysis, Machine Learning, Data mining with large data sets of structured and unstructured data in banking, travel services, strong functional knowledge, business processes and latest market trends and manufactory industries.
- Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical testing, normal distribution and other advanced statistical and econometric techniques.
- Developed predictive models using Decision Tree, Random Forest, Naïve Bayes, Logistic Regression, Cluster Analysis, and Neural Networks.
- Experienced the full software lifecycle in SDLC, Agile, and Scrum methodologies.
- Strong SQL programming skills, with experience in working with functions, packages, and triggers.
- Excellent understanding of machine learning techniques and algorithms, such as k - NN, Naive Bayes, SVM, Decision Forests, natural language processing (NLP) etc.
- Worked with RDBMS including MySQL, DB2 and Oracle SQL.
- Experienced in Data Integration Validation and Data Quality controls for ETL process and Data Warehousing using MS Visual Studio SSIS, SSAS, SSRS.
- Experience in implementation of the Stored Procedures, Triggers, Functions using T-SQL
- Expert in developing Data Conversions/Migration from Legacy System of various sources (flat files, Oracle, Non-Oracle Database) to Oracle system Using SQL LOADER, External table and Calling Appropriate Interface tables and API's Informatica.
- Experienced in Performance tuning of Informatica (sources, mappings, targets, and sessions)
- Hands on experience in Teradata SQL Analytics, Teradata utilities and familiar in Creating Secondary indexes, and join indexes in Teradata.
- Strong working experience on Teradata query performance tuning by analyzing CPU, AMP Distribution, Table Skewness, and IO metrics.
- Excellent Data Mining skills who can sift the grain from large datasets of Structured and Unstructured data, identify the patterns within data, analyze data and interpret results into actionable insights and business values
- Well adapted to Statistical Programming Languages and adept at writing code in R, Python, SAS and cloud platform as Azure ML and AWS ML
- Experience in managing and maintaining IAM policies for organizations in AWS to define groups, create users, assign roles and define rules for role-based access to AWS resources.
- Hands on experience in setting up databases in AWS using RDS, storage using an S3 bucket and configuring instance backups to S3 bucket to ensure fault tolerance and high availability.
- Maintenance and monitoring of Docker in a cloud-based service during production and Set up a system for dynamically adding and removing web services from a server using Docker. Used Kubernetes to manage Docker containers cluster.
- Configuration management using Amazon Cloud Formation, Continuous integration with Jenkins, AWS management (EC2, EBS, RDS, Route53)
- Transformed traditional environment to virtualized environments with AWS-EC2, S3, EBS, ELB, and EBS.
- Skills to build a fully automated, highly elastic cloud orchestration framework on AWS.
- Extensively worked on Teradata Utility tools like BTEQ, Fast load, Fast Export, Multi-Load, TPUMP, and TPT.
- Proficient in Tableau and R-Shiny data visualization tools to analyze and obtain insights into large datasets create visually powerful and actionable interactive reports and dashboards.
- Automated recurring reports using SQL and Python and visualized them on BI platform like Tableau.
- Worked in a development environment like Git and VM.
- Experience in developing and analyzing data models, involved in writing simple and complex SQL queries to extract data from the database for data analysis and testing
- Strong knowledge in all phases of the SDLC (Software Development Life Cycle) from analysis, design, development, testing, implementation, and maintenance with timely delivery against deadlines
- Ability to thoroughly analyze the system's functional requirements and prepare BRD (Business requirement documentation), use cases and testing documents
- Expertise in defining thescope of the project post gathering business requirements including constraints, assumptions, business impacts, project risks & scope exclusions, conducting aGAP analysis, User Acceptance Testing (UAT, SWOT Analysis, Cost-benefit analysis and ROI analysis)
- Proficient knowledge of statistics, mathematics, machine learning, recommendation algorithms and analytics with an excellent understanding of business operations and analytics tools for effective analysis of data.
- Excellent communication skills. Successfully working in fast-paced multitasking environment both independently and in a collaborative team, a self-motivated enthusiastic learner.
TECHNICAL SKILLS:
Programming & Scripting Languages: C, C++, JAVA, PL/SQL.
Databases: MS: Access, Oracle 12c/11g/10g/9i,Mysql,DB2
Statistical Software: SPSS, R, SAS.
ETL/BI Tools: Informatica PowerCenter 9.x, Tableau, Cognos BI 10, MS Excel, SAS, SAS/Macro, SAS/SQL
Cloud: AWS, S3, EC2.
Statistical Methods: Time Series, regression models, splines, confidence intervals, principal component analysis and Dimensionality Reduction, bootstrapping
BI Tools: Microsoft Power BI, Tableau, SSIS, SSRS, SSAS, Business Intelligence Development Studio (BIDS), Visual Studio, Crystal Reports, Informatica 6.1.
Data Modelling: Erwin r 9.6, 9.5, 9.1, 8.x, Rational Rose, ER/Studio, MS Visio, SAP Power designer.
Teradata Utilities: BTEQ, Fast load, Fast Export, Multi: load, TPUMP and TPT
Database Tools: Toad, SQL Developer, PL/SQL Developer, SQL Developer, SQL*Loader, SQL*Plus, Informatica Power Center 9.5.1.
Operating Systems: Windows (10, 7, Vista), XP, UNIX, Linux.
PROFESSIONAL EXPERIENCE:
Confidential, Brentwood, TN
Data Scientist
Responsibilities:
- Worked as a DataModeler/Analyst to generate Data Models using Erwin and developedrelationaldatabasesystem.
- Analyzed the business requirements of the project by studying the Business Requirement Specification document.
- Extensively worked on Data Modeling tools Erwin Data Modeler to design the data models.
- Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, SQL, GIT, Unix Commands, NoSQL, MongoDB.
- Transformed Logical Data Model to Erwin, Physical Data Model ensuring the Primary Key and Foreign Key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index Considerations.
- Designed Mapping to process the incremental changes that exist in the source table. Whenever source data elements were missing in source tables, these were modified/added inconsistency with third normal form based OLTP source database.
- Performed social network analysis and topic modeling in R, on employee chat data, and develop Sankey plot to understand the communication paths, the strength of relations between Agents and the topics frequently discussed between them
- Analyzed employee behavior and performance data, and developed Shiny dashboards to evaluate team preparedness through metrics, which helped evaluate leadership skills, agent experiences, agent behavior and customer sentiments
- Developed SQL procedures to synchronize the dynamic data generated from GTID systems with the Azure SQL Server.
- Constantly monitor the data and models to identify the scope of improvement in the processing and business. Manipulated and prepared the data for data visualization and report generation. Performed data analysis, statistical analysis, generated reports, listings,and graphs.
- Designed tables and implemented the naming conventions for Logical and Physical Data Models in Erwin 7.0.
- Provide expertise and recommendations for physicaldatabasedesign, architecture, testing, performancetuning and implementation.
- Designed logical and physical data models for multiple OLTP and Analytic applications.
- Extensively used the Erwin design tool &Erwin model manager to create and maintain the DataMart.
- Designed the physical model for implementing the model into the oracle9i physical database.
- Involved with Data Analysis Primarily IdentifyingDatasets, Source Data, SourceMetaData, DataDefinitions and Data Formats
- Performance tuning of the database, which includes indexes, and optimizingSQLstatements, monitoring the server.
- Wrote simple and advanced SQL queries and scripts to create standard and Adhoc reports for senior managers.
- Collaborated the data mapping document from a source to target and the data quality assessments for the source data.
- Created S3 buckets and managed roles and policies for S3 buckets. Utilized S3 buckets and Glacier for file storage and backup on AWS cloud. Used Dynamo DB to store the data for metrics and backend reports.
- Worked with Elastic Beanstalk for quick deployment of services such as EC2 instances, Load balancer, and databases on the RDS on the AWS cloud environment.
- Used Java code to connect AWS S3 buckets by using AWS SDK, to access media files related to the application.
- Used Amazon Simple Workflow service (SWF) for data migration in data centers which automates the process and tracks every step and logs are maintained in S3 bucket.
- Designed and developed user interfaces and customization of Reports using Tableau and OBIEE and designed cubes for data visualization, mobile/web presentation with parameterization and cascading.
- Performed Data Analysis and Data Profiling and worked on data transformations and data quality rules.
- Created SSIS Packages using Pivot Transformation, Execute SQL Task, Data Flow Task, etc. to import data into the data warehouse.
- Developed and implemented SSIS, SSRS and SSAS application solutions for various business units across the organization.
Environment: SQL, GIT, Unix Commands, NoSQL, MongoDB, SSIS, SSRS, SSAS, AWS, S3, EC2, RDS, SWF, Dynamo DB, Glacier, Erwin, Tableau, OBIEE.
Confidential, Chicago, IL
Data Scientist
Responsibilities:
- Data collection, Cleaned, filtered and transformed data to the specified format.
- Created captivating interactive visualizations and presentations to enhance decision-making capabilities by the management.
- Developed novel applications of classification, forecasting, simulation, optimization, and summarization techniques to enhance effective decisions.
- Prepared the workspace for Markdown. Accomplished Dataanalysis, statisticalanalysis, generatedreports, listings, and graphs.
- Found outliers, anomalies, trends in any given data sets.
- Assisted in migrating data, data pump with the Export/Importutility tool.
- Provided daily change management process support, ensuring that all changes to program baselines are properly documented and approved, maintained, managed and issue change schedules.
- Developed, installed, maintained and monitored company databases in high performance/high availability environment with supported configuration, performancetuning to ensure optimal resource usage.
- Documented all programs and procedures to ensure an accurate historical record of work completed on the assigned project as well as to improve quality and efficacy.
- Implemented various types of change data captures according to source data behavior and business requirements.
- Implemented various Performance tuning techniques at ETL & Teradata BTEQ for efficient development and performance.
- Used Simple storage services (s3) for snapshot and Configured S3 lifecycle of Applications & Databases logs, including deleting old logs, archiving logs based on retention policy of Apps and Databases.
- Created and maintained Logical and Physical models for the data mart and created partitions and indexes for the tables in the data mart.
- Performed Data profiling and Analysis applied various data cleansing rules designed data standards and architecture/designed the relational models.
- Creating new data designs and ensuring that they fall within the realm of the overall Enterprise BI Architecture.
- Built models using Statistical techniques like Bayesian HMM and Machine Learning classification models like XG Boost, SVM, and Random Forest.
- Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
- Created logical data model from the conceptual model and its conversion into the physical database design using ERWIN 9.6.
- Use SAS statistical regression method and SAS/REG polynomial simulation in Excel to simulate the anisotropic trend as 1D depth functions. Validate the simulated function by theimage quality of depth migration.
- Tested the migrated data processing system on Google Cloud with velocity model updating tasks.
- Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
- Responsible for the development of target data architecture, design principles, quality control, and data standards for the organization.
- Worked with DBA to create Best-Fit Physical Data Model from the Logical Data Model using Forward Engineering in Erwin.
- Produced quality reports for management for decision making.
- Participated in all phases of research including datacollection, datacleaning, datamining, developingmodels and visualizations.
- Redefined many attributes and relationships and cleansed unwanted tables/columns using SQL queries.
- Utilized Spark SQL API in PySpark to extract and load data and perform SQL queries.
Environment: ETL, Teradata BTEQ, S3, XGBOOST, SVM, Random Forest, AWS, Oracle PL/SQL, Erwin 9.6, DBA, SQL, Shell Script, HMM, Spark SQL, PySpark.
Confidential, Houston, TX
Data Analyst/Modeler
Responsibilities:
- Developed the logical data models and physical data models that capture current state/future state data elements and data flows using ER Studio.
- Delivered dimensional data models using ER/Studio to bring in the Employee and Facilities domain data into the Oracle data warehouse.
- Developed the design & Process flow to ensure that the process is repeatable.
- Performed analysis of the existing source systems (Transaction database)
- Involved in maintaining and updating Metadata Repository with details on the nature and use of applications/data transformations to facilitate impact analysis.
- Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
- Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation,and aggregation strategies) for Oracle and Teradata.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, MS SQL Server with high volume data
- Designed Logical & Physical Data Model /Metadata/ data dictionary using Erwin for both OLTP and OLAP based systems.
- Reverse Engineered DB2 databases and then forward engineered them to Teradata using ER Studio.
- Part of ateam conducting logical data analysis and data modeling JAD sessions communicated data-related standards
- Involved in meetings with SME (subject matter experts) for analyzing the multiple sources.
- Involved in SQL queries and optimizing the queries in Teradata.
- Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
- Identify, assess and intimate potential risks associated with testing scope, quality of the product and schedule
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server 2005 with high volume data.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server 2005 with high volume data
- Worked extensively on ER Studio for multiple Operations across Atlas Copco in both OLAP and OLTP applications.
- Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis.
- Produced PL/SQL statement and stored procedures in DB2 for extracting as well as writing data.
- Co-ordinate all teams to centralize Meta-data management updates and follow the standard Naming Standards and Attributes Standards for DATA &ETL Jobs.
- Finalize the naming Standards for Data Elements and ETL Jobs and create a Data Dictionary for Meta Data Management.
Environment: ER Studio, Business Objects XI, Rational Rose, Data stage, MS Office, MS Visio, SQL, SQL Server 2000/2005, Rational Rose, Crystal Reports 9, SQL Server 2008, SQL Server Analysis Services, SSIS, Oracle 11g
Confidential, Franklin Lakes, NJ
Data Analyst
Responsibilities:
- Study and understanding of the business and its functionalities by communication with Business Analysts.
- Analyzed the existing database for performance and suggested methods to redesign the model for improving the performance of the system.
- Supported ad-hoc, standard reporting and production projects.
- Designed and implemented many standard processes that are maintained and run on a scheduled basis.
- Created reports using MS Access and Excel. Applying filters to retrieve best results.
- Developed the Stored Procedures, SQL Joins, SQL queries for data retrieval, accessed for analysis and exported the data into CSV, Excel files.
- Developed Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.
- Analyzed business requirements, system requirements, data mapping requirement specifications and communicated it to developers effectively.
- Documented functional requirements and supplementary requirements in Quality Center.
- Setting up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Troubleshoot test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
- Responsible for different Data mapping activities from Source systems to Teradata.
- Developed SQL scripts, stored procedures, and views for data processing, maintenance etc., and other database operations.
- Performed the SQL Tuning and optimized the database and created the technical documents.
- Imported the Excel Sheet, CSV, Delimited Data, advanced excel features, ODBC compliance data sources into Oracle database for data extractions, data processing, and business needs.
- Designed and optimized the SQL queries, pass through query, make table query, joins in MS-Access 2003 and exported the data into Oracle database server.
- Compiled sales production and market penetration data for executive management. Data included employee activity, client coverage, and territory alignment analysis.
- Conducted business analysis, project assessment, and feasibility determination.
- Analyzed data feed requirements for Risk Management, Customer Information Management, and Analytic Support.
- Familiar with data and content migration using SAS migration utility for products that rely on metadata.
- Developed CSV files and reported offshore progress to management with the use of Excel Templates, Excel macros, Pivot tables and functions.
- Improved accuracy and relevance of credit card clients planning process reports and budgets reports for make high-level decisions.
- Manage all UAT deliverables to completion with overlapping releases.
Environment: SAS Enterprise Guide 4.0, OLAP Cube studio, Stored Processes, SAS Management Console, Informatica 8.1, MS Excel, MS PowerPoint, MS Visio, MS Project Management, Teradata SQL Assistant, Enterprise Miner, SAS DI Studio, MS Access, MS Excel. SQL, SPSS, SQL, VBA, PL/SQL, Shell Scripting, Oracle, Oracle 10g.
Confidential
Data Analyst
Responsibilities:
- Used SAS Proc SQLpass-throughfacility to connect to Oracle tables and created SAS datasets using various SQL joins such as left join, right join, inner join and full join.
- Performing data validation, transforming data from RDBMS oracle to SAS datasets.
- Produce quality customized reports by using PROC TABULATE, PROC REPORT Styles, and ODS RTF and provide descriptive statistics using PROC MEANS, PROC FREQ, and PROC UNIVARIATE.
- Developed SAS macros for data cleaning, reporting and to support routing processing.
- Performed advanced querying using SAS Enterprise Guide, calculating computed columns, using afilter, manipulate and prepare data for Reporting, Graphing, and Summarization, statistical analysis, finally generating SAS datasets.
- Involved in Developing, Debugging, and validating the project-specific SAS programs to generate derived SAS datasets, summary tables, and data listings according to study documents.
- Created datasets as per the approved specification collaborated with project teams to complete scientific reports and review reports to ensure accuracy and clarity.
- Experienced in working with data modelers to translate business rules/requirements into conceptual/logical dimensional models and worked with complex de-normalized and normalized data models
- Performed different calculations like Quick table calculations, Date Calculations, Aggregate Calculations, String and Number Calculations.
- Good expertise in building dashboards and stories based on the available data points.
- Created action filters, user filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Expertise in Agile Scrum Methodology to implement project life cycles of reports design and development
- Combined Tableau visualizations into Interactive Dashboards using filter actions, highlight actions etc. and published them to the web.
- Gathering business requirements, creating business requirement documents (BRD /FRD).
- Work closely with business leaders and users to define and design the data sources requirements and data access Code, test, identify, implement and document technical solutions utilizing JavaScript, PHP&MySQL.
- Created Rich dashboards using Tableau Dashboard and prepared user stories to create compelling dashboards to deliver actionable insights
- Working with themanager to prioritize requirements and preparing reports on theweekly and monthly basis.
Environment: SQL Server, Oracle 11g/10g, MS Office Suite, PowerPivot, Power Point, SAS Base, SAS Enterprise Guide, SAS/MACRO, SAS/SQL, SAS/ODS, SQL, PL/SQL, Visio
Confidential
PL/SQL Developer
Responsibilities:
- Analyzed, created new and modified existing forms, reports, scheduled reports, Oracle, PL/SQL, Stored Procedures, Functions, Packages, Triggers etc.
- Designed, developed and enhanced custom Forms and Reports according to the functional specification.
- Did code reviews of other developers and provided suggestions to improve performance
- Provided expertise to development managers in design or preparing proof-of-concept testing.
- Converted Forms & Reports from 6i to 10g.
- Wrote complicated stored Procedures, Functions, Database Triggers and Packages, Shell Scripts.
- Involved in handling Exceptions through PL/SQL Blocks.
- Created various feeds/downloads, nightly batch jobs, and other daily/monthly reports/downloads.
- Developed SQL *Loader scripts to load data from flat file to Oracle 10g database tables.
- Tuned batch jobs and other daily/monthly reports/downloads.
- Used Toad for creating and modifying procedures, functions, and triggers.
- Developed complex Oracle Forms providing extensive GUI features (multi-select drag and drop, graphical charts, automated system alerts and notifications etc.).
- Created Test Plan for QA and implementation plan for Production implementation once the unit test is done.
- Performed SQL performance tuning using Explain plan, Trace utility &TKProf.
- Documenting all Oracle Reports, Packages, Procedures and Functions development specifications.
- Tuned SQL queries and performed refinement of the database design leading to significant improvement in system response time and efficiency.
- Preparing test plan for QA for testing before moving from Pre-Prod to the Production environment.
- Modifying forms, reports and stored procedures & packages, triggers, functions etc. to meet the business requirements.
Environment: Oracle 9i, Windows XP, SQL Developer, Toad, Informatica PowerCenter 9.0.
