We provide IT Staff Augmentation Services!

Data Scientist Resume

5.00/5 (Submit Your Rating)

New, York

PROFESSIONAL SUMMARY

  • Over 8+years of experience in DataAnalysis, DataIntegration, Migration and MetadataManagement (MDM) andConfiguration Management.
  • Developed various Machine Learning applications wif Python Scientific Stack and R.
  • Experienced wif Deep Learning frameworks like Scikit Learn, Tensorflow and Keras.
  • Experienced DataAnalyst wif solid understanding of Data Mapping, Data warehousing(OLTP, OLAP), DataMining, DataGovernance and Data management services wif Quality Assurance.
  • Adept in Statistical Data Analysis, Exploratory Data Analysis, Machine Learning, Data Mining,Java and Data visualization using R, Python, Base SAS, SAS Enterprise Guide and SAS Enterprise Miner, Tableau and SQL
  • Strong working experience in Financial and Insurance industry wif substantial noledge of various front end and back end and overall end to end processes.
  • Proficient in data mining tools like R, SAS, Python, SQL, Excel, Java,eco­systems Staff leadership and Java development.
  • Proficiency in preparing ETLMappings (Source - Stage, Stage-Integration, ISD), Requirementsgathering, Data Reporting,Data visualization,Advanced business dashboard and presenting in front of clients.
  • Extensive experience in ObjectOrientedAnalysis and Design (OOAD) techniques wif UML using Flow Charts, Use Cases, Class Diagrams, Sequence Diagrams, Activity Diagrams and State Transition Diagrams
  • Strong working noledge wif SQL,SQLServer, Oracle, SAS, Tableau and Jupyter while handling various Mat lab applications in multiple projects
  • Expertise in handling various forms of Data like MasterData, Metadata, SourceData wif teh ability to provide Data analytics using various tools (Access, Excel, Reporting tools, etc.) and overall working Java experience in providing qualitative & quantitative assessment of data
  • Experience in data scaling, wrangling and data visualization in R, Python, SAS and Tableau
  • Strong SQL query skills and Spark experience wif designing and verifying Databases using Entity-Relationship Diagrams (ERD) and data profiling utilizing queries, dashboards, macros etc.
  • Worked closely wif teh QATeam in executing teh test scenarios, plans, providingtestdata, creating test cases, Issuing STR’s upon identification of bugs and collecting teh test metrics
  • Experience in performing user acceptance testing (UAT) and End to End testing monitoring test results andNetworksC++ escalating based on priorities
  • Experience working wif WATERFALL and AGILE methodologies and demonstrated excellent quality in Mat lab delivering teh output.

TECHNICAL SKILLS

Languages: SQL, PL/SQL, java, C, C++, XML, HTML, MATLAB, Python,Matlab R.

Statistical Analysis: R, Python, SAS E-miner 7.1,SAS Programming, MATLAB, Minitab, Jupyter

Databases: SQL Server 2014/2012/2008/2005/2000, MS-AccessOracle 11g/10g/9i.

DWH / BI Tools: Microsoft Power BI, Tableau, SSIS, SSRS, SSAS, Pentaho, Kettle, Business Intelligence Development Studio (BIDS), Visual Studio, Crystal Reports, Informatica 6.1, R-Studio.

Database Design Tools and Data Modeling: MS Visio, ERWIN 4.5/4.0, Star Schema/Snowflake Schema modeling, Fact & Dimensions tables, physical & logical data modeling, Normalization and De-normalization techniques.

Tools and Utilities: SQL Server Management Studio, SQL Server Enterprise Manager, SQL Server Profiler, Import & Export Wizard, Microsoft Management Console, Visual Source Safe 6.0, DTS, Crystal Reports, Power Pivot, ProClarity,Microsoft Office, Excel Power Pivot, Excel Data Explorer, Tableau.

PROFESSIONAL EXPERIENCE

Confidential, New York.

Data Scientist

Responsibilities:

  • Data extraction, datascaling, data transformations, data modelling and visualizations using R, SQL and Tableaubased on requirements.
  • Adept in writing R scripts while working wif Oracle REnterprise (ORE)
  • Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
  • Created different database schemas (Oracle) wif several tables containing teh data related to application details, DB and OSdetails, Asset configuration details, Server details and developed several queries to obtain encryption readiness results.
  • Developed MapReduce/SparkPython modules for machine learning & predictive analytics in Hadoop on AWS.
  • Created several correspondingISD (IntegrationSpecificDocument) which includes interface list (In-bound/Out-bound), detailed file sizes, production support details (SLA’s, servers etc.,)
  • Created functional specific document for teh Phase3 work including but not limited to Informatica Java, JavaScript requirements, architectural references, ETL sequence diagrams, data mappings, quality management, C++ use cases and data reconciliation details
  • Performed wellness checks(SQL) to make sure teh both stage and production environments are getting teh data as intended and addressing all teh production related issues.
  • Worked wif various LOBDirectors to identify third party hosted applications and prioritized them based on severity of risk, vulnerability using Risk assessment matrix.
  • Worked on Mat lab Cyber Security Questionnaire, requestingresponses from all teh corresponding Application Managers wif in due date for each Phase, based onrisk rating ( Critical, High, Medium and Low)
  • Worked closely wif ADM’s (ApplicationDevelopmentManagers) for data at rest applications to determine teh DB and OS types and versions and then created a GTAC supported compatible matrix usingSpark Excel to see which applications a solution and which ones have need an upgrade (Software/Hardware)
  • Designing a machine learning pipeline using MicrosoftAzureMachineLearning to predict and prescribe and Implemented a machine learning scenario for a given data problem
  • Designed and developed NLP models forNeural Networks sentiment analysis.
  • Experience wif dimensional and relational database design, ETL and lifecycledevelopment using Informatica Power Center, RepositoryManager, Designer, WorkflowManager and WorkflowMonitor.
  • Strong work on DataMining and responsible for generating reports and dashboards wif numbers and graphs to make sure loan processing times and work pipeline is properly routed as needed.

Environment:SQL/Server, Oracle 10g/11g, MS-Office, Informatica, ER Studio, XML, Hive, HDFS, Flume, Sqoop, R connector, C++,Python, R, Tableau 9.2

Confidential, Jersey City.

Data Analyst/Machine learning.

Responsibilities:

  • Worked on Phase3 of teh overall PPM program which is to ensure data from SOR systems have answers to business monitoring questions and also to make sure data is available in teh interim Java, JavaScript database in hourly cycles for reporting purposes
  • A highly immersive Data Science program involving Data Manipulation&Visualization, Web Scraping, MachineLearning, Python programming, Scala,SQL, GIT, MongoDB, Hadoop.
  • Used R and python for Exploratory Data Analysis, A/B testing, HQL, VQL, Data Lake, AWS Redshift, oozie, pySpark, Anova test and Hypothesis test to compare and identify teh TEMPeffectiveness of Creative Campaigns.
  • Responsible for C++creating several ETL mapping documents (Source to Stage, Stage to Integration) for different SOR’s making sure teh column loadrules, datatypes, TDQ/BDQ rules are all in place for all teh required source datafields
  • Created several corresponding ISD (Integration Specific Document) which includes interface list (In-bound/Out-bound), detailed file Scala, productionsupport details (SLA’s, servers etc.,)
  • Created functional specific document for teh Phase3 work including but not limited to Informatica requirements, architecturalreferences, ETLsequencediagrams, datamappings, qualitymanagement, usecases and datareconciliation details.
  • Experience in developing complex Informatica maps and strong in Data warehousing concepts along wif understanding of standard ETL transformation methodologies.
  • Experience wif dimensional and relational database design, ETL and lifecycledevelopment usingInformatica powercenter, repository manager,Designer, workflow manager and workflow monitor
  • Responsible for handling defects and escalations making sure they are addressed wifin time by updating corresponding documents (DDL,s, Mappings, Model changes etc.,)
  • Created wellness check scripts (SQL) to make sure teh both stage and production environments are getting teh data as intended and addressing all teh production issues.
  • Strong work on data mining and responsible for generating reports and dashboards wif numbers and graphs to make sure loan processing times and work pipeline is properly routed as needed.

Environment: R Studio, Python, Tableau, C++, java, Networks,SQL Server 2012, 2014 and Oracle 10g, 11g

Confidential, Grand Rapids, MI

Data Analyst

Responsibilities:

  • Responsible for running Daily, Weekly, Monthlyreports from both SQLServer and Oracle data warehouses.
  • Developed various automated and customized reports along wif improved template formats for various data metrics as part reporting.
  • Worked wif teh marketing team and analyzed marketing data using Access/Excel and SQL tools to generate reports, tables, listings and graphs.
  • Strong functional domain noledge of datagovernance, dataquality.
  • Scrum master for two internal projects both involving automating various manual processes in AGILE mode to improve timely delivery and quality.
  • Responsible for creating sales reporting metrics using Cognos across all markets (11 states) and providing wif improvement solutionsNetworks which benefitted teh individual market sales revenue.
  • Strong Excel working noledge on datamining, filtering, pivottables, formulas and setting up database connection for automatic data refresh and to share point links as well.
  • Experience conducting loads usingInformaticatools and handled performancetuning of Informaticajobs.
  • Extensively experience on datamigration, extraction, datacleansing and datastaging of operational sources using ETL (Informatica) processes.
  • Analyzed different kinds of data from many systems using ad-hocqueries, SQLscripts, Cognos report designs and delivered various comparisons, trends, statistics, errors and suggestions.
  • Maintained Change control process, conducted thorough analysis on various parameters, documented and presented teh same to teh reporting managers.

Environment: ER Studio,Informatica Power Center 8.1/9.1, Power Connect/ Power exchange, Oracle 11g, Mainframes,DB2 MS SQL Server 2008, SQL,PL/SQL, XML, Windows NT 4.0, Tableau, Workday, SPSS, SAS, Business Objects, XML, Tableau, Unix Shell Scripting, Networks,Teradata, Netezza.

Confidential

Data Analyst

Responsibilities:

  • Worked as part of a team that developed MachineLearning models using Natural Language Processing using Python Sklearn to provide insights on teh fraudulent claims and recommending ac- tionable insights.
  • Created new database objects like Procedures, Functions, Packages, Triggers, Indexes and Views using T-SQL in Development and Production environment for SQLServer2000
  • Actively participated in gathering of Requirement and System Specification.
  • Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and formatted teh results into reports and kept logs
  • Strong Understanding of Agile Data Warehouse Development.
  • Worked on complex T-SQL statements, and implemented various codes and functions.
  • Installed, authored, and managed reports using SQLServer2005 Reporting Services
  • Wrote Transact SQL utilities to generate table insert and update statements.
  • Developed and optimized databasestructures, storedprocedures, DDLtriggers and user-defined functions.
  • Implemented new T-SQL features added in SQLServer2005 that are Error handling through TRY-CATCH statement, CommonTableExpression (CTE).
  • Created Stored Procedures to transform teh data and worked extensively in T-SQL for various needs of teh transformations while loading teh data.
  • Participated in developing prototype modeling and communicated results to teh requesting individuals.

Environment: Statistical Modeling, Machine Learning, NLP,Python,Sklearn, Gensis,Pandas, PySpark, R,Mat- plotlib, SAS, Seaborn, Tableau, Power BI, SQL.

Confidential

Data Analyst

Responsibilities:

  • Developed teh ETL (SSIS) pipelines for data extraction.
  • Developed software tools in Python to automatically scrutinize documents and electronic content.
  • Developed .Net based applications for Microsoft Office document processing.
  • Developed teh Database SQL schema for teh data pipelines.
  • Performed Data Analysis and subsequent reports for QA teams to prioritize teh issues.
  • Strong functional domain noledge of data governance, data quality to make sure compliance standards is properly incorporated.
  • Participated in requirements gathering and development of value-adding use-cases and applications in tight cooperation wif other intra-organizational units, product managers, product development teams.
  • Developed teh SQL jobs for generating teh analytic reports.

Environment: Python, SSIS,SSRS, SQL,Sklearn, C#, Matplotlib, PostgreSQL

We'd love your feedback!