We provide IT Staff Augmentation Services!

Senior Data Scientist Resume

3.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY

  • 6+ years of Professional experience in Information Technology involving Data gathering, Analysis, Design, Testing, and . Excellent skills in state - of-the-art technology of client server computing, desktop applications and website development.
  • Over 6 years of learning experience on Big Data Analytics with hands on experience on writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.
  • Good working experience on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Expertise in Hadoop - Big data technologies: Hadoop Distributed File System (HDFS), Map Reduce, PIG, HIVE, HBASE, SQOOP.
  • Experienced and in-depth knowledge of different phases of System Development Life Cycle (SDLC) and methodologies such as Waterfall, Agile-Scrum, SAFe, and Waterfall-Scrum Hybrid.
  • Conducted various analyses like GAP Analysis, Feasibility Analysis, Impact Analysis, Cost Benefit Analysis, Risk Analysis, Root cause Analysis and SWOT Analysis.
  • Used different Elicitation Techniques for the Requirements Gathering like Brainstorming Sessions, Joint Application Development (JAD), Interview Sessions, Workshops, Focus Group, Surveys, Questionnaires, Document Analysis, and Observations with Subject Matter Expert (SME's), Business users and Business Owners.
  • Good knowledge of database like MySQL and extensive experience in writing SQL queries, Stored Procedures, Triggers, Cursors, Functions and Packages.
  • Good knowledge in process modeling and created UML Diagrams (Use Cases, Sequence Diagrams, and Activity Diagrams, state chart Diagrams) Workflow diagramswith the help of MS Visio, andBlueprint tool.
  • DesignedPrototype, Mock-up screen, and Wireframesusing MS Visio and Blueprint toolwhile developing Graphical User Interface based application.
  • Expertise in developing Test Cases, Test Plans including Defect Logs.
  • Performed UAT testing, Regression testing, Smoke testing, andSIT testing.
  • Experienced in writing Epic, Features and User Storieswith the help of Product owner and Scrum development team.
  • Involved in prioritizing the user stories with the help of product owner using prioritizing techniques and assisted in estimating the efforts needed to complete the user stories with the help of Scrum development team by using estimating techniques.
  • Expertise in developing automation test framework and creating function libraries.
  • Expertise in creating BDD driven test frame work.
  • Developed and maintained all project related documentation needed for implementation of application.
  • Knowledge in Data warehousing concepts- Data modeling, Data mapping, Data Mining, Data Analysis, Conceptual, Logical & Physical Data Model, Star schema, Snowflake schema, and Normalization.
  • Good Knowledge of, Web services, Client and Server-Side Validations and knowledge on ETL process.
  • Experience in creating Test Readiness Review (TRR), Requirement Traceability Matrix (RTM) documents.
  • Experience in preparing Test Strategy, developing Test Plan, Detailed Test Cases, writing Test Scripts by decomposing Business Requirements, and developing Test Scenarios to support quality deliverables.
  • Experience in designing and deploying a multitude application utilizing almost all of the AWS stack(including EC@, RouteS3, S3, RDS, Dynamo DB, SNS, SQS, IAM) focusing on high-availability, fault tolerance, and auto- Scaling.
  • Extensively used Pentaho, Jasper, BIRT client tools Pentaho report designer, Pentaho Data Integration, Pentaho scheme workbench, Pentaho design studio, Pentaho kettle, Pentaho BI server, Jasper report, Jasper server, BIRT report designer.
  • Performed load operations using a Relational Stage for updating and deleting data in a DB2 table in the Mainframe.

TECHNICAL SKILLS

Programming Languages: Java, C/C++, Assembly Language (8085/8086), Python

Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Hive, Pig, Sqoop, Flume,Pentaho

Visualization Tool: Tableau, Wolfram Alpha, Qlikview

Microsoft Office Tools and Technologies: MSWord, MS Excel, MS PowerPoint, MS Visio, SharePoint, Active Directory(LDAP, security, Group Policy, Schema changes), RSAT tools, Azure, MS Dynamics CRM.

RDBMS: SQL Server 2008, MYSQL, MS-Access.

OS and Cloud Technologies: Windows, UNIX/LINUX, AWS, Ubuntu.

IDE’s: Eclipse, JDeveloper, IntelliJ IDEA.

Development Tools: MS Visual Studio, Notepad ++

ETL and Testing Tools: Pentaho Data Integration (Kettle) 7.0.1, Informatica Power Center 9.5.1/9.1/8.6/8.5/8.1/7.1/6.1 , Informatica Data Analyzer 8.5/8.1, Informatica Power Exchange, IDQ, Data cleansing, SSIS.Quick Test Professional, Selenium Web driver, Selenium Grid, Quality Center 10/9.

PROFESSIONAL EXPERIENCE

Senior Data Scientist

Confidential, Chicago, IL

Responsibilities:

  • Worked on data analysis, data governance, data profiling, data migration, data conversion, data quality and data integration on various subject areas
  • Developed high-level data modeling diagrams such as star schemas for every subject area as a part of data normalization process using MS Visio and ERWIN.
  • Developed Conceptual, Logical and Physical data models and created DDL scripts to create database schema and database objects in Splice Machine, an RDMS that runs on Hadoop and Spark
  • Created source to target table mappings for ETL- Informatica developers to move data from numerous source systems to target systems.
  • Worked on different HL7 Message types like Admit Discharge Transfer, Orders and Results (General and Pharmacy).
  • Performed data cleansing and data quality on source systems using SQL and Paxata, a data preparation tool that runs on Spark
  • Prepared DDL scripts for tables to be created in target systems
  • Designed data marts for various data points as a part of DOL Fiduciary rule
  • Created data marts in Splice and developed ad-hoc reports and daily flash visual reports for executive level officers using Qlikview
  • Designed and developed data integration process and workflow using Pentaho PDI for marketing, service, and business groups.
  • Design and develop logical and physical data models that utilize concepts such as Star Schema, Snowflake Schema and Slowly Changing Dimensions.
  • Created message flows for validation, routing, qualification and transformation of HL7 and non-HL7 messages.
  • Created pin boards for executives and SMEs on visualizations tools such as Thoughtspot, a BI analytics search software
  • Used data preparation tools such as Paxata to prepare the data for ETL
  • Worked on Salesforce instance consolidation and created customized data marts solutions in salesforce thereby helping company to reduce enormous licensing and renewal costs
  • Performed bulk load of JSON data from s3 bucket to snowflake.
  • Developed process to call Marketo REST API from Pentaho PDI to integrate the source data (bulk data) and in Confidential the process into ETL flow.
  • Created ETL scripts for data transformation to data mart using Pentaho PDI (Kettle).
  • Performed Data Quality Analysis using SQL scripts on Splice Machine and DBeaver
  • Used Snowflake functions to perform semi structures data parsing entirely with SQL statements.
  • Performed ETL development using PentahoPDI, scripts, and cron jobs.

Environment: Splice Machine, Oracle, ThoughtSpot, Qlikview, Informatica, Paxata, Pentaho, SQL, DBeaver, SQL Developer, SSMS, MS VISIO, Active Batch, Microsoft (Word, Access, Excel, Outlook), JIRA, Salesforce.

Data Scientist

Confidential, Hainesport, NJ

Responsibilities:

  • Developing and maintaining databases assigned/designed for reporting purposes.
  • Utilizing SQL and PL/SQL for test setup and data validation on Access, Oracle, SQL Server, and Sybase databases.
  • Maintaining data and generating reports from the CareLogic and Mclean eBasis.
  • Conducting data analysis, design trend & performance benchmarking reports. Function as system administrator for software systems including agency system.
  • Expertise in snowflake to create and Maintain Tables and views.
  • Developed Spark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resources.
  • Developed JavaScript’s to modify HL7 messages as they pass through the Mirth Connect Interface Engine.
  • Format/maintain up to date approved policies and agency forms utilizing approved software/network to track ongoing updates, reviews, and approval systems
  • Assisted in creating Data Model , Entity Relationship Diagrams . Created use case Diagrams, Activity and Sequence Diagrams.
  • Assisted in performingData modeling and Data reporting.
  • Participated in maintenance of overall health of all technologies relevant to Active Directory applications.
  • Created Power BI visualization of Dashboards & Scorecards (KPI) for Finance Department
  • Designed, developed and implemented Power BI Dashboards, Scorecards & KPI Reports .
  • Expertise in Power BI, Power BI Pro, Power BI Mobile .
  • Segregated business requirements by analyzing them into low level and high level. Converted Business Requirements into Functional Requirements Document.
  • Worked with UI team to create the User Interface screenshots and presented to the business for their approval.
  • Developed wireframes to fully define the customer needs and requirements in brainstorming and review sessions.
  • Used MS VISIO to create Use Case diagrams and Sequence diagrams so that the development team and other stake holders can understand the business process.
  • Manipulated HL7 data in XML transmissions to match vendor specifications using Java Scripting.
  • Created ETL transformations and jobs using Pentaho Data Integration Designer (Kettle-Spoon) and scheduling them using Cron Job.
  • Used various input types in PDI for parallel accessing.
  • Used various types of inputs and outputs in Pentaho Kettle including Database Tables, Text Files, Excel files and CSV files.
  • Installed upgrade Tableau server and server performance tuning for Optimization.

Environment: MS. Access, SQL,MS Word, MS EXCEL, MS PowerPoint, HTML, JAVA, AD,UNIX, Tableau.

Data Scientist

Confidential, Newton, PA

Responsibilities:

  • Captured functional requirements from business clients along with IT requirement analysts by posing suitable questions, analyzing the requirements by collaborating with team and system architects by following the standard templates.
  • Extensive Tableau Experience in Enterprise Environment and Tableau Administration.
  • Developed entire frontend and backend modules using Python on Django Web Framework.
  • Configured data extraction and scheduled incremental refreshes for data sources on Tableau server to improve the performance of reports.
  • Worked in E-Discovery projects requiring collection of both structured and unstructured data.
  • Delivered Interactive visualizations/dashboards using Tableau to present analysis outcomes in terms of patterns, anomalies, and predictions.
  • Used existing Deal Model in Python to inherit and create object data structure for regulatory reporting.
  • Experience in the popular Python framework (Django).
  • Knowledge of object-relational mapping (ORM).
  • Followed good programming practices and adequately documented programs.
  • Supported the team in planning activities for Data staging, development, and UAT activities.
  • Assisted and supported the QA team in understanding and creating manual and automated test plans, testing efforts, root cause analysis.
  • Produced quality customized reports by using PROC REPORT, TABULATE, SUMMARY, and descriptive statistics using PROC MEANS, FREQUENCY, and UNIVARIATE.
  • Automated RabbitMQ cluster installations and configuration using Python/Bash.
  • Experience working with T-SQL, DDL, and DML Scripts and established Relationships between Tables using Primary Keys and Foreign Keys.
  • Extensive Knowledge in Creating Joins and sub-queries for complex queries involving multiple tables.
  • Hands-on experience in Using DDL and DML for writing Triggers , Stored Procedures, and Data manipulation .
  • Worked with star schema, snowflakes schema dimensions, SSRS to support large Reporting needs .
  • Used Python library BeautifulSoup for webscrapping to extract data for building graphs.
  • Experience in using Profiler and Windows Performance Monitor to resolve Dead Locks or long-running queries and slow running server.

Environment: Azure, MS Office, SQL, Tableau, SharePoint 2010. MS Dynamics CRM, MS Excel.

Data Scientist

Confidential

Responsibilities:

  • Documented the system requirements to meet end-state requirementsand complied Software Requirement Specification Document and Use Case document.
  • Developed an analysis model that includes Use Case diagrams, and Activity diagrams using UML methodologies in MS Visio, which provided the development team a view of the requirements for design and construction phases.
  • Responsible for maintenance, administration and support of data elements. And Ensures metadata, documentation , data files and data flows are accurate and up to date .
  • Created mappings between fields in the different sources/systems. Perform special projects as needed.
  • Worked on SQL queries for data manipulation.
  • Extracted, complied and tracked data and analyzed data to generate reports.
  • Arranged weekly team meetings to assign testing tasks and acquisition of status reports from individual team members.
  • Effectively managed change by deploying change management techniques such as Change Assessment, Impact Analysis and Root cause Analysis.
  • Created sample Wire frames for better understanding of the system by Business and technology teams.
  • Worked with UI team to create the User Interface screenshots and presented to the business for their approval.
  • Expertise in developing automation test framework and creating function libraries.
  • Expertise in creating BDD driven test frame work.
  • Used advanced Excel functions to generate spreadsheets and pivot tables.
  • Developed User Interface using HTML, XML and Java.
  • Performed installation of Software’s, troubleshooting of Software’s, drivers and essential desktop /laptop troubleshooting.
  • Created and maintained documentation as it relates to network configuration, network mapping, processes, and service records.
  • Presented solutions in written reports while analyzing, designing, testing and monitoring systems in a waterfall methodology.
  • Primary contact for internal users on how to complete required forms and retrieve related data in SAP.

Environment: Waterfall, MS Project, Windows8/9, Web Services, MS SQL, Ms Visio, MS Excel, HTML, UNIX.

Jr.SQL Developer/ Analyst

Confidential

Responsibilities:

  • Recorded, maintained & tracked defects, assigned type & priority/severity levels.
  • Great expertise in processing data and flowcharting techniques.
  • Excellent understanding of database structures, principles, theories and practices.
  • Conversant with T-SQL coding.
  • Great problem solving and analytical skills. Able to manage all the stages of software development.
  • Ability to develop applications in a detail-oriented environment.
  • Great experience in the Oracle platform especially on SQL standards and Java.
  • Excellent grasp of data warehousing and such processes as extracting, transforming and loading.
  • Able to communicate clearly and get along well with other coworkers.
  • Able to identify and resolve intricate problems.
  • Worked with business stakeholders in creating user acceptance criteria and conducted/coordinated UAT.

Environment: Waterfall, MS Office, Microsoft Visio, SQL 2008, UML.

We'd love your feedback!