We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

Washington, DC

SUMMARY

  • Around 8+ years of experience in IT and 5+years’ experience in Data scientist with strong technical expertise, business experience, and communication skills to drive high - impact business outcomes through data-driven innovations and decisions
  • Hands on experience on Spark-Mlib utilities such as classification, regression, clustering, collaborative filtering, dimensionality reductions
  • Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using R, Python and Tableau
  • Strong knowledge of statistical methods (regression, time series, hypothesis testing, randomized experiment), machine leaning, algorithms, data structures and data infrastructure
  • Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random Forest, SVM, K-Nearest Neighbors) in Forecasting/Predictive Analytics, Segmentation methodologies, Regression based models, Hypothesis testing, Factor analysis/ PCA, Ensemble
  • Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions dat scales across massive volumes of structures and unstructured data.
  • Solid team player, team builder, and an excellent communicator.
  • Extensive hands on experience and high proficiency with structures, semi-structured and unstructured data, using a broad range of data science programming languages and big data tools including R, Python, Spark, SQL, ScikitLearn, Hadoop Map Reduce
  • Expertise in Technical proficiency in Designing, Data Modeling Online Applications, Solution Lead for Architecting Data Warehouse/Business Intelligence Applications.
  • Expertise in teh implementation of Core concepts of Java, JEE Technologies, JSP, Servlets, JSTL, EJB, JMS, Struts, spring, Hibernate, JDBC, XML, Web Services, and JNDI.
  • Extensive experience working in a Test-Driven Development and Agile-Scrum Development.
  • Experience in working on both Windows, Linux and UNIX platforms including programming and debugging skills in UNIX Shell Scripting.
  • Flexible with Unix/Linux and Windows Environments, working with Operating Systems like Centos 5/6, Ubuntu 13/14, Cosmos.
  • Defining job flows in Hadoop environment-using tools like Oozie for data scrubbing and processing.
  • Experience in Data migration from existing data stores to Hadoop.
  • Developed Map Reduce programs to perform Data Transformation and analysis.
  • Experience in analyzing data with Hive and Pig using on reading data schema.
  • Created Development Environments in Amazon Web Services using services like VPC, ELB, EC2, ECS and RDS instances
  • Strong experience in Software Development Life Cycle(SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies
  • Proficient in Data Science programming using Programing in R, Python and SQL
  • Proficient in SQL, Database, Data Modeling, Data Warehousing, ETL and reporting tools
  • Strong knowledge in NOSQL column oriented databases like HBase, Cassandra, MongoDB, and its integration with Hadoop cluster.
  • Proficient in using AJAX for implementing dynamic Web Pages.

TECHNICAL SKILLS

Databases: Oracle, MySQL, MSSQL Server, Sybase, PostgreSQL, Mongo DB, NoSQL.

Hadoop/BigData: HDFS, Map reduce, YARN, Hive, Pig, Sqoop, Oozie, Zoo keeper, Flume, HBase

Programming Languages: Python, SQL, T-SQL, Matlab, C, C++, HTML, PL/SQL, XML, DHTML, HTTP, Java, Hadoop.

Database Design Tools and Data Modeling: Physical& logical data modeling, Dimensions tables, Kimball.

IDE: Eclipse, IntelliJ, Net Beans, IBM Rational Application Developer (RAD)

Web Servers: JBoss, Web Logic, Web Sphere, Tomcat, Jetty, Apache

Reporting Tools: Shiny, Power BI, Tableau, Jasper Reports, BIRT, Crystal Reports.

Statistical Software: SPSS, R, SAS

Tools: and Utilities: Crystal Reports, Power Pivot, DTS, SQL Server Enterprise Manager, SQL Server Profiler, Microsoft Management Console, Visual Source Safe 6.0, Excel Power Pivot, SQL Server, ProClarity, Microsoft Office 2007/10/13, Visual Studio v14, Excel Data Explorer, Tableau 8/10 Import & Export Wizard.Net.

Operating Systems: UNIX and Linux, Microsoft Windows 8/7/XP.

PROFESSIONAL EXPERIENCE

Confidential, Washington DC

Data Scientist

Responsibilities:

  • Responsible for design and development of advanced R/Python programs to prepare transform and harmonize data sets in preparation for modelling.
  • Responsible for performing Machine-learning techniques regression/classification to predict teh outcomes.
  • Designed teh prototype of teh Datamart and documented possible outcome from it for end-user.
  • Identifying and executing process improvements, hands-on in various technologies such as Oracle, Informatica, BusinessObjects.
  • Developed and maintained data dictionary to create metadata reports for technical and business purpose.
  • Involved in business process modelling using UML
  • Interaction with BusinessAnalyst, SMEs and other DataArchitects to understand Business needs and functionality for various project solutions.
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Worked closely with business, datagovernance, SMEs and vendors to define data requirements.
  • Created SQL tables with referential integrity and developed queries using SQL, SQL*PLUS and PL/SQL.
  • Design, coding, unit testing of ETL package source marts and subject marts using Informatica ETL processes for Oracle database.
  • Developed large data sets from structured and unstructured data. Perform datamining.
  • Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow flake Schemas.
  • Researched, evaluated, architected, and deployed new tools, frameworks, and patterns to built sustainable BigDataplatforms for teh clients.
  • Partnered with modellers to develop data frame requirements for projects.
  • Tracked various campaigns, generating customer profiling analysis and data manipulation.
  • Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
  • Having good exposure in BigDatatechnologies, Hadoop ecosystems HDFS, Map Reduce, Pig, SQOOP and Hive for scalability, distributed computing and High performance computing.

Environment: R/ Python, Informatica 9.0, ODS, OLTP, Oracle 11g, Hive, OLAP, DB2, Metadata, MS Excel, Mainframes MS Visio, Rational Rose, Requisite Pro., Hadoop, SQL, PL/SQL, Sql*Load, UML.

Confidential, Denver, Co

Data Scientist

Responsibilities:

  • Worked as a DataModeller /Analyst to generate Data Models using Erwin and developed relational database system.
  • Responsible for performing Machine-learningtechniques regression/classification to predict teh outcomes.
  • Performance tuning of teh database, which includes indexes, and optimizing SQL statements, monitoring teh server
  • Involved with Data Analysis primarily IdentifyingDataSets, SourceData, Source Meta Data, Data Definitions and Data Formats
  • Collaborate teh data mapping document from source to target and teh data quality assessments for teh source data.
  • Wrote simple and advanced SQLqueries and scripts to create standard and adhoc reports for senior managers.
  • Performed performance improvement of teh existing Data warehouse applications to increase efficiency of teh existing system.
  • Created PL/SQL packages and DatabaseTriggers and developed user procedures and prepared user manuals for teh new programs.
  • Provide expertise and recommendations for physicaldatabasedesign, architecture, testing, performance tuning and implementation
  • Designed and developed Use Case, ActivityDiagrams, SequenceDiagrams, OOD (ObjectorientedDesign) using UML and Visio.
  • Analyzed large datasets to answer business questions by generating reports and outcome.
  • Provided R/SQL programming, with detailed direction, in teh execution of data analysis dat contributed to teh final project deliverables. Responsible for data mining.
  • Executed SQL queries from R/Python on complex table configurations
  • Worked in a team of programmers and data analysts to develop insightful deliverables dat support data-driven marketing strategies.

Environment: R/ Python, Informatica 9.0, ODS, OLTP, SQL Profiler, and Query Analyzer, DB2, Metadata, MS Excel, Mainframes MS Visio, Rational Rose, Oracle 11g/10g.

Confidential, Plano, TX

Data Analyst/Data Modeler

Responsibilities:

  • Worked as Data Modeler/Analyst to generate data models using Ervin and developed relational database system.
  • Analyzed teh business requirements of teh projects by studying teh Business Requirement Specification Document
  • Extensively worked on Data Modeling tools Ervin Data Modeler to design teh Data Models.
  • Designed Mapping to process teh incremental changes dat exists in teh source table. Whenever source date elements were missing in source tables, these were modified/added in consistency with 3NF based OLTP source database.
  • Designed tables and implemented teh naming conventions for Logical and Physical Data Models in Ervin 7.0.
  • Provided Expertise and recommendations for physical database design, architecture, testing performance tuning and implementation.
  • Design Logical and Physical Data Models for multiple OLTP and analytic applications.
  • Extensively used teh Ervin design tool & Ervin model manager to create and maintain teh DataMart.
  • Designed teh physical model for implementing teh model into oracle 9i physical database.
  • Involved with data Analysis. Primarily identifying Datasets, Data Source, Source Meta Data, Data Definitions and Data formats.
  • Performance tuning of teh database, which includes indexing and optimizing SQL statements, monitoring teh server.
  • Wrote simple and advanced SQL queries and scripts to create standard and ad-hoc reports for senior managers.
  • Collaborated teh Data Mapping document from source to target and teh data quality assessments for teh Data source.
  • Used expert level understanding of different databases in combinations for Data Extractions and loading, integrated data extracted from different databases and loading to a specific database.
  • Designed and developed Insurance, Claims and Financial Reports Using SSRS.

Environment: SQL Server 2008 R2/2005 Enterprise, SSRS, SSIS, crystal Reports, Windows Enterprise server 2000,DTS, SQL Profiler, and query Analyzer, MS Visio, Ervin

Confidential, Buffalo, NY

Hadoop developer

Responsibilities:

  • Responsible to manage data coming from different data sources.
  • Designed and developed UDF’S to extend teh functionality in both PIG and HIVE.
  • Involved in collecting business requirements from teh Business partners and subject Matter Experts.
  • Developed MapReduceprograms to perform data filtering for unstructured data.
  • Wrote Hive queries for data analysis to meet teh business requirements.
  • Created Partitioned Hivetables and worked on them using Hive.
  • Import and Export of data using Sqoop between MySQL to HDFS on regular basis.
  • Used Flume to channel data from different sources to HDFS.
  • Created HBase tables to store data depending on column families.
  • Worked with administrator to set up and monitor teh Hadoopcluster
  • Supported Map Reduce Programs which are running on teh cluster.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Actively updated teh upper management with daily updates on teh progress of project dat include teh classification levels dat were achieved on teh data.
  • Developed scripts and batch jobs to schedule various Hadoopprograms.
  • Designed and Maintained Oozieworkflows to manage teh flow of jobs in teh cluster.

Environment: Java, Hadoop, Map Reduce, HDFS, Pig, Hive, HBase Linux, MySQL, Ubuntu.

Confidential

Java developer

Responsibilities:

  • Designed and developed teh front end using JSP, HTML and JavaScript and JQuery.
  • Designed teh application by implementing Struts Framework based on MVCArchitecture.
  • Developed custom Tagsin Struts.
  • Developed framework for data processing using Design patterns, Java, XML.
  • Used Spring IOC for dependency injection to Hibernate and Spring Frameworks.
  • Used teh light weight container of teh Spring Framework to provide architectural flexibility for Inversion of Controller (IOC).
  • Developed EJB components dat are deployed on Web logic Application Server.
  • Designed and developed Session beans to implement teh Business logic.
  • Designed and developed various configuration files for Hibernate mappings.
  • Written unit tests using JunitFramework and Logging is done using Log4J Framework.
  • Designed and Developed SQLqueries and StoredProcedures.
  • Applied CSS (Cascading style Sheets) for entire site for standardization of teh site.
  • Offshore co-ordination and User acceptance testing support.
  • Actively involved in code reviews and bugfixing.

Environment: Java 5.0, Struts, Spring 2.0, Hibernate 3.2, Web Logic 7.0, Eclipse 3.3, Oracle 10g, Junit 4.2,Maven, Windows XP, HTML, CSS, JavaScript, and XML.

Confidential 

Java developer

Responsibilities:

  • Involved in teh analysis & design of teh application using Rational Rose.
  • ObjectOrientedAnalysis and Design using UML include development of class diagrams, Sequence diagrams and State diagrams and implemented these diagrams in Microsoft Visio.
  • Developed teh various action classes to handle teh requests and responses.
  • Designed and created JavaObjects, JSPpages, JSF, JavaBeans and Servlets to achieve various business functionalities. Created validation methods using JavaScript and Backing Beans.
  • Involved in writing client side validations using JavaScript, CSS.
  • Involved in teh design of teh Referential Data Service module to interface with various databases using JDBC.
  • Used Hibernate framework to persist teh employee work hours to teh database.
  • Spring framework features were extensively used.
  • Developed and configured using BEAWebLogicApplicationServer.
  • Developed teh build scripts using Ant.
  • Involved in designing test plans, test cases and overall Unit testing of teh system.
  • Developed controllers and actions encapsulating teh businesslogic.
  • Developed classes and interface with underlying webservices layer.
  • Designed web services for teh above modules.
  • Prepared documentation and participated in preparing user's manual for teh application.

Environment: Java, Spring 2.0, Hibernate 3.2, Web Logic, Eclipse, SQL Server 2008, Junit 4.2,Ant, Windows XP, HTML, CSS, JavaScript, and XML.

We'd love your feedback!