We provide IT Staff Augmentation Services!

Data Scientist /data Analyst Resume

4.00/5 (Submit Your Rating)

Detroit, MichigaN

SUMMARY

  • 12+ years of progressive experience in Software Development includes application architect, administration, design and developmentin US Finance, Healthcare, Services, Public Safety, And Utilitysector.
  • 3+ years in Big data/ Hadoop experience in Hadoop ecosystem such as Hive, Pig, Flume, Sqoop, Zookeeper, SPARK, Kafka, Impala, MapReduce.
  • 1+ year in Data Science (ML) experience in building statistical models by applying classification & regression algorithms using various ML packages (Scikit - Learn).
  • Understanding to identify the viability of a business problem for a big data solution. Defining a logical architecture of the layers and components of a big data solution like data capacity planning and node forecasting. Selecting the right products to implement a big data solution.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, Oozie, Hive, Sqoop, Pig, SPARK, Cassandra and MongoDB and Flume.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters & Cloudera Hadoop Distribution
  • Prior experience working as Software Developer in Java/J2EE and related technologies such as JSP, Servlets, Hibernate, JDBC.
  • Designed and implemented Stream processing pipeline workflow for user which update user’s data nearly real-time
  • Good hands-on experience on data visualization tools such as Tableau, Zeppelin and PowerBI
  • Experience in Data Analysis, Data Cleansing, Data Validation and Verification, Data Conversion, Data Migrations and Data Mining
  • Experience working on LDAP user accounts and configuring ldap on client machines
  • Experience in developing applications using Object oriented design and Development using Microsoft .Net technologies including ASP.NET, C#.Net, VB.NET, ADO.Net, WPF, WCF, Silverlight and XML for Web and Win Forms development
  • Experience in utilizing SQL integration Services (SSIS), SQL Reporting services (SSRS), SQL Management studio and SQL tools

TECHNICAL SKILLS

Big Data Platform: Hortonworks (HDP 2.2)/AWS (S3, EC2)/Cloudera (CDH3)

Data Science Platform/Tools: Anaconda Continuum, Python, R

OLAP Concepts: Data warehousing, Data mining concepts

Apache Hadoop Yarn 2.0: HDFS, Pig, Hive, Sqoop, Zookeeper, Oozie

Real Time Data Streaming: Kafka, Spark (Streaming)

Source Control: GitHub, VSS, TFS

Databases and NoSQL: MS SQL Server 2012, Oracle 11g (PL/SQL) and MySQL 5.6, MongoDB

Data Visualization Tools: Tableau, Zeppelin,PowerBI, Matplotlib, ggplot2

Development Methodologies: Agile and Waterfall

Development Tool: Eclipse, Toad, Visual Studio

Programming Languages: Java, .Net

Scripting Languages: JavaScript, JSP, Python, XML, HTML

PROFESSIONAL EXPERIENCE

Data Scientist /Data Analyst

Confidential, Detroit, Michigan

Responsibilities:

  • Build Statistical Models around the loan origination to detect fraud for financial risk assessment.
  • Analyze large-scale loan datasets and employed state-of- the-art machine learning methods using Scikit-Learn(Python) for categorical predictionand continuous regression for the risk analysis
  • Reduce key features and proposed derived features on loan dataset(s), then trained an ensemble Naive Bayes - random forest - Logistic Regression for better accuracy metrics for model building
  • Analyze unstructured voice recordings from call centers,blogs,customer feedbacks, and recommend ways to reduce customer churn, up-sell and cross-sell various financial products.

Sr. Hadoop Developer/Admin

Confidential, Trenton, NJ

Responsibilities:

  • Designed & build data pipeline which stream data from client apps using web-sockets to server and from there Kafka Consumer which consumes that data and write to HDFSdata store. From HDFS store different spark jobs are reading this data using Spark-SQL & processing data in stream & batches jobs
  • Shared responsibility for administration of Hadoop, Hive and Pig
  • Managed, reviewed and interpreted Hadoop log files. Involved with the application teams to install Hadoop updates, patches and version upgrades as required
  • Leadership: Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Sqoop. Responsible for building scalable distributed data solutions using Hadoop
  • Data Ingestion: Involved in importing and exporting data (SQL Server, Oracle, csv and text file) from local/external file system and RDBMS to HDFS. Load log data into HDFS using Flume
  • ETL Data Cleansing, Integration & Transformation using Pig: Managing data from disparate sources.
  • Exported analyzed data to the relational databases using Sqoop for visualization & Report generation
  • Data Warehousing: Designed a data warehouse using Hive, created and managed Hive tables in Hadoop
  • Workflow Management: Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig

Hadoop Developer/Admin

Confidential, Atlanta, GA

Responsibilities:

  • Built ETL workflow to process sales and marketing data on hive tables
  • Importing the data from the MySql and Oracle into the HDFS using Sqoop
  • Experienced on loading and transforming of large sets of structured and unstructured data
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS
  • Created Hive Internal and External tables and loaded the data in to tables and query data using HQL
  • Used Hue for UI based PIG script execution, Oozie scheduling and creating tables in Hive
  • Deployment of applications using AWS EC2
  • Written Map Reduce java programs to analyze the log data for large-scale data sets
  • Participated in building CDH4 test cluster for implementing Kerberos authentication.Installing Cloudera manager and Hue
  • Project Management activities like planning, resource allocations, tracking and people management
  • Created use case, class, package, sequence diagrams using MS Visio.

Sr Application Developer

Confidential, Huntsville, AL

Responsibilities:

  • Application framework designing and development including database designing.
  • Designed and developed data pipeline using SQL Server 2010, SSIS, SSAS and related technologies.
  • Design, Develop and Publish web portal, which would allow Confidential customers to maintain and communicate with several of the hardware devices.
  • Provided data modeling and database design using ER-Studio and Enterprise Architect.
  • Involved in various phases of Software Development Life Cycle (SDLC) of the application
  • Developed user interfaces using JSP, HTML, XML and JavaScript which would allow Confidential customers to maintain and communicate with several of the hardware devices.
  • Design of interface modules with Customer Information Systems (CIS) and Billing Systems.
  • Design, Develop and support Prepay Management interface modules.
  • Design of API for Confidential products which would allow third party vendors to control and manage Confidential hardware devices.
  • Building SQL jobs which would check the health of the system daily.
  • Building and automating test procedures for testing new firmware releases.
  • Confidential to build various dashboards.
  • Design and Develop GIS mapping solution.
  • Implemented security in modules as Identity, Authentication, and Authorization

Application Developer

Confidential, Huntsville, AL

Responsibilities:

  • Involved in analysis, design and development of dynamic web application.
  • Developed and supported database Schema and stored procedures.
  • Involved in developing uaSecurity and uaPermissions for the end user depending on counties.
  • Developed Server Side Components using C#.
  • Looked after Master Modules and different reports.
  • Involved in documentation of various Modules.
  • Created entire project in Subversion and created Ant build script for compiling and building the application for various environments
  • ConfiguredWindows Communication Foundation (WCF)service to authenticate clients with Windows credentials for intranet applications for login validations
  • Involved in design and developed digital dashboard for Outage Management System.
  • Calculation of Reliable Indices like CAIDI, CAIFI, SAIDI, and SAIFI for more reliable system.
  • Made changes to uaNV Server for display of Map for Outage customer.
  • Design and developed database schema and stored procedures.
  • Set up User Interface, Coding and Design Guidelines.
  • Involved in documentation of various Modules.
  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML
  • Involved in designing the system using uaOneLink.
  • Made uaNV Server for display of map for Incident Tracking.
  • Assisted in implementation of the project at client side.
  • Wrote different stored procedures and schema.
  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams
  • Implemented Model-View-Control (MVC) software architecture in web applications to view the html.
  • Used List, Trees, Toolbars, Menus and Context Menus for navigating between pages inWPF.
  • Developed Windows based GUI usingWPF,Expression Blend and done data binding using one way, two ways and one way to source data binding.
  • Developed and supported database Schema and stored procedures.
  • Assisted in implementing GIS tracking of Vehicle for e-Dispatch.
  • Wrote different stored procedures and schema.

We'd love your feedback!