Data Scientist /data Analyst Resume
Detroit, MichigaN
SUMMARY
- 12+ years of progressive experience in Software Development includes application architect, administration, design and developmentin US Finance, Healthcare, Services, Public Safety, And Utilitysector.
- 3+ years in Big data/ Hadoop experience in Hadoop ecosystem such as Hive, Pig, Flume, Sqoop, Zookeeper, SPARK, Kafka, Impala, MapReduce.
- 1+ year in Data Science (ML) experience in building statistical models by applying classification & regression algorithms using various ML packages (Scikit - Learn).
- Understanding to identify the viability of a business problem for a big data solution. Defining a logical architecture of the layers and components of a big data solution like data capacity planning and node forecasting. Selecting the right products to implement a big data solution.
- Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, Oozie, Hive, Sqoop, Pig, SPARK, Cassandra and MongoDB and Flume.
- Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters & Cloudera Hadoop Distribution
- Prior experience working as Software Developer in Java/J2EE and related technologies such as JSP, Servlets, Hibernate, JDBC.
- Designed and implemented Stream processing pipeline workflow for user which update user’s data nearly real-time
- Good hands-on experience on data visualization tools such as Tableau, Zeppelin and PowerBI
- Experience in Data Analysis, Data Cleansing, Data Validation and Verification, Data Conversion, Data Migrations and Data Mining
- Experience working on LDAP user accounts and configuring ldap on client machines
- Experience in developing applications using Object oriented design and Development using Microsoft .Net technologies including ASP.NET, C#.Net, VB.NET, ADO.Net, WPF, WCF, Silverlight and XML for Web and Win Forms development
- Experience in utilizing SQL integration Services (SSIS), SQL Reporting services (SSRS), SQL Management studio and SQL tools
TECHNICAL SKILLS
Big Data Platform: Hortonworks (HDP 2.2)/AWS (S3, EC2)/Cloudera (CDH3)
Data Science Platform/Tools: Anaconda Continuum, Python, R
OLAP Concepts: Data warehousing, Data mining concepts
Apache Hadoop Yarn 2.0: HDFS, Pig, Hive, Sqoop, Zookeeper, Oozie
Real Time Data Streaming: Kafka, Spark (Streaming)
Source Control: GitHub, VSS, TFS
Databases and NoSQL: MS SQL Server 2012, Oracle 11g (PL/SQL) and MySQL 5.6, MongoDB
Data Visualization Tools: Tableau, Zeppelin,PowerBI, Matplotlib, ggplot2
Development Methodologies: Agile and Waterfall
Development Tool: Eclipse, Toad, Visual Studio
Programming Languages: Java, .Net
Scripting Languages: JavaScript, JSP, Python, XML, HTML
PROFESSIONAL EXPERIENCE
Data Scientist /Data Analyst
Confidential, Detroit, MichiganResponsibilities:
- Build Statistical Models around the loan origination to detect fraud for financial risk assessment.
- Analyze large-scale loan datasets and employed state-of- the-art machine learning methods using Scikit-Learn(Python) for categorical predictionand continuous regression for the risk analysis
- Reduce key features and proposed derived features on loan dataset(s), then trained an ensemble Naive Bayes - random forest - Logistic Regression for better accuracy metrics for model building
- Analyze unstructured voice recordings from call centers,blogs,customer feedbacks, and recommend ways to reduce customer churn, up-sell and cross-sell various financial products.
Sr. Hadoop Developer/Admin
Confidential, Trenton, NJ
Responsibilities:
- Designed & build data pipeline which stream data from client apps using web-sockets to server and from there Kafka Consumer which consumes that data and write to HDFSdata store. From HDFS store different spark jobs are reading this data using Spark-SQL & processing data in stream & batches jobs
- Shared responsibility for administration of Hadoop, Hive and Pig
- Managed, reviewed and interpreted Hadoop log files. Involved with the application teams to install Hadoop updates, patches and version upgrades as required
- Leadership: Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Sqoop. Responsible for building scalable distributed data solutions using Hadoop
- Data Ingestion: Involved in importing and exporting data (SQL Server, Oracle, csv and text file) from local/external file system and RDBMS to HDFS. Load log data into HDFS using Flume
- ETL Data Cleansing, Integration & Transformation using Pig: Managing data from disparate sources.
- Exported analyzed data to the relational databases using Sqoop for visualization & Report generation
- Data Warehousing: Designed a data warehouse using Hive, created and managed Hive tables in Hadoop
- Workflow Management: Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig
Hadoop Developer/Admin
Confidential, Atlanta, GA
Responsibilities:
- Built ETL workflow to process sales and marketing data on hive tables
- Importing the data from the MySql and Oracle into the HDFS using Sqoop
- Experienced on loading and transforming of large sets of structured and unstructured data
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS
- Created Hive Internal and External tables and loaded the data in to tables and query data using HQL
- Used Hue for UI based PIG script execution, Oozie scheduling and creating tables in Hive
- Deployment of applications using AWS EC2
- Written Map Reduce java programs to analyze the log data for large-scale data sets
- Participated in building CDH4 test cluster for implementing Kerberos authentication.Installing Cloudera manager and Hue
- Project Management activities like planning, resource allocations, tracking and people management
- Created use case, class, package, sequence diagrams using MS Visio.
Sr Application Developer
Confidential, Huntsville, AL
Responsibilities:
- Application framework designing and development including database designing.
- Designed and developed data pipeline using SQL Server 2010, SSIS, SSAS and related technologies.
- Design, Develop and Publish web portal, which would allow Confidential customers to maintain and communicate with several of the hardware devices.
- Provided data modeling and database design using ER-Studio and Enterprise Architect.
- Involved in various phases of Software Development Life Cycle (SDLC) of the application
- Developed user interfaces using JSP, HTML, XML and JavaScript which would allow Confidential customers to maintain and communicate with several of the hardware devices.
- Design of interface modules with Customer Information Systems (CIS) and Billing Systems.
- Design, Develop and support Prepay Management interface modules.
- Design of API for Confidential products which would allow third party vendors to control and manage Confidential hardware devices.
- Building SQL jobs which would check the health of the system daily.
- Building and automating test procedures for testing new firmware releases.
- Confidential to build various dashboards.
- Design and Develop GIS mapping solution.
- Implemented security in modules as Identity, Authentication, and Authorization
Application Developer
Confidential, Huntsville, AL
Responsibilities:
- Involved in analysis, design and development of dynamic web application.
- Developed and supported database Schema and stored procedures.
- Involved in developing uaSecurity and uaPermissions for the end user depending on counties.
- Developed Server Side Components using C#.
- Looked after Master Modules and different reports.
- Involved in documentation of various Modules.
- Created entire project in Subversion and created Ant build script for compiling and building the application for various environments
- ConfiguredWindows Communication Foundation (WCF)service to authenticate clients with Windows credentials for intranet applications for login validations
- Involved in design and developed digital dashboard for Outage Management System.
- Calculation of Reliable Indices like CAIDI, CAIFI, SAIDI, and SAIFI for more reliable system.
- Made changes to uaNV Server for display of Map for Outage customer.
- Design and developed database schema and stored procedures.
- Set up User Interface, Coding and Design Guidelines.
- Involved in documentation of various Modules.
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML
- Involved in designing the system using uaOneLink.
- Made uaNV Server for display of map for Incident Tracking.
- Assisted in implementation of the project at client side.
- Wrote different stored procedures and schema.
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams
- Implemented Model-View-Control (MVC) software architecture in web applications to view the html.
- Used List, Trees, Toolbars, Menus and Context Menus for navigating between pages inWPF.
- Developed Windows based GUI usingWPF,Expression Blend and done data binding using one way, two ways and one way to source data binding.
- Developed and supported database Schema and stored procedures.
- Assisted in implementing GIS tracking of Vehicle for e-Dispatch.
- Wrote different stored procedures and schema.
