Data Scientist Resume Pittsburgh, PA - Hire IT People

SUMMARY

12+ Years of Experience in Data Warehouse, Business Intelligence and Data Science - Big Data Analytics technologies
4+ Years of experience in Big Data Analytics with hands on experience in Data Extraction, Data Analysis, Data Loading, Data Visualization using Cloudera Platform (Sqoop, Flume, Pig, Hive, Hbase, Spark ), R and other platforms
Domain experience in Banking, Insurance, Retail, Telecom, Revenue Authority
Experience in data science including collecting data, clean data, exploratory data analysis, used machine learning algorithms for developing predictive models and created visualizations for making decisions
Hands on Experienced in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa, and analyzing data using Hive, Impala & Pig Latin Scripting
Experience in acquiring structured and un-structured data from a variety of sources, including relational databases, Web Scraping and loading to distributed databases such as HDFS on a Hadoop Platform.
Proficiency in Spark for loading data from the local file system, HDFS, Amazon S3, Relational and NoSQL databases and using Spark SQL, Import data into RDD and Ingesting data from a range of sources using Spark Streaming
Proficient in R Programming Language, Data extraction, Data cleaning, Data Loading, Data Transformation, Predictive Modeling and Data visualization using R and Tableau
Hands on experience on R and using Machine Learning Algorithms K means clustering, Random forests, Decision tree, Time-series, Regression, Clustering & Association Rules
11+ Years of experience in DWH and Business Intelligence implementations using Oracle ETL, BI Tools
Designed Enterprise Data Warehouse, Dimensional Models and BI Reporting Solutions
2+ Years of experience in ERP Applications that include SAP FICO & Oracle Financials Modules
11+ Years of experience in retail banking operations

TECHNICAL SKILLS

Big Data -Hadoop Ecosystem: Cloudera Platform - Sqoop, Flume, Pig, Hive, Hbase and Spark

RDBMS: Oracle 8.x/9.x/10.x, 11.x, MySQL 5.x, NoSQL- Mongo DB

ETL & BI Tools: OWB, Oracle BI Tools, Tableau 7.x/8.x

Data Modeling Tools: Oracle Data Modeler, Erwin

Programming Language: R, Python

OS: Windows 7/8/8.1, Linux

Other Tools: MS Project, MS Office Suite, AWS - Cloud Computing

PROFESSIONAL EXPERIENCE

Confidential, Pittsburgh, PA

Data Scientist

Responsibilities:

Worked with project team to understand the problem and business requirements
Worked with developers to extract data from HDFS to Spark shell for analysis
Imported data into R for exploring and understanding data
Exploring the data and data structures for developing model
Prepared data for creating training and test sets
Developed credit risk model to identify risky bank loans using decision tree algorithm
Communicated results using presentations and visualization

Environment: Linux, Hadoop, Hive, MySQL, Spark, R, R-Studio, Tableau

Confidential, New York

Data Scientist

Responsibilities:

Involved in extracting data from source to HDFS
Importing data from HDFS to Hive using Sqoop
Preparing data for exploratory analysis using data munging
Segmenting data by implementing k-means algorithm
Developed, Evaluated and improving the model performance
Deployed the model in production environment
Created visualization using R

Environment: Linux, Hadoop, MySQL, R, R-Studio

Confidential, Sterling, VA

Data Scientist

Responsibilities:

Gather requirements for various data mining projects
Worked with other team members and involved in development of the Hive/Impala scripts for extraction, transformation and loading of data
Involved in loading data from Hive and imported to R for data analysis and visualization
Responsible for preparing data and exploratory analysis for machine learning to develop models
Created standard data summaries, extracted subset of data and split data and created data partitions
Created various types of data visualizations using R and Tableau

Environment: CDH4, HDFS, Pig, Hive, Impala, Sqoop, LINUX, R, Tableau Desktop, Tableau Server

Confidential, Columbus, OH

Data Scientist

Responsibilities:

Involved in loading data from HDFS to Hive using Sqoop for Hive queries using Hive
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins
Created HBase tables to store various data formats of data coming from different portfolios
Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS)
Involved in importing data from Hive to R for data exploration and data cleaning for developing predictive models as per requirements
Developed predictive models in marketing for customer segmentation using R algorithms

Environment: Hadoop, Java, UNIX, HDFS, Pig, Hive, MapReduce, Sqoop, Hbase, LINUX, Flume, R

We provide IT Staff Augmentation Services!

Data Scientist Resume

Pittsburgh, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship