Data Scientist Resume Temple Terrace, FL - Hire IT People

SUMMARY:

Over all 7+ years of experience in Analyzing, Designing, Developing, Testing, Maintaining and Supporting Applications using R, Python, BigData, Had oop, Apache Spark, Scala, Hive, Sqoop, Tableau, PowerBuilder.
Drive use case analysis and architectural design around activities focused on meeting business requirements within the tools of the ecosystem.
Partner with Architecture, Development and Operational teams to define the architectural vision and direction of a Data Ecosystem to meet the modern data requirements which may comprise of a mix of Big Data Storage system such as Hadoop batch analytics, near - time analytics platforms and NoSQL Online application access.
Design, and develop automated test cases to verify solution feasibility and interoperability, including performance assessments.
Data warehousing and Relational Database Design with MS SQL server, Oracle and MySQL.
Extensive experience in Data Visualization including producing tables, graphs, listings using various procedures and tools such as Tableau.
Used advanced analytical techniques to segment our customers into actionable segments/micro-segments enabling more holistic customer strategy and experience
Experience with Hadoop Reference Architectures associated with AWS, Azure, HP, VMWare Infrastructure.
Proficient in Python, R and Tableau used in data analysis/ mining, various analytics and data visualization implementations.
Good knowledge in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.
Experience and knowledge in using various Python packages for Data Science such as NumPy, Scipy, Pandas, Matplotlib and Scikit-learn.
Experience in using various R packages for Data Science such as ggplot2, tidyr, Dplyr, caTools, rpart and MASS.
Experience with analyzing online user behavior, Conversion Data (A/B Testing) and customer journeys, funnel analysis.
Excellent Analytical and Communication skills required to effectively work in the field of applications development and maintenance.

TECHNICAL SKILLS:

RDBMS: SQL Server 2000/2005/2008/ R2/2012/2014, Oracle 9i/10g/11g MySQL, MS Access

Languages: Visual Basic, C, C++, R, Python, Scala

Data Warehousing/BI: Excel, SharePoint, Tableau

Big Data: Hadoop, Spark/Scala, Hive, Sqoop

NOSQL: Cassandra, HBase

Machine Learning: R, Python, Spark Mlib

Operating System: Windows, UNIX, Linux

PROFESSIONAL EXPERIENCE:

Confidential, Temple Terrace, FL

Data Scientist

Responsibilities:

Evaluating the data analytics opportunities to improve the efficiency of claims handling process like Fraud Detection
Utilized various data analysis and data visualization tools to accomplish data analysis, report design and report delivery.
Create statistical models based on researched information to provide conclusions that will guide the company and the industry into the future.
Taking care of missing data after import and encoding the categorical data, when needed.
Splitting the data into training set, test set and scaling the data in training set and test set, if necessary.
Creatively communicated and presented models to business customers and executives, utilizing a variety of formats and visualization methodologies.
Impact of marketing tactics on sales and then forecast the impact of future sets of tactics.
Developed Scala and SQL code to extract data from various databases
Used R and python for Exploratory Data Analysis and Hypothesis test to compare and identify the effectiveness of Creative Campaigns.
Used Scala, Python, R and SQL to create Statistical algorithms involving Linear Regression, Logistic Regression, Random forest, Decision trees, Support Vector Machine for estimating the risks.
Developed statistical models to forecast inventory and procurement cycles.
Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior.
Created pipelines for data ingestion and from various channels, through the scripts written in Hive & Java.
Work with a range of proprietary, industry standard, and open source data stores to assemble and organize and analyze data.
Mapped customers to revenue to predict the revenue (if any) from a new prospective customer.
Visualizations, Summary Reports and Presentations using R and Tableau.
Uploaded data to Hadoop Hive and combined new tables with existing databases.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
Developed pyspark code and Spark-SQL/Streaming for faster testing and processing of data .
Supported Map Reduce Programs those are running on the cluster.
Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data .
Scheduled jobs and workflow scheduler to manage Hadoop jobs.
Loaded the aggregated data into Data Mart for reporting, dash boarding and ad-hoc analysis using Tableau and developed a self-service BI solution for quicker turnaround of insights.
Maintained SQL scripts to create and populate tables in data warehouse for daily reporting across departments.

Environment: R 3.x, Python 2.x, Tableau 9, SQL Server 2012, Spark/Scala, SBT, Hive, Sqoop, Spark ML.

Confidential, Chicago, IL

Data Scientist

Responsibilities:

Performed Data Profiling to learn about behavior with various features such as traffic pattern, location, time, Date and Time etc.
Application of various machine learning algorithms and statistical modeling like decision trees, regression models and K-Means using Python and R.
Developed clinical NLP methods that ingest large unstructured clinical data sets, separate signal from noise, and provide personalized insights at the patient level that directly improve our analytics platform .
Used NLP methods for information extraction, topic modeling, parsing, and relationship extraction .
Worked with NLTK library for NLP data processing and finding the patterns.
Used clustering technique K-Means to identify outliers and to classify unlabeled data.
Ensured that the model has low False Positive Rate.
Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior.
Worked on Natural Language Processing with NLTK module of python for application development and automated customer response.
Utilized statistical Natural Language Processing for sentiment analysis, mine unstructured data, and create insights.
Worked on feature engineering such as feature creating, feature scaling and One-Hot encoding with Scikit-learn.
Performed Logistic Regression, Random forest, Decision Tree, SVM to classify package is going to deliver on time for the new route.
Implemented public segmentation by implementing k-means algorithm.
Implemented rule-based expertise system from the results of exploratory analysis and information gathered from the people from different departments.
Generated detailed report after validating the graphs using R, and adjusting the variables to fit the model.
Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data.
Written MapReduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase - Hive Integration.
Created SQL tables with referential integrity and developed advanced queries using stored procedures and functions using SQL server management studio.
Used packages like Dplyr, tidyr& ggplot2 in R Studio for Data visualization and generated scatter plot and high low graph to identify relation between different variables.
Created various types of data visualizations using Python and Tableau.
Communicated the results with operations team for taking best decisions.
Collected data needs and requirements by Interacting with the other departments.

Environment: R, Python 2.x, Linux, Tableau Desktop, SQL Server.

Confidential, Land O Lakes, FL

Data Scientist

Responsibilities:

Analyze and Prepare data, identify the patterns on dataset by applying historical models. Collaborating with Senior Data Scientists for understanding of data
Perform data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in R
This project was focused on customer segmentation based on machine learning and statistical modelling effort including building predictive models and generate data products to support customer segmentation
Used R and Python for programming for improvement of model. Upgrade the entire models for improvement of the product
Develop a pricing model for various product and services bundled offering to optimize and predict the gross margin
Built price elasticity model for various product and services bundled offering
Under supervision of Sr. Data Scientist performed Data Transformation method for Re scaling and Normalizing Variables
Developed predictive causal model using annual failure rate and standard cost basis for the new bundled service offering
Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and product recommendation and allocation planning
Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionality reduction etc.
Worked with sales and Marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions
P rototyping and experimentation ML/DL algorithms and integrating into production system for different business needs
Worked on Multiple datasets containing two billion values which are structured and unstructured data about web applications usage and online customer surveys
Segmented the customers based on demographics using K-means Clustering
Explored different regression and ensemble models in machine learning to perform forecasting
Presented Dashboards to Higher Management for more Insights using Power BI
Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring
Performed Boosting method on predicted model for the improve efficiency of the model
Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom, visualization tools using R, Tableau, and Power BI

Environment: MS SQL Server, R/R studio, SQL Enterprise Manager, Python, Red shift, MS Excel, Power BI, Tableau, T-SQL, ETL, MS Access, XML, MS office 2007, Outlook, AS E-Mine.

Confidential

Data Analyst

Responsibilities:

Extensively worked on Informatica PowerCenter Transformations such as Source Qualifier, Lookup, Filter, Expression, Router, Joiner, Update Strategy, Rank, Aggregator, Sequence Generator etc.
Proficiency in using Informatica PowerCenter tool to design data conversions from wide variety of sources.
Proficient in using Informatica workflow manager, Workflow monitor to create, schedule and control workflows, tasks, and sessions.
Created pivot tables and ran VLOOKUP's in Excel as a part of data validation.
Used Informatica PowerCenter for extraction, loading and transformation (ETL) of data in the data warehouse.
Worked on data analysis, data discrepancy reduction in the source and target schemas.
Designed and developed complex mappings, from varied transformation logic like Unconnected and Connected lookups, Router, Filter, Expression, Aggregator, Joiner, Update Strategy and more.
Preparation of System requirements (SRS), Database specifications (DBS), Software design document (SDD).
Responsible for the maintenance of few applications in PowerBuilder 10.2
Involved in using SQL Server 2005 for fixed the production issues in the background.
Coordination and Quality activities on delivery
Involved in testing with validation of all fields, functions, programs, agents from front end and back end code reviews across the application.
Involved in preparation program specifications, unit tests, test cases and user manual documents.

Environment: Informatica 8.x, PowerBuilder 10.2, SQL Server 2005.

Confidential

Data Analyst

Responsibilities:

Built time series models with ARIMA in R to make budget forecasting
Developed risk assessment models by using Decision Trees and Analytic Hierarchy Process
Designed and maintained comprehensive dashboards and metrics to enable real-time business decisions
Coded SQL queries to extract data and identify granularity issues and relationships between datasets and recommended solutions
Involved in manipulating, cleansing & processing of data using Excel, Access and SQL
Compared the source data with historical data to perform statistical analysis
Performed data preprocessing and data cleaning, collected and organized data

Environment: MS Access, R, MS Excel, ETL.

We provide IT Staff Augmentation Services!

Data Scientist Resume

Temple Terrace, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship