Sr. Data Scientist Resume Secaucus, NJ - Hire IT People

SUMMARY

Result - driven IT Professional with referable & strong expertise as Data Scientist with a passion for delivering valuable data through analytical functions, data retrieval methods and implement action-oriented solutions to complex business problems.
Data Professional with 8+ years of experience in Data Analysis, Machine learning, Artificial Intelligence, Data Visualization, ETL, Data Warehousing, Cloud services and Big Data Ecosystem.
Demonstrated expertise in all phases of CRISP-DM Methodology includes Business requirement, Data Collection, Data Modeling, Model Development and Model Deployment of Data Science ML Projects.
Bringing forth the ability to synthesize quantitative information and communicate effectively with Business.
Ability to analyze unstructured data from various sources like Google Maps API, Yelp, ArcGIS and competitor websites using Python web-scraping techniques.
Strong experience in Customer Analytics. Collaborated with Operations, Finance, Marketing, CRM and Web Analytics teams on multiple business initiatives by building Machine Learning models.
Proficient in writing and executing SQL queries in Spark Context and Snowflake.
Hands on experience in PySpark and creating Dataframes, applying operations like Transformations, Actions and built reports and Data Mining pipelines. Knowledge in Kafka Streaming.
Experience processing Big Data in Hadoop Architecture, leveraging HDFS Framework and components of its ecosystem like HIVE, Spark and Impala.
Hands-On working experience by joining multiple data sources like Oracle, Teradata, SQL Server, AWS Redshift and Snowflake during data collection phase as part of model building.
In-Depth understanding of choosing metrics in case of both Classification and Regression algorithms.
Experience in leveraging high compute AWS EC2 instances to speed up Feature Selection and Hyper Parameter Tuning stages of Machine Leaning modeling.
Immense working knowledge in dealing with datasets that possess Linear and Non-Linear relationships. Expert in feature engineering and statistical analyses.
Hands-on experience in building Time Series Forecasting models using SARIMAX and PROPHET algorithms.
Ability to generate insights from Data Visualizations using Power BI and Tableau to the business partners.
Good understanding in Text Processing & Image Processing concepts, Computer Vision & Natural Language Processing algorithms. Knowledge in AWS SageMaker and SparkML services.
Resilient mindset in problem solving and research capabilities.

TECHNICAL SKILLS

Methodologies: Waterfall, Agile and CRISP-DM

Machine Learning: Linear regression, Logistic Regression, Random Forests, Cross Validation, Naïve Bayes, K-Means Clustering and Model Selection, Feature Selection, Constraint Programming, Lookalike Modeling, Churn Prediction, Hyper-Parameter Tuning, NLP, TF-IDF, CNN and LSTM

BI Tools: Jupyter Notebooks, SAP BEx Analyzer, Business Objects, Microsoft Excel, Tableau, Power BI and ESRI ArcGIS

Programming: Python (Pandas, NumPy, Scikit-Learn, SciPy, Matplotlib, BeautifulSoup, Stats models, PySpark, Keras, Tensorflow, NLTK, Open CV, Skimage, PyTorch and Flask), SQL, HTML, XML and CSS

Cloud: AWS EC2, S3, EMR, Lambda, CloudWatch, Dynamo DB, IAM, Redshift and Snowflake

Databases: MS SQL Server, Oracle DB2, 1010data, Teradata and Dynamo DB, Mongo DB

Big Data Ecosystem: HDFS, Hue, Hive, Spark, Sqoop, Pig and Impala

PROFESSIONAL EXPERIENCE

Confidential - Secaucus, NJ

Sr. Data Scientist

Responsibilities:

Incorporated ESRI ArcGIS data variables to build a XGBoost machine learning model to predict annual store sales for New prospect locations in the Country.
Achieved an R2 of 79.05 against Consultant’s solution with 30% improvement in Cross Validation performance.
Replaced Consulting firm’s solution with in-house model, which was $100K cost to company.
Feature engineered walk score, bike score and livability score variables for each location of the Store. Utilized web scraping techniques to extract and analyze Competition data.
Satellite images from Google Maps API as a source data to develop an alternative computer vision neural network model using Python and Keras to classify worst/average/best performing categories which also complement the current In-House model.

Environment: Python, Scikit-Learn, Pandas, NumPy, Matplotlib, re, BeautifulSoup, ESRI ArcGIS, Keras, OpenCV, Skimage, Google maps API, SQL, Snowflake, AWS EC2, AWS IAM, Flask, Shell, Linux, MS Excel

Confidential - Secaucus, NJ

Sr. Data Scientist

Responsibilities:

Been part of organizational finance sales forecast consensus meeting with executives every quarter.
Aggregated Terabytes of Transaction data from the data lake using Spark SQL API for each channel.
Developed Time-series forecasting machine learning models for every channel (B&M, Web, & ADP) using Python and Stats models.
Selected a Parsimonious Model by iterating between Prophet and SARIMAX algorithms, tracking lowest Information Criteria from metrics like AIC and BIC Scores.

Environment: Python, Pandas, NumPy, Matplotlib, Stats models, Prophet, MS Excel, SparkConfidential - Secaucus, NJ

Sr. Data Scientist

Responsibilities:

Developed critical reports at Sales, Customer and SKU level, to make informed decisions, for Store-in-Store business initiative. Capitalized Snowflake for faster data retrieval.
Devised monthly ADP incremental analysis report, tracking metrics like customer penetration, subscription cancellation and probability of being active in ADP for both Store and Web channels.
Leveraged AWS EC2 instances to speed up the feature selection process in the Machine Learning model building pipeline.
Generated leads using Google Maps API’s for the Operations team to follow up with new business initiative on Small commercial businesses in the country.
Created dashboards to analyze customer shopping habits and sales transfer for the closed stores using advanced SQL querying and visualization tools like Power BI.

Confidential - Secaucus, NJ

Sr. Data Scientist

Responsibilities:

Restructured labor schedules for ALL B&M stores by combining Constraint programming with Genetic algorithms to code hard and soft constraints to build an Optimization machine learning ‘model.
Analyzed an estimated ROI of $16M annually by placing right talent in right selling time intervals and reducing labor from least performing stores.
Overhauled and automated the end-to-end Payroll process starting from schedule generation to organizing until emailing them to store, district and regional managers, which saved 100s of man-hours.
Deployed Machine Learning model using Flask on AWS EC2 Instance for Ops Team to create schedules for stores.
Integrated Employees data from Kronos, Foot Traffic IOT data from Retail Next in Spark using PySpark.
Collaborated with business stakeholders from Operations, Finance and Business Intelligence teams on the development of the model which increased productivity and cut unnecessary costs.

Environment: Python, Scikit-Learn, Pandas, NumPy, Matplotlib, PySchedule, SQL, Snowflake, AWS EC2, AWS IAM, Flask, Shell, Linux, MS Excel, Spark

We provide IT Staff Augmentation Services!

Sr. Data Scientist Resume

Secaucus, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship