Sr. Data/ML Engineer Resume San Francisco - Hire IT People

SUMMARY:

Lead data engineer with over 8 years of experience in scalable and high performance Big Data, Machine Learning, OLTP and OLAP environment
Experience in Hadoop ecosystem and implemented end to end solutions on all the major Hadoop distributions like Hortonworks, MAPR and Cloudera
Experience in designing the pipeline to bring External data for example, Confidential atmospheric chemistry data and weather forecast data in Hadoop. Strong experience with both customer facing projects and R&D projects
Deep, intuitive understanding of core statistical concepts, such as probability, randomness, correlation and sampling distributions
Proven experience in service development (REST, API, Micro services)
Deep understanding of modern machine learning methods for regression and classification
Expertise in architecting the data pipelines to load weblogs, clickstream, impression data in Hadoop
Experience in building the recommender system using the Prediction IO and Collaborative filtering
Expertise in installing and creating ETL pipeline using Confidential data integration tool
Experience in working with the near real time data and process it via using Flume and Kafka
Experience in data migration, cleaning, transformation and loading from legacy sources to DWH
Experience in implementing Confidential Big Data Integration solution. Implemented solution for Visa, AMEX, Aldo, Stanford Research, Match.com, Truecar, Home Depot and TE connectivity
Experience working with reporting tools SAP business objects, SSRS and Tableau
Experience in designing and implementing OLTP database (RDBMS Modeling), ETL and Reporting for big clients like US Mint, Forest Labs, Johnson & Johnson, Activision and Samsung
Helped Confidential in growing the business from 3 million to 10 million across various clients
Expertise in Project Management i.e. Project Scoping, Planning, Estimating, Scheduling, Organizing and Budgeting
Expertise in defining roadmap(MRD/PRD) based on product differentiation by target segments & customer requirements.
Managed cross - functional teams and multi-disciplinary projects across different geographical locations.
Expertise in negotiating deals of high complexity with creative solution of win-win for both parties.

TECHNICAL SKILLS:

Big Data: Hadoop Cloudera, HortonWorks, Pivotal, MapR distribution, Spark, Spark Streaming

Machine Learning: Apache Prediction IO, Convoluted Neural Network, NLP, Google Cloud ML, IBM Watson, Tensorflow, Keras, SparkML

Database/MPP: Greenplum, Vertica, Hawq, Hive, SQL Server, MySQL, Oracle, Spark SQL, MongoDB

Cloud: Google Cloud Platform, AWS

Languages: SQL, Shell scripting, Pig, Python, Scala

ETL Tools: Sqoop, SSIS, DataStage

Reporting Tools: SAP Business Objects, SSRS

SDLC Methodologies: AGILE - SCRUM, Waterfall

Project Management: MS Project, MS Office, Trello board, JIRA

PROFESSIONAL EXPERIENCE:

Confidential, San Francisco

Sr. Data/ML Engineer

Responsibilities:

Subject Matter Expert in Big Data, Machine Learning and Natural Language Processing
Developed ETL pipeline and algorithms of product recommendation for Macys.com. Recommendations system daily deals with the 90 GB of search, view and user purchase data. Product Recommendation system is projected to bring the revenue of $70 million for 2017
Implemented the uses cases like boosted product search, price boosting, prop card model, personalized deals and new arrivals on PredictionIO.
Set up the PredictionIO in the production environment which includes the technology stack of HBase, Spark and Scala based on the Cross-Occurrence algorithm
Working on tuning the existing product recommendation models for better performance.
Used universal recommender template of PIO to implement personalized recommendations, “Viewed this bought that”, item-based-cross-action, complementary purchases based on the product category hypothesis
Implemented flume to capture the events data in the Hadoop.
Working on supporting various data from different sources like Coremetrics data, Product catalog data, Store transaction data, User data, Pinterest data, Liked data, Events data for reporting, Marketing and Email Assembly data. Source system are Hadoop, DB2 and Oracle

Confidential, San Francisco

Big Data Engineer(Consultant)

Responsibilities:

Architected and lead technical solution for QA of Dredge data platform involving technologies - PIG and Spark
Gathered detailed business and technical requirements and participated in the definitions of business rules and data standards.
Worked with product stakeholders to create product roadmap. Coordinated with multiple teams to document use cases - click stream and impression data
Sentiment analysis and topics/content classification/categorization with deep learning
Processed clickstream data in Hadoop and moved the aggregated data to Vertica
Processed the impressions data in Vertica.
Designed the experimentation ETL’s using spark
Benchmarked the Hadoop cluster for the better performance

Confidential, San Ramon, CA

Big Data Evangelist

Responsibilities:

Involved in planning, estimation, scheduling and budgeting of external data project for data lake platform
Led the team of ETL engineers and architected the data pipelines using Confidential
Designed the pipeline to bring the external data from the NWS source using the Confidential Rest client in Hawq database and stored the data in Hawq parquet compression for the windmills.
Provided training for Confidential to 30 developers

Confidential, Redwood City, CA

Data Engineering Manager

Responsibilities:

Worked closely with presales team to deliver demos and POC. Delivered training and implemented ETL solutions
Actively worked with R&D team to create components in core Confidential product.

We provide IT Staff Augmentation Services!

Sr. Data/ml Engineer Resume

San, FranciscO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship