Highly effective data enthusiast with years of proven experience developing end - to-end data engineering and analytics solutions from scratch for giants like Confidential, Confidential and the CDC. Striving to offer best possible data solutions amid business constrains by leveraging my diverse data background, cutting edge technology in tandem with the technical skills obtained through academia. Looking to advance my career growth through data engineering, architecture and business intelligence projects.
- R, Python, SQL
- Amazon Web Services (AWS)
- Google Cloud Platform (GCP)
- Machine Learning Regression and Classification Modeling
- Natural Language Processing
- Forecasting/Network Optimization
- Descriptive Analytics
- Migration: on-premises to Google Cloud Platform
- Workflow: cross-platform through ODBC/API
- Dash and Plotly
- R Shiny
- Agile Development:
- Kanban workflow management
- Jira (Confluence Workspace)
Data Scientist/ Engineer
Confidential, Atlanta, GA
- Contributed to the technical development under Customer Technology Solutions team that is responsible for extending the customer delivery functionality of the Confidential One app.
- Developed customer demand forecast and ETA (machine learning) model along with current state visibility metrics to prioritize Confidential One app feature implementation and development by developing scripts in Python and SQL from scratch.
- Assisted in production level maintenance by identifying and documenting data anomalies in production Amazon Database Relational Services (AWS) tables.
- Collaborated with technical and business partners to identify business requirements, workflow gaps and proposed solutions following Kanban software development principles.
Confidential, Atlanta, GA
- Applied Big Data Engineering and Advanced Analytics techniques to support multiple Confidential Supply Chain Delivery Analytics and Channel Management teams.
- Collaborated with Sr. Leadership | Principal Engineers | Google Cloud Analytics Implementation Team to derive best practices for working with Big Data across the entire organization as it transitioned from on-premises data warehousing to Google Cloud Platform.
- Developed pioneering dataflow solutions in R, Python and SQL across multiple cloud platforms like GCP, Tableau and MS SharePoint from scratch.
- These solutions contributed directly to:
- Performance of www.homedepot.com website.
- Warehouse Inventory Planning calculations.
- Automated delivery of dashboards and metrics to Supply Chain Leadership.
- Data migration tools built to scale up to billions of rows of data (~16TB).
- Worked with Google Cloud Analytics Implementation Team
- Tested Python and R kernels on JupyterHub, Google Datalab, as well as the in-house cloud application (Analytics Workbench).
- Invited as an R ‘super-user’ and SME speaker to present R capabilities at Confidential .
- Recipient of internal ‘Best Business Partner’ award.
- Hosted multiple applied R and Python Workshops (Beginner and Advanced) for Analysts and Managers across all Supply Chain teams.
- Designed and developed an inventory management strategy for 60+ distribution facilities directly contributing to Confidential One Supply Chain initiative that is aimed to reshape legacy distribution network.
- Collaborated with Sr. Leadership to identify the current and future state of the inventory management strategy across multiple distribution network channels (E-Commerce and Confidential Pro).
- Leveraged Google Cloud Platform, R and advanced analytics techniques to translate the business defined inventory strategy into actionable output from scratch.
- Identified opportunities of providing customers with more accurate ETA communication during downstream delivery by designing and developing new analytical workflow.
- Partnered with cross-functional teams to identify the shortcomings of the existing process and evaluated the limitations of the last mile delivery routing tool (vendor software).
- Developed ensemble learning models (boosting/bagging) in R/SQL from scratch and compared the results to existing software to demonstrate performance advantages of the new model.
- Advised Sr. Leadership on how to leverage current organization partnerships to develop a scalable and robust tool that would improve customer service.
- Applied ML and NLP techniques to generate business insight from data sources containing unstructured text data (customer reviews, delivery notes, etc.) for multiple Delivery Analytics teams that focused on improving customer experience during online purchases.
- Collaborated with MSEs and Sr. Leadership to identify specific delivery exceptions and key operational challenges in order to drive an on-going action plan for improvement.
- Designed and developed forecasting and network optimization models at multiple levels of granularity to identify opportunities that focused on improving delivery speed to customers while reducing logistics costs.
- Provided multiple ADHOC BI reporting solutions through unsupervised ML (clustering), supervised ML (Classification and Regression) and other statistical tests (ex.T-test). Bridged the process implementation gaps between business and IT partners (user auths/IT requirements) to insure timely execution of new methods.
- Developed multiple R Shiny applications to enhance non-technical user capabilities and create data visualization tools.
Data Analyst/IT Specialist
Confidential, Atlanta, GA
- Supported the Emergency Operations Center during Zika Outbreak at Confidential .
- Provided subject matter expertise on project:
- Collaborated with branch leadership, project SME and functional users to form a clear understanding of project objectives on a weekly basis.
- Identified IT resources available for project development to meet all standards of the federal government organization.
- Obtained official agency certifications and authentications necessary by working directly with Sr. Database Administrators and Information System Security Officers to launch the project development stage.
- The project was designed using ‘Agile Mythologies’ to allow for flexibility during fluctuating priorities.
- Designed a system which focuses on efficiency and the continuous evolving ability to facilitate future developmental stages.
- Developed RDBMS snowflake schema for encrypted SQL Server 2012 warehouse with utilization of Change Data Capture.
- Used R, SQL, Perl, VBA to implement and automate ETL processes:
- FTP and SFTP
- Extracted structured and unstructured data using R from multiple sources and formats (SAS, .csv, .xlsx).
- Wrangled unstructured data using R as the main programming language.
- Created QC dashboards/reports to provide anomaly feedback to the functional users.
- Implemented automation and script interchangeability to reduce human error, ease of use, and data integrity.
- Integrated internal tracking process, which provided functional users the ability to manipulate and add comments for the incoming data on record level.
- Created MS Access user interface for internal tracking.
- Used natural language processing to aid functional users in their task.
- Wrote R scripts to load incoming data into the warehouse (SQL Server 2012).
- Wrote R scripts to query and mine data in the warehouse.
- Created QC dashboard to help identify code errors.
Confidential, Atlanta, GA
- Analyzed, archived and managed DNA and WGS fingerprint data to detect clusters of foodborne outbreaks.
- Assisted epidemiologists and PulseNet Database Administration Team in the investigation and control of foodborne outbreaks.
- Trained state and federal personnel on the use of BioNumerics and Cluster detection skills.
- Performed QA/QC and data summary reports and other administrative duties.
- InFORM 2015 Conference Organizational Committee member.