Data Engineer Resume

SUMMARY

Around 6 years of work experience in IT consisting of Data Analytics Engineering & as a Programmer Analyst.
Experienced with cloud platforms like Amazon Web Services, Azure, Databricks (both on Azure as well as AWS integration of Databricks)
Proficient with complex workflow orchestration tools namely Oozie, Airflow, Data pipelines and Azure Data Factory, CloudFormation & Terraforms.
Implemented Data warehouse solution consisting of ETLs, On - premise to Cloud Migration and good expertise building and deploying batch and streaming data pipelines on cloud environment.
Worked on Airflow 1.8(Python2) and Airflow 1.9(Python3) for orchestration and familiar with building custom Airflow operators and orchestration of workflows with dependencies involving multi-clouds.
Leveraged Spark as ETL tool for building data pipelines on various cloud platforms like AWS EMRs, Azure HD Insights and MapR CLDB architectures.
Career Interest and future aspirations include but not limited to: ML, AI, RPA & Automation everywhere motives.
Spark for ETL follower, Databricks Enthusiast, Cloud Adoption & Data Engineering enthusiast in Open source community.
Proven expertise in deploying major software solutions for various high-end clients meeting the business requirements such as Big Data Processing, Ingestion, Analytics and Cloud Migration from On-prem to Cloud.
Proficient with Azure Data Lake Services (ADLS), Databricks & iPython Notebooks formats, Databricks Deltalakes & Amazon Web Services (AWS).
Orchestration experience using Azure Data Factory, Airflow 1.8 and Airflow 1.10 on multiple cloud platforms and able to understand the process of leveraging the Airflow Operators.
Developed and Deployed various Lambda functions in AWS with in-built AWS Lambda Libraries and also deployed Lambda Functions in Scala with custom Libraries.
Expertise understanding of AWS DNS Services through Route53. Understanding of Simple, Weighted, Latency, Failover & Geolocational Route types.
Expertise understanding of AWS Network and Content Delivery services through Virtual Private Cloud (VPC).
Hands on Expertise and Functionality knowledge of IPs, Access Control Lists, Subnets, NAT Instances & Gateways, VPC-Peering Custom VPCs and Bastions.
Hands on experience on Data Analytics Services such as Athena, Glue Data Catalog & Quick Sight.
Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena.
Writing CloudFormation Templates in JSON for Network and Content Delivery of the AWS Cloud Environment.
Addressing complex POCs according to business requirements from the technical end.
Writing test cases for achieving the Unit test accomplishment.
Active Agile team player in Production support, Hotfix deployment, Code Reviews, System Design & Review, Test cases, Sprint planning and Demos.
Effectively communicate with business units and stake holders and provide strategic solutions according to the client’s requirements.

PROFESSIONAL EXPERIENCE

Confidential

Data Engineer

Responsibilities:

Added value to Digital Manufacturing Data Products by contributing as an Insights Data Analyst building business driven sourcing solutions for based on Business users’ requirements.
Data Stack typically includes - AWS, Snowflake, DynamoDB, S3, RDSs, AI & ML Data exploration, RPA-co-relations & causations, Spark SQL, SQLs, Data Modeling, Tableau, Excel
Communicated data analytic finds for Digital Manufacturing, Audit Data, Distribution Centers analysis etc.
Design Review contributor to cross functional technical teams and communicating technical finds to Visualizations Developers.
Investigate Data Quality issues and generate presentable narratives based on biases possible due to incompleteness of data.
Good Understanding of Data ingestion, Airflow Operators for Data Orchestration and other related python libraries.
Worked on Python APIs calls and landed data to S3 from external sources.
Analyze Machine Data and create excel visualization plots as a story narrative for Business Users and Product Owners.
Used Tableau and Excel for Visualization charts etc. and regularly communicating finding with Product Owners.
Worked with Data Engineers and Data Scientists to help and understand the gaps between the Product Integrity Datasets and Digital Manufacturing by leveraging the Advanced Analytics.
Worked on Tableau Visualization Charts and daily status Dashboards.
Worked in an Agile Environment and indulged in Design Review and End-to-End UATs and assisted QA in automating test cases.
Unit testing, UAT testing and End-to-End automation design reviews.
Demonstrated good communication skills and story narratives while Sprint Demos to leadership and Stake holders.

Environment: PySpark, AWS, S3, Snowflake, Elastic Map Reduce, Tableau, Airflow, SQL, SSIS, Excel, DynamoDB, Snowflake, Python 3, Spark SQL, NumPy, Sci-kit, Pandas, Boto3, S3cmd etc.

Confidential

Data Engineer

Responsibilities:

Worked creating, developing & production support for AAS models on retail sales, POS, Corporate Forecast data by developing & deploying spark pipelines for ensuring continuous delivery of data from various cross-functional teams like Omni-channel, Demand-Supply, Global Logistics and deliver enterprise cleansed datasets to BI Engineering & Data Science teams.
Collaborate Solution Architects, Principle Engineers, Data Scientists, DevOps team, Data Governance & Business Analysts to understand the precise business needs of the acceptance criteria and ensuring the deliverables of data products leveraged for Corporate Forecasts and Forecast Accuracy biases by Business & Product owners.
Data Stack: Azure Databricks, ADLS, ADF, AAS, DAX, Azure Automation Accounts, Azure Active Directory(AD), Azure IAM security Groups, Pyspark, Spark SQL, Azure Data warehouse (ADW), Power BI, DAX coding, MSBI, SSAS, CI/CD and Production Support.
Performed end-to-end delivery of pyspark ETL pipelines on Azure-databricks to perform the transformation of data orchestrated via Azure Data Factory (ADF) scheduled through Azure automation accounts and trigger them using Tidal Schedular.
Created Azure AAS Models from scratch by designing the Dimensional tables derived from fact table and Normalize data to create a tabular model on top which PowerBI reports are generated by business users.
Created security groups through CI/CD process and associated the object-IDs to the user groups based on the business domain filters.
Used Enterprise GitHub and Azure DevOps Repos for version control. Good understanding of branching strategies while collaborating with peer groups and other teams on shared repositories.

Environment: Azure Databricks, Azure Databases, Azure Devops, Azure Repos, Pyspark, Delta-lake, Azure Data-warehouse, Tidal Scheduler, Azure Data Factory(ADF), Data Lake Storage (ADLS), Analysis Services (AAS), Databricks (DBRX), PowerBI, SQL Server Management Studio (SSMS),Azure Automation Accounts, Runbooks, Webhooks, SparkSQL.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship