Azure Data Engineer Resume

SUMMARY

Above 8+ years of professional experience in IT, which includes work experience in Big Data, Hadoop ecosystem for data processing, Data Warehousing and Data Pipeline design and implementation.
Experienced with the Spark improving the performance and optimizations
Experience with Azure transformation projects, Azure architecture decision - making, and implemented ETL and data movement solutions using Azure Data Factory (ADF).
Proficient in data governance, Experience in Data Ingestion projects to inject data into Data Lake using multiple sources systems using Talend BigData
Experience in creating real time data streaming solutions using Apache Spark/ Spark Streaming/ Apache Storm, Kafka and Flume
Exposure to Data Lake Implementation and developed Data pipelines and applied business logic utilizing Apache Spark
Hands-on experience with Azure Databricks, Azure Data Factory, Azure SQL, and PySpark.
Experience in integration of various data sources with multiple Relational Databases like SQLServer, Teradata, and Oracle.
Proficient in data governance, Experience in Data Ingestion projects to inject data into Data Lake using multiple sources systems using Talend BigData.
Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch, data quality, metadata management, master data management.
Experience in working with creating ETL specification documents, & creating flowcharts, process work flows and data flow diagrams.
Experience in execution of Batch jobs through the data streams to SPARKStreaming.
Good knowledge in streaming applications using ApacheKafka.
Hands on experience in working with Tableau Desktop, Tableau Server and Tableau Reader in various versions.
Extending HIVE and PIG core functionality by using custom UDF's.
Good knowledge on AWS Services like EC2, EMR, S3, Glue Catalog, and Cloud Watch.
Experience in using SparkSQL to handle structureddatafrom Hive in AWSEMR Platform (M4.Xlarge,M5.12Xlarge clusters).
Experience in designing both time driven and data driven automated workflows using Oozie.
ImplementedSparkusing python andSparkSQL for faster processing of data and algorithms for real time analysis inSpark.
Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.

TECHNICAL SKILLS

Programming languages: SQL,PL/SQL,Python, R, Scala, Spark, shell scriptsBig Query

Databases: NoSQL, Oracle, DB2, MySQL, SQL Server, MS Access, HBase

OLAP & ETL Tools: Tableau, Spyder, Spark, SSIS, Informatica Power Center, Pentaho, Talend,Power BI Desktop/server

Data Modelling Tools: Microsoft Visio, ER Studio, Erwin

Application Servers: WebLogic, WebSphere, Apache Tomcat, JBOSS

Amazon Web Services: EMR, EC2, S3, RDS, Cloud Search, Redshift, Data Pipeline, Lambda.

Reporting Tools: JIRA, MS Excel, Tableau, QlikView, Qlik Sense, SSRS, SSIS

Development Methodologies: Agile, Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential

Azure Data Engineer

Responsibilities:

Designing and maintaining ADF pipelines with activities - Copy, Lookup, For Each, Get Metadata, Execute Pipeline, Stored Procedure, If condition, Web, Wait, Delete etc
Configured the logic apps to handle email notification to the end users and key shareholders with the help of web services activity.
Create Notebooks using Databricks, Scala and spark and capturing the data from Delta tables in Delta lakes.
Created Azure Data Factory and managing policies for Data Factory and Utilized Blob storage for storage and backup on Azure.
Extensive knowledge in migrating applications from internal data storage to Azure. Experience in building streaming applications in Azure Notebooks using Kafka and Spark.
Captured SCD2 and updated or inserted or deleted based on Business requirement using Databricks.
Develop up the Framework for creation of new snapshots and deletion of old snapshots in Azure Blob Storage and worked on setting up the life cycle policies to back the data from delta lakes.
Expert in building the Azure Notebooks functions by using Python, Scala and Spark.
Managed Azure Data Lakes (ADLS) and Data Lake Analytics and an understanding of how to integrate with other Azure Services.
Used Databricks Spark Jobs for the ETL and business transformation.
Coordinate with Business Analyst and enterprise architect to convert the business requirement to technical solution.
Managed Azure Data Lakes (ADLS) and Data Lake Analytics and an understanding of how to integrate with other Azure Service.
Designed and implemented database solutions in Azure SQL Data Warehouse, Azure SQL
Review and audit of existing solution, design, and system architecture.
Lead systems implementations and detailed functional/technical system design.
Used Pig as ETL tool to do transformations, joins and some pre-aggregations before storing the data into HDFS.
Creating data share that will load tables between two DB servers. Mostly truncate and load.
Created Linked Services for multiple source system (i.e.: Oracle, SQL Server, ADLS, BLOB and File Storage).
Wrote MapReduce jobs to generate reports for the number of activities created on a day, during a dumped from the multiple sources and the output was written back to HDFS.
Support in deployment activities and ensuring all requested components are migrated into production environment using Microsoft TFS which is used as a project management tool. Migration components includes windows service, configuration loaders, azure blob objects, data lake, web jobs, azure data warehouse components.
Reduced dependency on IT Teams by implementing full-fledged Data Ingestion using ADF, Azure Functions and Web Jobs.

Environment: SQL, Oracle12c, PL/SQL, Bigdata, Hadoop, Azure Data Lake, Spark, Scala, APIs, Pig, Python, Kafka, HDFS, ETL, MDM, OLAP, OLTP, SSAS, T-SQL, Hive, SSRS, Tableau, Map Reduce, Sqoop, Scala, HBase, SSIS.

Confidential

Data Engineer

Responsibilities:

Followed Test driven development of Agile Methodology to produce high quality software.
Designed and developed a horizontally scalable APIs using Python Flask.
Conducted JAD sessions, wrote meeting minutes and also documented the requirements.
Worked with cloud providers and API's for Amazon (AWS) EC2, S3, VPC.
Worked with Data ingestion, querying, processing and analysis of big data.
Performed tuned and optimized various complex SQL queries.
Developed normalized Logical and Physical database models to design OLTPsystem.
Extensively involved in creating PL/SQL objects i.e. Procedures, Functions, and Packages.
Performed bug verification, release testing and provided support for Oracle based applications.
Used Model Mart of Erwin for effective model management of sharing, dividing and reusing model information and design for productivity improvement
Extensively used Hive optimization techniques like partitioning, bucketing, Map Join and parallel execution.
Worked with Real-time Streaming using Kafka and HDFS.
Worked with Alteryx a data Analytical tool to develop workflows for the ETL jobs.
Designed the data marts in dimensional data modeling using star and snowflake schemas.
Wrote, tested and implemented Teradata Fastload, Multiload, DML and DDL.
Used various OLAP operations like slice / dice, drill down and roll up as per business requirements. .
Handled importing of data from various data sources, performed data control checks using Spark and loaded data into HDFS.
Designed and implemented importing data to HDFS using Sqoop from different RDBMS servers.
Worked with Sqoop commands to import the data from different databases.
Designed and developed Map Reduce jobs to process data coming in different file formats like XML..

Environment: Erwin, SQL, PL/SQL, Kafka, Spark, API's, Agile, ETL, HDFS, OLAP, HDFS, AWS, EMR, Glue, Teradata, Hive, Sqoop, Tableau, Map Reduce.

Confidential

Data Analyst

Responsibilities:

As a Data Analyst /Data Modeler I was responsible for all data related aspects of a project.
Created report on Cloud based environment using Amazon Redshift and published on Tableau
Developed of PythonAPIs to dump the array structures in the Processor at the failure point for debugging.
Worked extensively on ER/ Studio in several projects in both OLAP and OLTP applications.
Created SQL tables with referential integrity and developed queries using SQL, SQL*PLUS and PL/SQL.
Performed Data Analysis and data profiling using complex SQL on various sources systems including Oracle.
Developed the required data warehouse model using Star schema for the generalized model
Implemented Visualized BI Reports with Tableau.
Worked on stored procedures for processing business logic in the database.
Extensively worked on Viewpoint for Teradata to look at performance Monitoring and performance tuning.
Performed Extract, Transform and Load (ETL) solutions to move legacy and ERP data into Oracle data warehouse.
Developed and maintained data dictionary to create metadata reports for technical and business purpose.
Managed database design and implemented a comprehensive Snow flake-Schema with shared dimensions.
Worked Normalization and De-normalization concepts and design methodologies.
Worked on the reporting requirements for the data warehouse.
Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services).
Developed complex T-Sql code such as Stored Procedures, functions, triggers, Indexes, and views for the business application.
Wrote a complex SQL, PL/SQL, Procedures, Functions, and Packages to validate data and testing process.

Environment: ER/ Studio, SQL, Python, APIs, OLAP, OLTP, PL/SQL, Oracle, Teradata, BI, Tableau, ETL, SSIS, SSAS, SSRS, T-SQL,..

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship