Azure Data Engineer Resume

SUMMARY

Having 8 years of experience in Information Technology worked on Dataware housing and providing solutions develop/maintain/support the client requirements.
3+ years of experience as Azure Cloud Data Engineer in Microsoft Azure Cloud technologies including Azure Data Factory(ADF), Azure Data Lake Storage(ADLS), Azure Synapse Analytics (SQL Data warehouse), Azure SQL Database, Azure Analytical services, Polybase, Azure Cosmos NoSQLDB, Azure Key vaults, Azure HDInsight BigData Technologies like Hadoop, Apache Spark and Azure Data bricks.
4+ Years of experience in Teradata Database design, implementation and maintenance mainly in large scale Data Warehouse environments, experience in Teradata RDBMS using FastLoad, MultiLoad, TPump, FastExport, Teradata SQL Assistance, Teradata Parallel Transporter( TPT ) and BTEQ Teradata utilities.
Well - versed experienced in designing Azure Cloud Architecture and Implementation plans for hosting complex application workloads on MS Azure.
Hands on experience in creating pipelines in Azure Data Factory V2 using activities like Move &Transform, Copy, filter, for each, Get Metadata, Lookup, Data bricks etc.
Extensively working in reading Continuous json data from different source system using EventHub into various downstream systems using stream analytics and Apache spark structured streaming (Databricks).
Hands on experience working with different file formats like json, csv, avro, parquet etc.. using Databricks and Data Factory.
In-depth knowledge of Spark Architecture including Spark Core, S park SQL, Data Frames and Spark Streaming.
Extensive knowledge and Hands on experience implementing cloud datalakes like Azure DataLake Gen1 and Azure Datalake Gen2.
Expert in Coding SQL, Stored Procedures, Macros, Functions and Triggers.
Extensively worked on various database versions of Teradata 14/15/15.10/16/16.20 and AWS Intellicloud system.
Involved in full lifecycle of various projects, including requirement gathering, system designing, application development, enhancement, deployment, maintenance and support.
Expertise in OLTP/OLAP System Study, Analysis and E-R modeling, developing Database Schemas like Star schema and Snowflake schema (Fact Tables, Dimension Tables) used in relational, dimensional and multidimensional modeling.
Experienced working in versatile software development methodologies including Agile/Scrum, Iterative and Waterfall SDLC methodologies.

TECHNICAL SKILLS

ADFv2, Blob Storage, ADLS, Azure SQL DB, SQL server, Azure Synapse, Azure Analysis Services, Databricks, Mapping Dataflow (MDF), AzureCosmos DB, Azure Stream Analytics, Azure Event Hub, Logic Apps, Event Grid, Azure DevOps, ARM Templates
FastLoad, FastExport, MultiLoad, Tpump, TPT, Teradata SQL Assistant, BTEQ
Informatica Power Center 9.x/ 8.6/8.5/8.1/7 , DataStage 11.x/9.x, SSIS
PySpark, Python, U-SQL, T-SQL, LINUX Shell Scripting, AZURE PowerShell, Java
Azure SQL Warehouse, Azure SQL DB, Azure Cosmos No SQL DB, Teradata, Vertica, RDBMS, MySQL, Oracle, Microsoft SQL Server
Hadoop,HDFS, Hive, Apache Spark, Apache Kafka

PROFESSIONAL EXPERIENCE

Confidential

Azure Data Engineer

Responsibilities:

Involved in business Requirement gathering, business Analysis, Design and Development, testing and implementation of business rules.
Creating pipelines, data flows and complex data transformations and manipulations using Azure Data Factory(ADF) and PySpark with Databricks.
Created, provisioned multiple Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters.
Designing and Developing Azure Data Factory (ADF) pipelines to extract the data from Relational sources like Teradata, Oracle, SQL Server, DB2 and non-relational sources like Flat files, JSON files, XML files, Shared folders etc.
Developed streaming pipelines using Apache Spark with Python.
Develop Azure Databricks notebooks to apply the business transformations and perform data cleansing operations.
Develop Databricks Python notebooks to Join, filter, pre-aggregate, and process the files stored in Azure data lake storage.
Ingested huge volume and variety of data from disparate source systems into Azure DataLake Gen2 using Azure Data Factory V2.
Created reusable pipelines in Data Factory to extract, transform and load data into Azure SQL DB and SQL Data warehouse.
Implemented both ETL and ELT architectures in Azure using Data Factory, Databricks, SQL DB and SQL Data warehouse.
Experienced in developing audit, balance and control framework using SQL DB audit tables to control the ingestion, transformation and load process in Azure.
Used Azure Logic Apps to develop workflows which can send alerts/notifications on different jobs in Azure.
Used Azure Devops to build and release different versions of code in different environments.
Automated jobs using Scheduled, Event based, Tumbling window triggers in ADF.
Created External tables in Azure SQL Database for data visualization and reporting purpose.
Create and setup self-hosted integration runtime on virtual machines to access private networks.
Well-versed with Azure authentication mechanisms such as Service principal, Managed Identity, Key vaults.

Confidential

Data Engineer/Teradata Developer

Responsibilities:

Involved in business meetings to gather requirements, business Analysis, Design, review and Development, testing.
Performed tuning and optimization of complex SQL queries using Teradata Explain. Responsible for Collect Statistics on FACT tables.
Developed Python scripts for ETL load jobs using Panda functions.
Created proper Primary Index taking into consideration of both planned access of data and even distribution of data across all the available AMPS.
Wrote numerous BTEQ scripts to run complex queries on the Teradata database. Created Temporal tables and Columnar tables by utilizing the advanced features of V14.0
Created tables, views in Teradata, according to the requirements.
Provided architecture/development for initial load programs to migrate production databases from Oracle data marts to Teradata data warehouse, as well as ETL framework to supply continuous engineering and manufacturing updates to the data warehouse (Oracle, Teradata, MQ Series, ODBC, HTTP, and HTML). Performed the ongoing delivery, migrating client mini-data warehouses or functional data-marts from Oracle environment to Teradata.
Performed bulk data load from multiple data source (ORACLE 8i, legacy systems) to TERADATA RDBMS using BTEQ, Multiload and FastLoad.
Used various transformations like Source qualifier, Aggregators, lookups, Filters, Sequence generators, Routers, Update Strategy, Expression, Sorter, Normalizer, Stored Procedure, Union etc. Used Informatica Power Exchange to handle the change data capture (CDC) data from the source and load into Data Mart by following slowly changing dimensions (SCD) type II process.
Used Power Center Workflow Manager to create workflows, sessions, and also used various tasks like command, event wait, event raise, email.
Designed, created and tuned physical database objects (tables, views, indexes, PPI, UPI, NUPI, and USI) to support normalized and dimensional models.
Created a cleanup process for removing all the Intermediate temp files that were used prior to the loading process.
Worked on creating few Tableau dashboard reports, Heat map charts and supported numerous dashboards, pie charts and heat map charts that were built on Teradata database.

Confidential

ETL Developer

Responsibilities:

Design, Develop ETL process and create UNIX shell scripts to execute Teradata SQL, BTEQ, jobs.
Performed bulk data load from multiple data source (ORACLE 8i, legacy systems) to TERADATA RDBMS using BTEQ, TPT, Fastload, Multiload and TPump.
Assisted DBA to Create tables and views in Teradata, Oracle databases.
Dealt with Incremental data as well Migration data to load into the Teradata.
Enhanced some queries in the other application to run faster and more efficiently.
Used various Teradata Index techniques to improve the query performance.
Developed the shell scripts to automate the Call Detail Records - Informatica processes and subsequent concatenation of load ready files.
Extensively used Informatica Client tools - Source Analyzer, Warehouse Designer, Mapping Designer, Mapplet Designer, and Informatica Workflow Manager.
Developed Source to Target Mappings using Informatica PowerCenter Designer from Oracle, Flat files sources to Teradata database, implementing the business rules.
Used BTEQ and SQL Assistant (Query man) front-end tools to issue SQL commands matching the business requirements to Teradata RDBMS.
Modified BTEQ scripts to load data from Teradata Staging area to Teradata data mart.
Updated numerous BTEQ/SQL scripts, making appropriate DDL changes and completed unit and system test.
Created series of Macros for various applications in TERADATA SQL Assistant.
Responsible for loading data into warehouse from different sources using Multiload and Fastload to load millions of records.
Performed tuning and optimization of complex SQL queries using Teradata Explain.

Confidential

ETL Developer

Responsibilities:

Development of scripts for loading the data into the base tables in EDW using FastLoad, MultiLoad and BTEQ utilities of Teradata.
Extracted data from various source systems like Oracle, Sql Server and flat files as per the requirements.
Performed tuning and optimization of complex SQL queries using Teradata Explain.
Created a BTEQ script for pre population of the work tables prior to the main load process. Used volatile table and derived queries for breaking up complex queries into simpler queries.
Involved in loading of data into Teradata from legacy systems and flat files using complex MultiLoad scripts and FastLoad scripts.
Developed MLOAD scripts to load data from Load Ready Files to Teradata Warehouse.
Involved heavily in writing complex SQL queries to pull the required information from Database using Teradata SQL Assistance.
Created a shell script that checks the corruption of data file prior to the load.
Loading data by using the Teradata loader connection, writing Teradata utilities scripts (Fastload, Multiload) and working with loader logs.
Created and automate the process of loading using Shell Script, Multi load, Teradata volatile tables and complex SQL statements.
Created/Enhanced Teradata Stored Procedures to generate automated testing SQLs.
Involved in troubleshooting the production issues and providing production support. Streamlined the Teradata scripts and shell scripts migration process on the UNIX box.
Involved in analysis of end user requirements and business rules based on given documentation and working closely with tech leads and analysts in understanding the current system. Created a cleanup process for removing all the Intermediate temp files that were used prior to the loading process. Involved in troubleshooting the production issues and providing production support. Developed unit test plans and involved in system testing.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship