Azure Data Engineer Resume
0/5 (Submit Your Rating)
New York, NY
SUMMARY
- Data Engineer with 6 years of experience specializing in developing, implementing, and managing Business Intelligence Solutions using SQL Server, Azure Data Factory, Databricks, and SSIS
- Experience in the On - premises to Cloud migration projects, and migrating SSIS packages to Data Factory on Azure
- Well versed experienced in creating pipelines in ADF using different activities like Move &Transform, Copy, filter, for each, Data bricks etc.
- Experience in data analysis, design, and development of databases for different business applications
- Highly proficient in the use of T-SQL for developing complex stored procedures, triggers, tables, user functions, user profiles, relational database models and data integrity, SQL joins and query writing
- Strong development background in creating pipelines, data flows and complex data transformations and manipulations using SSIS and Azure Data Factory
- Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema modeling, Snowflake Schema modeling, Fact and Dimension tables.
- Well versed in Data warehouse concepts normalization/de-normalization techniques for optimum performance in relational and dimensional database environments and building Referential Integrity Constraints
- Experienced on Build, Deploying and managing SSIS packages with SQL server management studio, create SQL server agent jobs, configure jobs, configure data sources, and schedule packages through SQL server agent jobs.
- Experienced with SSIS performance tuning on Control flow, Data flow, Error handling, and Event handler, re-running of failed SSIS packages.
TECHNICAL SKILLS
ETL: Azure Data Factory, Databricks, SSIS, Spark
Databases: SQL Server, Oracle 11G, Mongo, Cosmos
Operating System: Windows, UNIX
Languages: SQL, PL/SQL, T-SQL
Others: SQL Server Data Tools (SSDT),Red Gate
PROFESSIONAL EXPERIENCE
Confidential, New York NY
Azure Data Engineer
Responsibilities:
- Design and implementation of greenfield cloud data solutions in Microsoft Azure, using Azure Data Lake 2, Azure Data Factory, Azure Logic Apps, Azure Functions, Azure Databricks, Python, and Power BI
- Created and developed metadata driven data architecture to load data files from various sources into Azure Data Lake with the help of Azure Data Factory and Databricks using PySpark and Spark-SQL.
- Responsible for developing ETL pipelines to meet business use cases by using data flows, Azure Data Factory (ADF), Data Lake and Azure Datawarehouse.
- Developed Preprocessing jobs using PySpark and data frames to flatten JSON documents to flat file.
- BuiltETLsolutions using Databricks byexecutingcode in Notebooksagainstdatain Data Lake and Delta Lakeand loading data intoAzureDWfollowing the bronze, silver, and gold layer architecture.
- CreatedcomplexETLAzureDataFactorypipelines usingmappingdata flowswithmultiple transformations includingSchema Modifier transformations,row modifier transformationsetc.
- DevelopeddynamicData Factory pipelines using parameters andtriggerthem as desiredusing events like file availabilityon Blob Storage, based on scheduleandvia Logic Apps.
- Conducting Exploratory data analysis in Jupyter notebooks using PySpark and sharing the data analysis.
- Integratedthe end-to-end datapipelineto take data from source systems to target data repositories.
- Responsible to manage data coming from different sources and loading of structured and unstructured data
- Worked on different files like CSV, JSON, Flat, fixed width to load the data from source to raw tables
- Implemented Triggers to schedule pipelines.
- Handled version control system with Git, and Bitbucket for software and documentation.
- Use various types of activities: data movement activities, transformations, and control activities; Copy data, Data flow, Get Metadata, Lookup, Stored procedure, Execute Pipeline
- Used Python for performing Data cleaning and preparation on structured and unstructured datasets in data bricks
- Constructing and optimizing complex SQL queries and stored procedures
- DevelopedCustomEmail notification inAzureData Factory pipelinesusingLogic Appsand standard notifications usingAzureMonitor.
- Persist data in Synapse views and use Power BI to generate reports for decision making.
Confidential, New York NY
Data Engineer
Responsibilities:
- Used Databricks along with Azure Data Factory (ADF) to compute large volumes of data.
- Performed ETL operations in Azure Databricks by connecting to different relational database source systems using JDBC connectors.
- Developed Python scripts to do file validations in Databricks using PySpark and automated the process using ADF.
- Developed an automated process in Azure cloud which can ingest data daily from web service and load into ADLS
- Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.
- Used Logic App to take decisional actions based on the workflow.
- Developed custom alerts using Azure Data Factory, SQLDB and Logic App.
- Developed Databricks ETL pipelines using notebooks, Pyspark data frames, Spark-SQL, and python scripting.
- Worked with enterprise Data Modeling team on creation of Logical models.
- Development level experience in Microsoft Azure providing data movement and scheduling functionality to cloud-based technologies such as Azure Blob Storage and Azure SQL Database.
- Worked on developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.
- Independently manage development of ETL processes - development to delivery.
- Created Pipelines inADFusing Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources likeAzureSQL, Blob storage,AzureSQL Data warehouse, write-back tool and backwards.
- Migration of on-premises data (Oracle/ SQL Server/ DB2/ MongoDB) toAzureData LakeStore (ADLS) usingAzureData Factory
- Designed and deployed many ETL workflows viaAzureData Factory (ADF) and SSIS packages to extract, transform and load data from SQL Server databases, excel and flat file sources into Data Warehouse.
- Recreating existing application logic and functionality in theAzureData Lake, Data Factory, SQL Database and SQL data warehouse environment.
- Created Notebooks inAzureData Bricks and integrated it withADFto automate the same.
Confidential
Data Engineer
Responsibilities:
- Responsible for development of SQL Objects, Tables, Stored Procedures, Indexes, Triggers
- Created SSIS packages to load data into Data Warehouse using Various SSIS Tasks like Execute SQL Task, bulk insert task, data flow task, file system task, send mail task, active script task, xml task and various transformations
- Extensively used different types of transformations such as lookup, slowly changing dimension (SCD), multi-cast, merge, OLE DB command, and derived column to migrate SSIS packages from one database to another database.
- Involved in queries optimization using tools such as SQL Profiler, Tuning Advisor, Execution plan, and statistics IO
- Developed complex SQL queries and performed optimization of databases and tuning of long running SQL Queries
- Created Event Handlers for runtime events and created Custom Log Provider in SQL Server to log events for audit purposes
- Implemented Change Data Capture (CDC) to perform incremental data extraction for importing data into the CDW
- Extensively used SSIS parallelism and multithreading features to increase performance and decrease ETL duration
- Validating and cleaning source data using different SSIS data transformations such as Script task, Conditional Split, lookup, slowly changing dimension, script component, Data conversion, derived columns, Merge join, etc.
- Created custom SSIS ETL Framework for loading Data Warehouse with restart ability logic and using DQS and MDM utilities for applying business rules
- Implemented package configurations in SSIS packages for cross environment deployment
- Perform a Proof of Concept to evaluate Azure Technologies which can be utilize for data Migration from Legacy system to Cloud