Azure Data Engineer Resume NY - Hire IT People

SUMMARY

Over 13+ years of IT Industry experience in Azure Cloud Services, ETL Tools and Business Intelligence in analysis, design, development, testing and deploying various software applications.
Having 4+ years of experience in Azure Cloud Services which includes Azure Databricks, Azure Data Factory v2, Azure Cloud Storages, Azure Data Lake, Azure Key Vault, Azure DevOps.
Having good experience in developing Spark processing frameworks using PySpark and Spark SQL in Databricks for extract, transform and load data for various big data file formats and other file formats.
Having good experience in Azure SQL DW, Azure Data Lake Gen2, Azure Blob Storage services.
Hands on experience in creating Mount points to Azure Storage Services (Azure data lake storage and Blob storage services).
Having good experience in Databricks File Systems (DBFS).
Experience in developing pipelines, data driven workflows for data movements and transformations as part of ETL/ELT process using Azure Data Factory v2.
Good experience in creating linked services, datasets and Integration run times.
Good experience in creating Logic Apps for sending emails.
Good experience in migrating On Premise data to Azure Data Lake Storages using Azure Data Factory.
Having good knowledge in Spark Architecture and Compute including Spark Core, Spark SQL, Data Frames, Driver Node, Worker Nodes, Stages, Executors and Tasks.
Hands on experience in creating CI/CD pipelines for release management activities.
Good experience in repository management using Azure DevOps.
Having good knowledge on Spark Streaming by using file systems, Kafka, Event hubs.
Worked on various Business Intelligence tools (Qlikview, Tableau) for developing dashboards and reports for data visualization and business analytics.
Worked on various ETL tools (Alteryx, Pentaho) for developing workflows for extract, transform and load data into different database systems.
Worked on extensively with dimensional data modeling in Star Schema, Snowflake Schema and Slowly Changing Dimensions for data processing into data warehouses.
Hands on experience in data analysis by applying data cleansing, data validation techniques to check the business rules on data.
Experience in working on SDLC and Agile Methodologies for more efficient deliveries.
Experience in production support activities for business continuity.
Ability to work independently and directly with customers.

TECHNICAL SKILLS

Microsoft ETL Cloud Services: Azure Data Bricks, Azure Data Factory v2

Microsoft Storage Services: ADLS Gen2, Blob Storage, Azure Data Lake (Delta Lake)

Business Intelligence Tools: Tableau, Qlikview

ETL Tools: Alteryx, Pentaho

Programming Language: Python, PySpark, Spark SQL

Databases: Vertica, Oracle, SQL Server, Kudu, Impala, IBM DB2, Azure SQL

Operating Systems: Windows, Unix

SDLC Methodologies: Agile Methodology

Version Control: Azure DevOps, Git, Perforce

Scheduling Tools: Autosys, ESP

PROFESSIONAL EXPERIENCE

Confidential, NY

Azure Data Engineer

Responsibilities:

Developing Spark processing frameworks using PySpark and Spark SQL in Databricks using read and write API’s for various file formats.
Developing Notebooks for data extraction of various file formats and transforming and loading the detailed, aggregate data into Azure Data Lake and also transmitting data into external data warehouses.
Developing SCD types frameworks and Pipelines, Workflows for incremental loads.
Creating scripts for data manipulation, bad records checks, null checks and schema enforcement & evaluations.
Creating Delta tables using various big data format files and other format files and also archiving the data in ADLS storage and Blob storage services.
Creating Mount Points to access the Azure storage services which includes ADLS Gen2 and Blob Storages.
Developing Data Frames, Temporary Views for creating Delta tables for ACID transactions.
Working on Azure Data Factory service for building pipelines using activities for data movement from on premise to Azure Storages.
Creating data flows using various transformation steps for data movements from storages, on premise to data warehouses as part of ETL/ELT process.
Creating pipelines to perform full loads and incremental loads.
Creating pipelines to archive the data in to Azure Blob and Azure ADLS storages.
Good Understanding of Azure DevOps practices for continuous integration/delivery and creating CI/CD release pipelines for release management to push the changes to higher environments.
Creating time based and event based scheduling for pipelines and monitoring the pipelines to avoid failures for day to day business continuity.
Creating linked services, data sets for various storages and on premise to extract the data and store into different file systems and Azure storages.
Creating self hosted Integration run times to pull and push the data into on premise.
Implemented the logic app to send an email notification about pipeline status report to the business on daily basis.
Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark Databricks cluster.
Working in Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints.
Working on Production support activities to make sure business continuity.
Interacting closely with business users and providing end to end support.
Implementation of Cloud native ETL process where needed.

Environment: Azure Databricks, Azure Data Factory v2, PySpark, Spark SQL, Blob, ADLS, Logic Apps, Azure Key Vault, Azure Data Lake, Databricks File System, Azure DevOps, Vertica, Oracle, SQL Server.

Confidential, NY

Azure Data Engineer

Responsibilities:

Developed ingestion pipelines inADFusingvarious activities, linked services, and datasets to extract the data from different sources, On Premise and write back into Azure cloud storages.
Developed Pipelines in Azure Data Factory to call Notebooks to transform the data for reporting and analytics usage.
Developed Pipelines for full loads and incremental loads.
Performed unit testing on Pipelines using various test scenarios.
Developed Spark applications usingPySpark,Spark - SQLfor data manipulation, transformation and data aggregation for multiple file formats.
Created multiple Notebooks in Databricks for reading external data and writing into Delta Lake Tables by using read and write API’s.
Created Mount Points to access the Azure storage services which includes ADLS Gen2 and Blob Storages.
Developing Data Frames, Temporary Views for creating Delta tables for ACID transactions.
Implemented the logic app to send an email notification about pipeline status report to the business on daily basis.
Performed data analysis and data quality checks and prepared data quality assessment report for Data Integrity.
Scheduled jobs in Databricks using developed notebooks and monitored those jobs.
Scheduled the ADF pipelines for file watcher jobs using event triggers and scheduled on a regular basis using schedule triggers.
Created CI/CD release pipelines for release management to push changes to the higher environments.
Working in Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints.
Working on Production support activities to make sure business continuity.
Interacted closely with business users and providing end to end support.
Implementation of Cloud native ETL process where needed.

Confidential, NY

Azure Data Engineer

Responsibilities:

Interacting with business users, architects and subject matter experts to gather business requirements and preparing the functional and technical specifications.
Developed Spark applications using PySpark and Spark SQL in Azure Databricks for ETL operations for multiple file formats to transforming and analysing the data to cover business insights.
Created ADF pipelines to copy data between various sources and destinations.
Designed pipelines using Copy Data, Data Flow, Execute Pipeline, Get Metadata, If Condition, Lookup, Set Variable, Databricks Notebook, For Each and Filter activities.
Created dynamic pipelines to copy data from on premise to Data Lake by using parameters, expressions.
Implemented pipelines for delta and full loads to run daily, weekly, and monthly.
Created and maintained Linked Services, Datasets, Triggers, and Integration Runtimes.
Created and managed Azure Blobs and Azure Data Lake Storage Gen 2 storage solutions.
Creating validation scripts for the data that moved to target which validate data count and validation.
Manually validating the data by getting source data and target data from Databricks and validating data count.
Developing Data Frames, Temporary Views for creating Delta tables for ACID transactions.
Design and developing various Pentaho transformation workflows to extract and transform the data from various systems, flat files for different sources and applying business rules and load the data into Data Lake for enterprise business operations.
Developing Pentaho ETL jobs and transformations such as metadata injection process, transform executors, database joins and lookups, aggregations, data mappings, data streaming operations to load data in to OLAP systems.
Developing the Pentaho ETL workflows to perform incremental loads to populate the previous business day data into Data Lake for day to day operations.
Maintaining the version control using GIT.
Scheduling the workflows in ESP to run the jobs on schedule basis.
Creating the release confluence pages for production deployment.
Working on configuration management, release management to perform code migrations to different environments.
Working closely with engineering team to schedule the workflows and monitoring to avoid failures.
Working on performance tuning and optimization of jobs, transformations, workflows for more efficiency and better performance.
Handling production support activities and provide optimal solutions for business continuity.

Environment: Azure Databricks, Azure Data Factory v2, PySpark, Spark SQL, Blob, ADLS, Pentaho, DB2, Oracle, SQL Server, ESP Scheduling, Mesos.

Confidential

Tableau BI Developer

Responsibilities:

Interacting with business users, architects to gather business requirements and preparing the functional and technical specifications.
Analyzing the business requirements, design specifications and functional specifications and providing estimations and creating tasks planning.
Design and developing summary, detail level dashboards by using Tableau and their components like relational data model, data blending, joins and unions, parameters, sets, hierarchy drill downs, filters, table calculations, calculations, LOD’s, Actions.
Creating various graphical Charts and Maps by applying certain business rules to build different interactive dashboards, scorecards, KPIs for business operations.
Developing drill through dashboards by using Tableau Action, Objects component techniques.
Creating incremental refresh schedules to ensure data is up to date in the dashboards.
Working on Custom SQL’s to fetch the data from different entities to create the hybrid dashboards for metrics analysis.
Working on metadata management for creating better implementation on dashboard visualization.
Working closely with Engineering team to deploy and publishing the Tableau dashboards in the production environments.
Testing out the new features and implement those new features in the dashboards for better visualization and look and feel.
Performing unit testing on all dashboards and story boards during the migration of Tableau old version to new version and make sure there is no break in the existing functionality and document all the observations and reporting them to SMEs to take it forward.
Performing repository management activities to manage the multiple versions of Tableau dashboards and merging the latest branch version with master branch to maintain the centralized repository.
Monitoring the dashboards status to avoid the failures and also providing production support to make sure all dashboards are running fine and data loaded correctly in all dashboards for business continuity.
Working on POC’s for preparing different kinds of dashboards and visualizations by incorporating different charts and showing it to business users and taking feedbacks for better look and feel.
Preparing navigation user guide for end users to provide details and usage of all objects incorporated in the dashboards.

Environment: Tableau BI, Alteryx, IBM DB2, Oracle, Greenplum, Autosys

Confidential

Qlikview BI Developer

Responsibilities:

Interacted with business users to gather business requirements for preparing high and low level design documents.
Expertise in creating physical data modeling by using associations and joins, concatenations for creating QV Dashboards.
Expertise in developing and testing Complex Dashboards using QVDs and worked on performance improvement of dashboards.
Created various Charts, pivot tables, straight tables as per the requirement of the client.
Implementation of Incremental load to populate the recent and modified data from data source.
Performing Unit and Integration testing, validation and sanity checks on the deliverables.
Responsible for doing R&D as per the new requirements and writing Scripts & Functions wherever required.
Suggested alternate solutions/workaround for the issues identified and handled multiple end to end developments.
Involved in Migration of Qlikview 9 to11.
Extracted data from various sources like DB2, Share point, XML, Excel and flat files for data modeling and Transformation.
Expertise in handled various critical data back loading requests.
Strong understanding of business processes and ability to adapt in new environment quickly.
Interacting with clients for understanding business process, requirements and Gap Analysis.
Ensure the delivery of daily, weekly and Monthly reports.
Actively tracking the Development, Enhancement, Support, Change Requests, Testing, UAT Progress using QC (Quality Centre) and JIRA also maintaining the respective documents in the repositories.
Act as a Mentor and a Leader to my team to ensure quality deliverables within the allocated time lines.

Environment: Qlikview BI, IBM DB2, Oracle, UNIX, Autosys.

Confidential

Qlikview BI Developer

Responsibilities:

Expertise in creating physical data modeling by using associations and joins, concatenations for creating QV Dashboards.
Expertise in developing and testing Complex Dashboards using QVDs and worked on performance improvement of dashboards.
Created various Charts, pivot tables, straight tables as per the requirement of the client.
Implementation of Incremental load to populate the recent and modified data from data source.
Performing Unit and Integration testing, validation and sanity checks on the deliverables.
Expertise in handled various critical data back loading requests.
Strong understanding of business processes and ability to adapt in new environment quickly.
Interacting with clients for understanding business process, requirements and Gap Analysis.
Responsible for development (Front end and backend both) of Qlikview application and their unit testing and defect fixing.
Responsible for the preparation & delivery of different deliverables like Design Documents, UNIT Test plans, Integration Test Plans, Test data for different features of the application.
Responsible for release of application to QA team in different cycles and finally its release to production Team.
Responsible for interaction with QA Team to discuss the issues, their fixing and updating of Issue list document.
Responsible for interaction with production Team to finish final execution of the system.
Responsible for preparation of other documents related to the application like Release document, PTP documents, Reverse KT Document.

Environment: Qlikview BI, Qlikview Workbench, IBM DB2, SQL Server, UNIX, Quality Center.

We provide IT Staff Augmentation Services!

Azure Data Engineer Resume

NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship