We provide IT Staff Augmentation Services!

Azure Data Engineer Resume

3.00/5 (Submit Your Rating)

Columbus, OH

PROFESSIONAL SUMMARY

  • Around 8+ years of experience in analyzing, designing, and developing Client/Server, Data Warehousing/ Data Modeling/Business Intelligence (BI) Stack database applications using MS SQL Server 2017/2016/14/12/08/05, SQL Server Integration, and Reporting & Analysis Services (SSIS, SSRS & SSAS).
  • Experience in working on Different Azure services like Azure Data Factory (v1 & v2), Azure Data Lake store (Gen1 & Gen2), Azure Data Lake Analytics, Azure Databricks, Event Hubs, Azure storage accounts (Blob Storage), Logic Apps, Batch Account, Azure Active Directory, Azure Key Vault, Azure Automation.
  • Worked extensively onAWSCloud services such as S3, EC2, EMR, Athena, Redshift, RDS and Dynamo DB for developing data pipelines and data analysis.
  • Strong experience with T - SQL (DDL & DML) in Implementing & Developing Stored Procedures, Triggers, Nested Queries, Joins, Cursors, Views, User Defined Functions, Indexes, User Profiles, Relational Database Models, Creating & Updating tables and checking the database consistency by executing DBCC Commands.
  • Ability to work in all stages of System Development Life Cycle (SDLC).
  • Worked on writing the u-sql scripts to get the partitioned data from ADLA.
  • Working with AWS/GCP cloud using in GCP Cloud storage, Data-Proc, Data Flow, Big- Query, EMR, S3, Glacier and EC2 Instance with EMR cluster.
  • Worked on setting up Databrick Environment, managing Databricks workspace Folder permissions, configuring clusters, transforming the data using pyspark, spark sql and Delta Lake.
  • Worked on creating the Azure Data Factory pipelines to process the files from SFTP folders and send out the notification emails using the logic apps.
  • Experience in working on cloud-based technologies like AWS which includes EMR, EC2, S3, Cloud monitoring and different data bases provided by AWS to manage application end to end.
  • Experience in optimizing the queries by creating various clustered, non-clustered indexes, indexed views and user-defined functions, Common Table Expressions (CTE'S), User-defined functions and views and used Backup, Recovery models using MS SQL Server 2016/2014/2012/2008.
  • Extensive Experience in RDBMS concepts such as Tables, User Defined Data Types, Indexes, Indexed Views, Functions, CTE's, Table Variables and Stored Procedures.
  • Hands on experience in created ADF (Azure Data Factory) pipelines for migrating the on-premises data from Netezza to Azure Sql Data Warehouse.
  • Developed power shell scripts to deploy datasets, liked services and pipelines in different environments and created a power shell run book for Azure automation account.
  • Experience in Managing Security of SQL Server Databases by creating Database Users, Roles and assigning proper permissions according to the business requirements.
  • Expertise in Performance tuning, Query Optimization and Maintaining data integrity using SQL Profiler and Spotlight.
  • Exposure in designing, developing, and delivering business intelligence solutions using Power BI, SQL Server Integration Services (SSIS), Analysis Services (MDS), and Reporting Services (SSRS).
  • Exposure in developing different types of reports using Power BI.
  • Experience in analyzing, designing, tuning and developing business intelligence database applications using MS SQL Server 2008R2/2012/2014/2016 SSIS, Reporting and Analysis services.
  • Extensive experience in Data Extraction and Transforming and Loading (ETL) using DTS package and by pulling large volumes of data (VLDB) from various data sources in MS SQL Server 2000 and SQL Server Integration Services (SSIS) in MS SQL Server 2008/2005 with .NET, Import/Export data, Bulk insert and BCP.
  • Hands on experience with performing various SSIS data transformation tasks like Lookups, Fuzzy Lookups, Conditional Splits and Event Handlers, Error Handlers etc.
  • Worked extensively on Extraction, Transformation, loading data from Oracle, DB2, Access, Excel, Flat Files and XML using DTS, SSIS.
  • Extensively used Report Wizard, Report Builder and Report Manager for developing and deploying reports in SSRS.
  • Designed and developedmanylarge-scale, batch&real-time big data applications that use Scala, Java, Python, Spark and other Hadoop ecosystem components.
  • Experience in developing Dashboard, Ad-hoc and Parameterized Reports using SSRS.
  • Experience in Configure and maintain Report Manager and Report Server for SSRS, Deployed and Scheduled the Reports in Report Manager.
  • Experience in .NET/C# application development environment to use SQL Server Databases.
  • Good experience in creating OLAP cubes using SQL Server analysis services (SSAS).
  • Excellent technical and analytical skills with clear understanding of design goals of ER modeling OLAP for dimension modeling.
  • Good knowledge of Data Marts, Data warehousing, Operational Data Store (ODS), OLAP, Data Modeling like Dimensional Data Modeling, Star Schema Modeling, Snow-Flake Modeling, FACT and Dimensions Tables using MS Analysis Services.
  • Hands on Experience in developing ETL DTS Packages & SSIS Packages for integrating data using OLE DB connection from various sources like (Excel, CSV, flat file) by using multiple transformations provided by SSIS and Analysis Services (MDS).
  • Experience in handling Master Data Management (MDM) services and setting up the Master Data Services (MDS) to integrate and maintain various entities.
  • Hands on experience with Visual Studio Online (VSO), Team Foundation Server (TFS).
  • Hands on experience with SDL Onboarding. Creating build and release pipelines using Azure DevOps.
  • Skilled in design of logical and physical data modeling using Erwin data modeling tool.
  • Strong background in a disciplined software development life cycle (SDLC) process and has excellent analytical, programming and problem-solving skills.

TECHNICAL SKILLS:

Big Data Ecosystem: Spark, Kafka, Hive, HBase, Pig, HDFS, MapReduce,Sqoop, Oozie, Tez, Impala, Ambari, Yarn

Cloud Technologies: AWS (Lambda,S3,EMR,RDS,EC2,Athena), Azure ( AZURE ML, AZURE SQL )

Programming Languages: Scala, Python, Java

Operating Systems: Linux, Windows, Centos, Ubuntu, RHEL

SQL Databases: MySQL, Oracle, MS-SQL Server, Teradata

NoSQL DB: HBase, Dynamo DB, Bigtable

Web Technologies: Spring, Hibernate, Spring Boot

Tools: Intellij, Eclipse

Scripting Languages: Python, Shell

PROFESSIONAL EXPERIENCE

Azure Data Engineer

Confidential, Columbus, OH

Responsibilities:

  • Actively participated in interacting with users, team lead, technical manager to fully understand the Application/System and Business requirements.
  • Built new Data factory Pipelines for data ingestion from On Prem FTP servers to Azure data lake store using Azure Data Factory (V2).
  • Created metadata driven Data factory Pipelines to pass the parameters to the pipelines dynamically using pipeline parameters.
  • Worked on designing the complex Azure data bricks templates, re-useable python functions.
  • Designed/Developed Realtime streaming notebook for the data ingestion using structured streaming from even-hubs to delta tables and azure synapse using jdbc push.
  • Worked on developing the pyspark scripts, creating clusters, using workspace, running jar files, creating jobs in Azure Databricks.
  • Worked on converting the existing u-sql scripts to pyspark
  • Involved in transforming the data through Azure Data Bricks notebooks and adding them as the activity for data factory workflows.
  • Strong experience of leading multiple Azure Big Data and Data transformation implementations in Banking and Financial Services, High Tech and Utilities industries.
  • Good amount of working knowledge in parameterizing the pyspark scripts from data factory.
  • Worked on transforming the Azure data lake data using pyspark and Spark Sql.
  • Worked on developing the real time process using EventHub's and transforming the data using Scala notebooks.
  • Used Python re-useable functions as part of coding and register them as a library.
  • Good amount of knowledge in managing Azure Databricks workspace creating the Mount points using Python, Cluster Configuration, and Permission.
  • Good working knowledge of Azure data bricks Delta Lake.
  • Worked on writing the u-sql Scripts for data transformation and extensively used Table valued functions for parameterizing the path inside the scripts.
  • Worked on developing and using the ABC Framework for Auditing, Logging and Reporting of all the Pipeline executions.
  • Involved in troubleshooting the failed u-sql jobs which has been executed through Azure Data Lake Analytics.
  • Used Azure Data Factory for data migration from OnPrem to Azure SQL Datawarehouse.
  • Created external tables in Azure Sql data warehouse to read the data from Azure data lake store using PolyBase.
  • Involved in Optimizing Stored Procedures and long running queries using indexing strategies and query-optimization techniques.
  • Developed a reusable pipeline which can perform sanity checks on source data files and update the logs in sql table.
  • Created complex Stored Procedures, Triggers, Functions, Indexes, Tables, Views and other T-SQL code and SQL joins for applications following SQL code standards.
  • Worked on enabling auditing and monitoring for all the azure services and creating a user groups and sending out an email whenever there is a failure.
  • Created external tables in Hive (Ambari) to retrieve the data from Azure data lake store using PolyBase.
  • Developed and deployed the code to different environments using Azure DevOps CICD Pipelines.
  • Expertise in using Azure DevOps for code check-in, creating pull request and configuring build and release definitions.
  • Good amount of knowledge in understanding the Arm templates used for deploying various Azure Resources.

ENVIRONMENT: SQL Server 2014/2016/2017/2019, Azure Sql Data Warehouse, Azure Data Factory v2, Azure Data Lake Store, Azure Data Lake Analytics, Azure Databricks, EventHub, Azure Storage account, Azure Automation, logic apps, power shell, Azure DevOps, Python, Power BI, MS PowerPoint, MS Project, C#, Visual Studio 2017/2015/2012.

AWS Data Engineer

Confidential, Weehawken, NJ

RESPONSIBILITIES:

  • Understand user requirements and Data Model defined and developed by the business analysts and data architects.
  • Involved in migrating existing Teradata Datawarehouse to AWS S3 based data lakes.
  • Exposure in developing different types of reports using Power BI.
  • Implemented database management techniques that include backup/restore, import/export and generated scripts.
  • Involved in Query Optimization, Performance Tuning and Rebuilding the Indexes at regular intervals for better performance.
  • Created Database Objects - Tables, Indexes, Views, User defined functions, Cursors, Triggers, Stored Procedure and Constraint by SQL Server.
  • Developed complex programs in T-SQL, writing Stored Procedures, Triggers, Functions and Queries with best execution plan.
  • Created databases and schema objects including tables, indexes and applied constraints, connected various applications to the database and written functions, stored procedures and triggers.
  • Worked extensively on migrating on prem workloads to AWS Cloud.
  • Worked on utilizing AWS cloud services like S3, EMR, Redshift, Athena and GlueMetastore.
  • Used broadcast variables in spark, effective & efficient Joins, caching and other capabilities for data processing.
  • Managed indexes, statistics and tuned queries by using execution plan for optimizing the performance of the databases.
  • DesignedanddevelopedendtoendInfrastructureonAWSdatatransformationscriptsinpysparkto integrate our platform with Enterprise Streaming DataPlatform.
  • Maintained the physical database by monitoring and optimizing performance, data integrity and SQL queries for maximum efficiency using SQL Profiler.
  • Exposure in developing Dashboards using Power BI for analyzing MS Product Key Activations, MS Product Key Blocks & MS Product Key distribution shared across different partners.
  • Hands on Experience in deploying the resources into the cloud with ARM Templates.
  • Configured Fortify and cred scan for visual studio solutions in branch using Build definition.
  • Created Entities in MDM and updated the data programmatically. Created ADF Pipelines to Read and write Data from MDM.
  • Created power shell scripts to deploy the datasets, pipelines and linked services. Created power shell scripts to clean up the datasets and run the pipelines on demand.

ENVIRONMENT: AWS EMR, Spark, Hive, HDFS, Sqoop, Kafka, Oozie, HBase, Scala, MapReduce.

MSBI/Azure Data Engineer

Confidential, Charlotte, NC

RESPONSIBILITIES:

  • Assist data modelers in designing of dimensional model for Star schema. Involved in database Schema design and development.
  • Worked on different ADF components like Pipelines, Datasets and Linked services. In depth knowledge of pipeline monitoring and scheduling.
  • Expertise in optimizing long running T-SQL lookup queries to reduce the lookup cache size, managing Clustered and Non-Clustered Indexes, Stored Procedures for Incremental Loads and long data load by dividing the large data loads into Batch Tables with optimum batch size using execution plan.
  • Worked on Dimensional modelling of the data from Scratch (Extracted the scratch data from heterogeneous sources and doing the ETL for incremental data loads using SSIS and building them into Dimension and Fact tables using Star Schema).
  • Experience in using T-SQL for creating stored procedures, indexes, cursors and function.
  • Created different activities in the pipeline like copy activity, u-sql activity, store procedure activity. Worked on scheduling the pipelines and running them on demand manually using Onetime.
  • Generated server-side T-SQL scripts for data manipulation and validation and created various snapshots and materialized views for remote instances.
  • Responsible for building scalable distributed data solutions using Big Data technologies like Apache Hadoop, MapReduce, Shell Scripting, Hive.
  • Creating tables, indexes and designing constraints and wrote T-SQL statements for retrieval of data and involved in performance tuning of T-SQL Queries and Stored Procedures.
  • Executed stored proc using BATCH files. Developed stored procedures and functions for designing various reports.
  • Exposure in Excel Reports, Power BI by connecting to various SQL server databases and improved the efficiency of these large reports.
  • Worked on writing the u-sql scripts based on the requirement and have the knowledge of testing and monitoring the job through Azure Data Lake Analytics.
  • Created power shell scripts to deploy the datasets, pipelines and linked services. Created power shell scripts to clean up the datasets and run the pipelines on demand.
  • Exposure in developing interactive Dashboards using Power BI for analytics Team.
  • Increased the performance necessary for statistical reporting by 25% after performance monitoring, tuning and optimizing indexes. Used Batch Files to Import Data from other Data Base.
  • Built new MDS (Master data services) and data warehouse application for business users and Migrated data from old data warehouse to new data warehouse. Developed and maintained Crystal reports as per the user requirement.
  • Worked as a member in On-Call Support which requires the On-Call support person to fix the job failures immediately even after the working hours.
  • Extensive hands on experience with SSIS to build huge packages which involves sourcing the data from a Tab Delimited Text Files, DB2, XML, Excel files and Flat Files and setting up the Variables and Parameters and mapping the obtained result sets to a Map Network Drive which uses the Merge Scripts.
  • Capaciously worked on integrations involving Microsoft Visual C# to use in the script components of SSIS.
  • Involved in creating measure groups using different fact tables and created calculated members, Advanced KPI's and named set in SSAS.
  • Enhancing and deploying the SSIS Packages from development server to production server.
  • Wrote the ETL scripts to load the data into database from various source files. Experienced in error
  • Create, manage SQL Server AZURE Databases.
  • Setting up Connection Strings and connecting SQL Server AZURE Databases from locally Installed SQL Server Management Studio (SSMS) for Developers
  • Handling and troubleshooting the scripts in failure to load the data into database.
  • Developed DTS packages to copy tables, schemas and views and to extract data from Excel and Oracle using SSIS.
  • Created packages in SSIS with custom error handling and worked with different methods of logging in SSIS.
  • Expert in SSIS Deploying and Scheduling
  • Identified and Defined Key Performance Indicators in SSAS 2008/2012.
  • Process SSAS cubes to store data to OLAP databases
  • Defined report layouts for formatting the report design as per the need.
  • Used Visual Studio Online (VSO) for code check-in and deployment.
  • Developed Sub Reports, Matrix Reports, Charts, and Drill down reports, using SQL Server Reporting Services (SSRS).
  • Generated Reports using Global Variables, Expressions and Functions for the reports.
  • Created reports and Dash boards using Microsoft Performance Point service.
  • Worked on querying data and creating on-demand reports using Report Builder in SSRS reports and send the reports via email.

ENVIRONMENT: SQL Server 2008/2012/2014/2016 Enterprise Edition, Netezza, Power BI, SSIS, SSRS, Azure Data Factory, Azure Data Lake Store, Azure Data Lake Analytics, Azure Storage account, Azure Automation, power shell, VSO, Power BI, MS PowerPoint, MS Project, C#, VB.Net, XML, Visual Studio 2017/2015/2012/2008, VSTS.

Data Engineer

Confidential

RESPONSIBILITIES:

  • Making and maintaining weekly TRP data files for all DD Channels.
  • Analyzing the causes for the drop of viewership due to timeband and market activity from the competition and recommend corrective action to increase viewership
  • Data Analysis and Visualization using python Libraries (Pandas, Matplotlib, Scikit-learn, NumPy)
  • Identifying priority markets for channels and finding Target audience through marketing Mailer creations, FPC Designing.
  • Analyze feedback and provide recommendations on content and program schedule based on BARC viewership data and researching quantitative and qualitative data.
  • Created reporting tables for comparing source and target data and report data discrepancies (mismatch, missing scenarios) found in the data.
  • Responsible for reporting of findings that will use gathered metrics to infer and draw logical conclusions from past and future behavior.
  • Writing Unix Scripts to automate Data collections and processing.
  • Developed reports and dashboards for the CMO & Director of Marketing (using Tableau, Excel and SQL) that measured the effectiveness of inbound marketing campaigns.
  • Follow up on a weekly basis with individual DD broadcasting stations to collet TRP data files
  • Analyze said TRP Data and provide solutions to the queries for the betterment of each channel.

ENVIRONMENT: SQL, Python (Pandas, Matplotlib, Scikit-learn, NumPy), UNIX Shell Scripting, Tableau, Excel.

SQL Server Developer

Confidential

RESPONSIBILITIES:

  • Requirement gathering, Functional & technical specifications for end user and end client applications, Re-Engineering and capacity planning.
  • Analysis, Design and data modeling of Logical and Physical database using E-R diagrams.
  • Document internal and external data sources and data integrations and provide operating support or data integration solutions.
  • Created ETL packages with different Heterogeneous data sources (SQL Server, Flat Files, CSV, Excel source files, and XML files) and then loaded the data into destination tables by performing different kinds of transformations using SSIS/DTS packages.
  • Involved in database Schema design and development.
  • Designed tables, constraints, necessary stored procedures, functions, triggers and packages using T-SQL.
  • Audited the Front-end application calls to SQL Server.
  • Created efficient Queries, Indexes, Views and Functions.
  • Designed ETL package, with ETL Import/Export wizard for transferring data.
  • Enforced Security requirements using triggers to prevent unauthorized access to the database.
  • Optimized the Database for best performance, with DBCC, partition table.
  • Design and develop the payroll module using MS SQL Server 2005.
  • Migrated data from MS Access to SQL Server.
  • Performance Tuning of the client databases for better performance on a regular basis.
  • Responsibilities include writing of scripts for Database tasks, releasing Database objects into production.
  • Running DBCC consistency checks, fixing data corruption on application Databases.
  • Experienced working on Star, Crystal and snow flake schemas and used the fact and dimension tables to build the cubes, perform processing and deployed them to SSAS database.
  • Created complex SSAS cubes with multiple fact measures groups, and multiple dimension hierarchies based on the OLAP reporting needs.
  • Improved existing reports performance by using execution log2 and optimized the underlying procedure and queries.
  • Worked in establishing connectivity between VB applications and MS SQL Server by using ADO.
  • Supporting ADO and Active-X connectivity to the database.
  • Involved in Report generation using SSRS and Excel services, Power Pivot and deployed them on SharePoint Server.
  • Generated Reports weekly and monthly to an online management interface using SSRS
  • Responsible for Post production support
  • Understanding the OLAP processing for changing and maintaining the data warehouse.
  • Resolved critical issues in given timeframe.
  • Monitored jobs and created several critical reports
  • Performed code reviews, unit testing.
  • Status meetings with the client, conducting reviews of the deliverables, defect tracking and analysis
  • Work with business stakeholders, application developer, and production teams and across functional units to identify business needs and discuss solution options.

ENVIRONMENT: SQL server 2005/2012, Oracle 11g, T-SQL, MS Integration, Analysis and Reporting Services, Management Studio, Query Analyzer, Windows 2007 Server, C#. MS VB Access, Windows 2005 Standard Edition, Windows 2005.

We'd love your feedback!