Azure Data Engineer Resume
IL
SUMMARY
- Experienced in transforming homogeneous data into vital information to serve the necessities of clients/customers using big data, Apache Spark, Python & SQL, and its Ecosystems.
- Implementation of Azure Cloud Components - Azure Data Factory, Azure Data Analytics, Azure Data Lake, Azure Data Bricks, Azure Synapse Analytics, Azure Data Store, Azure SQL DB/DW, Azure Cosmos DB, Power BI, U-SQL, and T-SQL
- Experience in submitting the spark job to the HDInsight cluster and having knowledge on Ambari, and Yarn.
- Strong experience in migrating other databases to Snowflake.
- Expertise in developing and creating the Azure Blob, SQL DB and lunching the windows and Linux virtual machines in Azure.
- Experience in streaming analytics Spark Streaming, Dstream, Databricks Delta
- Experience in Databricks MLflow for running machining learning models on distributed platforms
- Expertise in Creating VMs from the Disk and Basic Knowledge on Logical App.
- Understanding of Snowflake cloud technology.
- Experience with Snowflake cloud data warehouse for integrating data from multiple source system which include loading nested ISON formatted data into snowflake table.
- Automation of the tasks by using the Spark Cluster in ADF Basic knowledge in exporting data to power BI to generate the reports and Dashboards visualize recent processed data and deploying Reports.
- Knowledge of Working in web API Led Connectivity and understanding of Web Services integration of REST
- Good Knowledge on Dataware Housing Concepts and Data/dimension modelling like Star/Snowflake schemas, Dimensions and Fact tables.
- Hands on experience on Creating ADF Dynamic Pipelines for Full Load and Incremental Load
- Very good hands-on in Spark Core, Spark SQL, Spark Streaming.
- In-depth knowledge of PySpark and experienced in building the Spark applications using Python.
- Solid understanding of RDD operations in Apache Spark i.e., Transformations, Actions & Persistence (Caching).
- Good knowledge on using interactive notebooks like Jupyter/Databricks.
- Experienced in version control and source code management tools like GIT.
- Experienced in using IDEs like PyCharm.
- Hands on experience in setting up the Azure Data factory and creating the ingestion Pipelines to pull data to Azure Data Lake Store and Azure Blob Storage.
- Experience in migration of on-premises databases to Microsoft Azure environment (Blobs, Azure Data Warehouse, Azure SQL Server, PowerShell Azure components, SSIS Azure components).
- Proficient in managing Azure Data Lakes (ALS) and Data Lake Analytics and an understanding of how to integrate with other Azure Services.
- Expert in database concepts, including normalization, indexing, physical and logical modelling, creation of SQL queries and performance tuning.
- Experienced with SIS performance tuning on Control flow, Data flow, Error handling, and Event handler, re-running of failed SIS packages.
- Adept at utilizing Bl tools such as Power BI and QlikView for enhancing reporting capabilities and developing BI applications in accordance with client requirements.
- Demonstrated capability to liaise with key stakeholders for delivering compelling business value to senior leadership and clients.
- Extensive experience in developing complex Stored Procedures, Functions, Triggers, Views, Cursors, Indexes, CTE's, Joins and Sub queries with T-SQL.
- Strong development background in creating pipelines, data flows and complex data transformations and manipulations using ADF with Databrick
- Experience in the On - premises to Cloud implementation project using AF and Python scripting for data extraction from relational databases or files.
- Highly proficient in the use of T-SQL for developing complex stored procedures, triggers, tables, user functions, user profiles, relational database models and data integrity, SQL joins and query writing
- Good Knowledge on NoSQL database like MongoDB (University Certified), Dynamo & Cosmos DB
- Proficient in Logical and Physical database design & development (using Erwin, normalization, dimension modelling, Data Modelling, and SQL Server Enterprise manager)
- Excellent knowledge in designing and developing Data Warehouses, Data marts and Business Intelligence using multi-dimensional models for developing SSAS Cube.
TECHNICAL SKILLS
Operating Systems: Windows 7.0/10, Unix
Data Warehousing: Snowflake, Redshift, Teradata
Databases: MS SQL Server, Oracle 12c, MySQL, MS Access, DB2 Netezza, NoSQL DB Mongo DB, Azure SQL Datawarehouse
Cloud Technologies: Snowflake, SnowSQL
ETL Tools: SQL Server Integration Services (SSIS), IBM DataStage, Azure Data Factory
Database Tools: SQL Profiler, Management studio, Index Analyzer, SQL Agents, SQL Alerts, Visual Source Safe. Microsoft SQL Server CDC, IBM CDC, MapReduce
Languages: R, T-SQL, Visual Basic 6.0, C, C++, C#, JAVA, HTML, PL/SQL, VBA, PYTHON, Hadoop, Spark
Reporting Tools: SQL Server Reporting Services (SSRS), Tableau, MS Excel
BI Modeling Tools: Erwin. Embarcadero
Frameworks: Docker
PROFESSIONAL EXPERIENCE
Confidential
Azure Data Engineer
Responsibilities:
- Worked on design, development of complex applications using various technologies.
- Built and developed servers through Azure Resource Manager Templates or Azure Portal.
- Scheduled, deployed, and managed container replicas onto a node cluster using Kubernetes.
- Built and managed Docker container clusters managed by Kubernetes, Linux, Bash, GIT on GCP (Google Cloud Platform). Utilized Kubernetes and Docker for the runtime environment of the CI/CD system to build, test deploy.
- Created ADF Pipelines to move on-Prem Metric data to Azure DWH to support Realtime Analytics.
- Migrated an On-premises virtual machine to Azure Resource Manager Subscription with Azure Site Recovery.
- Microsoft Azure Cloud Services (PaaS & IaaS), Storage, Web Apps, Active Directory, Application Insights, Document DB, Azure Stream Analytics, Visual Studio Online (VSO) and SQL Azure.
- Experience managing Azure Data Lakes (ADLS) and Data Lake Analytics and an understanding of how to integrate with other Azure Services. Knowledge of USQL and how it can be used for data transformation as part of a cloud data integration strategy
- Worked on Microsoft’s internal tools like Cosmos, Kusto, iScope etc. which are known for doing ETL operations efficiently.
- Build Azure Data factory pipelines, dataflows for the data extraction, transformation, and loading of data from a wide variety of on-premises and cloud data sources
- Worked on Microsoft Azure Storage - Storage accounts, blob storage, managed and unmanaged storages.
- Created Data frames and Datasets for data transforming using PySpark and Spark SQL API in Databricks
- Worked on various Azure services like Compute (Web Roles, Worker Roles), Azure Websites, Caching, SQL Azure, NoSQL, Storage, Network services, Azure Active Directory, API Management, Scheduling, Auto Scaling, and PowerShell Automation.
- Experience with Power BI in converting data processing into analytics and reports that provides real-time insights.
- Recreating existing application logic and functionality in the Azure Data Lake, Data Factory, SQL Database and SQL data warehouse environment. experience in DWH/BI project implementation using Azure DF
- Use Git repository and Azure devops for deploying pipelines across multi environments
- Experience with Azure Micro Services, Azure Functions, and azure solutions.
- Experience working on big data with Azure. Connecting HDInsight to Azure and working on Big Data technology.
- Created Databricks notebooks and wrote PySpark, Spark SQL scripts in Databricks for data transformation and load data into Datalakes and blob storage
- Development web service using Windows Communication Foundation and .Net to receive and process XML files and deploy on Cloud Service on Microsoft Azure.
- Experience on IIS, IIS roles, SQL, Active directory.
- Hands on experience on Site-to-site VPNs, Virtual Networks, Network Security Groups, Load balancers, Storage Accounts.
- Experience managing Azure Data Lakes (ADLS) and Data Lake Analytics and an understanding of how to integrate with other Azure Services. Knowledge of USQL and how it can be used for data transformation as part of a cloud data integration strategy
- For Log analytics and for better query response used Kusto Explorer.
- Good knowledge of Operation Management Technologies - Log Aggregation, Server Monitoring, Process Monitoring, Application Monitoring - Splunk, Nagios, New Relic, Logstash, Kibana.
- Hands on experience on Azure VPN-Point to Site, Virtual networks, Azure Custom security, end point security and firewall.
- Designed and developed a new solution to process the NRT data by using Azure stream analytics, Azure Event Hub and Service Bus Queue.
- Have Experience in designing and developing Azure stream analytics jobs to process real time data using Azure Event Hubs, Azure IoT Hub, and Service Bus Queue.
- Experience in CI / CD using Jenkins for application process and in production support.
- Was involved in deploying API Management and Application Server resources.
- Experience maintaining high availability production systems in a Public/Private Cloud environment.
- Worked with DevOps practices using AWS, Elastic Bean Stalk and Docker with Kubernetes.
- Worked Hands on with Azure MFA (Multi Factor Authentication) Servers and Phone factors for 2 step Security.
- Collaborate to development of main Web Application to provides invoicing emission services,
- Responsible of web application deployments over cloud services (web and worker roles) on Azure, using VS and PowerShell.
- Involved in creating the Azure Services with Azure Virtual Machine and developing the Azure Solution and Services like IaaS and PaaS.
- Working experience with TFS/VSTS, Jenkins, Git, Jira, Ansible, Docker, ELK, Nexus, Sona Qube, Ansible.
Environment: Azure Data Factory, Spark (Python/Scala), Hive, Jenkins, Kafka, Spark Streaming, Docker Containers, PostgreSQL, RabbitMQ, Celery, Flask, ELK Stack, MS-Azure, Azure SQL Database, Azure functions Apps, Azure Data Lake, BLOB Storage, SQL server
Confidential, IL
Sr. Azure Data Engineer
Responsibilities:
- Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL. Migrate data from traditional database systems to Azure databases.
- Design and implement migration strategies for traditional systems on Azure (Lift and shift/Azure Migrate, other third-party tools.
- Experience in DWH/BI project implementation using Azure Data Factory.
- Interacts with Business Analysts, Users, and SMEs on elaborating requirements.
- Design and implement end-to-end data solutions (storage, integration, processing, visualization) in Azure.
- Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL.
- Propose architectures considering cost/spend in Azure and develop recommendations to right- size data infrastructure.
- Setup and maintain the Azure SQL Database, Azure Analysis Service, Azure SQL Data warehouse, Azure Data Factory, Azure SQL Data warehouse.
- Develop conceptual solutions & create proofs-of-concept to demonstrate viability of solutions.
- Implement Copy activity, Custom Azure Data Factory Pipeline Activities.
- Primarily involved in Data Migration using SQL, SQL Azure, Azure storage, and Azure Factory, SSIS, PowerShell.
- Create C# applications to load data from Azure storage blob to Azure SQL, to load from web API to Azure SQL and scheduled web jobs for daily loads.
- Recreating existing application logic and functionality in the Azure Data Lake, Data Factory, SQL Database and SQL Datawarehouse environment. experience in DWH/BI project implementation using Azure DF and data bricks.
- Architect, design and validate Azure infrastructure-as-a-Service (IaaS) environment
- Develop dashboards and visualizations to help business users analyze data as well as providing data insight to upper management with a focus on Microsoft products like SQL Server Reporting Services (SSRS) and Power BI.
- Responsible for creating Requirements Documentation for various projects.
- Strong analytical skills, proven ability to work well in a multi-disciplined team environment and adapt at learning new tools and processes with ease.
Environment: Azure SQL, Azure Storage Explorer, Azure Storage, Azure Blob Storage, Azure Backup, Azure Files, Azure Data Lake Storage, SQL Server Management Studio 2016, Visual Studio 2015, VSTS, Azure Blob, Power BI, PowerShell, C# .Net, SSIS, DataGrid, ETL Extract Transformation and Load, Business Intelligence (BI).
Confidential, Westchester, IL
Azure Data Engineer
Responsibilities:
- Build complex ETL jobs that transform data visually with data flows or by using compute services Azure Databricks, and Azure SQL Database
- Develop and maintain various data ingestion pipelines as per the design architecture and processes: source to landing, landing to curated & curated to process
- Use various types of activities: data movement activities, transformations, and control activities; Copy data, Data flow, Get Metadata, Lookup, Stored procedure, Execute Pipeline Dealt with Storage services named Data Lake Storage Gen 1 and Gen 2for hosting CV, JSON and Parquet files and managing access across storage accounts.
- Write Databricks notebooks (airf) for handling large volumes of data, transformations, and computations
- Work with various file formats: flat-file TXT & CV; parquet & other compressed formats
- Build Delta Lake for the curated layer, maintain high-quality data available for the teams: data scientists, finance etc.
- Used Cosmos DB for storing catalog data and for event sourcing in order processing pipelines.
- Designed and developed user defined functions, stored procedures, triggers for Cosmos DB
- Writing pyspark and spark SQL transformation in Azure Databricks to perform complex transformations for business rule implementation
- Unit tested the data between Redshift and Snowflake.
- Developed data warehouse model in snowflake for over 100 datasets using whereScape.
- Creating Reports in Looker based on Snowflake Connections
- Utilize Azure's ETL, Azure Data Factory (ADF) services to ingest data from legacy disparate data stores - SAP (Hana), SFTP servers & Cloudera Hadoop's HDFS to Azure Data Lake Storage (Gen2)
- Automated dataflows using Logic apps and Power Automate (Flow) which connects different Azure services and Function apps for customizations
- Worked on handling/automating failure of data pipelines by using Logic App and configuring SQL logics
- Optimization of all the tables and creating Data models based on business logic
- Automation for data validation for all the tables by creating dynamic store procedures and views.
- Leveraged Azure Cloud resources - Azure Data Lake Storage Gen2, Azure Data Factory, StreamSets(SDC), and Azures Data warehouse to build and operate a centralized cross-functional Data analytics platform
- Responsible to manage data coming from different sources and loading of structured and unstructured data
- Worked on different files like CSV, JSON, Flat, fixed width to load the data from source to raw tables
- Used Python for performing Data cleaning and preparation on structured and unstructured datasets
- Created ETL pipelines in Python and PySpark to load data into Hive tables under Databricks
- Create automation and deployment templates for relational and NoSQL databases including MSSQL and Cosmos DBin Azure using Python
Environment: MS SQL Server 2018, Azure Data Factory v2, Azure Data Lake Storage, Azure Databricks, Snowflakes.
Confidential
Azure Data Engineer
Responsibilities:
- Involved in Daily Stand-Up Meetings, Sprint Planning and Backlog Grooming for Agile Scrum Process.
- Respond to local area network (LAN) and wide area network (WAN) user requests for system upgrades and changes.
- Perform software installations and upgrades to operating systems and layered software packages.
- Monitor communications performance using visual, diagnostic equipment, status indicator checking methods, etc., to locate problems
- Manage the lifecycle for technical support documentation including Standard Operating Procedures and work instructions.
- Schedule installations and upgrades and maintain them in accordance with established IT policies and procedures.
- Use network management tools such as the Cisco Identity Services Engine (ISE) and Cisco Prime to identify, troubleshoot, and resolve wired and wireless networking issues on LAN equipment and/or PCs and other end-user devices.
- Track and report issues in our ticketing system.
- Participate in statewide projects to plan and implement system operation, optimization, enhancements, and upgrades.
- Set up, configure, and troubleshoot AV collaboration systems and software used by sta.
- Ensure data/media recoverability by implementing a schedule of system backups and database archive operations. Implement and promote standard operating procedures.
- Administer data backup systems at remote offices to ensure data availability, security, and recoverability. Quickly restore data accidentally deleted by customers.
- Knowledge of managing security and metrics of the data presented using the admin portal.
- Strong knowledge of Active Directory (building & maintaining group policies, administrative templates, sites, and services, etc.).
- Experience troubleshooting handheld devices, including Android and iPhone.
- Experience administering Microsoft SQL Server.
- Responsible for installing or upgrading Windows-based systems and servers, managing user access to the servers through RBAC identity systems, and maintaining the security and stability of the windows servers.
- Perform routine system maintenance and resolve server-side issues as they arise.
- Perform OS and application upgrades and server migrations to the cloud repository. Working both independently with little supervision as well as in a team environment.
- Assisting external customers with the development and troubleshooting of IT and software systems Hands-on administration and operation of scalable IT infrastructures.
- Experience with configuration and deployment of common Linux web and application servers (RabbitMQ, post x, Apache, nginx, MySQL).
- Maintain and Support Windows servers and applications.
- Must have experience with IIS.
- Incident Management: Logging tickets, Triaging tickets, and identifying the issue. If it is a bug in the system, then log a defect for development and if it is an issue with the business process then call the business stakeholder and explain the issue.
- Daily Checks: Check daily logs so see if there are any errors.
- Performance Monitoring: Monitor systems for performance (monitoring, recording, and alerting Memory/CPU usage patterns).
Environment: SQL Server, Windows Server, Windows 2008 R2, /2000 Server(64bit), Unix, LINUX, Shell Script
Confidential
SQL BI Developer
Responsibilities:
- Analyzed the requirements and selected the appropriate fact tables/created fact table.
- Migrated DTS packages from SQL Server 2008 to SQL Server 2012 as SSIS Packages.
- Creating Static Reports using MDX queries and creating MDX Base Report Definition.
- Experience in building and maintaining Data marts. Created roles for using the cubes
- Wrote DTS packages to transfer data from heterogeneous database & different files format (Text File, Spread Sheet) to SQL Server.
- Created tables for data ware housing applications and populated the tables from the OLTP database using DTS packages
- Wrote Triggers and Stored Procedures to capture updated and deleted data from OLTP systems.
- Excellent report creation skills using Microsoft SQL Server 2014 Reporting Services (SSRS) with proficiency in using Report Designer as well as Report Builder.
- Created SQL Server Reports based on the requirements using SSRS 2012,2014
- Involved in analysis of Report design requirements and actively participated and interacted with Team Lead, Technical Manager and Lead Business Analyst to understand the Business requirements.
- Extensively involved in designing the SSIS packages to export data from flat file source to SQL Server database.
- Experience in creating XML files by designing the SSIS packages and SQL server.
- Developed reports and deployed them on server using SQL Server Reporting Services (SSRS).
- Troubleshooting performance issues and fine-tuning queries and stored procedures
- Developed Complex Stored Procedures, Views, and Temporary Tables as per the requirement.
- Used SSRS Reports to write complex formulas and to query the database to generate different types of ad-hoc reports for Business Intelligence.
Environment: MS SQL Server 2008/2012 Enterprise Edition, SQL BI Suite (SSAS, SSIS, SSRS), Enterprise manager, Crystal Reports, XML, C#.NET, MS PowerPoint, MS Project, Windows Server 2008, Oracle.