We provide IT Staff Augmentation Services!

Azure/gcp Data Engineer Resume

2.00/5 (Submit Your Rating)

Houston, TX

Sr. Data Engineer

Offering 9+ Years of experience

Can be headhunted for a Lead level position across any functional sectors within an IT organization of repute

SUMMARY

  • Experience in Data collection, Data Extraction, Data Cleaning, Data Aggregation, Data Mining, Data verification, Data analysis, Reporting, and data warehousing environments.
  • Experience in migrating existing on - premise databases to both Azure and Google Cloud Environments for better reporting experience.
  • Experienced in building modern data warehouses in both Azure Cloud and Google Cloud, including building reporting in PowerBI or Data Studio.
  • Experience working on Azure Services like Data Lake, Data Lake Analytics, SQL Database, Synapse, Data Bricks, Data factory, Logic Apps and SQL Data warehouse and GCP services Like Big Query, Dataproc, Pub sub etc.
  • Experience in Developing Spark applications using Pyspark, Scala and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
  • Extensively used pyspark data frame API, spark sql, python and pandas udf’s for transformations for achieving the same functionalities from the RDBMS world and rebuild ETL’s with maximum efficiency with respect to cost and time.
  • Highly experienced in using multi cloud services to achieve maximum efficiency and performance for data application, used Azure and Google cloud platform for building extensive data pipelines for doing terra scale of data analysis.
  • Good experience in writing Spark applications. In-depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Kafka streaming.
  • Hands on experience in building CI/CD Azure pipelines using the Microsoft agent and running python file related tests.
  • Excellent communication skills with excellent work ethics and a proactive team player with a positive attitude. Domain Knowledge of Finance, Logistics and Health insurance.
  • Strong skills in visualization tools Power BI, Microsoft Excel - formulas, Pivot Tables, Charts and DAX Commands.
  • Worked on the Optimization of critical T-Sql, Store Procedures, Etl Jobs, and SSAS Cubes.
  • Led database administration and database performance tuning efforts to provide scalability and accessibility in a timely fashion, provide 24/7 availability of data, and solve end-user reporting and accessibility problems.

TECHNICAL SKILLS

  • Azure (Data lake, Data factory, Databricks)
  • Designing/modifying data warehouses
  • Programming Scala, Python,
  • MSBI (SSIS, SSAS, SSRS)
  • Data Visualization
  • Spark SQL, Sql Server
  • Microsoft Power BI
  • Microsoft Office
  • GCP(Big query, Dataproc, Pub sub)

PROFESSIONAL EXPERIENCE

Azure/GCP Data Engineer

Confidential, Houston, TX

Responsibilities:

  • Design new applications for high transaction processing & scalability to seamlessly support future modifications and growing volume of data processed in environment.
  • Implement solutions to run effectively in cloud and improve the performance of big data processing and high volume of data being handled by the system to provide better customer support.
  • Work with business process managers and be a subject matter expert for transforming vast amounts of data and creating business intelligence reports using the state-of-the-art big data technologies (Hive, Spark, Scoop, and NIFI for ingestion of big data, python/bash scripting /Apache Airflow for scheduling jobs in GCP/Google’s cloud-based environments).
  • Migrated an Oracle SQL ETL to run on google cloud platform using cloud dataproc & bigquery, cloud pub/sub for triggering the airflow jobs.
  • Worked on using presto, hive, and spark-sql, big query using python client libraries and building interoperable and faster programs for analytics platforms.
  • Hands on experience in using all the big data related services in Google Cloud Platform.
  • Used apache airflow in GCP composer environment to build data pipelines and used various airflow operators like bash operator, Hadoop operators and python callable and branching operators.
  • Moved Data between big query and Azure Data Warehouse using ADF and create Cubes on AAS with lots of complex DAX language for memory optimization for reporting.
  • Built reports for monitoring data loads into GCP and drive reliability at the site level
Azure Data Engineer

Confidential, Redmond, WA

Responsibilities:

  • Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data. Understand current Production state of application and determine the impact of new implementation on existing business processes.
  • Extract data from on-premise and cloud storages and Load data to Azure Data lake by using tools like Databricks and Datafactory.
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
  • Developed Spark applications using Scala and Spark-SQL for creating dimensions and facts tables and processed data is loaded back to azure data lake.
  • Moved data to Azure Data Lake to Azure data warehouse using polybase, created external tables in ADW with 4 compute nodes and scheduled.
  • Responsible for estimating the cluster size, monitoring and troubleshooting of the spark cluster in Databricks.
  • Undertake data analysis and collaborated with down-stream, analytics team to shape the data according to their requirement.
  • Worked on performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning. To meet specific business requirements wrote UDF’s in Scala and Store procedures.
  • Replaced the existing Map Reduce programs and Hive Queries into Spark application using Scala and pyspark.
Data Engineer

Confidential, Richmond, VA

Responsibilities:

  • Designed SSIS Packages to transfer data from flat files, Excel SQL Server using Business Intelligence Development Studio.
  • Extensively used SSIS transformations such as Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task and Send Mail task etc.
  • Performed data cleansing, enrichment, mapping tasks and automated data validation processes to ensure meaningful and accurate data was reported efficiently.
  • Implemented complex business logic through T-SQL stored procedures, Functions, Views and advance query concepts.
  • Created databases and schema objects including tables, indexes and applied constraints, connected various applications to the database and written functions, stored procedures and triggers.
  • Directly involved into requirement gathering and understanding business functions with the end users.
  • Provided valuable suggestions to users in terms of database design and new application implementation to overcome existing performance and reliability issues.
MS SQL/MSBI Developer

Confidential, Chicago, IL

Responsibilities:

  • Analyzed business requirements, facilitating planning and implementation phases of the OLAP model in Team meetings
  • Participated in Team meetings to ensure a mutual understanding with business, development and test teams.
  • Encapsulated frequently executed SQL statements into stored procedures to reduce the query execution times.
  • Created SSIS packages to implement error/failure handling with event handlers, row redirects, and loggings.
  • Implemented a master-child package model to improve maintenance and performance.
  • Configured packages with parameters to acquire values at runtime.
  • Optimized SSIS packages utilizing parallelisms, fast load options, buffer sizes, and checkpoints.
Sr MSBI/Power BI Developer

Confidential, Houston, TX

Responsibilities:

  • Obtained user approvals from the client for the collected requirements to ensure similar understanding between development team and business.
  • Extensively utilized SSIS packages to create complete ETL process and load data into database which was to be used by Reporting Services.
  • Identified the dimension, fact tables and designed the data warehouse using star schema.
  • Developed Multi-Dimensional Objects (Cubes, Dimensions) using SQL Server Analysis Services (SSAS).
  • Designed Dimensional Modeling using SSAS packages for End-User. Created Hierarchies in Dimensional Modeling. Developed Aggregations, partitions and calculated members for cube.
  • Utilized multi-dimensional and tabular model in analysis services to pull and generate reports directly from in-memory from back-end relational data sources.
  • Created reports to retrieve data using Stored Procedures that accept parameters.
  • Used SQL Server profiler for auditing and analyzing the events which occurred during a particular time horizon and stored in script.

ETL BI Developer

Confidential

Responsibilities:

  • Participated in discussions with Team leader, Group Members and Technical Manager regarding any Technical and Business Requirement issues.
  • Developed SSIS packages to Extract, Transform and Load (ETL) data into the data warehouse database from heterogeneous databases/data sources.
  • Developed Stored Procedures, Triggers, Functions and T-SQL Queries to capture updated and deleted data.
  • Used different data Transformations like Lookup, Derived Column, Conditional Split, data conversion while creating the SSIS Packages.
  • Effectively handled data errors during the modification of existing reports and creation of new reports.
  • Export and import data from CSV files, Text files and Excel Spreadsheets by creating SSIS Packages.
  • Experience in reporting services based on existing XML documents.
  • Scheduled Jobs and Alerts using SQL Server Agent.
  • Creating, maintaining the users, roles and granting privileges.

Environments: MS SQL Server 20012, SSIS, SSRS, MS Visual Studio 2012, Microsoft office

Sr MSBI Developer

Confidential

Responsibilities:

  • Designed and developed various SSIS packages (ETL) to extract and transform data and involved in Scheduling SSIS Packages.
  • Performed cost benefit analysis of different ETL packages (SSIS) to determine the optimal Process.
  • Created ETL metadata reports using SSRS, reports include like execution times for the SSIS packages, Failure reports with error description.
  • Created OLAP applications with OLAP services in SQL Server and build cubes with many dimensions using both star and snowflake schemas
  • Extracted and transformed Data from OLTP databases to the specific database designed for OLAP services (was involved in the creation of all objects for that database) on off-peak hours
  • Created stored procedures and performed index analysis for slow performing SSRS reports to bring down the report rendering time from 15 mins to 2-3 mins.
  • Developed Sql queries/Scripts to validate the data such as checking duplicates, null values truncated values and ensuring correct data aggregations.

We'd love your feedback!