We provide IT Staff Augmentation Services!

Senior Big Data Engineer Resume

3.00/5 (Submit Your Rating)

Chandler, AZ

SUMMARY

  • Over 12+ years IT experience as a Senior Big Data Engineer on Azure Cloud. Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and controlling and granting database access and migrating on premise databases to Azure Data lake store using Azure Data factory.
  • Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Technical expertise entails Data Warehouse, Enterprise Data Platform (EDP) and Big Data.
  • Migrating On premise databases to Azure Data Lake store using Azure Data Factory.
  • Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.
  • Experience in using Agile methodologies including extreme programming, SCRUM and Test-Driven Development (TDD).
  • Hands on experience in working on Spark SQL queries, Data frames, and import data from Data sources, perform transformations; perform read/write operations, save the results to output directory into HDFS.
  • Hands on experience working on creating delta lake tables and applying partitions for faster querying
  • Experience in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Good understanding of HDFS Designs, Daemons, HDFS High Availability (HA).
  • Hands on experience with data ingestion tools Kafka, Flume and workflow management tools Oozie.
  • Development level experience in Microsoft Azure providing data movement and scheduling functionality to cloud based technologies such as Azure Blob Storage and Azure SQL Database.

TECHNICAL SKILLS

Big Data Ecosystems: Spark, Hive, HDFS, Map Reduce, Impala, Python, Apache Kafka, Sqoop, HDFS, Pig, Oozie, Flume, Yarn, Spark, NiFi

Cloud Technologies: Azure and Big Data Appliance

Programming Languages: Python, SQL, HQL, Visual Basic 6

Databases: MySQL, Oracle11.2.0.4, SQL server, DB2, SQL Data Warehouse

Software Package: Azure Cloud, Big Data Appliance, Cloudera, Oracle EBS Financial (Technical)

Design Tools: Hive, Spark, Azure Data Factory, PL/SQL Developer, Databricks, Tableau, Power BI

Reporting Tools: Crystal Reports, OBIEE, Tableau, Jaspersoft, Big Data Discovery

Methodologies: Agile, Waterfall

Version Control: Azure DevOps, Git

PROFESSIONAL EXPERIENCE

Confidential, Chandler, AZ

SENIOR BIG Data Engineer

Responsibilities:

  • Extracted, Transformed, and Load data from various sources to Azure Data Storage services using a combination of Azure Data factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics.
  • Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks with PySpark.
  • Migrating data from on-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).
  • Created Azure Data Lake Storage (ADLS) (Gen 1 and Gen 2) and Ingested the Data from Flat files, CSV Files, Json Files and On-Premises Database Tables using Azure Data Factory V2(ADF).
  • Responsible for ingesting data from various source systems (RDBMS, Flat files, Big Data) into Azure (Blob Storage) using framework model.
  • Move Data from Teradata, Oracle, Snowflake, SQL Server to ADLS gen 2.
  • Automized the Power BI reports using Azure Data Factory (ADF) trigger pipelines when source data is updated based on the change log.
  • Good understanding of Spark Architecture including spark core, spark S QL, DataFrame, Spark streaming,
  • Developed Talend jobs to populate the claims data to data warehouse using Star schema, Snowflake schema, Hybrid Schema (depending on the use case).
  • Involved into Application Design and Data Architecture using Cloud and Big Data solutions on Azure.
  • Hands on experience working on creating delta lake tables and applying partitions for faster querying.

Confidential, MN

BIG DATA/ ENGINEER

Responsibilities:

  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
  • Developed Spark applications using Python and SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Developed Spark applications using Spark - SQL and Python in Azure Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Used Jupyter notebooks and PyCharm to develop, test and analyze Spark jobs before Scheduling Customized Spark jobs.
  • Replaced the existing MapReduce programs and Hive Queries into Spark application using Python.
  • Deployed and tested (CI/CD) our developed code using Visual Studio Team Services (VSTS).
  • Developed Json Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Cosmos Activity.
  • Performed fundamental tasks related to the design, construction, monitoring, and maintenance of Microsoft SQL databases.

Confidential, New York, NY

Data engineer

Responsibilities:

  • Created Azure Databricks jobs to operationalize azure data flows.
  • Implemented CI/CD via Azure DevOps. Using Jenkins to migrate all the Azure Data factory pipelines
  • Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.
  • Used databricks with Azure Data Factory (ADF) to compute large volumes of data.
  • Developed Python scripts to do file validations in databricks and automated the process using ADF.
  • Developed Streaming pipelines using Azure Event Hubs and Stream Analytics to analyze data.
  • Applied Lambda Architecture in Azure Databricks that combinesboth batch and real-time processing methods.
  • Using JIRA for issues and project tracking, TFS for version control, and Control-M for scheduling the jobs.

Confidential, St. Louis, MO

SQL Developer

Responsibilities:

  • Responsible for requirement gathering from business to create package specs.
  • Installed and Configuring SQL server 2008. Created Entity Relationship (ER) Diagrams to the proposed database.
  • Extensively used Joins and sub-queries for complex queries involving multiple tables from different databases.
  • Responsible for creating Databases, Tables, Cluster/Non-Cluster Index, Unique/Check Constraints Views, Stored Procedures, Triggers, Rules and Defaults.
  • Built and maintained SQL scripts, indexes, and complex queries for data analysis and extraction.
  • Performed fundamental tasks related to the design, construction, monitoring, and maintenance of Microsoft SQL databases.

We'd love your feedback!