Senior Data Engineer Resume Atlanta, GA - Hire IT People

SUMMARY

7+ years of professional experience in IT as Data Engineer with an expert hand in the areas of Database Development, ETL development, Data Modeling, Report development and Big Data technologies.
Experience in Microsoft Azure Cloud technologies including Azure Data Factory(ADF), Azure Data Lake Storage(ADLS), Azure Synapse Analytics(SQL Data warehouse), Azure SQL Database, Azure Analytical services, Polybase, Azure Cosmos NoSOLDB, Azure Key vaults, Azure Devops, Azure HDInsight BigData Technologies like Hadoop, Apache Spark and Azure Data bricks.
Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing and transforming the data to uncover insights into the customer usage patterns.
Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks
Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse
Expertise in building RDD transformations, Actions, Data Frames and case classes for the required input data, as well as converting RDD to Data frames using Spark Context.
Hands on Bash scripting experience and building data pipelines on Unix/Linux systems.
AVRO, Parquet, CSV, XML, JSON and Delta are among the file formats with which I’m familiar..
Strong in Data Warehousing concepts, Star schema and Snowflake schema methodologies, understanding Business process/requirements
On AWS, designed and built scalable, highly available, and fault-tolerant systems.
Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3, EMR), focusing on high-availability, fault tolerance, and auto-scaling.
Working knowledge with Tableau, including the creation of reports with the software.
Expertise in Cloud Migration of Existing Applications and Data Feeds.
Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
Experience in Work-flow management tools like Airflow, Databricks Workflows, Azure Data Factory.
Involved in the entire software development life cycle (SDLC) for the application, which included Agile and Waterfall approaches.
Excellent communication and interpersonal skills, as well as the ability to operate effectively in cross-functional team contexts.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Hive, Apache Spark, MapReduce, Kafka, Pig, Sqoop, Databricks Delta, Oozie

Languages: Python, Java, C++, C, SQL, Scala

Databases: MySQL, SQL Server, AWS Redshift, AWS DynamoDB, PostgreSQL, MongoDB, Apache HBase, Google Cloud BigQuery

Cloud: Microsoft Azure, AWS, Google Cloud, Snowflake

Cloud Stack: Microsoft Azure(Data Lake, Data Bricks, Data Storage, Data Factory), AWS(S3, EC2, EMR, Redshift, IAM, Cloud Watch, Quick Sight)

CI/CD and ETL Tools: Kubernetes, Docker, Jenkins, Informatica

Scheduling Tools: Airflow, Databricks Workflows, Azure Data Factory

PROFESSIONAL EXPERIENCE

Confidential, Atlanta, GA

Senior Data Engineer

Responsibilities:

Developed multiple optimized PySpark applications using Azure Databricks.
Developed ETL solutions using SSIS, Azure Data Factory and Azure Data Bricks.
Responsible for Data Ingestion, Data Cleansing, Data Standardization and Data Transformation.
Experience in building scalable data pipelines in Azure cloud platform using different tools.
Converted Hive queries into Spark transformations using Spark RDDs, Python and Scala.
Worked complex SQL statements using joins, sub queries and correlated sub queries.
Developed data pipelines using Azure Data Factory that process cosmos activity.
Experience in creating HD-INSIGHT cluster and storage account for running the jobs.
Migrated the data from Redshift data warehouse to Snowflake database.
Developed spark scripts using scala and python shell commands
Developed real time ingestion data pipelines from Event Hub into different tools.
Experience in building ETL solutions using Hive and Spark with Python and Scala.
Expert in working on optimizing applications built using tools like Spark and Hive.
Developed job automations from different clusters using Airflow Scheduler.
Involved in Monitoring set up for Hadoop production jobs using ELK.

Environment: Hadoop, HDFS, Pig, Hive, Spark, Oozie, Airflow, Docker, Kafka, Scala, Python, MongoDB, Kubernetes, Drone

Confidential

Data Engineer

Responsibilities:

Gathered requirements from Users to Develop data pipelines from different sources to Hadoop.
Maintaining and Designing Data governance and security for data platforms on AWS Cloud.
Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.
Worked on Terraform to create resources in cloud.Built POC on AWS Glue.
Experience in moving high and low volume data objects from Teradata and Hadoop to Snowflake.
Developed reusable framework to be leveraged for future migration that automates ETL from RDBMS systems to the Data Lake utilizing Spark Data Sources and Hive data objects.
Created data bricks notebooks using Python(Pyspark),Scala and Spark SQL for transforming the data that is stored in Azure Data Lake stored Gen2 from Raw to Stage and Curated zones
Performed day-to-day integration with the Database Administrators (DBA) DB2, SQL Server, Oracle, and AWS Cloud teams to ensure the insertion of database tables, columns and its metadata have been successfully implemented out to the DEV, QUAL and PROD region environments in AWS Cloud - Aurora and Snowflake.
Deployed Instances, provisioned EC2, S3 bucket, Configured Security groups and Hadoop eco system for Cloudera in AWS.
Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift.
Used SSIS, Python scripts, Spark Applications for ETL Operations to create data flow pipelines and was involved in transforming data from legacy tables to Hive, and S3 buckets for handoff to business and Data scientists to create analytics over the data.
Expertise in migrating existing applications, Informatica Data feeds, and ETL Pipelines to the Hadoop, snowflake, and AWS.
Created a process in Hadoop to load 1 billion records in span of one hour to match the latency and get the behind data for one of the important Dataset for Honeywell

Environment: Teradata, SQL Server, Hadoop, ETL operations, Cloudera, AWS Cloud computing architecture, EC2, S3, Advanced SQL methods, Python, Linux, Apache Spark, Scala, Spark-SQL, Terraform.

Confidential

Data Engineer

Responsibilities:

Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL.
Design Setup maintain Administrator the Azure SQL Database, Azure Analysis Service, Azure SQL Data warehouse, Azure Data Factory, Azure SQL Data warehouse.
Propose architectures considering cost/spend in Azure and develop recommendations to right-size data infrastructure.
Used various sources to pull data into Power BI such as SQL Server, Excel, Oracle, SQL Azure etc.
Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
Collaborate with application architects and DevOps.
Storing the parquet data into hive data base with daily date partitions for further queries.
Executed Oozie workflows to run multiple Hive and Pig jobs.

Environment: Hadoop, Hive, Spark 1.6, Scala 2.10, Sqoop, Oozie, Hive, AutoSys, HBase, PIG.

Confidential

Assistant System Engineer

Responsibilities:

Analyzing the Business Requirements and System specifications to understand the application.
Involved in preparing High Level Design and Low-Level Design Documents
Involved in coding and unit testing the new codes.
Prepare Test Plan and Test data.
Testing the code changes at functional and system level
Conduct quality review of design documents, code and test plans
Ensure availability of document/code for review and conduct quality reviews of testing
Fix problems discovered that are within the existing system functionality (Preventive Maintenance).
Modifications required to the code to prevent problems from occurring in future (Preventive Maintenance).
Involved in presenting induction to the new joiner’s in the project.

Environment: Java, Maven, UNIX, Eclipse, SOAP UI, WINSCP, Tomcat, JSP, Quality Center

We provide IT Staff Augmentation Services!

Senior Data Engineer Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship