Senior Data Engineer Resume Atlanta, GA - Hire IT People

SUMMARY

Over 7 years of experience in Business Intelligence, Data Engineering and Data Warehousing
Hands on experience in Infrastructure Development and Operations. Designed and deployed applications using AWS services like EC2, S3, Glue, Lambda, EMR, VPC, RDS, Auto scaling, Cloud Formation, Cloud Watch, Redshift, Athena and Kinesis Data Firehose and Data Streams
Experienced in extract transform and load (ETL) processing large datasets of different forms including structured, semi - structured and unstructured data
Experience in understanding business requirements for analysis, database design & development of applications
Strong SQL development skills including writing Stored Procedures, Triggers, Views, and User Defined functions
Experience developing Real Time processes
Extensive Shell/Python scripting experience for Scheduling and Process Automation
Versatile in deploying the content to Cloud platform on AWS using AWS, S3
Experience in configuring AWS IAM and Security Group in public and private Subnets in VPC
Expertise in converting AWS existing infrastructure to server less architecture (AWS Lambda, Kinesis) and deployed AWS Cloud formation
Extensively worked on Jenkins by installing, configuring, and maintaining the purpose of Continuous Integration (CI) and for End-to-End automation for all build and deployments and in implementing CI/CD for database using Jenkins
Strong work ethics with desire to succeed and make significant contributions to the organization
Experience working both independently and collaboratively to solve problems and deliver high quality results in a fast-paced, unstructured environment.

TECHNICAL SKILLS

Big Data Technologies: AWS, S3, Lambda, Triggers, Glue, EMR, Kinesis, Redshift, Hadoop, HDFS, Hive, MapReduce, Pig, Flume, Oozie, HBase, Spark

Programming Languages: Python, Java, Scala

Databases: MySQL, SQL/PL-SQL, MS-SQL Server 2005, Oracle 9i/10g/11g

Scripting Languages: JavaScript, HTML5, CSS3, XML, SQL, Shell

ETL Tools: Cassandra, HBase

Operating System: Linux, Windows XP/7/8/10

Software Life Cycle: SDLC, Waterfall, Agile

Office Tools: MS-Office, MS-Project, Risk Analysis tools, Visio

PROFESSIONAL EXPERIENCE

Confidential, Atlanta, GA

Senior Data Engineer

Responsibilities:

Designed and built full end-to-end Data Warehouse infrastructure from the ground up on Redshift for large scale data handling Thousands of records every day
Implemented and Managed ETL solutions and automating operational processes
Design and develop ETL integration patterns using Python on Spark
Develop framework for converting existing PowerCenter mappings and to PySpark(Python and Spark) Jobs
Wrote various data normalization jobs for new data ingested into Redshift
Worked at optimizing volumes and EC2 instances and created multiple VPC instances and on IAM to create new accounts, roles and groups
Implemented SparkRDD transformations to map business analysis and apply actions on top of Transformations
Built S3 buckets and managed policies for S3 buckets and used S3 bucket and Glacier for storage and backup on AWS
Optimizing and tuning the Redshift environment, enabling queries to perform up to 100x faster for Tableau and SAS Visual Analytics
Integrated services like GitHub, AWS CodePipeline, Jenkins and AWS Elastic Beanstalk to create a deployment pipeline
Created monitors, alarms and notifications for EC2 hosts using CloudWatch
Implemented new projects builds framework using Jenkins as build framework tools.

Environment: AWS, S3, Redshift, Kinesis Firehose, Kinesis Data Streams, Cloud Watch, GIT, Apache, Spark, Python, PySpark, MySQL, Shell Scripts, Lambda, Cloud Formation, Cloud Trail, Cloud Front, Docker

Confidential

Data Engineer

Responsibilities:

Designed and implemented scalable, secure cloud architecture based on Amazon Web Services. Leveraged AWS cloud services such as EC2; auto-scaling; and VPC (Virtual Private Cloud) to build secure, highly scalable and flexible systems that handled expected and unexpected load bursts and can quickly evolve during development iterations
Designed and Deployed AWS Solutions using EC2, S3, EBS, Elastic Load balancer (ELB), auto-scaling groups and OpsWorks
Worked at optimizing volumes and EC2 instances and created multiple VPC instances and on IAM to create new accounts, roles and groups
Experience in creating alarms and notifications for EC2 instances using Cloud Watch
Experience involving configuring S3 versioning and lifecycle policies and backup files and archive files in glacier and Creating Lambda function to automate snapshot back up on AWS and set up the scheduled backup
Extracting the data and sourcing it to Data Scientist Team.
Worked closely with the Data Scientist team and helping them in understanding the data
Used the AWS-CLI to suspend an AWS Lambda function processing an Amazon Kinesis stream, then to resume it again
Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework
Worked on Designing, Installing and Implementing Ansible configuration management system and writing playbooks for Ansible using YAML and deploying applications

Environment: GIT, Jenkins, AWS, EC2, VPC, S3, EBS, ELB, OpsWorks, IAM, Cloud Watch, Lambda, CLI, Kinesis, SQL, Lambda

Confidential

Big Data Engineer

Responsibilities:

Developed simple to complex MapReduce streaming jobs for processing and validating the data
Developed data pipeline using MapReduce, Flume, Sqoop and Pig to ingest customer behavioural data into HDFS for analysis
Developed MapReduce and Spark jobs to discover trends in data usage by users
Implemented Spark using Python and Spark SQL for faster processing of data
Developed Pig Latin scripts to perform Map Reduce jobs
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
Written automated HBase test cases for data quality checks using HBase command line tools
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
Used Tez framework for building high performance jobs in Pig and Hive
Developed end to end data processing pipelines that begin with receiving data using distributed messaging systems Kafka through persistence of data into HBase
Configured Kafka to read and write messages from external programs as well as to handle real time data
Developed end to end data processing pipelines that begin with receiving data using distributed messaging systems Kafka through persistence of data into HBase
Written Storm topology to emit data into Cassandra DB as well as accept data from Kafka producer and process the data
Developed interactive shell scripts for scheduling various data cleansing and data loading process
Performed data validation on the data ingested using MapReduce by building a custom model to filter all the invalid data and cleanse the data

Environment: Hadoop, MapReduce, Spark, Pig, Hive, Sqoop, Oozie, HBase, Kafka, Spark streaming, Flume, Storm, Tez, Impala, Mahout, Cassandra, Cloudera manager, MySQL, Windows, Unix

Confidential

Java Developer

Responsibilities:

Worked with the front-end applications using HTML, CSS and Java Script
Responsible for developing various modules, front-end and back-end components using several design patterns based on client's business requirements
Designed and Developed application modules using spring and Hibernate frameworks
Designed and developed the front-end with Swings and Spring MVC framework, Tag libraries and Custom Tag Libraries and development of Presentation Tier using JSP pages integrating AJAX, Custom Tag's, JSP Tag Lists, HTML, JavaScript and JQuery
Used Hibernate to develop persistent classes following ORM principles
Deployed spring configuration files such as application context, application resources and application files
Used Java-J2EE patterns like Model View Controller (MVC), Business Delegate, Session facade, Service Locator, Data Transfer Objects, Data Access Objects, Singleton and factory patterns
Worked with Maven for build scripts and Setup the Log4J Logging framework
Managing the version control for the deliverables by streamlining and re-basing the development streams of the SVN

Environment: Java/JDK, J2EE, Spring MVC, Hibernate, Eclipse, XML, JavaScript, Maven2, Web Services, JQuery, SVN, JUnit, Windows, Oracle

We provide IT Staff Augmentation Services!

Senior Data Engineer Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship