Data Engineer Resume High Point, NC - Hire IT People

SUMMARY:

Over 4+ years of experience in IT industry, played major role in implementing, developing and maintenance of various Web Based applications using Java and Big Data Ecosystem .
Around 3+ years of strong end to end experience in Hadoop, Spark, Cloud development using different Big Data tools .
Strong knowledge of Hadoop Architecture and Daemons such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts.
Expertise in importing and exporting data into HDFS and hive using Sqoop and vice versa.
Experience in Adding and removing the nodes in Hadoop Cluster .
Experience in extracting the data from RDBMS into HDFS sqoop
Experience in collecting the logs from log collector into HDFS using up Flume
Good understing of No SQL databases such as HBase.
Experience in analyzing data in HDFS through Map Reduce, Hive and pig.
Design, implement and review features and enhancements to Cassandra.
Experience in writing Map Reduce programs in Java.
Experienced in optimizing Hive Queries by tuning configuration parameters.
Involved in designing the data model in Hive for migrating the ETL process into Hadoop and wrote PIG Scripts to load data into Hadoop Environment.
Expert in data cleansing operations using Pig Latin transformations.
Hands on experience in NOSQL databases like Cassandra .
Experience in developing near real time workflows using spark streaming.
Experienced in using messaging queues like Kafka, Event Hub, and Service Bus.
Good Knowledge in Java, UNIX shell scripting, Linux, SQL Developer.
Extensive knowledge on Data ingestion, data processing, Batch analytics .

TECHNICAL SKILLS:

Big Data Stack: HDFS, Spark, Hive, Sqoop, Pig, MapReduce, Flume, Oozie, HBase, Kafka.

Programming Languages: Java, Scala, Python

Search Engine: Elastic Search

Databases and NoSQL DB s: Oracle, MySQL, SQL, Cassandra, MongoDB, DynamoDB

Cloud: Azure, AWS

Others: Shell Scripting, SBT, Jenkins

PROFESSIONAL EXPERIENCE:

Confidential, High Point, NC

Data Engineer

Responsibilities:

Implemented workflows to process around 400 messages per second and push the messages to the DocumentDB as well as Event Hubs.
Developed a custom message producer which can produce about 4000 messages per second for scalability testing.
Implemented call - back architecture and notification architecture for real time data.
Implemented spark streaming in scala to process the JSON Messages and push them to the kafka topic.
Created Custom Dashboards Using Aplication Insights and Aplication Insights Query Language to process metrics sent to AI and create dashboards on top of it in AZURE .
Created real time streaming dashboards in PowerBi using Stream Analytics to push dataset to PowerBi.
Developed a custom message consumer to consume the data from the kafka producer and push the messages to service bus and event hub (Azure Components).
Written Auto scalable functions which will consume the data from Azure Service Bus or Azure Event Hub and send the data to DocumentDB.
Written spark Application to capture the change feed from the DocumentDB using java API and write updates to the new DocumentDB .
Implemented Zero Down Time deployment for the entire production pipelines in Azure.
Implemented CICD pipelines to build and deploy the projects in Hadoop environment.
Experienced in implementing the pipelines in Jenkins.
Used Custom Receiver, socket stream, File stream and Directory stream in spark streaming.
Used Lambda, Kinesis, DynamoDB, cloudwatch from AWS.
Used APP Insights, DocumentDB, Service Bus, Azure Data Lake Store, Azure Blob Store, Event HUB, Azure Functions.
Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark.
Developed Hadoop streaming Jobs using python for integrating python API supported applications.
Used Python to run the ansible playbook which will deploy the logic apps to azure.

Environment : Hadoop, Hive, HDFS, Azure, AWS, spark Streaming, spark-sql, scala, python, Java, webserver’s, Maven Build, Jenkins, Ansible.

Confidential, DE

Data Engineer

Responsibilities:

Developed daily process to do Incremental import of Data from DB2 and Oracle into HDFS using Sqoop.
Extensively used Pig for Data transformation and Data cleansing.
Experienced in converting Hive scripts into spark using Scala and optimized spark jobs.
Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark.
Experience in writing Store procedures to transform the data in Microsoft sql server.
Scheduled multiple spark jobs in Oozie scheduler.
Developed Error logging script to load the Ingestion logs into HBase table.
Worked on CI/CD deployments using Jenkins.
Ingested XML Files captured from RabbitMQ and stored in HBase table.
Developed Complex Hive queries to analyze Data.

Environment: MAPR, Spark, Hive, Sqoop, Spark, HBase, Oozie, Pig, Scala, Unix, Agile, Code hub, Splunk, Pyspark.

Confidential

Hadoop / Java developer

Responsibilities:

Developed Hive queries as per required analytics for the report generation
Involved in developing the Pig scripts to process the data coming from different sources
Worked on data cleaning using pig scripts and storing in HDFS.
Worked on PIG user define functions (UDF) using java language for external functions
Developed data requirements, performed database queries to identify test data, and to create data procedures with expected results.
Planned, coordinated and managed internal process documentation and presentations, describing the Process Improvement identified along with the Workflow, Diagrams associated with the Process Flow.
Scheduling jobs to automate the process for regular executing jobs worked on using OOZIE.
Moving data to HDFS frame work using SQOOP.
Expertise on HIVE optimisation techniques like partioning and bucketing on the différent formates of data
Worked on UDF in HIveusing JAVA.
Expertise on PIG Joins to handle the data in Différent data sets

Environment: : Hadoop, Hive, Pig, Sqoop, HBase, MapReduce, Java, Python

Confidential

Java Developer

Responsibilities:

Involved in the implementation of service layer and DAO.
Responsible for designing JSPs as per the requirements.
Developer Core Java applications
Worked on JDBC, Collections, Multithreading, Collection API and Generics, File Handling.
Experience in working on Java collections
Worked SQl queries to create databases and tables and loading the data using SQl queries
Worked on java exceptions.
Involved in developing and deploying the Server Side components.
Fixed/Float Interest rates projections
Coding, debugging and bug fixing.

Environment: Core Java, JDBC, Collections, Multithreading, Hibernate, Collection API and Generics, File Handling, SQL Server 2005, Tortoise SVN, J2EE

We provide IT Staff Augmentation Services!

Data Engineer Resume

High Point, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship