Hadoop Developer Resume Austin, TX - Hire IT People

SUMMARY:

Hadoop Developer with 8+ Years of IT experience including 4 years in Big Data and Analytics field in Storage, Querying, Processing and Analysis for developing E2E Data pipelines. Expertise in designing scalable Big Data solutions, data warehouse models on large - scale distributed data, performing wide range of analytics.
Expertise in all components of Hadoop/Spark Ecosystems - Spark, Hive, Pig, Flume, Sqoop, HBase, Kafka, Oozie, Impala, Stream sets, Apache NIFI, Hue, AWS.
3+ years of experience working in programming languages Scala/Python.
Extensive knowledge on data serialization techniques like Avro, Sequence Files, Parquet, JSON and ORC.
Acute knowledge on Spark architecture and real-time streaming using Spark.
Hands on experience with Spark Core, Spark SQL and Data Frames/Data Sets/RDD API.
Good knowledge on Amazon Web Services (AWS) cloud services like EC2, S3, EMR and VPC.
Experienced in Data Ingestion, Data Processing, Data Aggregations, Visualization in Spark Environment.
Hands on experience in working with large volume of Structured and Un-Structured data.
Expert in migrating the code components from SVN repository to Bit Bucket repository.
Experienced in building Jenkins pipelines for continuous code integration from Github into Linux machine. Experience in Object Oriented Analysis Design (OOAD) and development.
Good understanding in end-to- end web applications and design patterns.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Experience in implementing by using agile methodology. Well versed in using Software development methodologies like Agile Methodology and Waterfall processes.
Experienced in handling databases: Netezza, Oracle and Teradata.
Strong team player with good communication, analytical, presentation and inter-personal skills.

TECHNICAL SKILLS:

Bigdata Technologies: HDFS, Map Reduce, Pig, Hive, Sqoop, Oozie, Scala, Spark, Kafka, Flume, Ambari, Hue

Hadoop Frameworks: Cloudera CDHs, Hortonworks HDPs, MAPR.

Database: Oracle 10g/11g, PL/SQL, MySQL, MS SQL Server 2012, DB2

Language: C, C++, Java, Scala, Python

AWS Components: IAH, S3, EMR, EC2,Lambda, Route 53, Cloud Watch, SNS

Methodologies: Agile, Waterfall

Build Tools: Maven, Gradle, Jenkins.

NOSQL Databases: HBase, Cassandra, MongoDB, DynamoDB

IDE Tools: Eclipse, Net Beans, Intellij

Modelling Tools: Rational Rose, Star UML, Visual paradigm for UML

Architecture: Relational DBMS, Client-Server Architecture

Cloud Platforms: AWS Cloud

BI Tools: Tableau

Operating System: Windows 7/8/10, Vista, UNIX, Linux, Ubuntu, Mac OS X

PROFESSIONAL EXPERIENCE:

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

Worked on Hortonworks-HDP 2.5 distribution.
Involved in review of functional and non-functional requirements.
Responsible for designing and implementing the data pipeline using Big Data tools including Hive, Spark, Scala and Stream Sets.
Experience in using Apache Storm, Spark Streaming, Apache Spark, Apache NiFi, Kafka and Flume in creating data streaming solutions.
Developed and implemented Apache NIFI across various environments, written QA scripts in Python for tracking files.
Involved in importing data from Microsoft SQL Server, MySQL, and Teradata into HDFS using Sqoop.
Good knowledge in using Apache NIFI to automate the data movement.
Developed Sqoop scripts to import data from relational sources and handled incremental loading.
Extensively used Stream Sets Data Collector to create ETL pipeline for pulling the data from RDBMS system to HDFS.
Implemented the data processing framework using Scala and Spark SQL.
Worked on implementing the performance optimization methods to improve the data processing timing.
Experienced in creating the shell scripts and made jobs automated.
Extensively worked on Data frames and Datasets using Spark and Spark SQL.
Responsible for defining the data flow within Hadoop eco system and direct the team in implement them and exported the result set from Hive to MySQL using Shell scripts.
Worked on Kafka Streaming using stream sets to process continuous integration of data from Oracle systems to hive tables.
Developed a generic utility in Spark for pulling the data from RDBMS system using multiple parallel connections.
Integrated existing code logic in HiveQL and implemented in the Spark application for data processing.
Extensively used Hive/Spark optimization techniques like Partitioning, Bucketing, Map Join, parallel execution, Broadcast join and Repartitioning.

Environment: Spark, Python, Scala, Hive, Hue, UNIX Scripting, Spark SQL, Stream sets, Kafka, Impala, Beeline, Git, Tidal.

Confidential, Washington, D.C

Hadoop Developer

Responsibilities:

Worked on Hortonworks-HDP 2.5distribution.
Experience in implementing Scala framework code using IntelliJ and UNIX scripting to implement the workflow for the jobs.
Involved in gathering business requirement, analyze the use case and implement the use case end to end.
Worked closely with the Architect; enhanced and optimized product Spark and Scala code to aggregate, group and run data mining tasks using Spark framework.
Experienced in loading the raw data into RDDs and validate the data.
Experienced in converting the validated RDDs into Data frames for further processing.
Implemented the Spark SQL code logic to join multiple data frames to generate application specific aggregated results.
Experienced in fine tuning the jobs for better performance in the production cluster space.
Worked totally in agile methodologies, used Rally scrum tool to track the User stories and Team performance.
Worked extensively in Impala Hue to analyze the processed data and to generate the end reports.
Experienced working with hive database through beeline.
Worked on analyzing and resolving the production job failures in several scenarios.
Implemented UNIXscripts to define the use case workflow and also to process the data files, and automate the jobs..

Environment: Spark, Scala, Hive, Sqoop, UNIX Scripting, Spark SQL, IntelliJ, Hbase, Kafka, Impala, Hue, Beeline, Git.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

Worked on Cloudera CDH distribution.
Hand on experience on cloud services like Amazon Web Services (AWS)
Created data pipelines for different events to load the data from DynamoDB to AWS S3 bucket and then into HDFS location.
Involved in complete SDLC - Requirement Analysis, Development, Testing and Deployment into Cluster.
Worked hand-in-hand with the Architect; enhanced and optimized product Spark code to aggregate, group and run data mining tasks using Spark framework.
Extracted data from various SQL database sources into HDFS using Sqoopand also ran Hive scripts on the huge chunks of data.
Implemented a prototype for the complete requirements using Splunk, python and Machine learning concepts.
Design and Implementation of Map reduce code logic for Natural Language Processing of Free Form Text.
Deployed the project on Amazon EMR with S3 Connectivity.
Implemented usage of Amazon EMR for processing Big Data across a Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service(S3).
Loaded the data into Simple Storage Service (S3) in the AWS Cloud.
Good Knowledge in using of Amazon Load Balancer for Auto scaling in EC2 servers.
Implemented Spark scripts to migrate map reduce jobs into Spark RDD transformations, streaming data using Apache Kafka.
Implemented Spark SQL queries which intermix the Hive queries with the programmatic data manipulations supported by RDDs and data frames in scala and python.
Involved in Deployment of Code Logic and UDFsacross the cluster.
Communicate deliverables status to user/stakeholders, client and drive periodic review meetings.
Worked on Data Processing using Hive queries in HDFS and the shell Scripts to wrap the HQL scripts.
Developed and Deployed Oozie Workflows for recurring operations on Clusters.
Experienced in performance tuning of hadoop jobs for setting right Batch Interval time, correct level of Parallelism and memory tuning.
Worked extensively with Sqoop for importing metadata from Oracle.
Used Tableau reporting tool to generate reports from the outputs stored in HDFS.

Environment: Hadoop, Spark, HDFS, Hive, Map Reduce, Sqoop, Oozie, Tableau.

Confidential

Hadoop Developer

Responsibilities:

Worked on Cloudera CDH distribution.
Design and Implement historical and incremental data ingestion techniques from multiple external systems using Hive, pig and sqoop ingestion tools.
Design physical data models for structured and semi-structured to validate the raw data into HDFS.
Design map/reduce logic and HIVES queries for generating aggregated metrics.
Involved in Design, implementation, development and testing phases in the project.
Responsible to monitor the jobs in production cluster while and trace the error logs when the jobs fails.
Design and Develop data migration logic for exporting data from MySQL to Hive.
Design and Develop complex workflow in Oozie for recurrent job execution.
Used SSRS reporting tool for the generation of data analysis reports.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Oozie, Eclipse, Cloudera, Sqoop, SSRS

Confidential

Software Developer

Responsibilities:

Involved in complete SDLC - Requirement Analysis, Development, Testing and Deployments.
Involved in resolving critical Errors.
Responsible to deploy the deliverables of sprints successfully.
Involved in capturing the client’s requirements and enhancements on the application document the requirements and populate to the associated teams.
Design and Implementation of REST Full services and WSDL in VORDEL.
Implemented complex SQL quires to get the analysis reports.
Created Desktop applications using J2EE, Swings.
Involved in developing applications using Java, JSP, Servlets, Swings.
Developed UI using HTML, CSS, Ajax, JQuery and developed Business logic and Interfacing Components using Business Objects, XML and JDBC.
Created applications, connection pools, deployment of JSP & Servlets.
Used Oracle, MySQL database for storing user information.
Developed backed for application using PHP for web applications.
Experienced with the Agile Methodologies.

Environment: SOAP, REST, HTML, WSDL, 22 Vordel, SQL Developer

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Austin, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship