We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Philadelphia, PA

SUMMARY

  • 7+ years of professional IT experience as a Hadoop developer and Java developer.
  • Extensive experience in working with Big data technologies Hadoop, Spark, Pig, Hive, HBase, Sqoop, Flume, and Kafka.
  • Good knowledge ofHadooparchitecture and various components such as HDFS, Map Reduce programming paradigm, Job Tracker, Task Tracker, Name Node and Data Node.
  • Experience in creating and maintaining large data pipelines using Kafka and Akka for handling Terabytes of data.
  • Experience in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, JSON and Avro.
  • Good experience working with NoSQL databases Cassandra and HBase.
  • Experience in installation, configuration, Management, supporting and monitoringAWS EMR, Cloudera (CDH5), MapR and HortonWorks Distributions.
  • Experience developing Scala applications for Loading/Streaming data from NoSQL databases (HBASE) to HDFS.
  • Good experience working with Mapper, Reducer, Combiner, Partitioner, Shuffling and Sorting process along with Custom Partitioning for efficient Bucketing.
  • Extensive experience in extending Hive and Pig core functionality by writing UDFs.
  • Good experience in Java application development and with frameworks Struts, spring and Hibernate.
  • Experience working with Apache NiFi to automate the data movement between different Hadoop systems.
  • Experience in creating RDD’s in Spark and applying operations - Transformations and actions.
  • Experience in developing and consuming webservices using REST and SOAP protocols.
  • Experience in performance tuning hive queries, map-reduce jobs and spark jobs.
  • Experience in moving data utilizing Sqoop from HDFS to Relational Database Systems and the vice versa.
  • Experience in working with Flume to load the log data from different sources into HDFS.
  • Experience in designing the Zookeeper to facilitate the servers in clusters and to keep up the information consistency.
  • Experience in planning both time driven and information driven mechanized work processes utilizing Oozie using python.
  • Good experience working with Spark and improving the performance and optimization of the existing algorithms inHadoopusing Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, Map Reduce, YARN, Pig, Hive, Flume, Sqoop, Impala, Oozie, Zookeeper, Spark, Kafka

Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Hortonworks and MapR.

Tools: Talend, Informatica, Eclipse

Programming Languages: Java, SQL, Python, C#, PHP, Scala

Web Technologies: HTML, CSS, JavaScript

Operating System: Windows, Unix, Linux

Databases: SQL Server, Oracle, DB2, MySQL

NoSQL Databases: Cassandra, HBase

PROFESSIONAL EXPERIENCE

Confidential, Philadelphia, PA

Hadoop Developer

Responsibilities:

  • Worked on developing Kafka producer and consumers, Cassandra clients and Spark with components HDFS, Hive.
  • Worked on Populating HDFS and HBase with huge amounts of data and ingest data in to spark engine using Apache Kafka.
  • Created RDD’s and worked on applying operations - Transformation and Actions.
  • Managed and Scheduled Spark Jobs on a Hadoop cluster using Oozie.
  • Worked on optimizing Hive queries using Hive on top of Spark engine.
  • Worked on creating and maintaining data pipelines using Kafka and Akka for handling large terabytes of data.
  • Integrated Hadoop cluster with Spark engine to perform Batch and GraphX operations.
  • Wrote Sqoop scripts to import data from different data sources to Cassandra.
  • Used HUE for running Hive queries and created partitions using Hive to improve performance.
  • Performed cleansing operations using Apache Nifi flow topologies before moving data into HDFS.
  • Worked on creating ETL jobs using Talend.
  • Installed and configured Hadoop cluster using AWS and worked with EMR and EC2 web services for fast and efficient data processing.
  • Developed the batch scripts to fetch the data from AWS S3 and do required transformations in Scala using Spark framework.
  • Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
  • Responsible for maintaining and expanding AWS (Cloud Services) infrastructure using AWS.
  • Wrote Python and Shell scripts for various deployments and automation process.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Developed data pipelines to ingest data into HDFS using Flume, Sqoop and Pig.
  • Worked on performance tuning hive queries, map-reduce jobs, spark jobs.
  • Moved data from RDBMS to Hive Dynamic partition tables using Sqoop.

Environment: Apache Spark, Kafka, Cassandra, Flume, YARN, Sqoop, Oozie, Hive, Pig, Java,Hadoopdistribution of Cloudera 5.4/5.5, Linux, XML, Eclipse, MySQL.

Confidential, Omaha, NE

Hadoop Developer

Responsibilities:

  • Used Sqoop and Java API’s to import the data to Cassandra from different relational databases.
  • Created tables in Cassandra and loaded large data sets of structured, semi-structured and unstructured data from various data sources.
  • Developed Map reduce jobs in Java for cleaning and preprocessing data.
  • Wrote Python scripts for wrapper and utility automation.
  • Performed cleansing operations by using storm builder topologies before moving data in to Cassandra.
  • Implemented Storm builder topologies to perform cleansing operations before moving data into Cassandra.
  • Worked on configuring Hive, PIG, Impala, Sqoop, Flume and oozie in cloudera.
  • Automated data movement between different Hadoop systems using Apache NiFi.
  • Wrote Map reduce programs in python using Hadoop Streaming API.
  • Wrote on creating Hive tables and loading them with data and writing Hive queries.
  • Migration of ETL processes from SQL server to Hadoop using PIG for data manipulation.
  • Developed spark jobs using Scala in test environment and Spark sql for querying.
  • Worked on importing data from oracle tables to HDFS and Hbase tables using Sqoop.
  • Wrote scripts to load data in to Spark RDDs and do in memory computations.
  • Wrote Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs in Scala.
  • Experience in Elastic search technologies in creating custom Solr Query components.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Worked on different data sources such as Oracle, Netezza, MySQL, Flat files etc.
  • Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
  • Worked with Flume to load the log data from different sources into HDFS.
  • Good knowledge in using apache NiFi to automate the data movement between differentHadoop systems.
  • Developed Talend jobs to move inbound files to HDFS file location based on monthly, weekly, daily and hourly partitioning.

Environment: Cloudera, Map Reduce, SparkSQL, SparkStreaming, Pig, Hive, Flume, Hue, Oozie, Java, Eclipse, Zookeeper, Cassandra, Hbase, Talend, Github.

Confidential

Java/Hadoop Developer

Responsibilities:

  • Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
  • Used Hibernate Framework for persistence onto oracle database.
  • Written and debugged the ANT Scripts for building the web application.
  • Developed web services inJava and used WSDL to publish the services to another application.
  • Wrote SQL commands and Stored Procedures to retrieve data from Oracle database. Worked to plug this procedure inJavaclasses.
  • Worked on developing UI using HTML, CSS and JavaScript.
  • Involved in writing PL/SQL - Stored Procedures, Functions, Triggers, and Sequence etc.
  • ImplementedJavaMessage Services(JMS) using JMS API.
  • Worked on managing and reviewing Hadoop log files.
  • Installed and configured Hadoop, YARN, Map Reduce, Flume, HDFS, developed Map Reduce jobs inJavafor data cleaning.
  • Coded using Servlets, SOAP Client and Apache CXF RestAPI's for delivering the data from our application to external and internal for communication protocol.
  • Worked on Cloudera distribution system for running Hadoop jobs on it.
  • Expertise in writing Hadoop Jobs to analyze data using Map Reduce, Hive, Pig, Solr and Splunk.
  • Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service.
  • Worked on moving the data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
  • Experienced in designing and developing multi-tier scalable applications usingJavaand J2EE Design Patterns.

Environment: MapR,Java, HTML,JavaScript, SQL Server, PL/SQL, JSP, Spring, Hibernate, Web Services, SOAP, SOA, JSF,Java, JMS, Junit, Oracle, Eclipse, SVN, XML, CSS, Log4j, Ant, Apache Tomcat.

We'd love your feedback!