Hadoop Developer Resume Philadelphia, PA - Hire IT People

SUMMARY

7+ years of professional IT experience as a Hadoop developer and Java developer.
Extensive experience in working with Big data technologies Hadoop, Spark, Pig, Hive, HBase, Sqoop, Flume, and Kafka.
Good knowledge ofHadooparchitecture and various components such as HDFS, Map Reduce programming paradigm, Job Tracker, Task Tracker, Name Node and Data Node.
Experience in creating and maintaining large data pipelines using Kafka and Akka for handling Terabytes of data.
Experience in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, JSON and Avro.
Good experience working with NoSQL databases Cassandra and HBase.
Experience in installation, configuration, Management, supporting and monitoringAWS EMR, Cloudera (CDH5), MapR and HortonWorks Distributions.
Experience developing Scala applications for Loading/Streaming data from NoSQL databases (HBASE) to HDFS.
Good experience working with Mapper, Reducer, Combiner, Partitioner, Shuffling and Sorting process along with Custom Partitioning for efficient Bucketing.
Extensive experience in extending Hive and Pig core functionality by writing UDFs.
Good experience in Java application development and with frameworks Struts, spring and Hibernate.
Experience working with Apache NiFi to automate the data movement between different Hadoop systems.
Experience in creating RDD’s in Spark and applying operations - Transformations and actions.
Experience in developing and consuming webservices using REST and SOAP protocols.
Experience in performance tuning hive queries, map-reduce jobs and spark jobs.
Experience in moving data utilizing Sqoop from HDFS to Relational Database Systems and the vice versa.
Experience in working with Flume to load the log data from different sources into HDFS.
Experience in designing the Zookeeper to facilitate the servers in clusters and to keep up the information consistency.
Experience in planning both time driven and information driven mechanized work processes utilizing Oozie using python.
Good experience working with Spark and improving the performance and optimization of the existing algorithms inHadoopusing Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, Map Reduce, YARN, Pig, Hive, Flume, Sqoop, Impala, Oozie, Zookeeper, Spark, Kafka

Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Hortonworks and MapR.

Tools: Talend, Informatica, Eclipse

Programming Languages: Java, SQL, Python, C#, PHP, Scala

Web Technologies: HTML, CSS, JavaScript

Operating System: Windows, Unix, Linux

Databases: SQL Server, Oracle, DB2, MySQL

NoSQL Databases: Cassandra, HBase

PROFESSIONAL EXPERIENCE

Confidential, Philadelphia, PA

Hadoop Developer

Responsibilities:

Worked on developing Kafka producer and consumers, Cassandra clients and Spark with components HDFS, Hive.
Worked on Populating HDFS and HBase with huge amounts of data and ingest data in to spark engine using Apache Kafka.
Created RDD’s and worked on applying operations - Transformation and Actions.
Managed and Scheduled Spark Jobs on a Hadoop cluster using Oozie.
Worked on optimizing Hive queries using Hive on top of Spark engine.
Worked on creating and maintaining data pipelines using Kafka and Akka for handling large terabytes of data.
Integrated Hadoop cluster with Spark engine to perform Batch and GraphX operations.
Wrote Sqoop scripts to import data from different data sources to Cassandra.
Used HUE for running Hive queries and created partitions using Hive to improve performance.
Performed cleansing operations using Apache Nifi flow topologies before moving data into HDFS.
Worked on creating ETL jobs using Talend.
Installed and configured Hadoop cluster using AWS and worked with EMR and EC2 web services for fast and efficient data processing.
Developed the batch scripts to fetch the data from AWS S3 and do required transformations in Scala using Spark framework.
Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
Responsible for maintaining and expanding AWS (Cloud Services) infrastructure using AWS.
Wrote Python and Shell scripts for various deployments and automation process.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
Developed data pipelines to ingest data into HDFS using Flume, Sqoop and Pig.
Worked on performance tuning hive queries, map-reduce jobs, spark jobs.
Moved data from RDBMS to Hive Dynamic partition tables using Sqoop.

Environment: Apache Spark, Kafka, Cassandra, Flume, YARN, Sqoop, Oozie, Hive, Pig, Java,Hadoopdistribution of Cloudera 5.4/5.5, Linux, XML, Eclipse, MySQL.

Confidential, Omaha, NE

Hadoop Developer

Responsibilities:

Used Sqoop and Java API’s to import the data to Cassandra from different relational databases.
Created tables in Cassandra and loaded large data sets of structured, semi-structured and unstructured data from various data sources.
Developed Map reduce jobs in Java for cleaning and preprocessing data.
Wrote Python scripts for wrapper and utility automation.
Performed cleansing operations by using storm builder topologies before moving data in to Cassandra.
Implemented Storm builder topologies to perform cleansing operations before moving data into Cassandra.
Worked on configuring Hive, PIG, Impala, Sqoop, Flume and oozie in cloudera.
Automated data movement between different Hadoop systems using Apache NiFi.
Wrote Map reduce programs in python using Hadoop Streaming API.
Wrote on creating Hive tables and loading them with data and writing Hive queries.
Migration of ETL processes from SQL server to Hadoop using PIG for data manipulation.
Developed spark jobs using Scala in test environment and Spark sql for querying.
Worked on importing data from oracle tables to HDFS and Hbase tables using Sqoop.
Wrote scripts to load data in to Spark RDDs and do in memory computations.
Wrote Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs in Scala.
Experience in Elastic search technologies in creating custom Solr Query components.
Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
Worked on different data sources such as Oracle, Netezza, MySQL, Flat files etc.
Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
Worked with Flume to load the log data from different sources into HDFS.
Good knowledge in using apache NiFi to automate the data movement between differentHadoop systems.
Developed Talend jobs to move inbound files to HDFS file location based on monthly, weekly, daily and hourly partitioning.

Environment: Cloudera, Map Reduce, SparkSQL, SparkStreaming, Pig, Hive, Flume, Hue, Oozie, Java, Eclipse, Zookeeper, Cassandra, Hbase, Talend, Github.

Confidential

Java/Hadoop Developer

Responsibilities:

Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
Used Hibernate Framework for persistence onto oracle database.
Written and debugged the ANT Scripts for building the web application.
Developed web services inJava and used WSDL to publish the services to another application.
Wrote SQL commands and Stored Procedures to retrieve data from Oracle database. Worked to plug this procedure inJavaclasses.
Worked on developing UI using HTML, CSS and JavaScript.
Involved in writing PL/SQL - Stored Procedures, Functions, Triggers, and Sequence etc.
ImplementedJavaMessage Services(JMS) using JMS API.
Worked on managing and reviewing Hadoop log files.
Installed and configured Hadoop, YARN, Map Reduce, Flume, HDFS, developed Map Reduce jobs inJavafor data cleaning.
Coded using Servlets, SOAP Client and Apache CXF RestAPI's for delivering the data from our application to external and internal for communication protocol.
Worked on Cloudera distribution system for running Hadoop jobs on it.
Expertise in writing Hadoop Jobs to analyze data using Map Reduce, Hive, Pig, Solr and Splunk.
Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service.
Worked on moving the data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
Experienced in designing and developing multi-tier scalable applications usingJavaand J2EE Design Patterns.

Environment: MapR,Java, HTML,JavaScript, SQL Server, PL/SQL, JSP, Spring, Hibernate, Web Services, SOAP, SOA, JSF,Java, JMS, Junit, Oracle, Eclipse, SVN, XML, CSS, Log4j, Ant, Apache Tomcat.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Philadelphia, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship