We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

New, YorK

PROFESSIONAL SUMMARY:

  • Around 8+ years of experience in IT industry, including Java, SQL, Big data environment, Hadoop ecosystem and Design, Developing, Maintenance of various applications.
  • Expertise in HDFS, YARN, MapReduce, Spark, Hive, Impala, Pig, Sqoop, HBase, Oozie, Flume, Kafka, Storm and various other ecosystem components
  • Have a hands - on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and Apache Kafka.
  • Extensive experience in working with Struts and Spring MVC (Model View Controller) architecture for developing applications using various Java/J2EE technologies like Servlets, JSP, JDBC, JSTL.
  • Proficiency in frameworks like Struts, Spring, Hibernate.
  • Expertise in Spark framework for batch and real-time data processing.
  • Good Knowledge of Spark and SCALA programming.
  • Experience in handline messaging services using Apache Kafka.
  • Used Spark-SQL to perform transformations and actions on data residing in Hive.
  • Data Ingestion in to Hadoop (HDFS): Ingested data into Hadoop from various data sources like Oracle, MySQL using Sqoop tool. Created Sqoop job with incremental load to populate Hive External tables. Involved in im
  • Porting the real-time data to Hadoop using Kafka and worked on Flume.
  • Excellent knowledge of Hadoop Architecture and its related components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
  • Good understanding of Linux implementation, customization and file recovery.
  • Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured, semi-structured and unstructured data sets and store them in HDFS.
  • Extensive experience in writing Pig and Hive scripts for processing and analyzing large volumes of structured data.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Loaded local data into HDFS using Apache NiFi.
  • Authentication and authorization management for Hadoop cluster users using Kerberos and Sentry.
  • Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3.
  • Experience working on NoSQL databases like HBase and knowledge in Cassandra, MongoDB.
  • Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Installed, Configured Talend ETL on single and multi-server environments.
  • Created standard and best practices for Talend ETL components and jobs.
  • Good knowledge in Graph databases Janus graph and Neo4j.
  • Front End with HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap.

TECHNICAL SKILLS:

Hadoop Ecosystem: Hadoop, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Kafka, Ni-fi, HBase, Oozie, Zookeeper, Kerberos, Sentry

Programming language: C, C++, Java, SQL Scala

Database: Oracle 10g, MySQL, SQL server, Cassandra, Janus Graph, Neo4j (graph Databases).

Cloud Platform: Amazon Web services (EC2, S3, EMR)

Ide Application: Eclipse, IntelliJ IDEA.

Collaboration: Git, Jira, Jenkins

Web Development: HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap.

Java/J2EE Technologies: Servlets, JSP (EL, JSTL, Custom Tags), JSF, Apache Struts, Junit, Hibernate 3.x, Log4J Java Beans, EJB 2.0/3.0, JDBC, RMI, JMS, JNDI.

Spark Technologies: Spark Core, Spark SQL, Spark Streaming, Kafka, Storm.

PROFESSIONAL EXPERIENCE:

Confidential, New York

Sr. Hadoop Developer

Responsibilities:

  • Working on stories related to Ingestion, Transformation, and Publication of data on time.
  • Using Spark for real-time data ingestion from web servers (unstructured and structured).
  • Implementing data import and export jobs into HDFS and Hive using Sqoop.
  • Converting unstructured data into a structured format using Pig.
  • Using Hive as a data warehouse in Hadoop, HQL on the data (structured data).
  • Using Hive to analyze the partitioned and bucketed using Hive SerDe’s like CSV, REGEX, JSON, and AVRO.
  • Using Apache Ni-Fi to check whether the data getting onto Hadoop cluster is a good data without any nulls in it.
  • Designing and deploying of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Zookeeper, Sqoop, Apache Spark and Impala.
  • Creating and transformation of RDDs, Data Frame using Spark.
  • Working on converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
  • Working with Big Data Hadoop Application using Talend on cloud through Amazon Web Services (AWS) EC2 and S3. increasing cluster size if needed in AWS using EMR (for data in cloud)
  • Developing Spark scripts by using Scala shell commands as per the requirement.
  • Using Spark API over Cloudera, Hadoop YARN to perform analytics on data in Hive
  • Involving Spark to improvise the performance and optimization of the existing algorithms in Hadoop using Spark context, spark-SQL, Data Frame, pair RDD's, Spark YARN.
  • Janus Graph is a Graph database used to store the parent-child relation (ER graph models) between the nodes. The data is stored in Cassandra to understand the nodes and network relations.
  • Working with Kafka to get real-time weblogs data onto big data cluster.

Environment: HDFS, Sqoop, Hive, SerDe’s, HBase, Sentry, Spark, Spark-SQL, Kafka, Flume, Oozie, Jason, Avro, Talend, EC2,S3,EMR, Zookeeper, Cloudera.

Confidential, charlotte, NC

Hadoop Developer

Responsibilities:

  • Loading customer data, spending data and credit from legacy warehouses to HDFS
  • Exported analyzed data to RDBMS using Sqoop for data visualization.
  • Used Hive queries to analyze the large data sets.
  • Build reusable Hive UDF’s libraries for business requirements.
  • Implemented Dynamic Partitioning and bucketing in Hive.
  • Implement script to transmit sys print information from Oracle to HBase using Sqoop
  • Deployed the Big Data Hadoop application using Talend on cloud AWS (Amazon Web Service).
  • Implemented Map Reduce jobs on XML, JSON, CSV data formats.
  • Developed Map reduce programs which were used to extract and transform the data sets and the resultant dataset is loaded to HBase.
  • Imported the customers log data into HDFS using Flume.
  • Implemented Spark job to improve query performance.
  • Used Impala to handle different file formats
  • Proactively involved in ongoing maintenance, support, and improvements in Hadoop cluster.
  • Used Tableau as a business intelligence tool to visualize the customer information as per the generated records.

Environment: Hadoop, Map Reduce, HDFS, Hive, Sqoop, Zookeeper, Oozie, Spark, Spark-SQL, Scala, Kafka, Java, Oracle, AWS S3.

Confidential - San Francisco, CA

Hadoop Developer

Responsibilities:

  • Primary responsibilities include building scalable distributed data solutions using Hadoop ecosystem
  • Datasets will be loaded from two different sources like Oracle, MySQL to HDFS and Hive respectively on daily basis.
  • Installed and configured Hive on the Hadoop cluster.
  • Worked on HBase Java API to populate operational HBase table with Key value.
  • Developed multiple MapReduce jobs in java for data cleaning and pre - processing.
  • Developing and running Map-Reduce jobs on YARN and Hadoop clusters to produce daily and monthly reports as per user's need.
  • Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
  • Experience in developing multiple MapReduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other file formats.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, HBase and Hive by integrating with Storm.
  • Designed and developed PIG Latin Scripts to process data in a batch to perform trend analysis.
  • Developed HIVE scripts for analyst requirements for analysis.
  • Developed java code to generate, compare & merge AVRO schema files.
  • Developed complex MapReduce streaming jobs using Java language that are implemented Using Hive and Pig.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig Latin scripts to study customer behavior.
  • Worked on NoSQL including MongoDB, Cassandra and HBase.
  • Continuously monitored and managed the Hadoop Cluster using Cloudera Manager.

Environment: Hadoop, HDFS, Pig, Pig Latin Eclipse, Hive, Map Reduce, Java, Avro, HBase, Sqoop, Storm, LINUX, Cloudera, Big Data, Java, My SQL, NoSQL, MongoDB, Cassandra, JSON, XML, CSV.

Confidential, Newark, CA

Big Data/Hadoop Developer

Responsibilities:

  • Worked o Hadoop Cluster with size of 83 Nodes and 896 terabytes capacity
  • Worked on Map reduce jobs, HIVE, Pig.
  • Involve in Requirement Analysis, Design, and Development.
  • Importing and exporting data into Hive and Hbase using Sqoop from existing SQL server.
  • Experience working on processing unstructured data using Pig and Hive.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Developed Hive queries, Pig scripts, and Spark SQL queries to analyze large datasets.
  • Exported the result set from Hive to MySQL using Sqoop.
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Worked on debugging, performance tuning of Hive & Pig Jobs.
  • Gained experience in managing and reviewing Hadoop log files.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Used NoSQL database with HBase.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Flume, LINUX, Hbase, Java, Oozie.

Confidential

Java Developer

Responsibilities:

  • Involved in design, development and analysis documents in sharing with Clients.
  • Developed web pages using Struts framework, JSP, XML, JavaScript, Hibernate, springs, Html/ DHTML and CSS, configure struts application, use tag library.
  • Developed Application using Spring and Hibernate, Spring batch, Web Services like Soap and restful Web services.
  • Used Spring Framework at Business Tier and spring’s Bean Factory for initializing services.
  • Used AJAX, JavaScript to create interactive user interface.
  • Implemented client-side validations using JavaScript & server-side validations.
  • Developed Single Page application using angular JS & backbone JS.
  • Implemented Hibernate to persist the data into Database and wrote HQL based queries to implement CRUD operations on the data.
  • Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Database modeling, administration and development using SQL and PL/SQL in Oracle 11g.
  • Coded different deployment descriptors using XML. Generated Jar files are deployed on Apache Tomcat Server.
  • Involved in the development of presentation layer and GUI framework in JSP. Client-Side validations were done using JavaScript.
  • Involved in configuring and deploying the application using WebSphere.
  • Involved in code reviews and mentored the team in resolving issues.
  • Undertook the Integration and testing of the various parts of the application.
  • Developed automated Build files using ANT.
  • Used Subversion for version control and log4j for logging errors.
  • Code Walkthrough, Test cases and Test Plans

Environment: HTML5, JSP, Servlets, JDBC, JavaScript, Json, jQuery, Spring, SQL, Oracle 11g, Tomcat, Eclipse IDE, XML, XSL, ANT, Tomcat 5.

Confidential

Associate Java Developer

Responsibilities:

  • Involved in the complete SDLC software development life cycle of the application from requirement gathering and analysis to testing and maintenance.
  • Developed the modules based on MVC Architecture.
  • Developed UI using JavaScript, JSP, HTML and CSS for interactive cross browser functionality and complex user interface.
  • Created business logic using servlets and session beans and deployed them on Apache Tomcat server.
  • Created complex SQL Queries, PL/SQL Stored procedures and functions for back end.
  • Prepared the functional, design and test case specifications.
  • Performed unit testing, system testing and integration testing.
  • Developed unit test cases. Used JUnit for unit testing of the application.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java, JSP, Servlets, Apache Tomcat, Oracle, JUnit, SQL

We'd love your feedback!