We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

SUMMARY:

  • Proactive IT developer with 8+ years of working experience on development and design of various scalable systems using Hadoop Technologies on various environments
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Horton works , and Cloudera (CDH3, CDH4 ) distributions on Amazon web services (AWS).
  • Extraordinary Understanding of Hadoop building and Hands on involvement with Hadoop segments such as Job Tracker, Task Tracker, Name Node, Data Node and HDFS Framework.
  • Extensive experience in analyzing data using Hadoop Ecosystems including HDFS, Hive, PIG, Sqoop, Flume, MapReduce, Spark, Kafka, HBase, Oozie, Solr and Zookeeper.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Extensive knowledge on NoSQL databases like HBase, Cassandra, Mongo DB.
  • Configured Zookeeper, Cassandra and Flume to the existing Hadoop cluster.
  • Have an experience in importing and exporting data using Sqoop from Hadoop Distributed File Systems to Relational Database Systems and also Relational Database Systems to Hadoop Distributed File Systems.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive QL ( Queries), Pig Latin ( Data flow language ), and custom MapReduce programs in Java .
  • Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into PigLatin and HQL (HiveQL).
  • Experience in converting Hive queries into Spark transformations using Spark RDDs and Scala .
  • Hands on Experience in troubleshooting errors in HBase Shell, Pig, Hive and MapReduce.
  • Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
  • Experience in NoSQL Column-Oriented Databases like HBase , Cassandra and its Integration with Hadoop cluster.
  • Experience in maintaining the big data platform using open source technologies such as Spark and ElasticSearch.
  • Experience in configuring the flume agents for the transfer of data from external systems to HDFS.
  • Got good experience with NOSQL database SOLR HBase.
  • Implemented Cluster for NoSQL tools Cassandra, MongoDB as a part of POC to address HBase limitations
  • Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
  • Good hands on experience in creating the RDD' s, DF's for the required input data and performed the data transformations using Spark Scala.
  • Experience in working with various Cloudera distributions (CDH4/CDH5), Hortonworks and Amazon EMR Hadoop Distributions.
  • Knowledge in developing a Nifi flow prototype for data ingestion in HDFS .
  • Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
  • Experience in analyzing, designing and developing ETL strategies and processes, writing ETL specifications, Informatics development.
  • Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and My SQL database and Java Core concepts like OOPS, Multithreading, Collections and IO .
  • Good working knowledge on Object Oriented Programming.
  • Experienced in designing Web Applications using HTML5, CSS3, JavaScript, Json, JQuery, AngularJS, Bootstrap and Ajax under Windows operating system.
  • Experience in Service Oriented Architecture using Web Services like SOAP & Restful.
  • Learning on administration situated design (SOA), work processes and web administrations utilizing XML, SOAP, and WSDL
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
  • Good experience in working with Tableau Visualization tool using Tableau Desktop , T ableau Serve r and Tableau Reader.
  • Have good interpersonal, communicational skills, strong problem solving skills, explore to new technologies with ease and a good team member.

TECHNICAL SKILLS:

Big Data Eco systems: HDFS, MapReduce, Hive, YARN, Pig, Sqoop, Kafka, Storm, Flume, Oozie, and ZooKeeper, Apache Spark, Apache Tez, Impala, Nifi, Apache Solr, Active MQ,Scala.

No SQL Databases: Hbase, Cassandra, mongoDB

Programming Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, Scala, Python

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery,AngularJS

Frameworks: MVC, Struts, Spring, Hibernate

Sun Solaris, HP: UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Version control: SVN, CVS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Business Intelligence Tools: Tableau, QlikView, Pentaho, IBM Cognos intelligence

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools: and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Cloud Technologies: Amazon WebServices(AWS), CDH3, CDH4, CDH5, HortonWorks, Mahout, Microsoft Azure Insight, Amazon RedShift

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential

Responsibilities:

  • Involved in managing nodes on Hadoop cluster and monitor Hadoop cluster job performance using Cloudera manager.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Created Map Reduce programs to handle semi/unstructured data like xml, json, Avro data files and sequence files for log files.
  • Developed Spark scripts by using Python shell commands as per the requirement.
  • Integrated ElasticSearch and implemented dynamic faceted-search.
  • Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr , Kafka , Pig , HBase and Cassandra.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
  • Developed HDFS with huge amounts of data using Apache Kafka .
  • Design and Develop Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs.
  • Developed end-to-end search solution using web crawler, Apache Nutch & Search Platform, Apache SOLR .
  • Developed ETL job in Talend to load data from ASCII , Flat files.
  • Used pig loader for loading tables from Hadoop to various clusters.
  • Designed talend jobs for data ingestion, enrichment and provisioning.
  • Design and develop custom Java components for Talend.
  • Worked in migrating HiveQL into Impala to minimize query response time.
  • Created Hive tables , dynamic partitions, buckets for sampling, and working on them using HQL.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Used Spark stream processing to get data into in-memory, implemented RDD transformations , actions to process as units.
  • Implemented a proof of concept (Poc's) using Kafka , Strom , HBase for processing streaming data.
  • Implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala .
  • Used MRUnit for unit testing and Continuum for integration testing.
  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Used maven to build and deploy the Jars for MapReduce, Pig and Hive UDFs.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop ) as well as system specific jobs (such as Java programs and shell scripts ).
  • Developed Spark scripts by using Python shell commands as per the requirement.

Environment: Hadoo p , Scala, Map Reduce, HDFS, Spark,Scala,Kafka, AWS, Apache SOLR,Hive, Cassandra, maven, Jenkins, Pig, UNIX, Python, MRUnit, Git.

Confidential, Mountain View, CA

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop .
  • Worked in joining raw data with the reference data using Pig scripting.
  • Analyzed data using Hadoop components Hive and Pig.
  • Implemented DataStax Enterprise Search with Apache Solr .
  • Stack and change extensive arrangements of organized, semi organized and unstructured information utilizing Hadoop/Big Data ideas.
  • Implemented DSE SOLR solution to push incremental orders data in to centralized Hadoop cluster.
  • Configured, Designed implemented and monitored Kafka cluster and connectors.
  • Developed ETL jobs using Spark-Scala to migrate data from Oracle to new hive tables.
  • Developed and Deployed applications using Apache Spark, Scala.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Created a high-level design approach to build a data lake , which will embrace the existing history data, and to suffice the need to process the transactional data.
  • Helped in troubleshooting Scala problems while working with Micro Strategy to produce illustrative reports and dashboards along with ad-hoc analysis.
  • Developed Hive queries for the analysts and I have written scripts using Scala.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Used to write custom UDF's in Hive and Pig . Used scripts written in Scala for performing MR Operations.
  • Continuous Integration environments in SCRUM and Agile methodologies.
  • Extracted the data from Teradata into HDFS using the Sqoop.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Managed real time data processing and real time Data Ingestion in HBase and Hive using Storm.

Environment: Hadoop , HDFS, Pig, Hive,Oozie, HBase, Kafka, Apache SOLR, MapReduce, ApacheSOLR, Sqoop, Storm, Spark, Scala, LINUX, Cloudera, Maven, Jenkins, Java, SQL.

Confidential, Tampa, Florida

Hadoop Developer

Responsibilities:

  • Exported data from DB2 to HDFS using Sqoop and Developed MapReduce jobs using Java API .
  • Installed and configured Pig and wrote Pig Latin scripts .
  • Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts.
  • Developed workflow-using Oozie for running MapReduce jobs and Hive Queries.
  • Implementing various advanced join operations using Pig Latin.
  • Done the work in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.
  • Assisted in exporting analyzed data to relational databases using Sqoop.
  • Involved in Develop monitoring and performance metrics for Hadoop clusters.
  • Worked with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster.

Environment: Hadoop , HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse,Spark, My SQL and Ubuntu, Zookeeper, Maven, Jenkins, Java (JDK 1.6), Oracle10g.

Confidential, NJ

Java Developer

Responsibilities:

  • Effectively interacted with team members and business users for requirements gathering.
  • Involved in analysis, design and implementation phases of the software development lifecycle (SDLC).
  • Implementation of spring core J2EE patterns like MVC , Dependency Injection (DI), and Inversion of Control (IOC).
  • Implemented REST Web Services with Jersey API to deal with customer requests.
  • Developed test cases using J Unit and used Log4j as the logging framework.
  • Worked with HQL and Criteria API from retrieving the data elements from database.
  • Developed user interface using HTML, Spring Tags, JavaScript, J Query and CSS.
  • Developed the application using Eclipse IDE and worked under Agile Environment.
  • Design and implementation of front end web pages using CSS, JSP, HTML, java Script Ajax and, Struts
  • Utilized Eclipse IDE as improvement environment to plan, create and convey Spring segments on Web Logic

Environment: Java , J2EE, HTML, JavaScript, CSS, J Query, Spring 3.0, JNDI, Hibernate 3.0, Java Mail, Web Services, REST, Oracle 10g, J Unit, Log4j, Eclipse, Web logic 10.3.

Java Developer

Confidential

Responsibilities:

  • Involved in various stages of Enhancements in the Application by doing the required analysis , development, and testing.
  • For analysis and design of application created Use Cases, Class and Sequence Diagrams.
  • Developed web-based user interfaces using struts framework.
  • Developed and maintained Java/J2EE code required for the web application.
  • Handled Client Side Validations used JavaScript and Involved in integration of various Struts actions in the framework.
  • Involved in the development of the User Interfaces using HTML, JSP, CSS and JavaScript.
  • Developed, Tested and Debugged the Java , JSP and EJB components using Eclipse .

Environments: Java (JDK 1.5), J2EE, Servelets, Struts, JSP, HTML, CSS, JavaScript, EJB, Eclipse, WebLogic 8.1, Windows.

We'd love your feedback!