Hadoop Developer Resume

SUMMARY:

Proactive IT developer with 8+ years of working experience on development and design of various scalable systems using Hadoop Technologies on various environments
Experience in installation, configuration, supporting and managing Hadoop Clusters using Horton works , and Cloudera (CDH3, CDH4 ) distributions on Amazon web services (AWS).
Extraordinary Understanding of Hadoop building and Hands on involvement with Hadoop segments such as Job Tracker, Task Tracker, Name Node, Data Node and HDFS Framework.
Extensive experience in analyzing data using Hadoop Ecosystems including HDFS, Hive, PIG, Sqoop, Flume, MapReduce, Spark, Kafka, HBase, Oozie, Solr and Zookeeper.
Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
Extensive knowledge on NoSQL databases like HBase, Cassandra, Mongo DB.
Configured Zookeeper, Cassandra and Flume to the existing Hadoop cluster.
Have an experience in importing and exporting data using Sqoop from Hadoop Distributed File Systems to Relational Database Systems and also Relational Database Systems to Hadoop Distributed File Systems.
Expertise in writing Hadoop Jobs for analyzing data using Hive QL ( Queries), Pig Latin ( Data flow language ), and custom MapReduce programs in Java .
Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into PigLatin and HQL (HiveQL).
Experience in converting Hive queries into Spark transformations using Spark RDDs and Scala .
Hands on Experience in troubleshooting errors in HBase Shell, Pig, Hive and MapReduce.
Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
Experience in NoSQL Column-Oriented Databases like HBase , Cassandra and its Integration with Hadoop cluster.
Experience in maintaining the big data platform using open source technologies such as Spark and ElasticSearch.
Experience in configuring the flume agents for the transfer of data from external systems to HDFS.
Got good experience with NOSQL database SOLR HBase.
Implemented Cluster for NoSQL tools Cassandra, MongoDB as a part of POC to address HBase limitations
Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Good hands on experience in creating the RDD' s, DF's for the required input data and performed the data transformations using Spark Scala.
Experience in working with various Cloudera distributions (CDH4/CDH5), Hortonworks and Amazon EMR Hadoop Distributions.
Knowledge in developing a Nifi flow prototype for data ingestion in HDFS .
Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
Experience in analyzing, designing and developing ETL strategies and processes, writing ETL specifications, Informatics development.
Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and My SQL database and Java Core concepts like OOPS, Multithreading, Collections and IO .
Good working knowledge on Object Oriented Programming.
Experienced in designing Web Applications using HTML5, CSS3, JavaScript, Json, JQuery, AngularJS, Bootstrap and Ajax under Windows operating system.
Experience in Service Oriented Architecture using Web Services like SOAP & Restful.
Learning on administration situated design (SOA), work processes and web administrations utilizing XML, SOAP, and WSDL
Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
Good experience in working with Tableau Visualization tool using Tableau Desktop , T ableau Serve r and Tableau Reader.
Have good interpersonal, communicational skills, strong problem solving skills, explore to new technologies with ease and a good team member.

TECHNICAL SKILLS:

Big Data Eco systems: HDFS, MapReduce, Hive, YARN, Pig, Sqoop, Kafka, Storm, Flume, Oozie, and ZooKeeper, Apache Spark, Apache Tez, Impala, Nifi, Apache Solr, Active MQ,Scala.

No SQL Databases: Hbase, Cassandra, mongoDB

Programming Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, Scala, Python

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery,AngularJS

Frameworks: MVC, Struts, Spring, Hibernate

Sun Solaris, HP: UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Version control: SVN, CVS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Business Intelligence Tools: Tableau, QlikView, Pentaho, IBM Cognos intelligence

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools: and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Cloud Technologies: Amazon WebServices(AWS), CDH3, CDH4, CDH5, HortonWorks, Mahout, Microsoft Azure Insight, Amazon RedShift

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential

Responsibilities:

Involved in managing nodes on Hadoop cluster and monitor Hadoop cluster job performance using Cloudera manager.
Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
Involved in loading data from edge node to HDFS using shell scripting.
Created Map Reduce programs to handle semi/unstructured data like xml, json, Avro data files and sequence files for log files.
Developed Spark scripts by using Python shell commands as per the requirement.
Integrated ElasticSearch and implemented dynamic faceted-search.
Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr , Kafka , Pig , HBase and Cassandra.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
Developed HDFS with huge amounts of data using Apache Kafka .
Design and Develop Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs.
Developed end-to-end search solution using web crawler, Apache Nutch & Search Platform, Apache SOLR .
Developed ETL job in Talend to load data from ASCII , Flat files.
Used pig loader for loading tables from Hadoop to various clusters.
Designed talend jobs for data ingestion, enrichment and provisioning.
Design and develop custom Java components for Talend.
Worked in migrating HiveQL into Impala to minimize query response time.
Created Hive tables , dynamic partitions, buckets for sampling, and working on them using HQL.
Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Used Spark stream processing to get data into in-memory, implemented RDD transformations , actions to process as units.
Implemented a proof of concept (Poc's) using Kafka , Strom , HBase for processing streaming data.
Implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala .
Used MRUnit for unit testing and Continuum for integration testing.
Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
Used maven to build and deploy the Jars for MapReduce, Pig and Hive UDFs.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop ) as well as system specific jobs (such as Java programs and shell scripts ).
Developed Spark scripts by using Python shell commands as per the requirement.

Environment: Hadoo p , Scala, Map Reduce, HDFS, Spark,Scala,Kafka, AWS, Apache SOLR,Hive, Cassandra, maven, Jenkins, Pig, UNIX, Python, MRUnit, Git.

Confidential, Mountain View, CA

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop .
Worked in joining raw data with the reference data using Pig scripting.
Analyzed data using Hadoop components Hive and Pig.
Implemented DataStax Enterprise Search with Apache Solr .
Stack and change extensive arrangements of organized, semi organized and unstructured information utilizing Hadoop/Big Data ideas.
Implemented DSE SOLR solution to push incremental orders data in to centralized Hadoop cluster.
Configured, Designed implemented and monitored Kafka cluster and connectors.
Developed ETL jobs using Spark-Scala to migrate data from Oracle to new hive tables.
Developed and Deployed applications using Apache Spark, Scala.
Developed Oozie workflow for scheduling and orchestrating the ETL process
Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Created a high-level design approach to build a data lake , which will embrace the existing history data, and to suffice the need to process the transactional data.
Helped in troubleshooting Scala problems while working with Micro Strategy to produce illustrative reports and dashboards along with ad-hoc analysis.
Developed Hive queries for the analysts and I have written scripts using Scala.
Created and worked Sqoop jobs with incremental load to populate Hive External tables.
Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
Used to write custom UDF's in Hive and Pig . Used scripts written in Scala for performing MR Operations.
Continuous Integration environments in SCRUM and Agile methodologies.
Extracted the data from Teradata into HDFS using the Sqoop.
Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
Managed real time data processing and real time Data Ingestion in HBase and Hive using Storm.

Environment: Hadoop , HDFS, Pig, Hive,Oozie, HBase, Kafka, Apache SOLR, MapReduce, ApacheSOLR, Sqoop, Storm, Spark, Scala, LINUX, Cloudera, Maven, Jenkins, Java, SQL.

Confidential, Tampa, Florida

Hadoop Developer

Responsibilities:

Exported data from DB2 to HDFS using Sqoop and Developed MapReduce jobs using Java API .
Installed and configured Pig and wrote Pig Latin scripts .
Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts.
Developed workflow-using Oozie for running MapReduce jobs and Hive Queries.
Implementing various advanced join operations using Pig Latin.
Done the work in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.
Assisted in exporting analyzed data to relational databases using Sqoop.
Involved in Develop monitoring and performance metrics for Hadoop clusters.
Worked with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster.

Environment: Hadoop , HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse,Spark, My SQL and Ubuntu, Zookeeper, Maven, Jenkins, Java (JDK 1.6), Oracle10g.

Confidential, NJ

Java Developer

Responsibilities:

Effectively interacted with team members and business users for requirements gathering.
Involved in analysis, design and implementation phases of the software development lifecycle (SDLC).
Implementation of spring core J2EE patterns like MVC , Dependency Injection (DI), and Inversion of Control (IOC).
Implemented REST Web Services with Jersey API to deal with customer requests.
Developed test cases using J Unit and used Log4j as the logging framework.
Worked with HQL and Criteria API from retrieving the data elements from database.
Developed user interface using HTML, Spring Tags, JavaScript, J Query and CSS.
Developed the application using Eclipse IDE and worked under Agile Environment.
Design and implementation of front end web pages using CSS, JSP, HTML, java Script Ajax and, Struts
Utilized Eclipse IDE as improvement environment to plan, create and convey Spring segments on Web Logic

Environment: Java , J2EE, HTML, JavaScript, CSS, J Query, Spring 3.0, JNDI, Hibernate 3.0, Java Mail, Web Services, REST, Oracle 10g, J Unit, Log4j, Eclipse, Web logic 10.3.

Java Developer

Confidential

Responsibilities:

Involved in various stages of Enhancements in the Application by doing the required analysis , development, and testing.
For analysis and design of application created Use Cases, Class and Sequence Diagrams.
Developed web-based user interfaces using struts framework.
Developed and maintained Java/J2EE code required for the web application.
Handled Client Side Validations used JavaScript and Involved in integration of various Struts actions in the framework.
Involved in the development of the User Interfaces using HTML, JSP, CSS and JavaScript.
Developed, Tested and Debugged the Java , JSP and EJB components using Eclipse .

Environments: Java (JDK 1.5), J2EE, Servelets, Struts, JSP, HTML, CSS, JavaScript, EJB, Eclipse, WebLogic 8.1, Windows.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship