We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Kansas, CitY

PROFESSIONAL SUMMARY:

  • 7years of professional experience in Design and Development of Java, J2EE and Big Data technologies.
  • 3 years of experience in using Hadoop and its ecosystem components like HDFS, MapReduce, Yarn, Spark, Hive, Pig, HBase, Nifi and Kafka.
  • In depth understanding of Hadoop Distributed Architecture and its various components such as Node Manager, Resource Manager, Name Node, Data Node, Hive Server2, HBase Master, Region Server etc.,
  • Strong experience developing end - to-end data transformations using Spark Core API.
  • Strong experience creating real time data streaming solutions using Spark Streaming and Kafka.
  • Worked extensively on fine tuning spark applications and worked with various memory settings in spark.
  • Strong Knowledge for real time processing using Apache Strom.
  • Developed Simple to complex Map/Reduce jobs using Java.
  • Expertise in writing end to end Data processing Jobs to analyze data using MapReduce, Spark and Hive.
  • Experience in Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Developed Hive and Pig scripts for handling business transformations and analyzing data.
  • Developed Sqoop scripts for large dataset transfer between Hadoop and RDBMs.
  • Experience in job workflow scheduling and scheduling tools like Nifi.
  • Experience using Hortonworks Distributions to fully implement and leverage new Hadoop features.
  • Proficient in using Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster.
  • Strong experience in working with UNIX/LINUX environments, writing shell scripts.
  • Experience in optimization of MapReduce algorithm using Combiners and Practitionersto deliver the best results.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns.
  • Sound knowledge of J2EE architecture, design patterns, objects modeling using various J2EE technologies and frameworks.
  • Comprehensive experience in building Web-based applications using J2EE Frame works like Spring, Hibernate, Struts and JMS.
  • Good working experience in design and application development using IDE's like Eclipse, Net Beans.
  • Experience in writing test cases in Java Environment using JUnit.
  • Hands on experience in development of logging standards and mechanism based on Log4j.
  • Experience in building, deploying and integrating applications with ANT, Maven.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
  • Good team player with ability to solve problems, organize and prioritize multiple tasks.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive,Nifi,Zeppelin, Pig, Sqoop, Spark, Storm,Flume, Kafka, Hortonworks clusters.

Languages: Java, Scala and Pig Latin

Java/J2EE & Web Technologies: J2EE, EJB, JSF, Servlets, JSP, JSTL, CSS, HTML, XHTML, CSS, XML, Angular JS, AJAX.

Frame works: Struts, Spring 2.x, ORM (Hibernate), JDBC

RESTful, JAX: WS

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Scripting Languages: Shell Scripting, Java script.

Database: Oracle 9i/10g, MySQL, RDBS, MongoDB, HBase

UML, Rational Rose, E: R Modelling and Umlet

IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.

Version Control System: CVS, SVN, GITHUB.

WORK EXPERIENCE:

Confidential, Kansas City

Hadoop Developer

Responsibilities:

  • Implemented best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.
  • Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors to pull data from other sources to HDFS.
  • Analyzed the data using Spark, Hive and produced summary results to downstream systems.
  • Create/Modify Shell scripts for scheduling data cleansing scripts and ETL loading process.
  • Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala (Prototype).
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Implemented Spark using Java for faster testing and processing of data.
  • Handled importing data from different data sources into HDFS using Sqoop and also performing transformations using Hive, Map Reduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Created components like Hive UDFs for missing functionality in HIVE for analytics.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Created HBase tables and column families to store the user event data.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Used Nifiprocessor toprocess and deploy end to end data processing pipelines and scheduling the work flows.

Environment: HDFS, Hadoop 2.x, Hortonworks (HDP 2.2), Pig, Hive, Sqoop, Kafka, Spark, YARN, UNIX Shell Scripting, Java 8, Agile Methodology and JIRA.

Confidential, Dearborn, MI

Hadoop Developer

Responsibilities:

  • Created Hive Tables, loaded transactional data from Teradata using Sqoop.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Implemented Hive Generic UDF's to in corporate business logic into Hive Queries.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Creating Hive tables and working on them using Hive QL.
  • Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
  • Involved in End-to-End implementation of ETL logic.
  • Worked on Cluster co-ordination services through Zookeeper.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
  • Exported the analyzed data to the RDBMS using Sqoop for to generate reports for the BI team.
  • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.

Environment: Hadoop, HDFS MapReduce, HiveQL, Pig, Hbase, Sqoop, Oozie, Maven, Shell Scripting, CDH.

Confidential, CA

Hadoop Developer

Responsibilities:

  • Developed simple to complex Map Reduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Map Reduce, and Hive to ingest, transform and analyze operational data.
  • Developed Map Reduce jobs to summarize and transform+ raw data.
  • Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
  • Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Created HBase tables and column families to store the user event data.
  • Scheduled and executed workflows in Oozie to run Hive and Pig jobs.
  • Used Impala to read, write and query the Hadoop data in Hive.

Environment: Hadoop, HDFS, HBase, Pig, Hive, MapReduce, Sqoop, Flume,Impala, ETL, REST, Java, MySQL, Oracle 11g, Unix/Linux.

Confidential

Java Developer

Responsibilities:

  • Actively Participated in JAD (Joint application development) sessions for requirements gathering and documenting business process.
  • Used JSP, Struts, JSTL tags, Java Script for building dynamic web pages. Added tag libraries like Display tag and Validator for extended flexible page design. (For more flexible page design introduced tag libraries like Display tag and Validator tags.).
  • Incorporated J2EE design Patterns (Business Delegate, Singleton, Data Access Object, Data Transfer Object, MVC) for the Middle Tier development.
  • Data access framework by Spring is used for automatically acquiring and releasing database resources and exception handling by spring data access hierarchy for better handling of database connections with JDBC.
  • Established communication among external systems using Web Services (REST).
  • Implemented several JUnit test cases.
  • Implemented a logging application, Web Logging for better trace the data flow on application server using Log4J.
  • Used Clear Case as a version control of the application with developments streams.
  • Worked with team of Developers and Testers to resolve the issues with the server timeouts and database connection pooling issues. Initiated Profiling using RAD for finding the Objects memory leaks.

Environment: Java1.4, J2EE1.3, Struts1.1, HTML, Java Script, JSP1.2, Servlets2.3, Spring1.2, ANT, Log4j1.2.9, PL/SQL,WebSphere Application Server 5.1/6.0, IBM Clear case.

Confidential

Java Developer

Responsibilities:

  • Gathered and analyzed the requirements and converted them into User Requirement specifications and Functional Requirement Specifications.
  • Involved in Full Software Development Life Cycle (SDLC). Used Agile Methodology to develop the entire application.
  • Designed and implemented the User interface using HTML, CSS, JavaScript and SQL Server.
  • Developed Interfaces using JSP based on the Users, Roles and Permissions. Screen options were displayed on User permissions. This was coded using Custom Tags in JSP using Tag Libraries.
  • Created web services using Advanced J2EE technologies to communicate with external systems.
  • Involved in the UI development, including layout and front-end coding per the requirements of the client by using JavaScript and Ext JS.
  • Used Hibernate along with Spring Framework to integrate with Oracle database.
  • Built complex SQL queries and ETL scripts for data extraction and analysis to define the application requirements.
  • Used DOM and SAX parsers with JAXP API.
  • Implementing JUnit test cases to test Java classes.
  • Utilized Rational Clear case for version control of the application. This involved creating development streams and defect streams.
  • Utilized WSAD for developing the application.

Environment: JSP, Servlets, Struts, Hibernate, HTML, CSS, JavaScript, JSON, REST, JUnit, XML, SASS, DOM, WebLogic(Oracle App server), Web Services, Eclipse, Agile.

Hire Now