We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Pleasanton, CA

SUMMARY:

  • Around 8 years of professional experience in Design and Development of Java, J2EE and Big Data technologies.
  • 3.5 years of experience in using Hadoop and its ecosystem components like HDFS, MapReduce, Cloudera, Horton works, Yarn, Spark, Hive, Pig, HBase, Nifi and Kafka.
  • In depth understanding of Hadoop Distributed Architecture and its various components such as Node Manager, Resource Manager, Name Node, Data Node, Hive Server2, HBase Master, Region Server etc.,
  • Strong experience developing end - to -end data transformations using Spark Core API.
  • Strong experience creating real time data streaming solutions using Spark Streaming and Kafka.
  • Worked extensively on fine tuning spark applications and worked with various memory settings in spark.
  • Strong Knowledge for real time processing using Apache Strom.
  • Developed Simple to complex Map/Reduce jobs using Java.
  • Expertise in writing end to end Data processing Jobs to analyze data using MapReduce, Spark and Hive.
  • Experience in Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Developed Hive and Pig scripts for handling business transformations and analyzing data.
  • Developed Sqoop scripts for large dataset transfer between Hadoop and RDBMs.
  • Experience in job workflow scheduling and scheduling tools like Nifi.
  • Experience using Hortonworks Distributions to fully implement and leverage new Hadoop features.
  • Proficient in using Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster.
  • Strong experience in working with UNIX/LINUX environments, writing shell scripts.
  • Experience in optimization of MapReduce algorithm using Combiners and Practitioners to deliver the best results.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns.
  • Sound knowledge of J2EE architecture, design patterns, objects modeling using various J2EE technologies and frameworks.
  • Comprehensive experience in building Web-based applications using J2EE Frame works like Spring, Hibernate, Struts and JMS.
  • Good working experience in design and application development using IDE's like Eclipse, Net Beans.
  • Experience in writing test cases in Java Environment using JUnit.
  • Hands on experience in development of logging standards and mechanism based on Log4j.
  • Experience in building, deploying and integrating applications with ANT, Maven.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
  • Good team player with ability to solve problems, organize and prioritize multiple tasks.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem-solving techniques.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, Harton works, cloudera, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive, Nifi, Zeppelin, Pig, Sqoop, Spark, Storm, Flume, Kafka, Hortonworks clusters.

Languages: Java/J2EE & Web Technologies J2EE, EJB, JSF, Servlets, JSP, JSTL, CSS, HTML, XHTML, CSS, XML, Angular JS, AJAX. Java, Scala and Pig Latin

Frame works: Struts, Spring 2.x, ORM (Hibernate), JDBC

Web Services: RESTful, JAX-WS

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Scripting Languages: Shell Scripting, Java script.

Database: Oracle 9i/10g, MySQL, RDBS, MongoDB, HBase

Design: UML, Rational Rose, E-R Modelling and Umlet

IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.

Version Control System: CVS, SVN, GITHUB

PROFESSIONAL EXPERIENCE:

Confidential,Pleasanton, CA

Sr. Hadoop Developer

Responsibilities:
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
  • Installed Hadoop, Map Reduce, HDFS, and Developed multiple maps reduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in the Design phase and delivered Design documents.
  • Involved in Testing and coordination with business in User testing
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
  • Experienced in defining job flows.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Load and Transform large sets of structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Utilized Apache Hadoop environment by Cloudera.
  • The created Data model for Hive tables.
  • Involved in Unit testing and delivered Unit test plans and results in documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.

Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, JSP, Structs2.0, NoSQL, HDFS, Teradata, Sqoop, LINUX, Oozie, Cassandra, Hue, HCatalog, Java. IBM Cognos, Oracle 11g/10g, Microsoft SQL Server, Microsoft SSIS, DB2 LUW, TOAD for DB2, IBM Data Studio, AIX 6.1, UNIX Scripting

Confidential, Tampa, FL

Hadoop Developer

Responsibilities:
  • Implemented best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.
  • Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors to pull data from other sources to HDFS.
  • Analyzed the data using Spark, Hive and produced summary results to downstream systems.
  • Create/Modify Shell scripts for scheduling data cleansing scripts and ETL loading process.
  • Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala (Prototype).
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Implemented Spark using Java for faster testing and processing of data.
  • Handled importing data from different data sources into HDFS using Sqoop and also performing transformations using Hive, Map Reduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Created components like Hive UDFs for missing functionality in HIVE for analytics.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Created HBase tables and column families to store the user event data.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Used Nifi processor to process and deploy end to end data processing pipelines and scheduling the work flows.

Environment: HDFS, Hadoop 2.x, Hortonworks (HDP 2.2), Pig, Hive, Sqoop, Kafka, Spark, YARN, UNIX Shell Scripting, Java 8, Agile Methodology and JIRA.

Confidential, Cincinnati, OH

Hadoop Developer

Responsibilities:
  • Created Hive Tables, loaded transactional data from Teradata using Sqoop.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Implemented Hive Generic UDF's to in corporate business logic into Hive Queries.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Creating Hive tables and working on them using Hive QL.
  • Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
  • Involved in End-to-End implementation of ETL logic.
  • Worked on Cluster co-ordination services through Zookeeper.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
  • Exported the analyzed data to the RDBMS using Sqoop for to generate reports for the BI team.
  • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.

Environment: Hadoop, HDFS MapReduce, HiveQL, Pig, Hbase, Sqoop, Oozie, Maven, Shell Scripting, CDH.

Confidential

Hadoop Developer

Responsibilities:
  • Developed simple to complex Map Reduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Map Reduce, and Hive to ingest, transform and analyze operational data.
  • Developed Map Reduce jobs to summarize and transform+ raw data.
  • Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
  • Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Created HBase tables and column families to store the user event data.
  • Scheduled and executed workflows in Oozie to run Hive and Pig jobs.
  • Used Impala to read, write and query the Hadoop data in Hive.

Environment: Hadoop, HDFS, HBase, Pig, Hive, MapReduce, Sqoop, Flume, Impala, ETL, REST, Java, MySQL, Oracle 11g, Unix/Linux.

Confidential

Java Developer

Responsibilities:
  • Actively Participated in JAD (Joint application development) sessions for requirements gathering and documenting business process.
  • Used JSP, Struts, JSTL tags, Java Script for building dynamic web pages. Added tag libraries like Display tag and Validator for extended flexible page design. (For more flexible page design introduced tag libraries like Display tag and Validator tags.).
  • Incorporated J2EE design Patterns (Business Delegate, Singleton, Data Access Object, Data Transfer Object, MVC) for the Middle Tier development.
  • Data access framework by Spring is used for automatically acquiring and releasing database resources and exception handling by spring data access hierarchy for better handling of database connections with JDBC.
  • Established communication among external systems using Web Services (REST).
  • Implemented several JUnit test cases.
  • Implemented a logging application, Web Logging for better trace the data flow on application server using Log4J.
  • Used Clear Case as a version control of the application with developments streams.
  • Worked with team of Developers and Testers to resolve the issues with the server timeouts and database connection pooling issues. Initiated Profiling using RAD for finding the Objects memory leaks.

Environment: Java1.4, J2EE1.3, Struts1.1, HTML, Java Script, JSP1.2, Servlets2.3, Spring1.2, ANT, Log4j1.2.9, PL/SQL, WebSphere Application Server 5.1/6.0, IBM Clear case.

Confidential

Java Developer

Responsibilities:
  • Gathered and analyzed the requirements and converted them into User Requirement specifications and Functional Requirement Specifications.
  • Involved in Full Software Development Life Cycle (SDLC). Used Agile Methodology to develop the entire application.
  • Designed and implemented the User interface using HTML, CSS, JavaScript and SQL Server.
  • Developed Interfaces using JSP based on the Users, Roles and Permissions. Screen options were displayed on User permissions. This was coded using Custom Tags in JSP using Tag Libraries.
  • Created web services using Advanced J2EE technologies to communicate with external systems.
  • Involved in the UI development, including layout and front-end coding per the requirements of the client by using JavaScript and Ext JS.
  • Used Hibernate along with Spring Framework to integrate with Oracle database.
  • Built complex SQL queries and ETL scripts for data extraction and analysis to define the application requirements.
  • Used DOM and SAX parsers with JAXP API.
  • Implementing JUnit test cases to test Java classes.
  • Utilized Rational Clear case for version control of the application. This involved creating development streams and defect streams.
  • Utilized WSAD for developing the application.

Environment: JSP, Servlets, Struts, Hibernate, HTML, CSS, JavaScript, JSON, REST, JUnit, XML, SASS, DOM, WebLogic (Oracle App server), Web Services, Eclipse, Agile.

Hire Now