We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Plano, TX

SUMMARY

  • Overall 8+ yearsof professional IT experience with 5+ years of experience in analysis, architectural design, prototyping, development, integration and testing of applications using Java/J2EE technologies and 3+ years of experience in Big Data Analytics as Hadoop Developer.
  • Developed UML Diagrams for Object Oriented Design: Use Cases, Sequence Diagrams and ClassDiagrams usingVisual Paradigm and Visio.
  • Three plus years of experience as hadoop developer with good knowledge in hadoop ecosystem technologies.
  • Experience in developing Map Reduce programs using apache hadoop for analyzing the big data as per the requirement.
  • Experienced on major hadoopecosystem’s projects such as Pig, Hive, Hbase and monitoring them with HortonworksAmbari.
  • Hands on experiencein developing Pig Latin Scripts and using Hive Query Language for data analytics.
  • Hands on experience in converting SQLto HiveQLand performance tuning of HiveQL.
  • Hands on experience working on NoSQL databases including Hbase, Mongo DB and its integration with hadoop cluster.
  • Good working experience using Sqoopto import data into HDFS from RDBMS and vice - versa.
  • Good knowledge in using job scheduling and monitoring tools likeOozie and Zoo Keeper.
  • Experience in Hadoop administration activities such as installation and configuration of clusters using Apache, Hortonworks, Cloudera and AWS.
  • Implemented Apache Kafkaand Spark streaming for real time processing of data.
  • Extensive experience working in Financial and Insurance industry.
  • Working knowledge of database such asOracle9i/10g/11g, Microsoft SQL Server.
  • Experience in writing numerous test cases using JUnit framework
  • Strong experience in database design, writing complex SQL Queries.
  • Experience in Building, Deploying and Integrating with Maven.
  • Experience in development of logging standards and mechanism based on Log4J.
  • Strong work ethic with desire to succeed and make significant contributions to the organization.
  • Worked in Various Software Methodologies such as Agile, Waterfall and Prototyping
  • Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map Reduce, Yarn, Hive,Sqoop, Pig, HBase,Spark, Kafka, Oozie,Zookeeper

Programming Languages: Java JDK 1.6/1.7 (JDK 6/JDK 7),Pig Latin, Shell Scripting, HTML, SQL

Frameworks: Hortonworks, Cloudera

Operating Systems: UNIX/LINUX, Windows

Databases: NoSQL (HBase, MongoDB),Oracle 11g, Microsoft SQL Server 2008/2012

Tools: Eclipse kepler/juno,TOAD, SQL Developer, ANT,Maven, Visio

PROFESSIONAL EXPERIENCE

Confidential, Plano, TX

Hadoop Developer

Responsibilities:

  • Responsible for design and development of Big Data applications using Horton works Hadoop.
  • Coordinated with business customers to gather business requirements.
  • Importing and exporting data into HDFS from database and vice versa usingSqoop.
  • Responsible to manage the data coming from different sources.
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, hive, Hbase, Spark and Sqoop.
  • Developed ApacheSpark jobs using Scala in test environment for faster data processing and used SparkSQL for querying.
  • Migrated HiveQL queries on structured into SparkQL to improve performance.
  • Analyzed data using Hadoop components Hive and Pig and created tables in hive for the end users.
  • Involved in writing Hive queries and pig scripts for data analysis to meet the business requirements.
  • Written Oozieflows and shell scripts to automate the flow.
  • Optimized Map Reduce and hive jobs to use HDFS efficiently by using Gzip, LZO,
  • Snappy and ORCcompression techniques.
  • Tuned Hive table and queries to achieve performance.
  • Written algorithms to calculate the most valuable households based on the data provided by external providers.
  • Connect hive tables to Trifacta to explore data and for data wrangling.
  • Involved in the design of the next phase (Future values household) for data analytics.
  • Used SparkSQL to query the hive tables to increase the performance.

Environment: MapReduce, HDFS, Yarn, Hive,HBase, Pig,Sqoop, Spark, Oozie, Java, python.

Confidential, Brooklyn, NY

Hadoop Developer

Responsibilities:

  • Responsible for design and development of Big Data applications using Hortonworks Hadoop.
  • Responsible for overall data migration and data management from Oracle database to HDFS using Sqoop.
  • Created Managed tables and External tables in Hive and loaded data from HDFS.
  • Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.
  • Converted SQL queries to HiveQL without any changes in the business logic.
  • Used different data formats (Text format and ORC format) while loading the data into HDFS.
  • Developed shell scripts to automate Daily running Sqoop jobs.
  • Hbase and hive integration using Hbase storage handler for analytics.
  • DevelopedUDF's to provide custom hive and pig capabilities.
  • Performancetuning of hivequeries and configuration properties.
  • Developed Oozieworkflow for scheduling and orchestrating the ETL process.
  • Monitored workload, job performance and capacity planning using Ambari.
  • Handled direct interaction with hortonworks team for issue management & cluster balancing
  • Involved in development of spark streaming application with Kafka to handle near-real time data processing of present data.
  • Developed SparkHiveQL code to integrate with hive for analyzing historical data using In-memory computing to speedup report generation.
  • Implemented SparkRDDtransformations, actions to migrate Map reduce algorithms.

Environment: MapReduce, HDFS, Yarn, Hive,Sqoop, Kafka, Spark, Oozie, Java, python.

Confidential, Bloomington, IL

Hadoop Developer

Responsibilities:

  • Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in Design phase and delivered Design documents.
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
  • Installed Hadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Map the Relational Database Architecture to Hadoop's file system and build databases on top of it using Cloudera Impala.
  • Migration of huge amounts of data from different databases (i.e. Netezza, Oracle, SQL Server) to Hadoop.
  • Developed Spark programs on Scala and java.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Experienced in defining job flows.
  • Involved in data migration from Oracle database to Mongo DB
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Load and Transform large sets of structured and semi structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Utilized Apache Hadoop environment by Cloudera.
  • Created Data model for Hive tables.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Involved in Unit testing and delivered Unit test plans and results documents.

Environment: Hadoop, MapReduce, YARN, HDFS, Hive, Pig, Impala, Kafka, Java, SQL, Oracle, Cloudera Manager, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse.

Confidential, Brooklyn, NY

Java/Hadoop Developer

Responsibilities:

  • Hands on experience in loading data from UNIX file system to HDFS.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Involved in creating Hive tables, loading data and running hive queries in those data.
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts.
  • Working knowledge in writing Pig’s Load and Store functions.
  • Developed Java MapReduce programs on log data to transform into structured way to find user location, age group, spending time.
  • Developed optimal strategies for distributing the web log data over the cluster, importing and exporting the stored web log data into HDFS and Hive using Scoop.
  • Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
  • Monitored multiple Hadoop clusters environments using Ganglia.
  • Developed PIG scripts for the analysis of semi structured data.
  • Developed and involved in the industry specific UDF (user defined functions).
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Managing and scheduling Jobs on a Hadoop cluster using Oozie.

Environment: Apache Hadoop 1.0.1, MapReduce, HDFS, HBase, Hive, Pig, Oozie, Flume, Java, Eclipse, Sqoop, Ganglia, Hbase.

Confidential

Java Developer

Responsibilities:

  • Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.
  • Implemented various J2EE Design Patterns such as Model-View-Controller, Data Access Object, Business Delegate and Transfer Object.
  • Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.
  • Wrote stored procedure and used JAVA APIs to call these procedures.
  • Developed various test cases such as unit tests, mock tests, and integration tests using the JUNIT.
  • Used log4j to perform logging in the applications.

Environment: Java, J2EE, Struts MVC, Tiles, JDBC, JSP, JavaScript, HTML, Spring IOC, Spring AOP, JAX-WS, Ant, Web sphere Application Server, Oracle, JUNIT and Log4j, Eclipse

Confidential

Programmer Analyst

Responsibilities:

  • Involved in the design and development phases of Rational Unified Process (RUP)
  • Involved in creation of UML diagrams like Class, Activity, and Sequence Diagrams using modeling tools of IBM Rational Rose
  • Involved in Bug fixing of various modules that were raised by the Testing teams in the application during the Integration testing phase
  • Involved and participated in Code reviews
  • Used Log4J logging framework for logging messages
  • Used Rational Clear Quest for bug tracking
  • Involved in deployment of application on IBM Websphere Application Server

Environment: Java, J2EE, Hibernate, XML, XML Schemas, JSP, HTML, CSS, PL/SQL, JUnit, Log4j.

We'd love your feedback!