We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Kansas, MO

SUMMARY

  • Hadoop Developer with overall 8 Years of experience in IT and 4 years in Hadoop Ecosystem.
  • Expertise in Hadoop Ecosystem components HDFS, Map Reduce, Hive, Pig, Spark, Sqoop, HBase, Kafka, Oozie, Flume, Cassandra.
  • Have a hands - on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and Apache Kafka.
  • Expertise in Hive queries, created user defined aggregated function worked on advanced optimization techniques and have extensive knowledge on joins.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Experience in designing and developing tables in Hbase and storing aggregated data from Hive Table.
  • Good Knowledge with NoSQL Databases -Cassandra and HBase.
  • Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services(AWS) cloud. Exporting and importing data into S3 and Redshift.
  • Knowledge on Scala Programming Language for developing Spark applications.
  • Worked on all kinds of file format such as AVRO, Sequence, Parquet, text-file for both importing and exporting from HDFS.
  • Good understanding of classic Hadoop and yarn architecture along with various Hadoop Daemons such as job tracker, Task Tracker, Name node, Data Node, Secondary Name Node, Resource Manager, Node Manager, Application master and containers.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa
  • Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured, semi-structured and unstructured data sets and store them in HDFS.
  • Deep Knowledge in the core concepts of MapReduce Framework and Hadoop ecosystem.
  • Hands on experience in cleansing semi-structured and unstructured data using Pig Latin scripts.
  • Experience in managing and reviewingHadooplog files.
  • Tested raw data and executed performance scripts using MRUnit.
  • Worked extensively with Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Experience in working with BI Visualization tools like Tableau, Qlikview.
  • Worked on predictive modeling techniques like Neural Networks, Decision Trees and Regression Analysis
  • Experience in handling multiple relational databases: MySQL, SQL Server, PostgreSQL and Oracle
  • Good knowledge in Software Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC).
  • Experience working both independently and collaboratively to solve problems and deliver high quality results in a fast-paced, unstructured environment.
  • Delivery Assurance - Quality Focused, Process Oriented, Strong analytical and problem-solving skills.
  • Excellent communication and inter-personal skills detail oriented, time bound, responsible team player and ability to coordinate in a team environment and possesses high degree of self-motivation and a quick learner.

TECHNICAL SKILLS

Java Technologies: JDK, J2EE, Hibernate, Struts, Servlet, JSP, EJB, JDBC, Spring.

Web Technologies: HTML, JavaScript, CSS, AJAX, AngularJS, jQuery.

Databases: Oracle, MS SQL Server, DB2

Big Data Frameworks: Hadoop, Spark, Scala, Hive, Kafka, AWS, Cassandra, HBase, Flume, Pig, Sqoop, MapReduce, Oozie, NiFi, Cloudera, Mongo DB, Impala, HCatalog, Hortonworks.

Big Data Distribution: Cloudera, Amazon EMR

Messaging Services: Active MQ, Kafka

SDLC methodologies: Agile, Waterfall

IDEs and Tools: Eclipse, Net Beans, Jboss Dev Studio, MyEclipse

Build Tools: Apache-ANT, Maven

Logging Tools: Log4j

Frame Works: Jakarta Struts 1.x, Spring 2.x

Test Framework: Junit

Analytics: Tableau, Kibana

Version Tools: Git and CVS

Operating Systems: Windows, Linux (Ubuntu, Cent OS), iOS

PROFESSIONAL EXPERIENCES

Confidential, Kansas, MO

Hadoop Developer

Responsibilities:

  • Developed efficient Map Reduce programs in java for filtering out the unstructured data.
  • Imported data from various relational data stores to HDFS using Sqoop.
  • Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data.
  • Responsible for installing and configuring Hadoop MapReduce, HDFS, also developed various MapReduce jobs for data cleaning.
  • Installed and configured Hive to create tables for the unstructured data in HDFS
  • Hold good expertise on major components in Hadoop Ecosystem including Hive, PIG, Hbase, Hbase-Hive Integration, Sqoop.
  • Involved in loading data from UNIX file system to HDFS
  • Responsible for managing and scheduling jobs on Hadoop Cluster
  • Responsible for importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Experienced in managing Hadoop log files
  • Worked on managing data coming from different sources
  • Wrote HQL queries to create tables and loaded data from HDFS to make it structured
  • Load and transform large sets of structured, semi structured and unstructured data
  • Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view the data for further analysis
  • Created Hive tables, loaded them with data and wrote Hive queries that run internally in MapReduce way
  • Provided step by step process using Ambari during cluster installation.
  • Wrote and modified store procedures enabling to load and modify data as per the project requirements
  • Responsible for developing PIG Latin scripts enabling the extraction of data from the web server output files to load into HDFS
  • Extensively used Flume to collect the log files from the web servers and then integrated these files into HDFS
  • Responsible for implementing schedulers on Job Tracker enabling them to effectively use the resources available in the cluster for any given MapReduce jobs.
  • Constantly worked on tuning the performance of the queries in Hive and Pig, making the queries work even more powerfully in processing and retrieving the data
  • Supported Map Reduce Programs running on the cluster
  • Created external tables in Hive and loaded the data into these tables
  • Hands on experience in database performance tuning and data modeling
  • Monitored the cluster coordination using ZooKeeper

Environment: Hadoop v1.2.1, HDFS, MapReduce, Hive, Sqoop, Pig, DB2, Oracle, XML, CDH4.x

Confidential -Tampa, FL

Hadoop/Spark Developer

Responsibilities:

  • Being a ground up project, we have developed the entire application from scratch and I have worked mainly on writing code for Kafka Producer and Kafka Consumer as per our requirement.
  • After persisting the data into Kafka brokers successfully, It is written to a flat file from where we load it into HIVE table.
  • Defined and created the structure of Hive table on one side and Hbase table on the other side.
  • Developed a spark pipeline to transfer data from lake to Cassandra in cloud to make the data available for decision engine to publish customized offers real time.
  • Worked on Big Data Integration and Analytics based on Hadoop, Spark, Kafka and web Methods technologies.
  • Performed complex mathematical, statistical analysis using SparkMlib, SparkStreaming and GraphX.
  • Developed data pipeline using Sqoop, Pig and Java Map Reduce and Spark to ingest customer behavioral data and purchase histories into HDFS for analysis.
  • Used Storm to consume events coming through Kafka and generate sessions and publish them back to Kafka.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.

Environment: Hadoop v2.6.0, HDFS, CDH 5.3.x, Map Reduce, Hbase, Sqoop, Core Java, Hive, Spark, Oozie DB.

Confidential - Phoenix, AZ

Hadoop Developer

Responsibilities:

  • Importing Large Data Sets from MySQL to Hive Table using Sqoop.
  • Involved in analysis of user requirement and identifying the resources. Developed reusable Mapp lets and Transformations. Worked on data modelling of snow flake schema in data mart. Experience in power center client tools for data staging and other transformations.
  • Worked on various kinds of transformations like expressions, aggregator, stored procedure, lookup, filter, join, rank, router for ETL from Data Warehouse.
  • Worked on sessions for loading data into targets, involved in writing queries for staging.
  • Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle DB.
  • Worked on Oracle Data Integrator for performing ELT, experience working in different navigators and concepts like sessions, execution and scheduling.
  • Created a new mapping to pull data to target using lookup table, aggregators and joins.
  • Hands on experience in Data loading and integration strategies with data quality.
  • Created Hive Managed and External Tables as per the requirements
  • Designing and developing tables in Hbase and storing aggregating data from Hive
  • Developing Hive Scripts for Extract, Transformations and Load from MySQL.
  • Writing Java Custom UDF's for processing data in Hive.
  • Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive.
  • The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented Partitioning, Bucketing in Hive for better organization of the data
  • Optimized Hive queries for performance tuning.
  • Involved with the team of fetching live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
  • Experience in using Avro, Parquet, RC File and JSON file formats and developed UDFs using Hive and Pig.
  • Installed Oozie workflow to run several MapReduce jobs.
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Worked on different file formats like XML files, Sequence files, JSON, CSV and Map files using Map Reduce Programs.
  • Continuously monitored and managed Hadoop cluster using Cloudera Manager.
  • Performed POC's using latest technologies like spark, Kafka, Scala.
  • Worked on the conversion of existing MapReduce batch applications to Spark for better performance.

Environment: Hadoop v2.4.0, HDFS, Map Reduce, Core Java, Oozie, Hive, Sqoop, CDH 4.x.x

Confidential

Java Developer

Responsibilities:

  • Using OOAD Technology classes are designed in UML with the help of Rational Rose tool.
  • Created user-friendly GUI interface and Web pages using HTML and DHTML embedded in JSP.
  • JavaScript was used for the client-side validations.
  • Designing and developing generic validator framework for modules and injecting these validators using hibernate framework.
  • Creating Hibernate mapping files for all database tables.
  • Developing GUI Screens using JSF (IBM Implementation) and for Ajax functionality.
  • Developed and deployed EJB's (Session and Entity) to implement the business logic and to handle various interactions with the database.
  • Involved in debugging the application.
  • Developed Servlets using JDBC for storing and retrieving user data into and from the SQL database.
  • Used Web Logic Application Server to deliver a new class of enterprise applications that enhance business interactions and transactions between a company and its key constituencies.
  • Used Web Logic Application Server to deliver high performance and scalability.
  • Written Database objects like Triggers, Stored procedures in SQL.
  • Interacted with the users and documented the System.
  • Used HP QA to manage the defects and issues.

Environment: JSP 2.0, JDBC, HTML, OOAD, Servlets, Web Services, WSAD 5.0, UML, Java 1.6, EJB 2.0, JSF, QA, Hibernate, AJAX, Windows 7/XP, CVS, XML/XSL.

Java Developer

Confidential

Responsibilities:

  • Worked as software developer for Confidential on developing a supply chain management system.
  • The application involved tracking invoices, raw materials and finished products.
  • Gathered user requirements and specifications.
  • Developed the entire application on Eclipse IDE.
  • Developed and programmed the required classes in Java to support the User account module.
  • Used HTML, JSP and JavaScript for designing the front-end user interface.
  • Implemented error checking/validation on the Java Server Pages using JavaScript.
  • Developed Servlets to handle the requests, perform server-side validation and generate result for user.
  • Used JDBC interface to connect to database.
  • Used SQL to access data from Microsoft SQL Server database.
  • Performed User Acceptance Test.
  • Deployed and tested the web application on Web Logic application server.

Environment: JDK 1.4, Servlet 2.3, JSP 1.2, JavaScript, HTML, JDBC 2.1, SQL, MySQL Server, UNIX and BEA Web Logic Application Server.

We'd love your feedback!