We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Overland Park, KS

SUMMARY:

  • 8+ years of total Software development experience with Hadoop Ecosystem, Big Data and Data Science Analytical Platforms, Java/J2EE Technologies, Database Management Systems and Enterprise - level Cloud Base Computing and Applications.
  • Around 3years of experience in Design and Implementation of Big data applications using Hadoop stack MapReduce, Hive, Pig, Oozie, Sqoop, Flume, HBase and NoSQL Data bases.
  • Hands on experience in writing complex Map reduce jobs, Pig Scripts and Hive data modeling.
  • Have experience creating batch style distributed computing applications using Apache Spark and Flume.
  • Have hands-on experience doing analytics using SPARK SQL.
  • Hands-on experience and in depth understanding and usage of Hadoop Architecture frameworks and various components
  • Experience and in-depth understanding of analyzing data using HIVEQL, PIG.
  • Worked extensively with HIVE DDLs and Hive Query language (HQLs) and Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
  • Good hands-on experience with PIVOTAL’S query processing model HAWQ.
  • In-depth understanding of NoSQL databases such as HBase.
  • Proficient knowledge and hands on experience in writing shell scripts in Linux.
  • Adequate knowledge and working experience in Agile & Waterfall methodologies.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
  • Have a good understanding of Kafka.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and CISCO TIDAL.
  • Experience using various Hadoop Distributions (PIVOTAL, Hortonworks, MapR etc) to fully implement and leverage new Hadoop features.
  • Expertise in Hadoop Ecosystem tools which including HDFS, Yarn, MapReduce, Pig, Hive, Sqoop, Flume, Kafka, Spark, Zookeeper and Oozie.
  • Experienced in requirement analysis, application development, application migration and maintenance using Software Development Lifecycle (SDLC) and Java/J2EE technologies.
  • Experience in development of Client/Server Technologies and Systems Software design and development using Java/ JDK, Java Beans, J2EE(TM) Technology- J2EE technologies such as Spring, Struts, Hibernate, Servlets, JSP, JBOSS, JavaScript and JDBC and web Technologies like HTML, CSS, PHP, XML.
  • Experienced in backend development using SQL, stored procedures on Oracle 9i, 10g and 11i
  • Worked on various Tools and IDEs like Eclipse, IBM Rational, Visio, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus
  • Expertise in full life cycle development of system, requirement elicitation, making Use Cases, Class Diagram, and Sequence Diagram.
  • Conscientious team player and motivated to learn and apply new concepts. Always aspires to exceed client expectations and to effectively collaborate with several cross-functional teams.
  • Worked with geographically distributed and culturally diverse team, including roles that involve interaction with clients and team members.

TECHNICAL SKILLS:

Big-Data /Hadoop Technologies: MapReduce, PIG, HIVE, SQOOP, FLUME, HDFS, Kafka,Oozie, HAWQ

NO SQL Database: Hbase

Real Time/Stream processing: Apache Spark,Apache Kafka

Programming Languages: JAVA, C++, C, SQL, PL/SQL,Python, Scala.

Java Technologies: Servlets, JavaBeans, JDBC, JNDI, JTA, JPA, EJB 3.0

Framework: JUnit and JTest, LDAP

Databases: Oracle8i/9i, MY SQL, MS SQL server, POSTGRESQL

IDE's & Utilities: Eclipse, NetBeans

Web Dev. Technologies: HTML, XML

Protocols: TCP/IP, HTTP and HTTPS

Operating Systems: Linux, MacOS, Windows 8, Windows 7, Vista, XP, Windows 95/2000 and MS-DOS

PROFESSIONAL EXPERIENCE:

Confidential, Overland Park, KS

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution.
  • Worked on Kafka and REST API to collect and load the data on Hadoop file system and used sqoop to load the data from relational databases.
  • Implemented Talend jobs to load data from excel sheets and integrated with Kafka.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra database.
  • Developed Spark scripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala, Python.
  • Worked with Python, to develop analytical jobs using light weight PySpark API of spark.
  • Worked with Avro, ORC file formats and compression techniques like LZO.
  • Used Hive to form an abstraction on top of structured data resides in HDFS and implemented Partitions, Dynamic Partitions, Bucketson HIVE tables.
  • Used Spark API over Hadoop YARN as execution engine for data analytics using Hive.
  • Worked on migrating MapReduce programs into Spark transformations using Scala.
  • Designed, developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis.
  • Using Job management scheduler apache Oozie to execute the workflow.
  • Using Ambarito monitor node's health, status of the jobs and to run the analytics jobs in Hadoop clusters.
  • Worked on Tableau to build customized interactive reports, worksheets and dashboards.
  • Implemented Kerberos for strong authentication to provide data security.
  • Involved in performance tuning of spark jobs using Cache and by utilizing complete advantage of cluster environment.

Environment: Hadoop, HDP, Spark, Scala, Python, Kafka, Hive, Sqoop, Ambari, Mesos, Talend, Oozie, Cassandra, Tableau, Jenkins, Hortonworks, Amazon AWS and Red Hat Linux.

Confidential, KS

Hadoop Developer

Responsibilities:

  • Built python script to extract the data from the Hawq tables and generated a “dat” file for the downstream application.
  • Built a generic framework to parse raw data with fixed length using python which takes JSON Layout for the fixed positions of the strings and load the data into Hawq tables.
  • Built generic framework that transforms two or more data sets in HDFS using python.
  • Built generic frameworks for Sqoop/Hawq to load data from SQL server to HDFS and HDFS to Hawq using python.
  • Performed extensive data validation using Hawq partitions for efficient data access.
  • Built generic framework that allows for us to update the data in a Hawq tables using python.
  • Coordinated in all testing phases and worked closely with Performance testing team to create a baseline for the new application.
  • Created automated workflows that schedule jobs daily for loading data and other transformation jobs in using cisco tidal.
  • Created PostgreSQL functions (stored procs) to populate the data into the tables on daily basis.
  • Developed functions using PL python for various use cases.
  • Wrote programs on Scala to support the play framework and act as code behind for the frontend application.
  • Developed multiple Kafka topics/queues and produced 20Million data using producer
  • Wrote and worked on various data types like complex Json, canonical Json and xml data to Kafkatopics.
  • Developed the code for Data ingestion and acquisition using Spring XD streams to Kafka.
  • The format of data is JSON data which is finally converted into avro byte format and then published to kafka.
  • Documented technical design documents and production support documents.
  • Documented technical design documents and production support documents.
  • Worked on SSIS and SSRS tools to aid in the decommission of the data from SQL to distributed environment.
  • Wrote python scripts to create automated workflows.

Environment: PHD-2.0, HAWQ 1.2, SQOOP 1.4, Python 2.6, SQL, Apache Kafka

Confidential, Philadelphia, PA

Hadoop Developer

Responsibilities:

  • Pulled the data from data warehouse using Sqoopand placed in HDFS.
  • Wrote MapReduce jobs to join data from multiple tables and convert it to CSV files.
  • Worked with Play Frameworkto design the frontend of the application.
  • Wrote programs onScala to support the play framework and act as code behind for the frontend application.
  • Wrote programs in java and at times Scala to implement intermediate functionalities like events or records count from the Hbase.
  • Configured multiple remote akka worker nodes and Master nodes from scratch to as per the software requirement specifications.
  • Also wrote some pig scripts to do ETL transformations on the MapReduce processed data.
  • Involved in review of functional and non-functional requirements.
  • Responsible to manage data coming from different sources.
  • Wrote shell scripts to pull the necessary fields from huge files generated by MapReduce jobs.
  • Converted ORC data from hiveinto flat file using mapReduce jobs.
  • Creating Hive tables and working on them using Hive QL.
  • Supported the existing MapReduce Programs those are running on the cluster.
  • Followed agile methodology for the entire project.
  • Prepare technical design documents, detailed design documents.

Environment: Linux - Ubuntu, Hadoop pseudo distributed mode 1.2.1, HDFS, Hive, Hortonworks, Flume, Hive.

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

  • Converting the existing relational database model to Hadoop ecosystem.
  • Generate datasets and load to HADOOP Ecosystem
  • Worked with Linux systems and RDBMS database on a regular basis to ingest data using Sqoop.
  • Worked with Spark to create structured data from the pool of unstructured data received.
  • Managed and reviewed Hadoop and HBase log files.
  • Involved in review of functional and non-functional requirements.
  • Responsible to manage data coming from various sources.
  • Experience in implementing Kafka Java producers and create custom partitions, configured brokers and implemented High level consumers to implement data platform.
  • Loaded the CDRs from relational DB using Sqoop and other sources to Hadoop cluster by using Flume.
  • Involved in loading data from UNIX file system and FTP to HDFS.
  • Designed and implemented HIVE queries and functions for evaluation, filtering, loading and storing of data.
  • Creating Hive tables and working on them using Hive QL.
  • Wrote Spark code to convert unstructured data to structured data.
  • Developed Hive queries to analyze the output data.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Developed Kafka consumer's API in Scala for consuming data from Kafka topics.
  • Had to do the Cluster co-ordination services through ZooKeeper.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Used HIVE to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Design and implement Spark jobs to support distributed data processing.
  • Supported the existing MapReduce Programs those are running on the cluster.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Followed agile methodology for the entire project.
  • Installed and configured Apache Hadoop Hive and Pig environment
  • Prepare technical design documents, detailed design documents.

Environment: Linux - Ubuntu, Hadoop pseudo distributed mode 1.2.1, HDFS, Hive 0.12, Flume, Kafka, Hortonworks, Spark, Flume, Hive.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Utilized Flume to filter out the input data read to retrieve only the data needed to perform analytics by implementing flume interception.
  • Used Flume to transport logs to HDFS
  • Worked on Pig script to count the number of times a URL was opened in a duration. Later a comparison of the count of various other URL’s shows the relative popularity of that website among employees.
  • Hive was used to pull out additional analytical information.
  • Worked on Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume
  • Worked on Hue interface for querying the data.
  • Involved in writing MapReduce programs for analytics
  • Also used MapReduce for structuring the data coming from flume sinks.
  • Managing and scheduling Jobs on a Hadoop cluster using Oozie.
  • Generated the datasets and loaded to HADOOP Ecosystem.
  • Performed the installation, configuration and used the Hadoop ecosystem components such as Map Reduce, HDFS, Pig, Hive, Scoop, Flume, HBase.

Environment: Hadoop, Cloudera Manager, Map Reduce, Hive, Flume, Pig.

Confidential, Houston, TX

Java Developer

Responsibilities:

  • Worked with several clients with day-to-day requests and responsibilities.
  • Involved in analyzing system failures, identifying root causes and recommended course of actions.
  • Integrated Struts Hibernate and JBoss Application Server to provide efficient data access.
  • Involved in HTML page Development using CSS and JavaScript.
  • Developed the presentation layer with JSF, JSP and JAVA Script technologies.
  • Designed table structure and coded scripts to create tables, indexes, views, sequence, synonyms and database triggers. Involved in writing Database procedures, Triggers, PL/SQL statements for data retrieval.
  • Developed the UI components using JQuery and JavaScript Functionalities.
  • Designed database and coded PL/SQL stored Procedures, triggers required for the project.
  • Used Session and FacesContext of JSF Objects for passing content from one Bean to other.
  • Designed and developed Session Beans to implement business logic.
  • Tuned SQL statements and Web Sphere application server to improve performance, and consequently met the SLAs.
  • Created the EAR and WAR files and deployed the application in different environment.
  • Engaged in analyzing requirements, identifying various individual logical components, expressing the system design through UML diagrams using Rational Rose.
  • Involved in running shell scripts for regression testing.
  • Extensively used HTML and CSS in developing the front-end.
  • Designed and Developed JSP pages to store and retrieve information.

Environment: Java, J2EE, JSP, Java Script, JSF, Spring, XML XHTML, Oracle9i, PL/SQL, SOAP Web service, Web Sphere, Oracle, JUnit, SVN.

Confidential

Graduate Trainee/Programmer Analyst

Responsibilities:

  • Prepared program Specification for the development of PL/SQL procedures and functions.
  • Created Custom Staging Tables to handle import data.
  • Created custom triggers, stored procedures, packages and functions to populate different database.
  • Developed SQL* loader scripts to load data in the custom tables.
  • Run Batch files for loading database tables from flat files using SQL*loader.
  • Created UNIX Shell Scripts for automating the execution process.
  • Developed PL /SQL code for updating payment terms.
  • Created indexes on tables and Optimizing Stored Procedure queries.
  • Design, Development and testing of Reports using SQL*plus.
  • Modified existing codes and developed PL/SQL packages to perform certain specialized functions/enhancement on oracle application.
  • Created Indexes and partitioned the tables to improve the performance of the query.
  • Involved in preparing documentation and user support documents.
  • Involved in preparing test plans, unit testing, System integration testing, implementation and maintenance.

Environment: Oracle 9i/10g, PL/SQL, SQL*Loader, SQL Navigator, SQL*Plus, UNIX, Windows NT, Windows2000.

We'd love your feedback!