We provide IT Staff Augmentation Services!

Sr. Hadoop developer Resume

4.00/5 (Submit Your Rating)

Pittsburgh, PA

PROFESSIONAL SUMMARY:

  • 8 years of professional IT experience which includes more than 3 years of experience in Big data ecosystem and related technologies.
  • Experience in working with the Hadoop eco systems (Pig, Hive, Oozie, Hbase, Flume, Zookeeper, Sqoop)
  • Worked in NoSQL databases like HBase, Cassandra , and Mongodb .
  • Developed and implemented MapReduce programs for analyzing Big Data with different file formats like structured and unstructured data.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice versa.
  • Knowledge on several tools like Spark, Kafka, Storm, Hue, Zookeeper, Solr search engine
  • Expertisein cleansing and analyzing data using HiveQL , Pig Latin and custom MapReduce programs in Java.
  • Involved in importing streaming logs and aggregating the data to HDFS through Flume .
  • Experience in writing custom UDFs for extending Hive functionalities.
  • Expertise on basics of R Programming
  • Scheduled workflows using Oozie with help of Oozie coordinator and Zookeeper.
  • Ability to develop Pig UDF'S to pre - process the data for analysis.
  • Knowledge on different file formats like Avro, Parquet, JSON
  • Worked extensively with Data migration, Data cleansing, and ETL Processes features for data warehouses.
  • Good experience working with Cloudera Distribution in standalone, pseudo and distributed mode.
  • Used the Agile , Waterfall and Scrum methodology to develop the applications.
  • Having good experience on Core Java in implementing OOP concepts , Multithreading , Collections, Exception handling .
  • Good work experience in developing web applications covering front-end/UI using the web technologies like HTML, XHTML, CSS, JAVASCRIPT, JQUERY, JSON, XML and AJAX .
  • Designed and developed a web-based client using Servlets, JSP, Java Script , and XML using Spring MVC Framework.
  • Involved in the ETL process using Ab Initio tool to setup a data extraction from several databases.
  • Experience in developing and deploying applications using Web Sphere Application Server, Tomcat and Web Logic .
  • Expertise in debugging the applications and Unit Testing the application using JUnit.
  • Implemented web services using SOAP and REST
  • Experienced in analyzing business requirements and translating requirements into functional and technical design specifications using UML.
  • Strong and effective problem-solving, analytical and interpersonal skills, besides being a valuable team player.

TECHNICAL SKILLS:

Big Data Technologies : HDFS, Hive, Map Reduce, Pig, Sqoop, Flume, Zookeeper, Oozie, YARN, Spark, Avro, Impala, Parquet, Hbase, Kafka, Tableau, Cascading

Programming Languages: C, C++, Java, R, Scala

Scripting Languages: Python and Shell

Front End Technologies: HTML, J2EE, JavaScript, CSS,, JSP, Servlets, XML.

Web Services: Restful, Soap

Application Server: Websphere, Weblogic Server, Apache Tomcat.

Java Frameworks: Spring, Hibernate, Struts

Development Methodologies: Agile, Waterfall and Scrum

Databases: Oracle, MySQL, Hbase, Cassandra, MongoDB

PROFESSIONAL EXPERIENCE:

Confidential, Pittsburgh, PA

Sr. Hadoop Developer

Responsibilities:

  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing
  • Involved in loading data from LINUX file system to HDFS
  • Responsible for managing data from multiple sources
  • Extracted files from Relational Database through Sqoop and placed in HDFS and processed
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Managed and reviewed Hadoop Log files
  • Load log data into HDFS using Flume
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Used JDBC for database connectivity with MySQL Server
  • Extensive work in ETL process consisting of data transformation, data sourcing, mapping, conversion and loading using Talend
  • Experience in working with Sparkstreaming.
  • Written Hive queries for data analysis to meet the business requirements.
  • Deep JVM knowledge of heavy experience with Functional Programming language like Scala
  • Involved in converting Hive/SQL queries into Spark transformations and actions using Spark SQL (RDDs and Data frames) in Python and Scala
  • Implemented Spark SQL queries with Scala for faster testing and processing of data
  • Implemented Spark Streaming to read real-time data from Kafka in parallel and processed in parallel and save the result as parquet format in Hive
  • Did analytics POC to analyze outpatient details with R and SparkR (with Logistic Regression algorithm)
  • Installed Zeppelin in Cloudera Dev environment and executed Spark programs
  • Developed applications using Eclipse
  • Used Hadoop Streaming to write jobs in a Python scripting language

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive,HBase,Sqoop, Flume, Java, Python, Oracle 10g, MySQL, Ubuntu, Agile, XML, SQL Server, YARN, Cloudera, Teradata, Talend, UNIX Shell Scripting, Oozie, Scala, Spark, R, Maven, SBT, Zeppelin, Eclipse, IntelliJ.

Confidential - New York, NY

Hadoop Developer

Responsibilities:

  • Processed data into HDFS, analyzed the claims data of the customer usingMapReduce produce summary results from Hadoop to downstream systems
  • Developed MapReduce pipeline jobs to process the data and create necessary files and loading the files into HBase for faster access
  • Created components like Hive UDFs for missing functionality in HIVE for analytics.
  • Involved in implementing the Pig scripts during the ETL process.
  • Imported the log data from different servers into HDFS using Flume and developed MapReduce programs for analyzing the data
  • Imported the claims data of the customer from various systems/sources (like MySQL) into HDFS using Sqoop.
  • Implemented Hive queries for Data cleaning and sorting in the HDFS.
  • Scheduled an workflow to import the weekly transactions in the revenue department from RDBMS database using Oozie
  • Applied Hive quires to perform data analysis on HBase estimating the per annum claims by the customers
  • Refactored the existing Hive and Pig scripts.
  • Hands on experience with NoSQL databases likeCassandra for POC (proof of concept) in storing URL's and images.
  • Used different file formats like Text files, Sequence Files, Avro
  • Cluster co-ordination services through Zookeeper
  • Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts

Environment: MapReduce, HDFS Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, Hbase, Cassandra, Hadoop Cluster

Confidential, Maine,ME

Hadoop Developer

Responsibilities:

  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Manipulated, transformed, and analyzed data from various types of databases
  • Worked extensively in creating Map Reduce jobs to power data for search and aggregation
  • Extensively used Pig for data cleansing with Tez
  • Created HBase tables to store various data formats coming from different applications
  • Designed a data warehouse using Hive
  • Implemented counters on HBasedata to count total records on different tables.
  • Experienced in handling Avro data files by passing schema into HDFS using Avro tools and Map Reduce.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • Implemented secondary sorting to sort reducer output globally in map reduce.
  • Implemented data pipeline by chaining multiple mappers by using Chained Mapper
  • Experience with Hortonworks distribution of Hadoop.
  • Worked on Hadoop cluster which ranged from 5-8 nodes during pre-production stage and it was sometimes extended up to 26 nodes during production
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
  • Experienced in handling data from different data sets, join them and pre-process using Pig join operations.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
  • Provided batch processing solution to certain unstructured and large volume of data by using Hadoop MapReduce framework
  • Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
  • Mentored analyst and test team for writing Hive Queries
  • Used R for analytics, predictive modeling and regression analysis
  • Implemented test scripts to support test driven development and continuous integration

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, HBase, Flume, Java, Oracle 10g, Netezza, MySQL, Ubuntu, Agile, Cloudera, UNIX Shell Scripting, Oozie, Maven, Eclipse.

Confidential, Dallas, TX

Java/Hadoop Developer

Responsibilities:

  • Involved in review of functional and non-functional requirements.
  • Installed and configured HadoopMapReduce and HDFS.
  • Acquired good understanding and experience of NoSQL databases such asHBase and Cassandra.
  • Installed and configured Hive and also implemented various business requirements by writingHiveUDFs.
  • Loaded home mortgage data from the existing DWH tables (SQL Server) to HDFS using Scoop .
  • Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
  • Extensively worked on user interface for few modules using HTML, JSP's, JavaScript.
  • Generated Business Logic using servlets, Session beans and deployed them on Web logic server.
  • Created complex SQLqueries and storedprocedures.
  • Developed the XML schema and Web services for the data support and structures.
  • Implemented the Web service client for login verification, credit reports and applicant information using web service.
  • Responsible for managing data coming from different sources.
  • Used Hibernate ORM framework with spring framework for data persistence and transaction management.
  • Wrote test cases in JUnit for unit testing of classes.
  • Extensively worked with JDBC programs using Oracle and MySQL databases and developed SQL and PL/SQL for Oracle to process the data.
  • Provided technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
  • Built and deployed Java application into multiple Unix based environments and produced both unit and functional test results along with release notes.
  • Analyzed the banking and existing system requirements and validated them to suit J2EE architecture.
  • Designed the process flow between front-end and server side components

Environment: JDK, J2EE, JMS, XML, Servlets, JSP, CVS, HTML, Oracle 8i, UML, and Tomcat.

Confidential

Java developer

Responsibilities:

  • Developed Admission & Census module, which monitors a wide range of detailed information for each resident upon pre-admission or admission to your facility.
  • Involved in development of Care Plansmodule, which provides a comprehensive library of problems, goals and approaches. You have the option of tailoring (adding, deleting, or editing problems, goals and approaches) these libraries and the disciplines you will use for your care plans.
  • Involved in development of General Ledgermodule, which streamlines analysis, reporting and recording of accounting information. General Ledger automatically integrates with a powerful spreadsheet solution for budgeting, comparative analysis and tracking facility information for flexible reporting.
  • Extensively used Core Java, Servlets, JSP and XML.
  • Used Struts 1.2 in presentation tier.
  • Involved in writing JSP and JSF components. Used JSTL Tag library (Core, Logic, Nested, and Bean and Html taglib’s) to create standard dynamic web pages.
  • Application was based on MVC with JSP serving as presentation layer, Servlets as controller and Hibernate in business layer to access to Oracle Database.
  • Developed the DAO layer for the application using Spring Hibernate Template support
  • Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
  • Designed user-interface and checking validations using JavaScript.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
  • Developed various EJBs for handling business logic and data manipulations from database.
  • Involved in design of JSP’s and Servlets for navigation among the modules.
  • Designed cascading style sheets and XML part of Order entry Module & Product Search Module and did client side validations with java script.

Environment: J2EE, Java/JDK, JDBC, JSP, Servlets, JavaScript, EJB, JNDI, JavaBeans, XML, XSLT, Oracle 9i, Eclipse, HTML/ DHTML,SVN.

We'd love your feedback!