We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Dallas, TX

SUMMARY

  • Over 8 Years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies.
  • 4 years of comprehensive experience as a Hadoop/bigdata Developer.
  • Passionate towards working in Big Data and Analytics environment.
  • Hands on experience in developing and deploying enterprise based applications using major components in Hadoop ecosystem like Hadoop 2.x, YARN, Hive, Pig, Map Reduce, HBase, Flume, Sqoop, Spark, Storm, Kafka, Oozie and Zookeeper.
  • Experience in converting Hive/SQL queries into Spark transformations using Java.
  • Experience in installation, configuration, management and deployment of Hadoop Cluster, HDFS, Map Reduce, Pig, Hive, Sqoop, Flume, Oozie, Hbase, and Zookeeper.
  • Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases. Used Informatica for ETL processing based on business.
  • Worked on complex mappings in Talend 6.0.1/5.5 using tMap, tJoin, tReplicate, tParallelize, tJava, tjavarow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, etc.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Data transformation, file processing, and identifying user behavior by running Pig Latin Scripts and expertise in creating Hive internal/external Tables/Views using shared Meta Store.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Diversed knowledge on Spark and Scala.
  • Experience in NoSql databases like Hbase, Cassandra and MongoDB for data extraction and storing huge volumes of data.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Good understanding of Storm and Kafka for monitoring and managing Hadoop jobs.
  • Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper.
  • Experience with databases like DB2, Oracle 11g, MySQL, SQL Server and MS Access.
  • Strong programming skills in Core Java, J2EE and Phyton technologies.
  • Experience in Data Warehouse life cycle, methodologies and its tools for reporting and data analysis.
  • Extensive use of Open Source Software such as Web/Application Servers like Apache Tomcat 6.0 and Eclipse 3.x IDE.
  • Expertise skills in Java Multithreading, Object Oriented Design Patterns, Exception Handling, Servlets, Garbage Collection, JSP, HTML, Struts, Hibernate, Enterprise Java Beans, JDBC, RMI, JNDI and XML related technologies.
  • Experienced in Documenting the Software Requirements Specifications including Functional Requirements, Data Requirements and Performance Requirements.
  • Strong Technical background, excellent analytical ability, team player and goal oriented, with a commitment toward excellence.
  • Highly organized with the ability to manage multiple projects and meet deadlines.
  • Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.

TECHNICAL SKILLS

Languages: C, C++, Java, J2EE, Phyton, PL SQL

Big Data Ecosystem: Hadooop/Big DATA HDFS, Hbase, Pig, Hive, Sqoop, Zookeeper, Oozie, Spark, Apache Hadoop, Kafka

NoSQL Technologies: MongoDB, Cassandra, Neo4J

Databases: Oracle 11g/10g/9.i/8.X, MySQL, MS SQL Server 2000

Web technologies: Core Java, J2EE, JSP, Servlets, EJB, JNDI, JDBC, XML, HTML, JavaScript, Web Services

Frame Works: Spring 3.2/3.0/2.5/2.0 , Struts 2.0/1.0, Hibernate 4.0/3.0, Groovy, Camel

App Server: Weblogic 12c/11g/10.1/9.0, WebSphere 8.0/7.0

Web Server: Apache Tomcat 7.0/6.0/5.5

IDE: Eclipse 3.7/3.5/3.2/3.1/ Net Beans, Edit Plus 2, Eclipse Kepler

Tools: Talend open studio, SQL Developer, Soap UI

Testing: JUnit, JMock

Operating System: Linux, UNIX and Windows 2000/NT/XP/Vista/7/8/10

Methodologies: Agile, Unified Modeling Language (UML), Design Patterns (Core Java and J2EE)

System Design Development: Requirement gathering and analysis, design, development, testing, delivery

PROFESSIONAL EXPERIENCE

Hadoop Developer

Confidential, Dallas, TX

Responsibilities:

  • Interact with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document.
  • Analyze and create low level design document (LLD) and mapping document.
  • Performed analysis, design, development, Testing and deployment for Ingestion, Integration, provisioning using Agile Methodology.
  • Attended Daily Scrum meetings to provide update on the progress of the user stories Rally and to the Scrum Master and also to notify blocker and dependency if any.
  • Experienced in creating Generic schemas and creating Context Groups and Variables to run jobs against different environments like Dev, Test and Prod.
  • Created Talend Mappings to populate the data into dimensions and fact tables.
  • Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance Tuning of mappings.
  • Experienced in Talend Data Integration, Talend Platform Setup on Windows and UNIX systems.
  • Created complex mappings in Talend 6.0.1/5.5 using tMap, tJoin, tReplicate, tParallelize, tJava, tjavarow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, etc.
  • Created joblets in Talend for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
  • Developed jobs to move inbound files to vendor server location based on monthly, weekly and daily frequency.
  • Implemented Change Data Capture technology in Talend in order to load deltas to a DataWarehouse.
  • Created jobs to perform record count validation and schema validation.
  • Created contexts to use the values throughout the process to pass from parent to child jobs and child to parent jobs.
  • Developed joblets that are reused in different processes in the flow.
  • Developed error logging module to capture both system errors and logical errors that contains Email notification and also moving files to error directories.
  • Provided the Production Support by running the jobs and fixing the bugs.
  • Experienced in using Talend database components, File components and processing components based up on requirements.
  • Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite.
  • Worked in improving performance of the Talend jobs.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Worked with the Data Science team to gather requirements for various data mining projects.
  • Involved in creating Hive tables and loading and analyzing data using hive queries.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
  • Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
  • Managed and reviewed Hadoop log files. Used Scala for integration Spark into Hadoop.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Designing conceptual model with Spark for performance optimization.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data
  • Used Spark to analyze point-of-sale data and coupon usage.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases using Scoop.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Performed unit testing and also integration testing after the development and got the code reviewed.
  • Involved in migrating objects from DEV to QA and then promoting to Production.

Environment: Talend Studio 6.0.1/5.5, cloudera, XML files, Flat files, HL7 files, JSON, TWS, Hadoop 2.4.1, HDFS, Hive 0.13, HBase 0.94.21, Talend Administrator Console, IMS, Agile Methodology, HPSM

Hadoop Developer

Confidential, Columbus, IN

Responsibilities:

  • Designed techniques and wrote effective and successful programs in Java, Linux shell scripting to push the large data including the Text and Byte type of data to successfully migrate to No Sql Stores using various Data Parser techniques in addition to Map Reduce jobs.
  • Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs, to enable healthy operation of Map Reduce jobs to push the data from SQL to No Sql store.
  • Designed ETL Jobs/Packages using Talend Open Studio (TOS)
  • Used Talend components (tMap, tDie, tConvertType, tFlowMeter, tLogCatcher)
  • Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.
  • Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
  • Provided continuous monitoring and management of the Hadoop cluster through Cloudera Manager.
  • Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
  • Designed the Prototypes of the Cloud Architecture in Amazon EC2 environment and delivered end to end prototype software within the aggressive schedule.
  • Delivered Working Widget Software using EXTJS4, HTML5, RESTFUL Web services, JSON Store, Linux, HADOOP, ZOOKEEPER, NoSQL databases, Java, SPRING Security, JBOSS Application Server for Big Data analytics.
  • Delivered the Design and developed Business Intelligence and Reporting Software using BIRT and automated Reporting capabilities to extract data from the SQL stores.
  • Developed Java based programs both using the Restful (RestEasy) services, the resource oriented model and the SOAP based web services to enable access the resources to Graphical user Interfaces like EXTJS4.
  • Integrated Central Authentication Services (CAS) to Web applications.
  • Designed, developed and delivered JIRA Graphical User Interface Screens to enable integration with the web applications.
  • Implemented the Agile Development and Scrum practices with Sprint planning, Sprint delivery of the developed software and participated in the continuous improvement processes with the Sprint Retrospective.
  • Built data platforms, pipelines, storage systems using the Apache Kafka, Apache Storm and search technologies such as Elastic search.
  • Reviewed ETL application use cases before on boarding to Hadoop
  • Developed Use cases and Technical prototyping for implementing Pig, Hive and Hbase.
  • Analyzed the alternatives for NOSQL Data stores and intensive documentation for Hbase vs. Accumulo data stores.

Environment: Hadoop, Talend Integration Suite 5x, HDFS, Hive, Pig, Zookeeper, Oozie, Flume, Hbase, ETL, Java (jdk1.6), Spring, RESTFUL, HTML5, EXTJS4, JSON, Linux Shell Scripting, SQL,cloudera

Hadoop Developer

Confidential, Redmond, WA

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Involved in analyzing system failures, identifying root causes, and recommended course of actions.
  • Extracted, Transformed, and Loaded (ETL) of data from multiple sources like Flat files, XML files, and Databases.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Designed a data warehouse using Hive.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Develop Hive queries for the analysts.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Cluster co-ordination services through ZooKeeper.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Involved in loading data from LINUX file system to HDFS.
  • Responsible for managing data from multiple sources.
  • Installed and configured Hive and also written Hive UDFs.
  • Extracted files from Couch through Sqoop and placed in HDFS and processed.
  • Provided NoSql solutions in MongoDB, Cassandra for data extraction and storing huge amount of data.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
  • Worked with highly engaged Informatics, Scientific Information Management and enterprise IT teams.

Environment: Hadoop, Hbase, HDFS, Hive, Java (jdk1.6), Pig, Zookeeper, Oozie, Flume

Java Developer

Confidential, Chicago, IL

Responsibilities:

  • Involved in Software Development Life Cycle (SDLC) which includes requirement Gathering, Design, Coding and Testing.
  • Designed and developed Business Services using Spring Framework (Dependency Injection) and DAO Patterns.
  • Implemented the business layer by using Hibernate with Spring DAO and also developed mapping files, POJO java classes using ORM tool.
  • Implemented theWeb servicesand associated business modules integration.
  • Worked on generating the web services classes by using Service Oriented Architecture SOA, SOAP.
  • Developed and implemented the MVC Architectural Pattern JSP, Servlets and Action classes.
  • Specified the default initialization file is through the log4j.configuration system property and loaded the log4j.properties from WeblogicClasspath.
  • Used SOAP UI to test the different methods in the web services.
  • Effective usage of J2EE Design Patterns Namely Session Facade, Factory Method, Command and Singleton to develop various base framework components in the application.
  • Involved in Units integration, bug fixing, and User acceptance testing with test cases.
  • Used Stateless Session Bean to implement Business Process and interact with DA layer for DB Access.
  • Developed the presentation layer using JSP, HTML, XHTML, CSS and client validations using JavaScript.
  • Used Spring MVC and Angular JS framework for configuring the application.
  • Used SQL and PL/SQL Programming extensively to talk to Oracle database.
  • Responsible as CVS administrator and for deploying web application in the Oracle App Server.
  • ANT was used as a build tool. Also, worked in an Agile work environment.
  • Used Log4j for logging errors, messages and performance logs.

Environment: Windows XP, JDK 1.6, Oracle 10g, Web sphere, CVS, Rational Clear quest, Servlets3.0, JSP 2.2, HTML, XHTML, XSLT, JDBC, JMS, EJB, SOAP, WSDL, Web Services, Eclipse 3.2, Ant 1.6.5, Maven, Agile development process,PL/SQL,JUnit, JMock, and Log4j.

Java Developer

Confidential, Columbus, OH

Responsibilities:

  • Designed, configured and developed the web application using Jsp, Jasper Report, barbeque barcode scanner, JavaScript, HTML.
  • Developed Session Beans for JSP clients.
  • Configured and Deployed EAR & WAR files on WebSphere Application Server.
  • Defined and designed the layers and modules of the project using OOAD methodologies and standard J2EE design patterns & guidelines
  • Designed and developed all the user interfaces using JSP, Servlets and Spring framework
  • Developed the DAO layer using Hibernate and used caching system for real time performance Designed the application to allow all users to utilize core functionality, as well as business specific functionality based on log on ID
  • Developed Web Service provider methods (bottom up approach) using WSDL, XML and SOAP for transferring data between the Applications
  • Configured Java Messaging Services (JMS) on Web Sphere Server using Eclipse IDE
  • Used AJAX for developing asynchronous web applications on client side
  • Used JDBC for accessing database to track all credit aspects of accounts, which include financial review details, security held, actuarial exposure data and receivables
  • Designed various applications using multi-threading concepts, mostly used to perform time consuming tasks in the background
  • Wrote JSP & Servlets classes to generate dynamic HTML pages
  • Designed class and sequence diagrams for Modify and Add modules
  • Design and develop XML processing components for dynamic menus on the application
  • Adopted Spring framework for the development of the project
  • Developed the user interface presentation screens using HTML
  • Co-ordinated with QA lead for development of test plan, test cases, test code, and actual testing responsible for defects allocation and resolution of those defects
  • All the coding and testing was performed using Eclipse
  • Maintained the existing code based developed in Spring and Hibernate framework by incorporating new features and fixing bugs
  • Involved in fixing bugs and unit testing with test cases using JUnit framework
  • Developed build and deployment scripts using Apache ANT to customize WAR and EAR files
  • Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic using Oracle database
  • Used Spring ORM module for integration with Hibernate for persistence layer
  • Involved in writing Hibernate Query Language (HQL) for persistence layer
  • Used Log4j for application logging and debugging
  • Coordinated with offshore team for requirement transition & providing the necessary inputs required for successful execution of the project
  • Involved in post-production support and maintenance of the application

Environment: Java SE 7, Java EE 6, JSP 2.1, Servlets 3.0, HTML, JDBC 4.0, IBM WebSphere 8.0, PL/SQL, XML, Spring 3.0, Hibernate 4.0, Oracle 12c, ANT, Java Script & JQuery, JUnit, Windows 7 and Eclipse 3.7.

Java Developer

Confidential

Responsibilities:

  • Developed web pages using JSF framework establishing communication between various pages in application.
  • Designed and developed JSP pages using Struts framework.
  • Utilized the Tiles framework for page layouts.
  • Involved in writing client side validations using Java Script.
  • Used Hibernate framework to persist the employer work hours to the database.
  • Spring framework AOP features were extensively used.
  • Followed Use Case Design Specification and developed Class and Sequence Diagrams using RAD, MS Visio.
  • Used JavaScript, AJAX for making calls to Controllers that get File from server and popup to the screen without losing the attributes of the page.
  • Coded Test Cases and created Mock Objects using JMock and used JUnit to run tests.
  • Configured Hudson and integrated it with CVS to automatically run test cases with every build and generate code coverage report.
  • Configured Data Source on WebLogic Application server for connecting to Oracle, DB2 Databases.
  • Wrote complex SQL statements and used PL/SQL for performing database operations with the help of TOAD.
  • Created User interface for Testing team which helped them efficiently test executables.
  • Mentored co-developers with new technologies. Participated in Code reviews.
  • Worked on a Data stage project which generates automated daily reports after performing various validations.

Environment: UNIX, RAD6.0, WebLogic, Oracle, Maven, JavaScript, JSF, JSP, Servlets, Log4J, Spring, Pure Query, JMock, JUnit,TOAD, MS Visio, Data Stage, CVS, SVN, UML and SOAPUI

We'd love your feedback!