We provide IT Staff Augmentation Services!

Hadoop Java Developer Resume

5.00/5 (Submit Your Rating)

Austin, TX

PROFESSIONAL SUMMARY:

  • 8+ years of overall IT experience in a variety of industries, which includes hands on experience of 3+ years in Big Data Analytics and development
  • Expertize with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Experience in analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience on Hadoop distributions like Cloudera, MapR and HortonWorks.
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
  • Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml and JSON.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
  • Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
  • Strong Experience of Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP and AutoSys.
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Worked in large and small teams for systems requirement, design & development.
  • Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and Git.
  • Experience of using build tools Ant, Maven.
  • Preparation of Standard Code guidelines, analysis and testing documentations.

TECHNICAL SKILLS:

BigData/Hadoop Technologies: HDFS, YARN, MapReduce, Hive, Pig, Impala, Sqoop, Flume, Spark, Kafka, Storm, Drill, Zookeeper and Oozie

NO SQL Databases: HBase, Cassandra, MongoDB

Languages: C, Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts, JMS, EJB, RESTful

Application Servers: Web Logic, Web Sphere

Cloud Computing Tools: Amazon AWS

Databases: Microsoft SQL Server, MySQL, Oracle, DB2

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Business Intelligence Tools: Tableau, Splunk, Qlik View

Development Tools: Microsoft SQL Studio, Eclipse, IntelliJ

Development Methodologies: Agile/Scrum, Waterfall

Version Control Tools: Git, SVN

WORK EXPERIENCE:

Confidential -Washington, DC

Spark Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Experience in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time and Persists into HBase.
  • Developed Spark Applications in Scala and build them using Maven.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDAFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Designed, developed and did maintenance of data pipelines in a Hadoop and RDBMS environment with both traditional and non-traditional source systems using RDBMS and NoSQL data stores for data access and analysis.
  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project.
  • Worked on Cluster of size 400 nodes.
  • Worked extensively with Sqoop for importing metadata from Oracle.
  • Experience in installation & configuration of Apache Hadoop on Amazon AWS (EC2) system.
  • Involved in creating Hive tables, loading and analyzing data using hive queries
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Good experience with continuous Integration of application using Jenkins.
  • Used Reporting tools like Tableau to connect with Hive for generating daily reports of data.
  • Collaborated with the infrastructure, network, database, application and BA teams to ensure data quality and availability.

Environment: Hadoop YARN, Spark Streaming, Spark SQL, Scala, Kafka, Hive, Sqoop, Amazon AWS, Impala, HBase, Tableau, Oozie, Jenkins, Cloudera, Oracle 12c, Linux

Confidential -Atlanta, GA

Hadoop Developer

Responsibilities:

  • Worked on analyzing data using different big data analytic tools including Pig, Hive and MapReduce.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Executed Hive queries on tables stored in Avro format to perform data analysis to meet the business requirements.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive
  • Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using python.
  • Pioneered in design, develop and implementation of entire data transformation process from Python to HQL scripts for performance tuning of application.
  • Implemented Kafka event log producer to produce the logs into Kafka topic which are utilized by ELK (Elastic Search, Log stash, Kibana) stack to analyze the logs produced by the Hadoop cluster.
  • Implemented Kafka consumers to store the data into HDFS and query it by creating hive tables on top of it.
  • Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts
  • Used Snappy compression technique to compress the files before loading it to Hive.
  • Experienced with performing CURD operations in HBase.
  • Created HBase tables to store the Avro data from different portfolios and querying on top of it.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • experienced in running Hadoop streaming jobs to process terabytes of data.
  • Worked with MySQL for storing of metadata and performing the lookup requests.
  • Worked on automation of the process for efficient testing of the application.
  • Performed POC to implement Apache Spark to discuss the uses of implementing it in project.
  • Expertise in platform related Hadoop Production support tasks by analyzing the job logs.
  • Responsible for continuous Build/Integration with Jenkins and deploying the applications into production using XL Deploy.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Hadoop YARN, Hive, Pig, Python, Apache Kafka, HBase, Shell Scripting, Java, MySQL, ELK stack, MapR, Pycharm, XL Deploy, Git, Jenkins.

Confidential - Austin, TX

Hadoop Java Developer

Responsibilities:

  • Development of Map Reduce jobs in cascading for data cleansing and data processing of flat files.
  • Responsible for importing the flat files from external environments to ACC in Hadoop.
  • Design, developed and implemented main flow component for end to end data flow process within the platform.
  • Responsible for creating the SOAP clients for consuming the web service Requests.
  • Involved in Analysis and design for setting up edge node as per the client requirement.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Expertise in writing the hive scripts for large data sets comparison.
  • Expertise in performance optimization and memory tuning of map-reduce applications.
  • Experience with MySQL for utilizing it for auditing purposes on the cluster.
  • Responsible for MapR upgrade in both production and non-production environment.
  • Written shell scripts for data extraction and data cleansing for performing member specific analytics.
  • Co-ordinate with Administrator team to analyze Map Reduce Jobs performance for resolving any cluster related issues.
  • Expertise in platform related Hadoop Production support tasks by analyzing the job logs.
  • Co-ordinate with different teams to determine the root cause and taking steps to resolve them.
  • Responsible for continuous Integration with Jenkins and deploying the applications into production using XL Deploy.
  • Performed POC in installation & configuration of Apache Hadoop on Amazon AWS (EC2) system.
  • Managed and reviewed Hadoop log files to identify issues when job fails and finding out the root cause.
  • Utilizing service now to provide application support for the existing clients.

Environment: Hadoop YARN, Cascading, Hive, Pig, Java, Shell Scripting, SOAP web service, MySQL, MapR, Mule ESB, Agile, Git, Jenkins, Service Now.

Confidential, Dayton, OH

Sr.Java/J2EE Developer

Responsibilities:

  • Involved in Requirement Analysis, Design, Development and Testing of the risk workflow system.
  • Involved in the implementation of design using vital phases of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
  • Applied OOAD principle for the analysis and design of the system.
  • Implemented XML Schema as part of XQuery query language
  • Applied J2EE design patterns like Singleton, Business Delegate, Service Locator, Data Transfer Object (DTO), Data Access Objects (DAO) and Adapter during the development of components.
  • Used RAD for the Development, Testing and Debugging of the application.
  • Used Websphere Application Server to deploy the build.
  • Developed front-end screens using Struts, JSP, HTML, AJAX, JQuery, Java script, JSON and CSS.
  • Used J2EE for the development of business layer services.
  • Developed Struts Action Forms, Action classes and performed action mapping using Struts.
  • Performed data validation in Struts Form beans and Action Classes.
  • Developed POJO based programming model using spring framework.
  • Used IOC (Inversion of Control) Pattern and Dependency Injection of Spring framework for wiring and managing business objects.
  • Used Web Services to connect to mainframe for the validation of the data.
  • SOAP has been used as a protocol to send request and response in the form of XML messages.
  • JDBC framework has been used to connect the application with the Database.
  • Used Eclipse for the Development, Testing and Debugging of the application.
  • Log4j framework has been used for logging debug, info & error data.
  • Used Hibernate framework for Entity Relational Mapping.
  • Used Oracle 10g database for data persistence and SQL Developer was used as a database client.
  • Extensively worked on Windows and UNIX operating systems.
  • Used SecureCRT to transfer file from local system to UNIX system.
  • Performed Test Driven Development (TDD) using JUnit.
  • Used Ant script for build automation.
  • SVN version control system has been used to check-in and checkout the developed artifacts. The version control system has been integrated with Eclipse IDE.
  • Used Rational Clear quest for defect logging and issue tracking.

Environment: Windows XP, RAD7.0, Core Java, J2EE, Struts, Spring, Hibernate, Web Services, Design Patterns, Websphere, Ant, (Servlet, JSP), HTML, AJAX, JavaScript, CSS, jQuery, JSON,SOAP, WSDL, XML, Eclipse, Agile, Jira, Oracle 10g, Win SCP, Log4J, JUnit.

Confidential - Lewisville

Java/J2EE Developer

Responsibilities:

  • Designed and developed the application using agile methodology.
  • Implementation of new module development, new change requirement, fixes the code. Defect fixing for defects identified in pre-production environments and production environment.
  • Wrote technical design document with class, sequence, and activity diagrams in each use case.
  • Created Wiki pages using Confluence Documentation.
  • Developed various reusable helper and utility classes which were used across all modules of the application.
  • Involved in developing XML compilers using XQuery.
  • Developed the Application using Spring MVC Framework by implementing Controller, Service classes.
  • Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
  • Used Hibernate for persistence framework, involved in creating DAO's and used Hibernate for ORM mapping.
  • Written Java classes to test UI and Web services through JUnit.
  • Performed functional and integration testing, extensively involved in release/deployment related critical activities. Responsible for designing Rich user Interface Applications using JSP, JSP Tag libraries, Spring Tag libraries, JavaScript, CSS, HTML.
  • Used SVN for version control. Log4J was used to log both User Interface and Domain Level Messages.
  • Used Soap UI for testing the Web Services.
  • Use of Maven for dependency management and structure of the project
  • Create the deployment document on various environments such as Test, QC, and UAT.
  • Involved in system wide enhancements supporting the entire system and fixing reported bugs.
  • Explored Spring MVC, Spring IOC, Spring AOP, and Hibernate in creating the POC.
  • Done data manipulation on front end using JavaScript and JSON.

Environment: Java, J2EE, JSP, Spring, Hibernate, CSS, JavaScript, Oracle, JBoss, Maven, Eclipse, JUnit, Log4J, AJAX, Web services, JNDI, JMS, HTML, XML, XSD, XML Schema, SVN, Git.

We'd love your feedback!