We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Richmond, TX

SUMMARY:

  • 7+ years of professional IT experience in analysing requirements, designing and testing highly distributed mission critical applications.
  • Strong working experience in Big Data and Hadoop ecosystems.
  • Expertise in using various tools in Hadoop ecosystem including MapReduce, HBase, Hive, Pig, Oozie, Sqoop, Flume, Spark, Storm, Kafka, and Zookeeper.
  • Good knowledge on Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, YARN, and MapReduce programs.
  • Proficient in using HDFS commands, writing MapReduce jobs, managing and reviewing Hadoop log files using Hadoop Log Analyzer.
  • Good understanding in security requirements for Hadoop clusters and integrations with Kerberos Key Distribution Centre.
  • Strong skills in querying using Hive, Pig, HBase, Spark SQL, and MongoDB.
  • Extensive experience in working with Oracle, MySQL, Microsoft SQL Server Databases.
  • Good knowledge in Amazon Web Services like Amazon EC2, IAM, EMR, S3 Storage, RedShift, DynamoDB, Aurora, and AWS Security Compliance Programs.
  • Experienced in constructing Apache Storm - Apache Kafka pipelines and implementing controlled data flow procedures using work flow tools like Oozie.
  • Hands on experience in designing ETL operations including data extraction, data cleansing, data transformations, data loading.
  • Good knowledge in developing data ingestion processes using Flume Agents and Spark Streaming for real time and near-real time data analysis.
  • Experience in developing workflows using Flume Agents with multiple sources like Web Server logs, REST API and multiple sinks like HDFS sink and Kafka sink.
  • Experience in implementing unified data platform to get data from different data sources using Apache Kafka brokers and various producers and consumers.
  • Proficient in Java, Scala, and Python.
  • Hands on experience in application development using Java, RDBMS, and Linux Shell Scripting.
  • Experience in working on various operating systems like UNIX, Linux, and Windows.
  • Excellent understanding of Software Development Life Cycle (SDLC) and strong knowledge on various project implementation methodologies including Waterfall and Agile.
  • Worked in Agile environment with active SCRUM participation.
  • Possess strong communication and Interpersonal skills. Can quickly master and work on new concepts and ability to adapt to different project environments and applications with minimal supervision.

TECHNICAL SKILLS:

Programming Languages: Core Java, Scala, Python, C, C++, Unix Shell Scripting

Big data Technologies: Hadoop and MapReduce, HDFS, YARN Pig, Hive, Sqoop, Oozie, HBase, Spark, Flume, Kafka, ZooKeeper

Databases: My SQL, Oracle, Microsoft SQL SERVER, Microsoft Azure SQL, PostgreSQL

NoSQL Databases: MongoDB, HBase, Dynamo DB, Oracle NoSQL Database

Operating Systems: Windows, Linux, MacOS

Methodologies: Agile, Water Fall, Scrum

Network Protocols: TCP/IP, HTTP, HTTPS, UDP, DNS

Integration Tools: Jenkins and Hudson

Build Tools: Ant, Maven, Gradle

Version Control: SVN, Git tortoise, Git hub, TFS

IDE s: Eclipse, Net Beans IntelliJ IDEA, Notepad++, Visual Studio

Cloud Services: Amazon Web Services, Microsoft Azure

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, Richmond, TX

Responsibilities:
  • SDLC Requirements gathering, Analysis, Design, Development and Testing of application using AGILE and SCRUM methodology.
  • Detailed understanding on existing build system, Tools related for information of various products and releases and test results information
  • Designed and implemented map reduce jobs to support distributed processing using Java, Hive and Apache Pig .
  • Consumed Web Services for transferring data between different applications using RESTFUL APIs.
  • Built a mechanism for Talend, automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
  • Implemented a prototype to integrate PDF documents into a web application using Git hub.
  • Active participation in process improvement, normalization/de-normalization, data extraction, data cleansing, SCRUM data manipulation
  • Performed data transformations in Scala, Hive and used partitions, buckets for performance improvements.
  • Written custom Input format and record reader classes for reading and processing the binary format in MapReduce .
  • Used Mockito frame work as the unit test runner.
  • Involved in Test Driven Development (TDD) and Acceptance Test Driven Development (ATDD).
  • Managed and deployed Amazon Web Services Elastic MapReduce (AWS EMR) clusters.
  • Build cloud-native applications using Amazon Web Services - specifically Elastic Map Reduce (EMR), Lambda, DynamoDB, and Elastic Beanstalk.
  • Managed data schema versions across various microservices.
  • Developed and tested the enterprise application with JUNIT.
  • Written Custom writable classes for Hadoop serialization and De-serialization of time series tuples.
  • Implemented custom file loader for Pig to query directly on large data files such as build logs
  • Used Python for pattern matching in build logs to format errors and warnings
  • Developed Pig Latin scripts & Shell scripts for validating the different query modes in Historian.
  • Created Hive external tables on the MapReduce output before partitioning; bucketing is applied on top of it.
  • Improved the Performance by Scala, tuning of HIVE and MapReduce using Talend, ActiveMQ and JBoss .
  • Developed daily test engine using Python for continuous tests.
  • Developed rich interactive visualizations integrating various reporting components from multiple data sources
  • Used Shell scripting for Jenkins job automation with Talend.
  • Building a custom calculation engine which can be programmed according to user needs.
  • Ingestion of data into Hadoop using Shell scripting for SCRUM, Elastic Sqoop and apply data transformations and using Pig and Hive.
  • Handled the performance improvement changes to Pre-Ingestion service which is responsible for generating the Big Data Format binary files from older version of Historian.
  • Worked with support teams and resolved operational & performance issues

Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloudera, Java Map-Reduce, Python, Maven, GIT, Jenkins, UNIX, MySQL, Eclipse, Oozie, Sqoop, Flume, Oracle, JDK 1.8/1.7, Agile and Scrum Development Process, NoSQL, JBoss, Flink, Java Script, and Mockito

Senior Hadoop Developer

Confidential, Boca Raton, FL

Responsibilities:
  • Involved in all phases of development activities from requirements collection to production support.
  • Detailed understanding of current system and find out the different sources of data for EMR.
  • Involved in cluster setup.
  • Performed Batch processing of logs from various data sources using MapReduce.
  • Submitted automated jobs on Cloudera via Jenkins scripts.
  • Contributed to Predictive Analytics. (To monitor inventory levels and ensure product availability)
  • Analysis of customers' purchasing behaviours in JMS.
  • Response to value-added services based on clients' profiles and purchasing habits.
  • Worked on gathering and refining requirements, interviewing business users to understand and document data requirements including elements, entities and relationships, in addition to visualization and report specifications.
  • Defined UDFs using Pig and Hive to capture customer behaviour.
  • Designed and implemented Map Reduce jobs to support distributed processing using Java, Hive, Spark SQL and Apache Pig, Oozie.
  • Integrated Apache Kafka for data ingestion.
  • Creates Scala and Hive external tables on the map reduce output before partitioning and bucketing.
  • Maintenance of data and imported scripts using HBase, Hive and Map reduce jobs.
  • Developed and maintain several batch jobs to run automatically depending on business requirements.
  • Import and export data between the environments like MySQL, and HDFS.
  • Unit testing and Deploying for internal usage monitoring performance of solution.

Environment: Apache Hadoop, Cloudera, RHEL, Hive, HBase, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume and Cloudera Distribution, Oracle, Teradata and MySQL.

Hadoop Developer

Confidential, Winston Salem, NC

Responsibilities:
  • Involved in System Analysis and Design methodology as well as Object Oriented Design and development using OOAD methodology to capture and model business requirements.
  • Worked with several clients with day to day requests and responsibilities.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Involved in analysing system failures, identifying root causes and recommended course of actions.
  • Worked on Hive for exposing data for further analysis and for generating and transforming files from different analytical formats to text files.
  • Wrote Shell Scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Implemented and maintained various projects in Java.
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes.
  • Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
  • Assisted in designing, building, and maintaining database to analyse life cycle of checking and debit transactions.
  • Developed JAVA, J2EE applications in Object Oriented Analysis, extensively involved throughout Software Development Life Cycle (SDLC).
  • Worked with J2SE, XML, Web Services, WSDL, SOAP, UDDI, and TCP/IP.
  • Utilized JSP, Servlet, Java Server Face, EJB, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit as part of developing.
  • Used database technologies such as Oracle 8i and Oracle 9i, DB2, PL/SQL.
  • Worked with Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.

Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Cloudera, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.

Java/Big Data Developer

Confidential, Cl eveland, OH

Responsibilities:
  • Designed, implemented and tested clustered multi-tiered, e-Commerce products. Core technologies used include IIS, SQL Server, ASP, XML/ XSLT, JSP, Tomcat, JavaBeans and Java Servlets.
  • Developed the XML Schema and Web services for the data maintenance and structures.
  • Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
  • Followed Agile Methodology (TDD, SCRUM) to satisfy the customers and wrote JUnit test cases for unit testing the integration layer.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Analysed the data by performing Hive queries and running Pig scripts to know user behaviour.
  • Developed scripts to extract data from MySQL into HDFS.
  • Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
  • Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
  • Loaded the aggregated data onto DB2 for reporting on the dashboard.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Strong expertise on MapReduce programming model with XML, JSON, CSV file formats.
  • Experience in managing and reviewing Hadoop log files.
  • Involved in loading data from Linux file system to HDFS.
  • Implemented test scripts to support test driven development and continuous integration.
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Worked with the Data Science team to gather requirements for various data mining projects.

Environment: HDFS, Hadoop 2.2.0 (Yarn), Flume 1.5.2, Eclipse, SQL Server, Map Reduce, Hive 1.1.0, Pig Latin 0.14.0, JavaBeans, SQL, Sqoop 1.4.6, Oozie, CentOS, Zookeeper 3.5.0 and NOSQL database.

Java Developer

Confidential

Responsibilities:
  • Involved in analysis, design and development of Expense Processing systems.
  • Created used interfaces using JSP.
  • Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
  • Developed the DAO objects using JDBC.
  • Business Services using the Servlets and Java.
  • Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, Client side and Server-side validations.
  • Developed GUI using JSP, Struts frame work.
  • Involved in developing the presentation layer using Spring MVC/Angular JS/jQuery.
  • Involved in designing the user interfaces using Struts Tiles Framework.
  • Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
  • Used Hibernate 3.0 in data access layer to access and update information in the database.
  • Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.
  • Developed JUnit test cases for all the developed modules.
  • Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
  • Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
  • Used CVS for version control across common source code used by developers.
  • Used Ant scripts to build the application and deployed on Oracle WebLogic Server 10.0.

Environment: Struts 1.2, Hibernate 3.0, Spring 2.5, JSP, Servlets, XML, SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.

Java Developer

Confidential

Responsibilities:
  • Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
  • Developed nightly batch jobs which involved interfacing with external third-party state agencies.
  • Implemented JMS producer and Consumer using Mule ESB.
  • Gathered business requirements and wrote functional specifications and detailed design documents.
  • Extensively used Core Java, Servlets, JSP and XML.
  • Wrote AngularJS controllers, views, and services.
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
  • Implemented Enterprise Logging service using JMS and Apache CXF.
  • Developed Unit Test Cases and used JUNIT for unit testing of the application.
  • Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements.

Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB, Junit, WAS7, jQuery, Ajax, SAX.

We'd love your feedback!