We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Around 9years of professional IT experience in all phases of Software Development Life Cycle including hands on experience in Java/J2EE technologies and Big Data Analytics.
  • Substantial experience writing MapReduce jobs in Java, PIG , Flume , Tez , Zookeeper and Hive and Storm.
  • Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce, HDFS,HBase , AVRO, Zoo Keeper, Oozie, Hive, HDP , Cassandra , Sqoop , PIG, Flume.
  • Extensive Knowledge on automation tools such as Puppet and Chef.
  • Experience in web - based languages such as HTML, CSS, PHP, XML and other web methodologies including Web Services and SOAP.
  • Good Experience in importing and exporting data between HDFS andRelational Database Management systems using Sqoop.
  • Extensive knowledge of NoSQL databases such as HBase .
  • Worked on Multi Clustered environment and setting up Cloudera Hadoop echo System.
  • Background with traditional databases such as Oracle, Teradata, Netezza, SQL Server, ETL tools / processes and Data warehousing architectures .
  • Proficient in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume .
  • Experience in transferring Streaming data from different data sources into HDFS and HBase using Apache Flume.
  • Experienced in using Zookeeper and OOZIE Operational Services for coordinating the cluster and scheduling workflows.
  • Experience working with Cloudera & Hortonworks Distribution of Hadoop .
  • Extensive experience in Java and J2EE technologies like Servlets, JSP, Enterprise Java Beans (EJB), JDBC.
  • Experienced in importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and extracted the data from relational databases like Oracle , MySQL, Teradata into HDFS and Hive using Sqoop.
  • Expertise in writing HIVE queries, Pig and Map Reduce scripts and loading the huge data from local file system and HDFS to Hive.
  • Hands on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and Apache Kafka .
  • Good experience in working with cloud environment like Amazon Web Services EC2 and S3 .
  • Hands on experience on working with Amazon EMR framework transferring data to EC2 server.
  • Good knowledge in Software Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC).
  • Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
  • Experienced in Application Development using Java, Hadoop, RDBMS and Linux shell scripting and performance tuning.
  • Experienced in loading data to hive partitions and creating buckets in Hive.
  • Experienced in relational databases like MySQL, Oracle and NoSQL databases like HBase and Cassandra.
  • Hands on experience working on NoSQL databases including HBase, Cassandra and its integration with Hadoop cluster.
  • Handson experience in Developing Haoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.

TECHNICAL SKILLS:

Java/J2EE Technologies: JSP, Servlets, JQuery, JDBC, Java Script

Hadoop/Big Data: HDFS, Hive, Pig, HBase, Map Reduce, Zookeeper, Spark, Scala, Akka, Kafka, Sqoop, Oozie, Flume, Storm

Programming Languages: Java, J2EE, HQL, R, Python, XPath, PL/SQL, Pig Latin.

Operating Systems: UNIX, Linux, Windows

Web Technologies: HTML, XML, DHTML, XHTML, CSS, XSLT.

Web/Application servers: Apache HTTP server, Apache Tomcat, AJBoss.

Frameworks: MVC, Struts, Spring, Hibernate

Databases: Microsoft Access, Mongo DB, Cassandra, MS SQL, Oracle.

PROFESSIONAL EXPERIENCE:

Confidential -Atlanta, GA

Sr. Hadoop Developer

Responsibilities:

  • Hands on experience in developing and deploying enterprise-based applications using major components in Hadoop ecosystem like Hadoop2, YARN, Hive, Pig, Map Reduce, HBase, Flume, Scoop, Spark, Strom, Kafka, Oozie and Zookeeper .
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks and Cloudera (CDH3, CDH4) distributions on Amazon web services (AWS).
  • Excellent Programming skills at a higher level of abstraction using Scala and Spark.
  • Good understanding in processing of real-time data using Spark.
  • Hands on experience in Importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop.
  • Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Kafka, Flume, MapReduce, and Hive.
  • Involved in NOSQL databases like HBase, Apache Cassandra in implementing and integration.
  • Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
  • Experience in managing and reviewing Hadoop Log files.
  • Used Zookeeper to provide coordination services to the cluster.
  • Used Microsoft Azure for building the applications and for building, testing, deploying the applications.
  • Experienced using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Experience and understanding in Spark and Storm.
  • Hands on dealing with log files to extract data and to copy into HDFS using flume.
  • Experience in analysing data using Hive, Pig Latin, and custom MR programs in Java.
  • Hands on experience in Analysis, Design, Coding and testing phases of Software Development Life Cycle (SDLC).
  • Experience in multiple database and tools, SQL analytical functions, Oracle PL/SQL server and DB2.
  • Experience in Creating ETL/Talend jobs both design and code to process data to target databases.
  • Worked on different file formats like Avro, Parquet, RC file format, JSON format.
  • Involved in writing Python scripts for building disaster recovery process for current processing data into data center by providing current static location.
  • Hands on experience working on NoSQL databases like MongoDB, HBase, Cassandra and its integration with Haoop cluster.
  • Experience in ingesting data into Cassandra and consuming the ingested data from Cassandra to HDFS.
  • Used Apache Nifi for loading PDF Documents from Microsoft SharePoint to HDFS.
  • Used Avro serialization technique to serialize data for handling schema evolution.
  • Experience in designing and coding web applications using Core Java & Web Technologies - JSP , Servlets and JDBC , full Understanding of utilizing J2EE technology Stack, including Java related frameworks like Spring, ORM Frameworks (Hibernate)
  • Experience in designing the User Interfaces using HTML, CSS, JavaScript and JSP.
  • Developed web application in open source java framework Spring. Utilized Spring MVC framework.
  • Experienced front-end development using EXT-JS, jQuery, JavaScript, HTML, Ajax and CSS.
  • Have good interpersonal, communicational skills, strong problem-solving skills, explore and adapt to new technologies with ease and a good team member.

Environment: Hadoop, YARN, HBase, Azure, SDLC, MVC, NoSQL, Kafka, Python, Zookeeper, Oozie, jQuery, JavaScript, HTML, Ajax and CSS.

Confidential, Louisville, KY

Sr. Hadoop Developer

Responsibilities:

  • Involved in complete SDLC - Requirement Analysis, Development, System Integration Testing and Performance Testing.
  • Involved in architecture and design of distributed time-series database platform using NoSQL technologies like Hadoop/HBase, Zookeeper.
  • Good understanding of Spark Algorithms such as Classification, Clustering, and Regression.
  • Good understanding on Spark Streaming with Kafka for real-time processing.
  • Extensive experience working with Spark tools like RDD transformations, Spark MLlib and Spark QL.
  • Experienced in moving data from different sources using Kafka producers, consumers and preprocess data using Storm topologies.
  • Experience in working with Amazon Web Services EC2 instance and S3 buckets.
  • Responsible for configuring deployment environment to handle the application using Jetty server and Web
  • Logic 10 and Postgres database at the back-end.
  • Involved in the implementation of Spring MVC Pattern and developed persistence layer using Hibernate framework.
  • Implemented ORM through Hibernate and involved in preparing the Database Model for the project.
  • Followed Scrum methodology for the application development.
  • Supported Map Reduce Programs those are running on the cluster and developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Developed various helper classes needed following Core Java multi-threaded programming and Collection classes.
  • Extracted data from Netezza databases to Hadoop framework.
  • Extracted the data from various sources into HDFS using Sqoop and ran Pig scripts on the huge chunks of data.
  • Further used pig to do transformations, event joins, elephant bird API and pre -aggregations performed before loading JSON files format onto HDFS.
  • Involved in resolving performance issues in Pig and Hive with understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
  • Good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.

Environment: HDFS, Spark, Pig, Sqoop, MapR, HBase, Zookeeper, Kafka,AWS, Netezza, Core java.

Confidential - Columbus, OH

Sr. Hadoop/Spark Developer

Responsibilities:

  • Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Responsible for managing data coming from different sources.
  • Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
  • Performed Filesystem management and monitoring on Hadoop log files.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Developed spark code using Scala and spark-SQL for faster testing and data processing.
  • Performed masking on customer sensitive data using Flume interceptors.
  • Used Oozie and Oozie coordinators to deploy end to end data processing pipelines and scheduling the work flows.
  • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Worked on large sets of structured, semi-structured and unstructured data.

Environment: Apache Hadoop, HDFS, MapReduce, Hive, HBase, Sqoop, Oozie, Maven, Shell Scripting, Spark, Scala, Cloudera Manager .

Confidential - Rensselaer, NY

Spark/Hadoop Consultant

Responsibilities:

  • Designing the entire architecture of the data pipeline for analysis.
  • Worked on Sqoop jobs to import data from Oracle and bring into HDFS.
  • Scala Script to load processed into DataStax Cassandra.
  • Performance tuning of Spark and Sqoop Job
  • Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
  • Map-Reduce Job to compare two files TSV and save the processed output into Oracle
  • Hands on design and development of an application using Hive (UDF).
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query
  • Provide support data analysts in running Pig and Hive queries.
  • Transformed the ABintio Process into Hadoop using PIG and HIVE
  • Created partitioned tables in Hive
  • Created Reports using Tableau on HiveServer2.
  • Worked on Data Modelling for Dimension and Fact tables in Hive Warehouse.
  • Scheduling the jobs through Walgreens EBS internal Scheduling System.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Hortoworks Data Platform, Hadoop, Spark, Scala, SBT, Sqoop, Mapreduce, HDFS, Pig, Hive, Java, Oracle, DataStaxCassandra, Centos, Windows, Python.

Confidential, Edwardsville, IL

Hadoop Developer

Responsibilities:

  • Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
  • Exported analyzed data to relational databases using Sqoop for visualization to generate reports for the BI team.
  • Implemented Spark using python and Spark SQL for faster processing of data and algorithms for real time analysis in Spark.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Used the Spark - Cassandra Connector to load data to and from Cassandra. Real time streaming the data using Spark with Kafka.
  • Developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and HBase by implementing the rules in storm.
  • Built a prototype for real time analysis using Spark streaming and Kafka.
  • Built a prototype for real time analysis using Spark streaming and Kafka.Built a prototype for real time analysis using Spark streaming and Kafka.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
  • Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles.
  • Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Expertise in extending Hive and Pig core functionalities by writing custom User Defined Functions (UDF).
  • Used IMPALA to pull the data from Hive tables.
  • Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
  • Create and develop an End to End Data Ingestion on to Hadoop.
  • Involved in architecture and design of distributed time-series database platform using NOSQL technologies like Hadoop/HBase, Zookeeper.
  • Integrated NoSQL database like HBase with Map Reduce to move bulk amount of data into HBase.
  • Efficiently put and fetched data to/from HBase by writing MapReduce job.

Environment: Hadoop, Kafka, Spark, Sqoop, Hive, pig, NoSQL, Impala, Oozie, HBase, Zookeeper.

Confidential , Pea pack, NJ

Java/J2EE Developer

Responsibilities:

  • Understanding and analyzing business requirements. Participated in all phases of SDLC
  • Involved in designing Use Case diagrams, Class diagrams and Sequence diagrams as a part of design phase
  • Configured spring framework using the Spring core module to inject dependencies and Spring ORM module to use Hibernate to persist data into Oracle database.
  • Developed RESTful Web Services using Jersey, JAX-RS to perform CRUD operations on the database server over HTTP and to consume web services for transferring data between different applications.
  • Used Spring Boot for developing microservices and used REST to retrieve data from client-side using Micro service architecture.
  • Used Singleton, Session Facade, and DAO patterns in implementing the application.
  • Used SAX parser for parsing the XML documents that are retrieved upon consuming the Web services
  • Extensively worked with XML Schemas (XSD) for defining XML elements and attributes
  • Deployed web components, presentation components and business components in IBM WebSphere Application Server.
  • Used RabbitMQ as the message broker to convert the entire flow as a SOA based architecture.
  • Involved in developing UI components using Angular JS and JSON to interact with RESTful web services.
  • Utilized JavaScript/jQuery libraries like bootstrap and AJAX for form validations and other interactive features.
  • Created build environment for Java using Git and Maven.
  • Used Log4J to write log messages with various levels.
  • Developed the test cases with JUnit for Unit testing of the built components.
  • Worked on enhancements, change requests and defect fixing. Interacted with product owner and testers.
  • Contributed to standardizing project coding, code review guidelines and checklist.
  • Used Jenkins for Continuous Integration.
  • Used JIRA to keep track of the project, bugs and issues.
  • Followed Agile/ Scrum methodology to track project progress and participated in Scrum meetings.

Environment: Java , J2EE, Hibernate, Spring, Eclipse, IBM WebSphere, REST (JAX-RS), XML, JSON, CSS, JUnit, RabbitMQ, Maven, Oracle, Angular JS, JavaScript/jQuery, AJAX, JIRA, Jenkins .

Confidential

Java Developer

Responsibilities:

  • Involved in Architecture and System Design and development process.
  • Worked with off-site (USA based) resources for successful implementation of the Workflow module.
  • Created UI screens using StrutsMVC for logging into the system and performing various operations on network elements.
  • Classified users into various organizations to differentiate the privileges between them in accessing the system.
  • Developed Use Cases, Business Logic and Unit Testing of Struts Based Application.
  • Developed JSP pages using Custom tags and Tiles framework and Struts framework.
  • Developed UI Screens for presentation logic using JSP, Struts Tiles, and HTML.
  • Used display tag to render large volumes of data.
  • Used Bean, HTML and Logic tags to avoid java expressions and scriplets in JSP.
  • Implemented Design patterns like Session Façade, Command, Singleton and DAO in business layer.
  • Created EJBs for Backend operations. Also used Hibernate for Database persistence.
  • Sent message objects using JMS to client queues and topics.
  • Created Unit test cases for unit testing.
  • Used Log4j for logging purposes and defined debug levels to control the log.
  • Built Application EAR using ANT.
  • Included Hibernate 3.0 annotations for Oracle DB.

Environment: Java 1.5, JavaScript, CSS, AJAX, J2EE, JSP, EJB, Struts 1.2, WebSphere 5.0, Apache TOMCAT, Web Services, Hibernate, JMS, XML, XSL, HTML.

We'd love your feedback!