Sr. Hadoop Developer Resume Atlanta, GA - Hire IT People

SUMMARY:

Around 9years of professional IT experience in all phases of Software Development Life Cycle including hands on experience in Java/J2EE technologies and Big Data Analytics.
Substantial experience writing MapReduce jobs in Java, PIG , Flume , Tez , Zookeeper and Hive and Storm.
Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce, HDFS,HBase , AVRO, Zoo Keeper, Oozie, Hive, HDP , Cassandra , Sqoop , PIG, Flume.
Extensive Knowledge on automation tools such as Puppet and Chef.
Experience in web - based languages such as HTML, CSS, PHP, XML and other web methodologies including Web Services and SOAP.
Good Experience in importing and exporting data between HDFS andRelational Database Management systems using Sqoop.
Extensive knowledge of NoSQL databases such as HBase .
Worked on Multi Clustered environment and setting up Cloudera Hadoop echo System.
Background with traditional databases such as Oracle, Teradata, Netezza, SQL Server, ETL tools / processes and Data warehousing architectures .
Proficient in working with MapReduce programs using Apache Hadoop for working with Big Data.
Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume .
Experience in transferring Streaming data from different data sources into HDFS and HBase using Apache Flume.
Experienced in using Zookeeper and OOZIE Operational Services for coordinating the cluster and scheduling workflows.
Experience working with Cloudera & Hortonworks Distribution of Hadoop .
Extensive experience in Java and J2EE technologies like Servlets, JSP, Enterprise Java Beans (EJB), JDBC.
Experienced in importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and extracted the data from relational databases like Oracle , MySQL, Teradata into HDFS and Hive using Sqoop.
Expertise in writing HIVE queries, Pig and Map Reduce scripts and loading the huge data from local file system and HDFS to Hive.
Hands on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and Apache Kafka .
Good experience in working with cloud environment like Amazon Web Services EC2 and S3 .
Hands on experience on working with Amazon EMR framework transferring data to EC2 server.
Good knowledge in Software Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC).
Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
Experienced in Application Development using Java, Hadoop, RDBMS and Linux shell scripting and performance tuning.
Experienced in loading data to hive partitions and creating buckets in Hive.
Experienced in relational databases like MySQL, Oracle and NoSQL databases like HBase and Cassandra.
Hands on experience working on NoSQL databases including HBase, Cassandra and its integration with Hadoop cluster.
Handson experience in Developing Haoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.

TECHNICAL SKILLS:

Java/J2EE Technologies: JSP, Servlets, JQuery, JDBC, Java Script

Hadoop/Big Data: HDFS, Hive, Pig, HBase, Map Reduce, Zookeeper, Spark, Scala, Akka, Kafka, Sqoop, Oozie, Flume, Storm

Programming Languages: Java, J2EE, HQL, R, Python, XPath, PL/SQL, Pig Latin.

Operating Systems: UNIX, Linux, Windows

Web Technologies: HTML, XML, DHTML, XHTML, CSS, XSLT.

Web/Application servers: Apache HTTP server, Apache Tomcat, AJBoss.

Frameworks: MVC, Struts, Spring, Hibernate

Databases: Microsoft Access, Mongo DB, Cassandra, MS SQL, Oracle.

PROFESSIONAL EXPERIENCE:

Confidential -Atlanta, GA

Sr. Hadoop Developer

Responsibilities:

Hands on experience in developing and deploying enterprise-based applications using major components in Hadoop ecosystem like Hadoop2, YARN, Hive, Pig, Map Reduce, HBase, Flume, Scoop, Spark, Strom, Kafka, Oozie and Zookeeper .
Experience in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks and Cloudera (CDH3, CDH4) distributions on Amazon web services (AWS).
Excellent Programming skills at a higher level of abstraction using Scala and Spark.
Good understanding in processing of real-time data using Spark.
Hands on experience in Importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop.
Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Kafka, Flume, MapReduce, and Hive.
Involved in NOSQL databases like HBase, Apache Cassandra in implementing and integration.
Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
Experience in managing and reviewing Hadoop Log files.
Used Zookeeper to provide coordination services to the cluster.
Used Microsoft Azure for building the applications and for building, testing, deploying the applications.
Experienced using Sqoop to import data into HDFS from RDBMS and vice-versa.
Experience and understanding in Spark and Storm.
Hands on dealing with log files to extract data and to copy into HDFS using flume.
Experience in analysing data using Hive, Pig Latin, and custom MR programs in Java.
Hands on experience in Analysis, Design, Coding and testing phases of Software Development Life Cycle (SDLC).
Experience in multiple database and tools, SQL analytical functions, Oracle PL/SQL server and DB2.
Experience in Creating ETL/Talend jobs both design and code to process data to target databases.
Worked on different file formats like Avro, Parquet, RC file format, JSON format.
Involved in writing Python scripts for building disaster recovery process for current processing data into data center by providing current static location.
Hands on experience working on NoSQL databases like MongoDB, HBase, Cassandra and its integration with Haoop cluster.
Experience in ingesting data into Cassandra and consuming the ingested data from Cassandra to HDFS.
Used Apache Nifi for loading PDF Documents from Microsoft SharePoint to HDFS.
Used Avro serialization technique to serialize data for handling schema evolution.
Experience in designing and coding web applications using Core Java & Web Technologies - JSP , Servlets and JDBC , full Understanding of utilizing J2EE technology Stack, including Java related frameworks like Spring, ORM Frameworks (Hibernate)
Experience in designing the User Interfaces using HTML, CSS, JavaScript and JSP.
Developed web application in open source java framework Spring. Utilized Spring MVC framework.
Experienced front-end development using EXT-JS, jQuery, JavaScript, HTML, Ajax and CSS.
Have good interpersonal, communicational skills, strong problem-solving skills, explore and adapt to new technologies with ease and a good team member.

Environment: Hadoop, YARN, HBase, Azure, SDLC, MVC, NoSQL, Kafka, Python, Zookeeper, Oozie, jQuery, JavaScript, HTML, Ajax and CSS.

Confidential, Louisville, KY

Sr. Hadoop Developer

Responsibilities:

Involved in complete SDLC - Requirement Analysis, Development, System Integration Testing and Performance Testing.
Involved in architecture and design of distributed time-series database platform using NoSQL technologies like Hadoop/HBase, Zookeeper.
Good understanding of Spark Algorithms such as Classification, Clustering, and Regression.
Good understanding on Spark Streaming with Kafka for real-time processing.
Extensive experience working with Spark tools like RDD transformations, Spark MLlib and Spark QL.
Experienced in moving data from different sources using Kafka producers, consumers and preprocess data using Storm topologies.
Experience in working with Amazon Web Services EC2 instance and S3 buckets.
Responsible for configuring deployment environment to handle the application using Jetty server and Web
Logic 10 and Postgres database at the back-end.
Involved in the implementation of Spring MVC Pattern and developed persistence layer using Hibernate framework.
Implemented ORM through Hibernate and involved in preparing the Database Model for the project.
Followed Scrum methodology for the application development.
Supported Map Reduce Programs those are running on the cluster and developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
Developed various helper classes needed following Core Java multi-threaded programming and Collection classes.
Extracted data from Netezza databases to Hadoop framework.
Extracted the data from various sources into HDFS using Sqoop and ran Pig scripts on the huge chunks of data.
Further used pig to do transformations, event joins, elephant bird API and pre -aggregations performed before loading JSON files format onto HDFS.
Involved in resolving performance issues in Pig and Hive with understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
Good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.

Environment: HDFS, Spark, Pig, Sqoop, MapR, HBase, Zookeeper, Kafka,AWS, Netezza, Core java.

Confidential - Columbus, OH

Sr. Hadoop/Spark Developer

Responsibilities:

Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
Responsible for managing data coming from different sources.
Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
Performed Filesystem management and monitoring on Hadoop log files.
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Developed spark code using Scala and spark-SQL for faster testing and data processing.
Performed masking on customer sensitive data using Flume interceptors.
Used Oozie and Oozie coordinators to deploy end to end data processing pipelines and scheduling the work flows.
Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Worked on large sets of structured, semi-structured and unstructured data.

Environment: Apache Hadoop, HDFS, MapReduce, Hive, HBase, Sqoop, Oozie, Maven, Shell Scripting, Spark, Scala, Cloudera Manager .

Confidential - Rensselaer, NY

Spark/Hadoop Consultant

Responsibilities:

Designing the entire architecture of the data pipeline for analysis.
Worked on Sqoop jobs to import data from Oracle and bring into HDFS.
Scala Script to load processed into DataStax Cassandra.
Performance tuning of Spark and Sqoop Job
Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
Map-Reduce Job to compare two files TSV and save the processed output into Oracle
Hands on design and development of an application using Hive (UDF).
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query
Provide support data analysts in running Pig and Hive queries.
Transformed the ABintio Process into Hadoop using PIG and HIVE
Created partitioned tables in Hive
Created Reports using Tableau on HiveServer2.
Worked on Data Modelling for Dimension and Fact tables in Hive Warehouse.
Scheduling the jobs through Walgreens EBS internal Scheduling System.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Hortoworks Data Platform, Hadoop, Spark, Scala, SBT, Sqoop, Mapreduce, HDFS, Pig, Hive, Java, Oracle, DataStaxCassandra, Centos, Windows, Python.

Confidential, Edwardsville, IL

Hadoop Developer

Responsibilities:

Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
Exported analyzed data to relational databases using Sqoop for visualization to generate reports for the BI team.
Implemented Spark using python and Spark SQL for faster processing of data and algorithms for real time analysis in Spark.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Used the Spark - Cassandra Connector to load data to and from Cassandra. Real time streaming the data using Spark with Kafka.
Developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and HBase by implementing the rules in storm.
Built a prototype for real time analysis using Spark streaming and Kafka.
Built a prototype for real time analysis using Spark streaming and Kafka.Built a prototype for real time analysis using Spark streaming and Kafka.
Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles.
Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Expertise in extending Hive and Pig core functionalities by writing custom User Defined Functions (UDF).
Used IMPALA to pull the data from Hive tables.
Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
Create and develop an End to End Data Ingestion on to Hadoop.
Involved in architecture and design of distributed time-series database platform using NOSQL technologies like Hadoop/HBase, Zookeeper.
Integrated NoSQL database like HBase with Map Reduce to move bulk amount of data into HBase.
Efficiently put and fetched data to/from HBase by writing MapReduce job.

Environment: Hadoop, Kafka, Spark, Sqoop, Hive, pig, NoSQL, Impala, Oozie, HBase, Zookeeper.

Confidential , Pea pack, NJ

Java/J2EE Developer

Responsibilities:

Understanding and analyzing business requirements. Participated in all phases of SDLC
Involved in designing Use Case diagrams, Class diagrams and Sequence diagrams as a part of design phase
Configured spring framework using the Spring core module to inject dependencies and Spring ORM module to use Hibernate to persist data into Oracle database.
Developed RESTful Web Services using Jersey, JAX-RS to perform CRUD operations on the database server over HTTP and to consume web services for transferring data between different applications.
Used Spring Boot for developing microservices and used REST to retrieve data from client-side using Micro service architecture.
Used Singleton, Session Facade, and DAO patterns in implementing the application.
Used SAX parser for parsing the XML documents that are retrieved upon consuming the Web services
Extensively worked with XML Schemas (XSD) for defining XML elements and attributes
Deployed web components, presentation components and business components in IBM WebSphere Application Server.
Used RabbitMQ as the message broker to convert the entire flow as a SOA based architecture.
Involved in developing UI components using Angular JS and JSON to interact with RESTful web services.
Utilized JavaScript/jQuery libraries like bootstrap and AJAX for form validations and other interactive features.
Created build environment for Java using Git and Maven.
Used Log4J to write log messages with various levels.
Developed the test cases with JUnit for Unit testing of the built components.
Worked on enhancements, change requests and defect fixing. Interacted with product owner and testers.
Contributed to standardizing project coding, code review guidelines and checklist.
Used Jenkins for Continuous Integration.
Used JIRA to keep track of the project, bugs and issues.
Followed Agile/ Scrum methodology to track project progress and participated in Scrum meetings.

Environment: Java , J2EE, Hibernate, Spring, Eclipse, IBM WebSphere, REST (JAX-RS), XML, JSON, CSS, JUnit, RabbitMQ, Maven, Oracle, Angular JS, JavaScript/jQuery, AJAX, JIRA, Jenkins .

Confidential

Java Developer

Responsibilities:

Involved in Architecture and System Design and development process.
Worked with off-site (USA based) resources for successful implementation of the Workflow module.
Created UI screens using StrutsMVC for logging into the system and performing various operations on network elements.
Classified users into various organizations to differentiate the privileges between them in accessing the system.
Developed Use Cases, Business Logic and Unit Testing of Struts Based Application.
Developed JSP pages using Custom tags and Tiles framework and Struts framework.
Developed UI Screens for presentation logic using JSP, Struts Tiles, and HTML.
Used display tag to render large volumes of data.
Used Bean, HTML and Logic tags to avoid java expressions and scriplets in JSP.
Implemented Design patterns like Session Façade, Command, Singleton and DAO in business layer.
Created EJBs for Backend operations. Also used Hibernate for Database persistence.
Sent message objects using JMS to client queues and topics.
Created Unit test cases for unit testing.
Used Log4j for logging purposes and defined debug levels to control the log.
Built Application EAR using ANT.
Included Hibernate 3.0 annotations for Oracle DB.

Environment: Java 1.5, JavaScript, CSS, AJAX, J2EE, JSP, EJB, Struts 1.2, WebSphere 5.0, Apache TOMCAT, Web Services, Hibernate, JMS, XML, XSL, HTML.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship