Sr. Hadoop Developer Resume
Atlanta, GA
SUMMARY:
- Around 9years of professional IT experience in all phases of Software Development Life Cycle including hands on experience in Java/J2EE technologies and Big Data Analytics.
- Substantial experience writing MapReduce jobs in Java, PIG , Flume , Tez , Zookeeper and Hive and Storm.
- Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce, HDFS,HBase , AVRO, Zoo Keeper, Oozie, Hive, HDP , Cassandra , Sqoop , PIG, Flume.
- Extensive Knowledge on automation tools such as Puppet and Chef.
- Experience in web - based languages such as HTML, CSS, PHP, XML and other web methodologies including Web Services and SOAP.
- Good Experience in importing and exporting data between HDFS andRelational Database Management systems using Sqoop.
- Extensive knowledge of NoSQL databases such as HBase .
- Worked on Multi Clustered environment and setting up Cloudera Hadoop echo System.
- Background with traditional databases such as Oracle, Teradata, Netezza, SQL Server, ETL tools / processes and Data warehousing architectures .
- Proficient in working with MapReduce programs using Apache Hadoop for working with Big Data.
- Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume .
- Experience in transferring Streaming data from different data sources into HDFS and HBase using Apache Flume.
- Experienced in using Zookeeper and OOZIE Operational Services for coordinating the cluster and scheduling workflows.
- Experience working with Cloudera & Hortonworks Distribution of Hadoop .
- Extensive experience in Java and J2EE technologies like Servlets, JSP, Enterprise Java Beans (EJB), JDBC.
- Experienced in importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and extracted the data from relational databases like Oracle , MySQL, Teradata into HDFS and Hive using Sqoop.
- Expertise in writing HIVE queries, Pig and Map Reduce scripts and loading the huge data from local file system and HDFS to Hive.
- Hands on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and Apache Kafka .
- Good experience in working with cloud environment like Amazon Web Services EC2 and S3 .
- Hands on experience on working with Amazon EMR framework transferring data to EC2 server.
- Good knowledge in Software Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC).
- Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
- Experienced in Application Development using Java, Hadoop, RDBMS and Linux shell scripting and performance tuning.
- Experienced in loading data to hive partitions and creating buckets in Hive.
- Experienced in relational databases like MySQL, Oracle and NoSQL databases like HBase and Cassandra.
- Hands on experience working on NoSQL databases including HBase, Cassandra and its integration with Hadoop cluster.
- Handson experience in Developing Haoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
TECHNICAL SKILLS:
Java/J2EE Technologies: JSP, Servlets, JQuery, JDBC, Java Script
Hadoop/Big Data: HDFS, Hive, Pig, HBase, Map Reduce, Zookeeper, Spark, Scala, Akka, Kafka, Sqoop, Oozie, Flume, Storm
Programming Languages: Java, J2EE, HQL, R, Python, XPath, PL/SQL, Pig Latin.
Operating Systems: UNIX, Linux, Windows
Web Technologies: HTML, XML, DHTML, XHTML, CSS, XSLT.
Web/Application servers: Apache HTTP server, Apache Tomcat, AJBoss.
Frameworks: MVC, Struts, Spring, Hibernate
Databases: Microsoft Access, Mongo DB, Cassandra, MS SQL, Oracle.
PROFESSIONAL EXPERIENCE:
Confidential -Atlanta, GA
Sr. Hadoop Developer
Responsibilities:
- Hands on experience in developing and deploying enterprise-based applications using major components in Hadoop ecosystem like Hadoop2, YARN, Hive, Pig, Map Reduce, HBase, Flume, Scoop, Spark, Strom, Kafka, Oozie and Zookeeper .
- Experience in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks and Cloudera (CDH3, CDH4) distributions on Amazon web services (AWS).
- Excellent Programming skills at a higher level of abstraction using Scala and Spark.
- Good understanding in processing of real-time data using Spark.
- Hands on experience in Importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop.
- Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Kafka, Flume, MapReduce, and Hive.
- Involved in NOSQL databases like HBase, Apache Cassandra in implementing and integration.
- Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
- Experience in managing and reviewing Hadoop Log files.
- Used Zookeeper to provide coordination services to the cluster.
- Used Microsoft Azure for building the applications and for building, testing, deploying the applications.
- Experienced using Sqoop to import data into HDFS from RDBMS and vice-versa.
- Experience and understanding in Spark and Storm.
- Hands on dealing with log files to extract data and to copy into HDFS using flume.
- Experience in analysing data using Hive, Pig Latin, and custom MR programs in Java.
- Hands on experience in Analysis, Design, Coding and testing phases of Software Development Life Cycle (SDLC).
- Experience in multiple database and tools, SQL analytical functions, Oracle PL/SQL server and DB2.
- Experience in Creating ETL/Talend jobs both design and code to process data to target databases.
- Worked on different file formats like Avro, Parquet, RC file format, JSON format.
- Involved in writing Python scripts for building disaster recovery process for current processing data into data center by providing current static location.
- Hands on experience working on NoSQL databases like MongoDB, HBase, Cassandra and its integration with Haoop cluster.
- Experience in ingesting data into Cassandra and consuming the ingested data from Cassandra to HDFS.
- Used Apache Nifi for loading PDF Documents from Microsoft SharePoint to HDFS.
- Used Avro serialization technique to serialize data for handling schema evolution.
- Experience in designing and coding web applications using Core Java & Web Technologies - JSP , Servlets and JDBC , full Understanding of utilizing J2EE technology Stack, including Java related frameworks like Spring, ORM Frameworks (Hibernate)
- Experience in designing the User Interfaces using HTML, CSS, JavaScript and JSP.
- Developed web application in open source java framework Spring. Utilized Spring MVC framework.
- Experienced front-end development using EXT-JS, jQuery, JavaScript, HTML, Ajax and CSS.
- Have good interpersonal, communicational skills, strong problem-solving skills, explore and adapt to new technologies with ease and a good team member.
Environment: Hadoop, YARN, HBase, Azure, SDLC, MVC, NoSQL, Kafka, Python, Zookeeper, Oozie, jQuery, JavaScript, HTML, Ajax and CSS.
Confidential, Louisville, KY
Sr. Hadoop Developer
Responsibilities:
- Involved in complete SDLC - Requirement Analysis, Development, System Integration Testing and Performance Testing.
- Involved in architecture and design of distributed time-series database platform using NoSQL technologies like Hadoop/HBase, Zookeeper.
- Good understanding of Spark Algorithms such as Classification, Clustering, and Regression.
- Good understanding on Spark Streaming with Kafka for real-time processing.
- Extensive experience working with Spark tools like RDD transformations, Spark MLlib and Spark QL.
- Experienced in moving data from different sources using Kafka producers, consumers and preprocess data using Storm topologies.
- Experience in working with Amazon Web Services EC2 instance and S3 buckets.
- Responsible for configuring deployment environment to handle the application using Jetty server and Web
- Logic 10 and Postgres database at the back-end.
- Involved in the implementation of Spring MVC Pattern and developed persistence layer using Hibernate framework.
- Implemented ORM through Hibernate and involved in preparing the Database Model for the project.
- Followed Scrum methodology for the application development.
- Supported Map Reduce Programs those are running on the cluster and developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Developed various helper classes needed following Core Java multi-threaded programming and Collection classes.
- Extracted data from Netezza databases to Hadoop framework.
- Extracted the data from various sources into HDFS using Sqoop and ran Pig scripts on the huge chunks of data.
- Further used pig to do transformations, event joins, elephant bird API and pre -aggregations performed before loading JSON files format onto HDFS.
- Involved in resolving performance issues in Pig and Hive with understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
- Good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Environment: HDFS, Spark, Pig, Sqoop, MapR, HBase, Zookeeper, Kafka,AWS, Netezza, Core java.
Confidential - Columbus, OH
Sr. Hadoop/Spark Developer
Responsibilities:
- Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
- Responsible for managing data coming from different sources.
- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Performed Filesystem management and monitoring on Hadoop log files.
- Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
- Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Developed spark code using Scala and spark-SQL for faster testing and data processing.
- Performed masking on customer sensitive data using Flume interceptors.
- Used Oozie and Oozie coordinators to deploy end to end data processing pipelines and scheduling the work flows.
- Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Worked on large sets of structured, semi-structured and unstructured data.
Environment: Apache Hadoop, HDFS, MapReduce, Hive, HBase, Sqoop, Oozie, Maven, Shell Scripting, Spark, Scala, Cloudera Manager .
Confidential - Rensselaer, NY
Spark/Hadoop Consultant
Responsibilities:
- Designing the entire architecture of the data pipeline for analysis.
- Worked on Sqoop jobs to import data from Oracle and bring into HDFS.
- Scala Script to load processed into DataStax Cassandra.
- Performance tuning of Spark and Sqoop Job
- Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
- Map-Reduce Job to compare two files TSV and save the processed output into Oracle
- Hands on design and development of an application using Hive (UDF).
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query
- Provide support data analysts in running Pig and Hive queries.
- Transformed the ABintio Process into Hadoop using PIG and HIVE
- Created partitioned tables in Hive
- Created Reports using Tableau on HiveServer2.
- Worked on Data Modelling for Dimension and Fact tables in Hive Warehouse.
- Scheduling the jobs through Walgreens EBS internal Scheduling System.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment: Hortoworks Data Platform, Hadoop, Spark, Scala, SBT, Sqoop, Mapreduce, HDFS, Pig, Hive, Java, Oracle, DataStaxCassandra, Centos, Windows, Python.
Confidential, Edwardsville, IL
Hadoop Developer
Responsibilities:
- Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
- Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
- Exported analyzed data to relational databases using Sqoop for visualization to generate reports for the BI team.
- Implemented Spark using python and Spark SQL for faster processing of data and algorithms for real time analysis in Spark.
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
- Used the Spark - Cassandra Connector to load data to and from Cassandra. Real time streaming the data using Spark with Kafka.
- Developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and HBase by implementing the rules in storm.
- Built a prototype for real time analysis using Spark streaming and Kafka.
- Built a prototype for real time analysis using Spark streaming and Kafka.Built a prototype for real time analysis using Spark streaming and Kafka.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
- Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles.
- Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Expertise in extending Hive and Pig core functionalities by writing custom User Defined Functions (UDF).
- Used IMPALA to pull the data from Hive tables.
- Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
- Create and develop an End to End Data Ingestion on to Hadoop.
- Involved in architecture and design of distributed time-series database platform using NOSQL technologies like Hadoop/HBase, Zookeeper.
- Integrated NoSQL database like HBase with Map Reduce to move bulk amount of data into HBase.
- Efficiently put and fetched data to/from HBase by writing MapReduce job.
Environment: Hadoop, Kafka, Spark, Sqoop, Hive, pig, NoSQL, Impala, Oozie, HBase, Zookeeper.
Confidential , Pea pack, NJ
Java/J2EE Developer
Responsibilities:
- Understanding and analyzing business requirements. Participated in all phases of SDLC
- Involved in designing Use Case diagrams, Class diagrams and Sequence diagrams as a part of design phase
- Configured spring framework using the Spring core module to inject dependencies and Spring ORM module to use Hibernate to persist data into Oracle database.
- Developed RESTful Web Services using Jersey, JAX-RS to perform CRUD operations on the database server over HTTP and to consume web services for transferring data between different applications.
- Used Spring Boot for developing microservices and used REST to retrieve data from client-side using Micro service architecture.
- Used Singleton, Session Facade, and DAO patterns in implementing the application.
- Used SAX parser for parsing the XML documents that are retrieved upon consuming the Web services
- Extensively worked with XML Schemas (XSD) for defining XML elements and attributes
- Deployed web components, presentation components and business components in IBM WebSphere Application Server.
- Used RabbitMQ as the message broker to convert the entire flow as a SOA based architecture.
- Involved in developing UI components using Angular JS and JSON to interact with RESTful web services.
- Utilized JavaScript/jQuery libraries like bootstrap and AJAX for form validations and other interactive features.
- Created build environment for Java using Git and Maven.
- Used Log4J to write log messages with various levels.
- Developed the test cases with JUnit for Unit testing of the built components.
- Worked on enhancements, change requests and defect fixing. Interacted with product owner and testers.
- Contributed to standardizing project coding, code review guidelines and checklist.
- Used Jenkins for Continuous Integration.
- Used JIRA to keep track of the project, bugs and issues.
- Followed Agile/ Scrum methodology to track project progress and participated in Scrum meetings.
Environment: Java , J2EE, Hibernate, Spring, Eclipse, IBM WebSphere, REST (JAX-RS), XML, JSON, CSS, JUnit, RabbitMQ, Maven, Oracle, Angular JS, JavaScript/jQuery, AJAX, JIRA, Jenkins .
Confidential
Java Developer
Responsibilities:
- Involved in Architecture and System Design and development process.
- Worked with off-site (USA based) resources for successful implementation of the Workflow module.
- Created UI screens using StrutsMVC for logging into the system and performing various operations on network elements.
- Classified users into various organizations to differentiate the privileges between them in accessing the system.
- Developed Use Cases, Business Logic and Unit Testing of Struts Based Application.
- Developed JSP pages using Custom tags and Tiles framework and Struts framework.
- Developed UI Screens for presentation logic using JSP, Struts Tiles, and HTML.
- Used display tag to render large volumes of data.
- Used Bean, HTML and Logic tags to avoid java expressions and scriplets in JSP.
- Implemented Design patterns like Session Façade, Command, Singleton and DAO in business layer.
- Created EJBs for Backend operations. Also used Hibernate for Database persistence.
- Sent message objects using JMS to client queues and topics.
- Created Unit test cases for unit testing.
- Used Log4j for logging purposes and defined debug levels to control the log.
- Built Application EAR using ANT.
- Included Hibernate 3.0 annotations for Oracle DB.
Environment: Java 1.5, JavaScript, CSS, AJAX, J2EE, JSP, EJB, Struts 1.2, WebSphere 5.0, Apache TOMCAT, Web Services, Hibernate, JMS, XML, XSL, HTML.
