Hadoop Engineer Resume
MN
EXECUTIVE SUMMARY:
- Apache Spark & Hadoop Developer with around 9 years of IT industry experience in Big Data Analytics and Development with strong emphasis on Apache Hadoop, Apache Spark and other ecosystem components.
- I have very good exposure to data ingestion, storage, querying, processing and analysis of large volume of data with hands on experience
- I am a quick learner and have thirst for learning new things, pursuing knowledge and ability to adapt the changes to new environment at fast pace.
- I have a strong knowledge on complete Software Development Life Cycle - Software analysis, design, architecture, development and maintenance in Agile methodology.
TECHNICAL SKILLS:
Big Data Analytics (Descriptive Analysis): Apache Hadoop, Apache Spark, Kafka, Hive, Sqoop, Pig, Flume, Oozie, Spark MLlib
Light Weight Data Analytics: Elastic Search, Log stash, Kibana (ELK Stack)
Programming / Web tech: C, C++, Core Java, Scala, Python, J2EE, JSP, JSF, Maven
RDBMS: MySQL, MS SQL Server, Oracle 10-g Express
No SQL Databases: MongoDB (Basics - POC Level), Apache HBase
Tools: / IDEs / Version Control: Eclipse, IntelIJ, SVN, GIT
PROFESSIONAL EXPERIENCE:
Confidential, MN
Hadoop Engineer
Skill Set: Hadoop - HDFS, Spark - SQL & streaming, Kafka, Sqoop, Hive, Core Java, Scala, Unix Shell Scripting, Oozie workflows, Ambari - Hortonworks.
Responsibilities:
- Working on end to end data ingestion pipelines using Sqoop to ingest the data from RDBMS to HDFS and Kafka to ingest streaming data into HDFS for analysis.
- Developing spark streaming application to receive the data streams from Kafka and process the continuous data streams and trigger actions based on fixed events.
- Using Hive to analyze the structured data and compute various metrics for creating dashboards in Tableau for data visualization.
- Writing Unix scripts to run the HDFS commands and Sqoop jobs on demand.
- Involved in managing the Kafka cluster to ensure high throughput message delivery to spark
Confidential, VA
Spark Developer
Skill Set: Hadoop - HDFS, HBase, Spark core - RDDs, Spark Streaming, Kafka, Sqoop, Hive, Core Java, Scala, Amazon Web Services - EC2, Lambda services, S3 storage.
Responsibilities:
- Worked on creating data ingestion pipelines to ingest huge amount of Click Stream and custom application data into Hadoop as various file formats like raw text files, CSV, ORC from the cash back rewards application and application home web application.
- Involved in designing and development of various custom data processing modules to define the pre-qualified offers for various visitors and the existing customers.
- Worked extensively on integrating Kafka (Data Ingestion) with Spark streaming to achieve high performance real time processing system.
- Worked on improving the in-memory computing performance of Spark applications by optimizing the Spark core RDD transformations based on requirement.
Confidential - Chicago, IL
Hadoop Developer
Skill Set: Hadoop - HDFS, Sqoop, Hive, Core Java, Python, Unix Shell Scripting, Oozie workflows
Responsibilities:
- Involved in moving data from Oracle to HDFS and vice-versa using SQOOP.
- Collected and aggregated large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in writing Flume configuration files to stream the log data from web applications.
- Worked on installing cluster, commissioning and decommissioning of Data node, Namenode recovery, capacity planning, and slots configuration.
Confidential
Java Developer
Skill Set: Java, Struts, Spring 2.0, Hibernate 3.2, Web Logic 7.0, Eclipse, Oracle, JUnit 4.2, Maven, Windows XP, HTML, CSS, JavaScript, and XML.
Responsibilities:
- Involved in developing various data flow diagrams, use case diagrams and sequence diagrams.
- Worked on various phases and as well as improving the reporting module.
- Worked extensively in JSP, HTML, JavaScript, and CSS to create the UI pages for the project.
- Created JUnit test cases for unit testing and developed generic JS functions for validations.
- Gathered requirements for migrating from ICD-9 to ICD-10 codes.
Confidential
Java Developer
SkillSet: Core Java, J2EE, Web Services, XML, XSD, WSDL, SOAP, SOAP UI, ANT, SQL, JSP, JSTL, JUnit, JBOSS, Spring MVC, Hibernate, Oracle SQL Developer, PL/SQL, Clear Quest.
Responsibilities:
- Involved in the entire SDLC of client’s business process by using their current system and continuous client feedback.
- Developed programs for accessing the database using JDBC thin driver to execute queries, Prepared statements, Stored Procedures and to manipulate the data in the database
- Developed the CSS sheets for the front ends of the Gate Way interface.
- Developed code for displaying the Log reports that are generated when clients access the Gate Way interface.
- Mapping of fields between the client’s XML and Remedy incident management system.
- Involved in the testing process with the clients for all the phases of the project.
- Involved in maintenance work andfixed some of the bugs during testing process.
- Performeddocumentation for change requests and system requirement specifications of the project.
- Designed and implemented business components for applications using J2EE technologies such as JDBC and JMS.
- Developed custom JSP tags for role-based sorting and filtering, which were used in the front-end of the presentation layer.
- JPA was implemented to make the application ORM independent.
