Sr. Hadoop Developer Resume
Charlotte, NC
PROFESSIONAL SUMMARY:
- Around 8+ years of experience in IT industry, including Java, SQL, Big data environment, Hadoop ecosystem and Design, Developing, Maintenance of various applications.
- Expertise in HDFS, YARN, Mapreduce, Spark, Hive, Impala, Pig, Sqoop, Hbase, Oozie, Flume, Kafka, Storm and various other ecosystem components
- Have a hands - on experience on fetching the live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
- Extensive experience in working with Struts and Spring MVC (Model View Controller) architecture for developing applications using various Java/J2EE technologies like Servlets, JSP, JDBC, JSTL.
- Proficiency in frameworks like Struts, Spring, Hibernate.
- Expertise in Spark framework for batch and real-time data processing.
- Good Knowledge of Spark and SCALA programming.
- Experience in handline messaging services using Apache Kafka.
- Used Spark-SQL to perform transformations and actions on data residing in Hive.
- Data Ingestion in to Hadoop (HDFS): Ingested data into Hadoop from various data sources like Oracle, MySQL using Sqoop tool. Created Sqoop job with incremental load to populate Hive External tables. Involved in importing the real-time data to Hadoop using Kafka and worked on Flume.
- Excellent knowledge of Hadoop Architecture and its related components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Good understanding of Linux implementation, customization and file recovery.
- Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured, semi-structured and unstructured data sets and store them in HDFS.
- Extensive experience in writing Pig and Hive scripts for processing and analyzing large volumes of structured data.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Loaded local data into HDFS using Apache NiFi.
- Authentication and authorization management for Hadoop cluster users using Kerberos and Sentry.
- Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3.
- Experience working on NoSQL databases like HBase and knowledge in Cassandra, MongoDB.
- Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie
- Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
- Installed, Configured Talend ETL on single and multi-server environments.
- Created standard and best practices for Talend ETL components and jobs.
- Good knowledge in Graph databases Janus graph and Neo4j.
- Front End with HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Kafka, Nifi, Hbase, Oozie, Zookeeper, Kerberos, Sentry
Programming language: C, C++, Java, SQL Scala
Database: Oracle 10g, MySql, SQL server, Cassandra, Janus Graph, Neo4j (graph Databases).
Cloud Platform: Amazon Web services (EC2, S3, EMR)
Ide Application: Eclipse, IntelliJ IDEA.
Collabration: Git, Jira, Jenkins
Web Development: HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap.
Java/J2EE Technologies: Servlets, JSP (EL, JSTL, Custom Tags), JSF, Apache Struts, Junit, Hibernate 3.x, Log4J Java Beans, EJB 2.0/3.0, JDBC, RMI, JMS, JNDI.
Spark Technologies: Spark Core, Spark SQL, Spark Streaming, Kafka, Storm.
PROFESSIONAL EXPERIENCE
Confidential, Charlotte, NC
Sr. Hadoop Developer
Responsibilities:
- Working on stories related to Ingestion, Transformation, and Publication of data on time.
- Using Spark for real-time data ingestion from web servers (unstructured and structured).
- Implementing data import and export jobs into HDFS and Hive using Sqoop.
- Converting unstructured data into a structured format using Pig.
- Using Hive as a data warehouse in Hadoop, HQL on the data (structured data).
- Using Hive to analyze the partitioned and bucketed using Hive SerDe’s like CSV, REGEX, JSON, and AVRO.
- Using Apache NiFi to check whether the data getting onto Hadoop cluster is a good data without any nulls in it.
- Designing and deploying of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, Sqoop, Apache Spark and Impala.
- Creating and transformation of RDDs, DataFrame using Spark.
- Working on converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
- Working with Big Data Hadoop Application using Talend on cloud through Amazon Web Services (AWS) EC2 and S3. increasing cluster size if needed in AWS using EMR(for data in cloud)
- Developing Spark scripts by using Scala shell commands as per the requirement.
- Using Spark API over Cloudera, Hadoop YARN to perform analytics on data in Hive
- Involving Spark to improvise the performance and optimization of the existing algorithms in Hadoop using Spark context, spark-SQL, Data Frame, pair RDD's, Spark YARN.
- Janus Graph is a Graph database used to store the parent-child relation(ER graph models) between the nodes. The data is stored in Cassandra to understand the nodes and network relations.
- Working with Kafka to get real-time weblogs data onto big data cluster.
Environment: HDFS, Sqoop, Hive, SerDe’s, Hbase, Sentry, Spark, Spark-SQL, Kafka, Flume, Oozie, Json, Avro,Talend,EC2,S3,EMR, Zookeeper, Cloudera.
Confidential, Columbus, OH
Hadoop Developer
Responsibilities:
- Loading customer data, spending data and credit from legacy warehouses to HDFS
- Exported analyzed data to RDBMS using Sqoop for data visualization.
- Used Hive queries to analyze the large data sets.
- Build reusable Hive UDF’s libraries for business requirements.
- Implemented Dynamic Partitioning and bucketing in Hive.
- Implement script to transmit sys print information from Oracle to HBase using Sqoop
- Deployed the Big Data Hadoop application using Talend on cloud AWS (Amazon Web Service).
- Implemented Map Reduce jobs on XML, JSON, CSV data formats.
- Developed Map reduce programs which were used to extract and transform the data sets and the resultant dataset is loaded to HBase.
- Imported the customers log data into HDFS using Flume.
- Implemented Spark job to improve query performance.
- Used Impala to handle different file formats
- Proactively involved in ongoing maintenance, support, and improvements in Hadoop cluster.
- Used Tableau as a business intelligence tool to visualize the customer information as per the generated records.
Environment: Hadoop, Map Reduce, HDFS, Hive, Sqoop, ZooKeeper, Oozie, Spark, Spark-SQL, Scala, Kafka, Java, Oracle, AWS S3.
Confidential, Newark, CA
Big Data/Hadoop Developer
Responsibilities:
- Worked o Hadoop Cluster with size of 83 Nodes and 896 terabytes capacity
- Worked on Map reduce jobs, HIVE, Pig.
- Involve in Requirement Analysis, Design, and Development.
- Importing and exporting data into Hive and Hbase using Sqoop from existing SQL server.
- Experience working on processing unstructured data using Pig and Hive.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Developed Hive queries, Pig scripts, and Spark SQL queries to analyze large datasets.
- Exported the result set from Hive to MySQL using Sqoop.
- Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
- Used NoSQL database with HBase.
- Actively involved in code review and bug fixing for improving the performance.
Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Flume, LINUX, Hbase, Java, Oozie.
Confidential
Software Developer
Responsibilities:
- Interacted with business managers to transform requirements into technical solutions.
- Followed Agile software development with Scrum methodology.
- Involved in Java, J2EE, Struts, Spring, Web Services and Hibernate in a fast-paced development environment.
- Server-side coding and development using Spring, Exception Handling, Java Collections including Set, List, Map, Spring, Hibernate, Webservices, etc in Windows & Linux environment.
- Involved in defect tracking as well as planning using JIRA.
- Resolved a complicate production issue for business managers, where the number of records where displaying wrong.
- Created and modified Struts actions. Worked with struts validations.
- Worked on Spring application framework features IOC container and AOP and integrated Spring with Hibernate using the Hibernate Template.
- Developed enterprise inter-process communication frame work using Spring REST-full Web Service. Developing SOAP Webservices and REST Webservices (JAXB, JSON, JAX-RS, JAX-WS) Developed Hibernate persistent layer.
- Implemented Spring MVC framework in the presentation tier for all the essential control Used Log4j utility to generate run-time logs.
- Prepared Unit and System Testing Specification documents and performed Unit and System testing of the application.
- Reviewed the code for ensuring adherence to java coding standards.
- Developed Functional Requirement Document based on user’s requirement.
Environment: Core java, Servlets, Springs3.0, Spring MVC, Hibernate, REST Web Services, SQL Developer, Apache Tomcat 7.0, MongoDB, Multi-Threading, Web sphere, Agile Methodology, Design Patterns, Apache Maven, Junit.
Confidential
Java Developer
Responsibilities:
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC) .
- Designed and developed framework components, involved in designing MVC pattern using Struts and spring framework.
- Responsible for developing Use case, Class diagrams and Sequence diagrams for the modules using UML and Rational Rose.
- Developed the Action Classes, Action Form Classes, created JSPs using Struts tag libraries and configured in Struts-config.xml, Web.xml files.
- Involved in Deploying and Configuring applications in Web Logic Server.
- Used SOAP for exchanging XML based messages.
- Used Microsoft VISIO for developing Use Case Diagrams, Sequence Diagrams and Class Diagrams in the design phase.
- Developed Custom Tags to simplify the JSP code. Designed UI screens using JSP and HTML.
- Actively involved in designing and implementing Factory method, Singleton, MVC and Data Access Object design patterns.
- Web services used for sending and getting data from different applications using SOAP messages. Then used DOM XML parser for data retrieval.
- Wrote JUNIT test cases for Controller, Service and DAO layer using MOCKITO, DBUNIT.
- Developed unit test cases using proprietary framework which is similar to JUNIT.
- Used JUnit framework for unit testing of application and ANT to build and deploy the application on Web Logic Server.
Environment: Java, J2EE, JDK1.7, JSP, Oracle, VSAM, Eclipse, HTML, Junit, MVC, ANT, Web Logic.
