Hadoop Developer Resume
Palo Alto, CA
SUMMARY
- 7 Years of experience in IT which includes 4 years of experience in Big Data ecosystem related technologies.
- Proficient experience in Installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on exposure on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, HBase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
- Good experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
- Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
- Expert in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe.
- Good experience in Apache Flume for efficiently collecting, aggregating, and moving large amounts of log data.
- Experience and in - depth understanding of analyzing data using HIVEQL, PIG Latin.
- Experience in managing Hadoop clusters using Cloudera Manager Tool.
- Hands on experience in Sequence files, Combiners, Counters, Dynamic Partitions, Bucketing.
- Hands-on experience in performing analytics using Apache Spark.
- Good Knowledge on Map Reduce design patterns.
- Responsible for setting up processes for Hadoop based application design and implementation.
- Developed GUI applications by using HTML, CSS, AJAX, JSP and JavaScript libraries to implement user interface screens.
- Experience in Core Java technologies (JDBC, Multi-Threading, Networking, Generics, Oops concepts, collections, exception handling).
- Strong experience in Design, Development, Testing, Deploying of Object Oriented, and Web based Enterprise Applications using Java/J2EE technologies.
- Experienced in developing enterprise applications using MVC frameworks such as Struts, Spring MVC.
- Proficiency in multiple databases likeMongoDB, Cassandra, MYSQL, Oracle 9i, 10g, 11g and MS SQL Server.
- Experienced in developing applications using application/Web servers such as Web Sphere, Web logic and Tomcat.
- Experienced by using Hibernate interfaces, annotations, JDBC and SQL in implementing DAO layers.
- Extensive knowledge of Agile and Scrum methodologies to develop best practices for software development and implementation.
- Worked and learned a great deal from (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
- Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PL SQL Developer, SQL*Plus.
- Expertise in using Java API and Sqoop to export data into DataStaxCassandracluster from RDBMS.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce Hive, Pig, Pentaho, HBase, Storm, Zookeeper, Sqoop, Oozie, Flume, Avro and YARN
Web Technologies: Core Java, J2EE, Servlet, JSP, JDBC, XML, AJAX, SOAP, WSDL, REST.
Methodologies: Agile, UML, Water-Fall and Test Driven
Frameworks: Hibernate 2.x/3.x, Spring 2.x/3.x,Struts 1.x/2.x and JPA
Programming Languages: Java, Unix Shell scripting, HTML, C, C++,SQL, PL/SQL
Data Bases: Oracle 11g, DB2, MS-SQL Server, MYSQL, MS-Access, Mongo DB, Cassandra.
Application Servers: Web Logic, Web Sphere, Apache Tomcat.
Web Design Tools: HTML, DHTML, AJAX, JavaScript, JQuery and CSS, Angular Js, Ext.JS and JSON
Operating Systems: Windows, Linux, Unix
Development / Build Tools: TOAD, SQL Developer, SOAP UI, Eclipse, Ant, Maven, IntelliJ, JUNIT and log4J.
ETL Tools: Talend, Informatica, Pentaho
PROFESSIONAL EXPERIENCE
Confidential, Palo Alto, CAHadoop Developer
Responsibilities:- Installed and configured Hadoop, Map Reduce, HDFS, developed multiple maps reduce jobs in Java for data cleaning and pre-processing.
- Involved in creating Hive tables, loading the data and writing hive queries that will run internally in aMap Reduceway.
- Designed and implementedMap Reducebased large-scale parallel relation-learning system.
- Analysed Hadoop cluster and Big Data analytic tools including Pig, Hive, HBase and SQOOP.
- DevelopedSparkscripts by using Scala Shell commands as per the requirement.
- Involved in converting Hive/SQL queries intoSparktransformations usingSparkRDDs and Scala.
- Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
- Handled data from RDBMS and streaming sources used Apache Spark for data processing.
- Responsible for developing Kafka as per the software requirement specifications.
- Involved in monitoring data and filtering data for high speed data handling usingKafka.
- Worked in Spark streaming to get ongoing information from theKafkaand store the stream information to HDFS.
- Implemented UNIX shell scripts to perform cluster admin operations and used Zookeeper for Cluster coordination services.
- Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
- Imported data from various resources to theCassandracluster using Java APIs.
- Configured, Documented and Demonstrated inter node communication between Cassandra nodes and client using SSL encryption.
- Worked on Cassandra in creating tables to load large sets of semi structured data coming from various sources.
- Analyzed the data to export Cassandra using Sqoop and to generate reports for the BI team.
- Ingested the data from Relational Databases to HDFS using SQOOP.
- Configured the servers for Auto scaling and elastic load balancing in AWS EC2 and usedAWSservices like EC2 and S3 for small data sets.
- Developed Pig Latin scripts for replacing the existing legacy process to theHadoopand the data is fed toAWSS3.
- Strong experience in working with ELASTIC MAPREDUCE and setting up environments on Amazon AWS EC2 instances.
- Created S3 buckets and also managing policies for S3 buckets and Utilized S3 bucket and Glacier for storage and backup on AWS.
- Developer Columnar Data warehouse using the Red shift for analyzing the 4TB of customer data.
- Used Pig as ETL tool to do joins and some pre-aggregations before storing the data onto HDFS.
- Analyzed the data by performing Hive queries and running Pig scripts as per customer requirements.
- Used Spark SQL to load tables into HDFS to run select queries on top.
- Involved in loading data from UNIX file system to HDFS.
- Experienced in handling Avro data files inMap Reduceprograms using Avro data serialization system.
Environment: HDFS, Hadoop, AWS, Elastic Map Reduce, Spark, Pig, HBase, SQL, UNIX, Hive, Sqoop, Kafka, Spark Streaming, Linux shell scripting, ETL, Zoo Keeper, Eclipse and Cassandra.
Confidential, New Jersey, NY
Hadoop Developer
Responsibilities:
- Involved in HBase Java API to populate operational HBase table with Key value.
- Installed and configured Hive, Pig and Sqoop on theHDP2.0 cluster.
- Performed data analysis with HBase using Hive External tables.
- Used Map Reduce to move bulk amount of data into HBase to Integrate with NoSQL database like HBase.
- Performed real time analytics on HBase using Java API and Fetched data to/from HBase by writing Map Reduce job.
- Installed and configuredHadoopMapReduce, HDFS, Developed multipleMapReducejobs in java for data cleaning and Processing.
- Developed Map-Reduce jobs on Yarn and Hadoopclusters to produce daily and monthly reports.
- SupportedMapReducePrograms those are running on the cluster.
- Implemented intermediate functionalities like events or records count from the Flume sinks.
- Designed workflows by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Extracted the data from MySQL into HDFS using Sqoop.
- Used Sqoop to load the CDRs from relational DB and other sources toHadoop cluster by using Flume.
- Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
- Configured and Installed Hive on the Hadoop cluster.
- Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Analyzed the data by performing Hive queries (HiveQL) and running Pig Latin scripts to study customer behaviour.
- Wrote Java code to format XML documents; upload them toSolrserver for indexing.
- Using Hive join queries to join multiple tables of a source system and load them toElasticsearch tables.
- Written various key queries inElasticsearchfor retrieval of data effectively.
- Extensively worked onElasticsearchquerying and indexing to retrieve the documents in high speeds.
- WrittenElasticsearchtemplate for the index patterns.
- Loaded two different datasets sources like Oracle, MySQL to HDFS and Hive respectively on daily basis.
- Optimized Hive queries to extract the customer information from HDFS or HBase.
- Developed Pig Latin scripts to aggregate the log files of the business clients.
- Designed and developed Pig Latin Scripts to process data in a batch to perform trend analysis.
- Using Hive and Pig to develop complex Map reduce streaming jobs using Java language.
- Loaded and transformed large sets of structured, semi structured data using Pig Scripts.
- Involved in writing shell scripts in scheduling and automation of tasks.
- Managed and reviewed Hadoop log files to identify issues when Job fails.
- Used Teradata Data Mover to copy data and objects such as tables and statistics from one system to another.
- Worked on Teradata Multi-Load, Teradata Fast-Load utility to load data from Oracle and SQL Server to Teradata
Environment: HDP2.0, Hadoop, HDFS, Map Reduce, Pig, Hive, Sqoop, Java, Flume, Oracle, MySQL, HBase, Elastic Search, Linux Shell Scripting and Big Data.
Confidential, Houston, TX
Hadoop Developer
Responsibilities:
- Installed and configuredHadoopMap Reduce, HDFS, developed multipleMap Reducejobs in java for data cleaning and pre-processing.
- Continuously monitored and managed the Hadoop Cluster using Cloudera Manager.
- Used Clouderamanager to pull metrics on various cluster features like JVM, Running Map and reduce tasks.
- Utilized Apache Hadoop environment by using Cloudera.
- Developed a workflow using Oozie to automate the tasks of loading the data into HDFS from analyzing the data.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box and analyzed the weblog data using the Hiveql.
- Used Oozie work flow for Scheduling and managing jobs on a Hadoop cluster.
- Developed complex queries using Hive and Impala.
- Involved in working withImpalafor data retrieval process.
- Experienced in migrating HiveQL intoImpalato minimize query response time.
- Performed transformations using Map Reduce, Hive to load data into HDFS and worked on importing data from various sources.
- Developed several newMap Reduceprograms to analyze and transform the data to uncover insights into the customer usage patterns.
- Developed simple and complexMap Reduceprograms in Java for Data Analysis.
- Written various Hive and Pig scripts and responsible for building scalable distributed data solutions using Hadoop.
- Developed Hive UDFs to validate against business rules before data move to hive tables.
- Used Pig scripts to join different data sets to perform queries.
- Experienced in writing Pig UDF’s to perform analytics with Pig Latin operations.
- Effectively usedSqoopto transfer data from databases (MySQL, Oracle) to HDFS, Hive.
- Analyzed relational databases using Sqoop to export data and generate reports.
- Worked extensively withSqoopfor importing metadata from Oracle.
- Used MFT to work on the Ingestion of Files into HDFS from remote systems.
- Worked on Avro Data files using Avro Serialization system.
- Experienced with different scripting language like Python and shell scripts.
- Implemented Frameworks using Java and python to automate the ingestion flow.
- Designed & Implemented database Cloning using Python and Built backend support for Applications using Shell scripts
Environment: HDFS, Map Reduce, Pig, Hive, Sqoop, Python, Maven, Avro, Impala, Oozie, Avro, Eclipse and Shell Scripting.
Confidential
Java Developer
Responsibilities:
- Implemented business rules by using Java & J2EE technology to modify the components.
- Involved in Agile methodology includes daily SCRUM meetings, iteration planning, etc.
- Designed different use cases by using UML, Class and Activity diagrams.
- Developed GUI application by using HTML, Java Script and JSP to view pages in desktop portal.
- Implemented Core Java in the application by using (Multi-Threading, Collections, OOPS, exception handling, JDBC).
- Developed Spring ORM/JDBC layers to integrate Hibernate for regular JDBC calls.
- Implemented navigation layer using Spring MVC components like dispatcher Servlet, controllers and view resolver components.
- Implemented business components as Spring Beans and configured using Dependency Injection and annotations.
- Consumed Web Services (REST) from third party and implemented those using JAX-RS.
- Used Hibernate in persistence layer and developed POJO's, Data Access Object (DAO) to handle all database operations.
- Packaging and Deployment of builds through Maven scripts.
- Deployed the Application on Web Sphere application Server and used Message Driven Beans with MQ Series.
- Implemented logger for debugging using Log4j and used GIT as version management tool.
- Performed Code review and JIRA is used to track the information.
- Performed unit testing using JUNIT framework to check the code review.
- Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using Oracle 11g as a database.
Environment: Java SE/ J2EE, Core Java, JSP,JSTL, XML,HTML, Query, Spring MVC/ DI /ORM /JDBC, Log4J, JUNIT, JavaScript, JSF, Ajax, Hibernate, Web Sphere, Eclipse, Unix, Ant, JUNIT, REST, GIT, Maven, MQ, JIRA, Oracle 11g.
Confidential
Java Developer
Responsibilities:
- Coordinated in Object Oriented Analysis & Design of the project.
- Developed HTML navigation menu that is derived from the database in the form of XML.
- Performed client side validation for the application using Java-Script and J-Query.
- Extensively involved in developing Web interface using JSP, JSP Standard Tag Libraries (JSTL) using Struts Framework.
- The Struts Framework is used in the application which is based on MVC design pattern.
- Developed client-side AJAX application that uses XSLT, XPath, JavaScript, OOP, and retrieve them via JNDI interface.
- The Communication with databases is used by using JDBC and worked on Stored Procedures.
- Used web services as a REST Services (JAX-RS) for XML response type.
- Implemented Log4J for Logging Errors, debugging and tracking.
- Performed unit testing using JUNIT framework to check the code review.
- Developed Ant scripts to build and deploy the application onto Web Logic Application Server and ranUNIXshell scripts and implemented auto deployment process.
- Developed and tested the application using Eclipse.
Environment: Java, XML, JSP, HTML, CSS, JavaScript, Struts MVC, MYSQL, JSTL, Servlet, JAX-RS, REST, Eclipse, JDBC, Log4J, JUNIT, Web Logic, Ant, JDBC.
