Sr. Hadoop Developer Resume
Phoenix, AZ
SUMMARY
- 7+ years of extensive IT experience with multinational clients which includes 3 years of Hadoop related architecture experience developing Bigdata / Hadoop applications.
- Hands on experience with teh Hadoop stack (MapReduce, HDFS, Sqoop, Pig, Hive, HBase, Flume, Oozie and Zookeeper)
- Well versed in configuring and administering teh Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera
- Proven Expertise in performing analytics on Big Data using MapReduce, Hive and Pig.
- Experienced with performing real time analytics on NoSQL databases like HBase and Cassandra.
- Worked with Oozie workflow engine to schedule time based jobs to perform multiple actions.
- Hands on experience with importing and exporting data from Relational databases to HDFS, Hive and HBase using Sqoop.
- Analyzed large amounts of data sets writing Pig scripts and Hive queries
- Experienced in writing MapReduce programs & UDFs for both Hive & Pig in Java
- Used Flume to channel data from different sources to HDFS.
- Experience with configuration of Hadoop Ecosystem components: Hive, HBase, Pig, Sqoop, Mahout, Zookeeper and Flume.
- Supported MapReduce Programs running on teh cluster and wrote custom MapReduce Scripts for Data Processing in Java
- Experience with Testing MapReduce programs using MRUnit, Junit and EasyMock.
- Experienced with implementing Web based, Enterprise level applications using J2EE frameworks like Spring, Hibernate, EJB, JMS, JSF and Java.
- Experienced with implementing/consumed SOAP Web Services using Spring CXF and Consumed Rest Web Services using Http Clients.
- Experienced in writing functions, stored procedures, and triggers using PL/SQL.
- Experienced with build tool ANT, Maven and continuous integrations like Jenkins.
- Experienced in all facets of Software Development Life Cycle (Analysis, Design, Development, Testing and maintenance) using Waterfall and Agile methodologies
- Motivated team player with excellent communication, interpersonal, analytical and problem solving skills
- Highly adept at promptly and thoroughly mastering new technologies with a keen awareness of new industry developments and teh evolution of next generation programming solutions.
TECHNICAL SKILLS
Methodologies: Agile Scrum, Waterfall, Design patterns
Big Data Technologies: PIG, HIVE, HBASE, SQOOP, Hadoop, Map Reduce, HDFS, HortonWorks, Cloudera Manager
Languages: PIG Latin, NOSQL, JAVA, XML, XSL, SQL, PL/SQL, HTML, JavaScript, C, C++
J2EE Technologies: JSP, Servlet, Spring, Hibernate, Web services
Web Technologies: HTML, CSS, JavaScript
Databases: HDFS, DB2, ORACLE and SQL server
Application Servers: WebSphere, WebSphere Portal Server, WebLogic, Tomcat, Apache, Web Aware
Business Intelligence Tools: Tableau, Qlik, Alteryx, Splunk.
Operating Systems: WINDOWS, OS 7, 8, UNIX and DOS
PROFESSIONAL EXPERIENCE
Confidential, Phoenix, AZ
Sr. Hadoop Developer
Responsibilities:
- Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Integrated scheduler with Oozie work flows to get data from multiple data sources parallel using fork.
- Created Data Pipeline of MapReduce programs using Chained Mappers.
- Implemented Optimized join base by joining different data sets to get top claims based on state using MapReduce.
- Implemented complex mapreduce programs to perform joins on teh Map side using Distributed Cache in Java.
- Created teh high level Design for teh Data Ingestion and data extraction Module, enhancement of Hadoop Map - Reduce job which joins teh incoming slices of data and pick only teh fields needed for further processing.
- All this happens in a distributed environment.
- Developed several advanced MapReduce programs to process data files received.
- Usage of Sqoop to import data into HDFS from MySQL database and vice-versa.
- Responsible for importing log files from various sources into HDFS using Flume.
- Created customized BI tool for manager team that perform Query analytics using HiveQL.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Created Hive Generic UDF's to process business logic that varies based on policy.
- Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
- Optimizing teh Hive queries using Partitioning and Bucketing techniques, for controlling teh data distribution.
- Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java mapreduce Hive, Pig, and Sqoop.
- Developed Unit test cases using Junit and MRUnit testing frameworks.
- Experienced in Monitoring Cluster using Cloudera manager.
- Helped teh Business intelligence team in designing dashboards and workbooks.
Environment: Hadoop, HDFS, HBase, MapReduce, Java, Hive, Pig, Sqoop, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager, MySQL.
Confidential, Danbury, CT
Hadoop Developer
Responsibilities:
- Plan data ingestion and integration process from teh EDW env into a Datalake in HDFS
- Install, Configure, and Administer Pivotal Hadoop distribution 1.0.1 for both Dev and Prod Clusters
- Upgrade Pivotal Hadoop distribution from 1.0.1 to 1.1.1 and tan to 2.0.1 stable version
- Optimize Hadoop performance by reconfiguring Yarn RM for increased heap memories
- Configure Namenode High Availability for Namenode failovers
- Configure Pivotal HAWQ database for Namenode failovers
- Support/Troubleshoot jobs or Mapreduce programs running in teh Production cluster
- Resolve issues, answer questions, and provide support for users or clients on a day to day basis related to Hadoop and its ecosystem including teh HAWQ database
- Upload data from Linux/UNIX file system into HDFS for data manipulation in HAWQ and Hadoop
- Install, configure, and operate Apache stack i.e. Hive, HBase, Pig, Sqoop, Zookeeper, & Mahout
- Create tables, load data, and write queries/UDFs for jobs running in Hive
- Create Pig UDFs for jobs running in teh Production environment
- Develop scripts to automate routine DBA tasks (i.e. refresh, backups, vacuuming, etc.)
- Develop CRON jobs in Linux (RHEL) to monitor teh system and system resources
- Maintain Hadoop, Hadoop ecosystems, third party software, and database(s) with updates/upgrades, performance tuning and monitoring.
- Install, configure, and operate data integration and analytic tools i.e. Informatica, Chorus, SQLFire, & GemFireXD for business needs
- Manage/implement LFS and HDFS security using tools i.e. Kerberos autantication protocol
- Configure IDM Access and Disaster Recovery for HDFS using tools i.e. Distcp, and WANDisco
Environment: Pivotal Hadoop (HDFS), Mapreduce, AWS, Hive, HBase, Pig, Sqoop, Zookeeper, Mahout, Cloudera Manager
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive HBase database and SQOOP.
- Installed Hadoop, Map Reduce, HDFS, and Developed multiple mapreduce jobs in PIG and Hive for data cleaning and pre-processing.
- Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered teh BRD and TDD documents.
- Extensively involved in Design phase and delivered Design documents.
- Involved in Testing and coordination with business in User testing.
- Importing and exporting data into HDFS and Hive using SQOOP.
- Written Hive jobs to parse teh logs and structure them in tabular format to facilitate effective querying on teh log data.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
- Experienced in defining job flows.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Experienced in managing and reviewing teh Hadoop log files.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing teh data on to HDFS.
- Load and Transform large sets of structured and semi structured data.
- Responsible to manage data coming from different sources.
- Involved in creating Hive Tables, loading data and writing Hive queries.
- Utilized Apache Hadoop environment by Cloudera.
- Created Data model for Hive tables.
- Involved in Unit testing and delivered Unit test plans and results documents.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Worked on Oozie workflow engine for job scheduling.
Environment: Hadoop, Hive, MapReduce, Pig, SQOOP.
Confidential
Java Developer
Responsibilities:
- Designed teh user facing screens using JSP, Servlets, jQuery, AJAX, JavaScript, CSS.
- Developed application using Spring MVC, JSP, JSTL (Tag Libraries) and AJAX on teh presentation.
- Layer, teh business layer is built using spring and teh persistent layer uses Hibernate.
- Utilized Agile Scrum to manage full life-cycle development of teh project.
- Developed Web services for consuming Stocks details and Transaction rates using Spring-WS and Web services Template.
- Data Operations are performed using Spring ORM wiring with criteria API for Querying database.
- Implemented Criteria API and Native Queries at business manager.
- Involved in writing Stored Procedures, Triggers and Cursors.
- Worked in teh styles (CSS) and images for teh web application.
- Worked with Agile Methodology.
- Creating build scripts using Maven.
Environment: Java, Spring, JSP, AJAX, XML, JavaScript, Maven, Eclipse, HTML
Confidential
Member Technical Staff
Responsibilities:
- Design teh dashboard using Java Swings
- Write unit test cases
- Do documentation for full life cycle for development
- Manual test cases written in a word document for GUI testing.
Environment: Java swings, Core Java