- Over 6 years of IT experience, which includes solid experience in Hadoop Ecosystem.
- Experienced Hadoop Developer has a strong background with file distribution systems in a big - data arena. Understands the complex processing needs of big data and has experience developing codes and modules to address those needs.
- Experience in working on CDH3 and CDH4 Cloudera distributions and Horton.
- Strong expertise on BIGDATA data modeling techniques with Hive, Hbase.
- Experience in developing PIG Latin Scripts and using Hive Query Language .
- Expertise on Hive SERDE parser for unstructured data analysis.
- Experience in developing customized UDF 's in java to extend Hive and Pig Latin functionality.
- Developed Java Map Reduce programs on log data to transform into structured way to find user location, age group, spending time.
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- Experience working on NoSQL database Hbase/Data Ware House .
- Experience using Sqoop to import data into HDFS from RDBMS and vice-versa.
- Proficiency working with various data sources: RDBMSs, Web services and other data sources.
- Good knowledge in java topics such as Generics, Collections and multi-threading .
- Advanced experience with HDFS, MapReduce, Hive, Hbase, Zookeeper, Impala, Pig, and Flume, Oozie, STORM .
- Understanding and ability to use SQL, XML, JSON and UNIX.
- Design, develop and maintain solutions for Terabyte scale data analytics.
- Prepared test case scenarios and internal documentation for validation and reporting.
- Experience designing, implementing, and supporting Java/J2EE application modules.
- Experience with a variety of Java Frameworks (spring, Struts, etc.)
Data Management Databases: Oracle 10G/11G
Hadoop/Big Data: Hadoop 0.20.2-cdh3u3,HDFS 0.20.2 , Map Reduce, Hbase 0.90.4, Pig0.8.1, Hive 0.7.1, Impala 1.2, Sqoop1.3.0, Flume0.9.4, Spark, Scala,, Kafka Oozie 2.3.2, HUE, Zookeeper YARN, Cluster Build, MYSQL, Informatica Data Meer, R-Analytics, 3.7.x, Cascalog, CLoudera Manager 4.7.x, CDH 4.6, Cloudera Manager 4.8.2,Hortonwork.
Methodologies & Standards: Software Development Lifecycle (SDLC)
Programming Languages: Java, C, PL SQL, Shell Scripting, Pig Latin, HiveQL
Operating Systems: Windows XP, Windows 2000 Server, UNIX, Linux 5.6
Confidential, Fort Worth, Texas
- Preparation of Vendor Questionnaire to capture the Vendor product features and advantages with Hadoop cluster.
- Involved in design and Implementation of proof of concept for the system to be developed on BIGDATA Hadoop with Hbase, HIVE, Pig, and Flume.
- Used Hbase for real time searching on log data and PIG, HIVE, MapReduce for analysis.
- Managed and reviewed Hadoop log files
- Used Flume to publish logs to Hadoop in real time.
- Worked with business teams and created Hive queries for ad hoc access.
- Involved in loading data from UNIX file system to HDFS. Automated the steps to load log files into Hive
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Used HUE to save Hive queries for each required report and for downloading the query results as csv or excel.
- Conduct interviews with subject matter experts and document the features to be included in the system.
- Used Pig for data cleansing.
- Created partitioned tables in Hive.
- Developed Hive SerDe for parsing send email logs.
- Wrote Hive UDF to extract date from the time in seconds.
- Involved in installing the Hive, Hbase, PIG, Flume and other Hadoop ECO system software.
- Involved in creating 12 data node Hadoop clusters for POC.
Environment: Java 6, Eclipse, Linux 5.x, CDH3, CDH4.x, Sqoop, Pig, Hive 0.71, Flume, UNIX Shell Scripting, HUE, WinSCP, MYSQL 5.5, Scala.
Confidential, Atlanta, GA
- Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
- Responsible for Installing, Configuring and Managing of Hadoop Cluster spanning multiple racks.
- Developed Pig latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
- Used Sqoop tool to extract data from a relational database into Hadoop.
- Design, develop and maintain services and interfaces to allow for cross product communication, data analytics, reporting, and management.
- Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
- Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
- Involved in identifying possible ways to improve the efficiency of the system. Involved in the requirement analysis, design, development and Unit Testing use of MRUnit and Junit
- Prepare daily and weekly project status report and share it with the client.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
Environment: Apache Hadoop, Java (JDK 1.6), Oracle, My SQL, Hive, Pig, Sqoop, Linux, Cent OS, Junit, MRUnit, Hbase.
Confidential, Los Angeles, CA
ETL Hadoop Developer
- Created Hive Tables, loaded retail transactional data from Teradata using Scoop.
- Loaded Customer data from the existing DWH tables (SQL Server) to HDFS using Scoop.
- Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
- Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze and
- Visualize the data using Data Meer.
- Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
- Loaded the load ready files from mainframes to Hadoop and files were converted to ascii format.
- Developed pig scripts for replacing the existing home loans legacy process to the Hadoop and the data is back fed to retail legacy mainframes systems.
- Developed MapReduce programs to write data with headers and footers and Shell scripts to convert the data to fixed-length format suitable for Mainframes CICS consumption.
- Used Maven for continuous build integration and deployment.
- Agile methodology was used for development using XP Practices (TDD, Continuous Integration).
- Participated in daily scrum meetings and iterative development.
- Exposure to burn-up, burn-down charts, dashboards, velocity reporting of sprint and release progress.
Environment: Java1.3, EJB, Java Script, HTML, XML, Rational Rose, Microsoft Visio, Swings, JSP, Servlets, JNDI, JDBC, SQL, Oracle8i, Tomcat 3.1.
- Associated in designing application using MVC design pattern.
- Developed front-end user interface modules by using HTML, XML, Java AWT, and Swing.
- Front-end validations of user requests carried out using Java Script.
- Designed and developed the interacting JSPs and Servlets for modules like User Authentication and Summary Display.
- Used Jconsole for the memory management.
- Developing Action, Action Form, Front Controller, Singleton Classes, and Transfer Objects (TO), Business Delegates (BD), Session Façade, Data Access Objects (DAO) and business validators.
- Analyzed, designed, implemented and integrated product in existing application.
- Wrote Jboss Quartz to schedule jobs.
- Communicated with the other components using JMS within the system.
- Designed and Developed Web Services implemented SOA architecture using SOAP and XML for the module and published (exposed) the Web Services.
- Designed and developed Entity/Session EJB components for the primary modules.
- Java Mail was used to notify the user of the status and completion of the request.
- Developed Stored Procedures on Oracle 8i.
- Implemented Queries using SQL (database triggers and functions).
- JDBC was used to interface the web-tier components on the J2EE server with the relational database.
Java D eveloper
- Gathered requirements for the project and involved in analysis phase.
- Implemented applications using Bootstrap framework.
- Worked on developing internal customer service representative (CSR's) tools.
- Redesigned the service plan page to display dynamically service products based on user selection.
- Debug the application using Firebug too traverse the documents and manipulated the nodes using DOM and DOM functions.
- Developed front-end reporting screen using Angular JS, widely used Angular JS UI components like route Providers, pagination, ng-grid, ng-directives, Session timeout pop-ups.
- Worked on minor enhancements using core Java.
- Involved in writing SQL queries.
- Used stored procedures, triggers, cursors, packages, Anonymous PL/SQL to store, retrieve, delete and update the database tables by using PL/SQL.
- Used technologies like JDBC for accessing related data from database
Environment: Java, Oracle, PL/SQL, ETL informatica.