Hadoop Developer Resume
Columbus, IN
SUMMARY
- More than seven years of professional IT experience with hands - on experience in development of in Big Data and also in JAVA and J2EE technologies.
- Experienced as Hadoop Developer with good knowledge in HadoopFramework, Map Reduce andHadoopecosystem components:Hive,Pig,Oozie, Sqoop,HBase
- Understanding the System Requirement and Business Requirement specifications and preparing the Low Level Design documents.
- Experienced in writing custom MapReduce I/O formats and keyvalue formats.
- Coded Map Reduce which meets business requirement
- Expert in working with Hive data warehouse toolcreating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
- Involved in developing Pig Latin scripts and HiveQL queries for processing data.
- Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
- Responsible for preparing action plans to analyze the data
- Code review to check logic and coding standard.
- Created the unit test as per requirement and tested test cases.
- Discussion with business on Pre and Post implementation process
- Trained the team for Installation & Configuration ofHadoopcluster in Distributed mode & also inCloudera DistributionHadoop(CDH) system. Proficiency with mentoring & talks on HadoopEcosystemto team members.
- Having good Knowledge onZookeeper, Random Read/write column oriented database likeHBaseand storedataby usingdifferentHBaseShell commands.
- Used Sqoop to importdatainto HDFS and Hive. Used extensivePigLatin to load & transform (ETLprocess) supplier’sdatafor our internal customer requirements.
- An excellent team player and self-starter with effective communication skills.
TECHNICAL SKILLS
Hadoop/Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, Zookeeper, HBase, Flume
Programming Languages: Java, C, C++, SQL
IDE Tools: Eclipse, Rational Team Concert, NetBeans
Application Servers: Jboss, Tomcat, Glassfish
Databases: Oracle 11g/10g/9i, MySQL, DB2, MSSQL Server
Operating Systems: LINUX, UNIX, Windows
PROFESSIONAL EXPERIENCE
Confidential, Columbus IN
Hadoop Developer
Responsibilities:
- Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Performed analysis on the unused user navigation data by loading into HDFS and writing Map Reduce jobs. The analysis provided inputs to the new APM frontend developers and lucent team.
- Wrote Map Reduce jobs using Java API and Pig Latin.
- Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
- Used Flume to collect, aggregate and store the web log data onto HDFS.
- Wrote Pig scripts to run ETL jobs on the data in HDFS.
- Used Hive to do analysis on the data and identify different correlations.
- Written AdhocHiveQL queries to process data and generate reports.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Worked on HBase. Configured MySQL Database to store Hive metadata.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Written Hive queries for data analysis to meet the business requirements.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, HBase, Sqoop, Cloudera, ETL, MYSQL, agile, Windows
Confidential, Princeton NJ
Hadoop Developer
Responsibilities:
- Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats
- Developed MapReduce programs that filter bad and unnecessary claim records and find out unique records based on account type
- Processed semi, unstructured data using Map Reduce programs
- Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie coordinator jobs
- Implemented custom Datatypes, Input Format, Record Reader, Output Format, Record Writer for MapReduce computations
- Worked on CDH4 cluster on CentOS.
- Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level
- Transformed date related data into application compatible format by developing apache Pig UDFs
- Developed MapReduce pipeline for feature extraction and tested the modules using MRUnit
- Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms
- Creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way
- Responsible for performing extensive data validation using Hive
- Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access
- Worked on different set of tables like External Tables and Managed Tables
- Used Oozie workflow engine to run multiple Hive and Pig jobs
- Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Involved in designing and developing nontrivial ETL processes within Hadoop using tools like Pig, Sqoop, Flume, and Oozie
- Developed Hive queries for creating foundation tables from stage data
- Used Pig as ETL tool to do transformations, event joins, filter and some preaggregations
- Analysed the data by performing Hive queries and running Pig scripts to study customer behaviour
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
- Worked with Sqoop to export analysed data from HDFS environment into RDBMS for report generation and visualization purpose
Confidential
Hadoop Developer
RESPONSIBILITIES:
- Working withHadoopandHadoopEcosystem
- Working with Big data technologiesHive,PigandMap Reduce
- WrittenHiveQueries andPigscripts fordataanalysis to meet the business requirement.
- Involved in Extracting, loadingDatafromHiveto RDBMS using Sqoop.
- Involved in TransformingDatawithin aHadoopCluster usingPigScripts.
- Involved Job to Load Log FilesDataintoHiveusing Flume.
- Involved in Import and Export by using Sqoop for job entries.
Environment: Apache Hadoop, HDFS, MapReduce, MySQL, DB Visualizer, Linux, Apache Hive, Apache Pig, Sqoop
Confidential, Boston, MA
Java Developer(Hadoop)
Responsibilities:
- Created the web application using MVC Struts framework.
- Created user-friendly GUI interface and Web pages usingHTMLand DHTML embedded in JSP.
- Developed web layer using Struts framework to manage the project in MVC pattern
- Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development
- Designed J2EE project with Front Controller pattern .
- Designed CSS and tag libraries for Front end
- Extensive use of Java scripts and AJAX to control all user functions.
- Developing front end screens which includes JQuery, JSP, JavaScript, Java and CSS
- Attend business and requirement meetings
- Using ANT to create build scripts for deployment and run the JUnit test cases
- Using VSS extensively to code check-in, checkout and version them and maintain production, test and development views appropriately.
- Understand the sources of data and organize it in a structured table setup
- Deliver daily reports and data sheets to clients for their business meetings.
- Code review, unit testing and local Integration testing.
- Integrating of application modules, components and deploying in the target platform.
- Involving in the requirement study, and preparation of detailed software requirement specification.
- Involving in low level and high level design and preparation of HLD and LLD documents Visio. Testing support during integration and production
Environment: C, Java, Hadoop, Hive, Sqoop, Zookeeper, MapReduce, JSP, J2EE, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.
Confidential
Java Developer
Responsibilities:
- Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development
- Designed J2EE project with Front Controller pattern
- Developing front end screens which includes JQuery, JavaScript, Java and CSS
- Attend business and requirement meetings
- Design Data structure, Database and tables in Oracle
- Using ANT to create build scripts for deployment and run the JUnit test cases
- Using VSS extensively to code check-in, check-out and version them and maintain production, test and development views appropriately.
- Understand the sources of data and organize it in a structured table setup
- Deliver daily reports and data sheets to clients for their business meetings.
- Code review, unit testing and local Integration testing.
- Integrating of application modules, components and deploying in the target platform.
- Involving in the requirement study, and preparation of detailed software requirement specification.
- Involving in low level and high level design and preparation of HLD and LLD documents Visio
- Testing support during integration and production
Environment: JSP, J2EE, Servlets, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.