Hadoop Developer Resume
SUMMARY
- Around 4 years of extensive experience in the IT industry, involved in analysis, design, development, testing, documentation and maintenance of Client/Server and Web applications. Extensive knowledge of project life cycle and related processes.
- Good experience in Big Data using Hadoop, Hive, PIG, Sqoop and MapReduce Programming.
- Strong development skills with Java, JSP, JDBC, SQL, Servlet, J2EE
- Strong background in object - oriented programming (OOA&D) and concepts.
- Working experience in MapReduce programming model and Hadoop Distributed File Systems (HDFS).
- Hands on experience on major components of Hadoop Ecosystem like Flume, HBase, Zookeeper, Oozie, Hive, Sqoop, PIG and YARN.
- Excellent understanding of Hadoop architecture and different components of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node.
- Experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in Java.
- Worked on importing and exporting data from different databases like Oracle, Teradata into HDFS and Hive using Sqoop.
- Wrote Hive and PIG queries for data analysis to meet the business requirements.
- Involved in creating tables, partitioning, bucketing of table and creating UDF's in Hive.
- Experience with Hive Queries Performance Tuning.
- Well experienced with implementing Join operations using PIG Latin.
- Involved in writing data transformations, data cleansing using PIG operations.
- Extensive knowledge in NoSQL databases like HBase, Cassandra.
- Experienced with performing CRUD operations using HBase Java Client API and Rest API.
- Experience with Oozie Workflow Engine to automate and parallelize Hadoop Map/Reduce, Hive and PIG jobs.
- Experienced with processing different file formats like Avro, XML, JSON and Sequence file formats using MapReduce programs.
- Strong knowledge of SOA concepts and experience in developing web services based on SOAP and REST protocols.
- Developed web applications using JAVA, J2EE, including the design and implementation of custom web parts and other programs that leverage the object model.
- Experienced in preparing and executing Unit Test Plan and Unit Test Cases using JUnit.
- Knowledge of Enterprise Content Management, Collaboration and Portals
- Worked with version controls like CVS, SVN and Git.
- Familiarity with Software Development Life Cycle, Agile, SCRUM and other project methodologies
- Adept in technical, analytical, and logical skills; highly adaptable to new environments and great zeal to learn new technologies
- Committed to professionalism, highly organized, ability to work under strict deadline schedules with attention to details, possess excellent written and communication skills.
- Ability to work effectively in a multi-cultural environment with a team and individually as per the project requirement.
TECHNICAL SKILLS
Hadoop Ecosystem: Hadoop, MapReduce, Sqoop, Hive, Oozie, PIG, HDFS, YARN, Zookeeper, Flume, Spark, Cloudera Distribution for Hadoop (CDH)
No SQL Databases: HBase, Cassandra, MongoDB
Languages: Core Java, JDBC, SQL, C, C#, PIG Latin, HiveQL, Scala, J2EE, R, Python
Web Technologies: JSP, Servlets, HTML, CSS, XML
Scripting: Java Script, Python, Unix Shell Scripting
Databases: My SQL, MS SQL Server, PostgreSQL
Tools: Quality Center, SQL Query Analyzer, Junit, Eclipse, Teradata
Web Services: WSDL, SOAP, REST
Methodologies: Scrum, Agile, Waterfall
Platforms: Windows, UNIX,LINUX
PROFESSIONAL EXPERIENCE
Confidential
Hadoop Developer
Responsibilities:
- Involved in running Hadoop jobs for processing millions of records of text data.
- Data validation between existing system and new cluster.
- Written the Apache PIG scripts and Python to process the HDFS data. Developed Map Reduce program for parsing and loading into HDFS information.
- Developed PIG UDF'S for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
- Developed Pig Latin scripts for transformations, sort, group, event joins, filter.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Involved in analysis,ETLdesign and development for extracting data from different interfaces like SQL, Flat Files.
- Involved in creating Hive tables, designing patterns, and loading and analyzing data using hive queries.
- Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive. Used the Used hive Windowing and analytical functions for data Analysis.
- Worked with file formats TEXT, AVRO, JSON files and involved in loading data from LINUX file system to HDFS.
- Performed Data Loading Techniques through Hive and HBase, ETL through Talend.
- Hands on experience in writing Spark SQL scripting.
- Deployment plan across the different applications (products).
- Analyzed business requirements and cross-verified them with functionality and features of NOSQL databases like HBase.
- Created HBase tables to store variable data formats of data coming from different applications.
- Load and transform large sets of structured, semi structured and unstructured data.
- Worked on Transporting data from HBASE to HIVE using map reduce and HIVE - HBase storage handlers.
- Identifying Hadoop Configuration changes. Integrated test plan across the applications.
- Experience in moving all files generated from various sources to HDFS for further processing through Flume.
- Peer Code Reviews and assisting the team in full project life cycle implementation (requirement gathering to production). Working on Data stage to fetch various reports.
- Setting up monthly call with Dev, support, Test team to discuss production issues.
- Worked with Test team test cases reviews call and suggesting different kind of test scenarios.
- Research and troubleshoot technical problems and provided business logic consultation to the development team.
- Provided data flow after merging various workflows.
- Delivering complex PL/SQL queries to Testing team to test data flow across different system.
- Providing Production Support to the Applications. Create, maintain and develop documentation.
- Providing SQL script to support and test team to capture data issues in production database.
Environment: Hadoop, Apache Pig, Hive, Sqoop, MapReduce, Python, HBase, Talend, Toad Oracle, SQL PLUS, Linux.
