Hadoop Developer Resume
Boston, MA
SUMMARY
- Experienced in Analysis, Design, Development, Testing, Implementation, Maintenance and Enhancements in Big Data (Hadoop), Java Applications in various IT Projects
- 8 years of experience in IT industry which around 3 years of experience in Big Data implementing complete Hadoop solutions along with 5 years of experience in Java.
- Good working experience in using ApacheHadoop eco system components like MapReduce, HDFS, Hive, Sqoop, Pig, Oozie, Flume, HBase, Spark, Storm, Kafka, Scala and Zoo Keeper.
- Writing UDFs and integrating with Hive and Pig.
- ExperiencewithSequence files, AVRO and ORC file formats and compression.
- Experience in Hadoop Distributions: Cloudera and Hortonworks,
- Performed importing and exporting data into HDFS and Hive using Sqoop.
- Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Extensive knowledge in using SQL Queries for backend database analysis.
- Used different Hive Serde's like Regex Serde and HBase Serde.
- Strong knowledge in NOSQL column oriented databases like HBase, Cassandra,MongoDBand its integrationwithHadoopcluster.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
- Led many Data Analysis & Integration efforts involving HADOOP along with ETL.
- Extensive experience with SQL, PL/SQL and database concepts.
- Transferred bulk data from RDBMS systems likeTeradatainto HDFS using Sqoop.
- Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
- In depth knowledge of creating Map Reduce codes in Java as per the business requirements.
- Experience in developing multi-tier Java based web application.
- Expertise in Core Java, J2EE, Multithreading, JDBC, Shell Scripting and proficient in using Java API’s for application development.
- Good experience in developing applications using JavaEE technologies includes Servlets, Struts, JSP, and JDBC.
- Well-versed in Agile, other SDLC methodologies and can coordinate with owners and SMEs.
- Worked on different operating systems like UNIX/Linux, Windows XP, and Windows 2K.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Spark, Scala, HBase, Oozie, Flume
Programming Languages: Core Java, JSP, JDBC, Linux
Scripting Languages: JSP & Servlets, JavaScript, Python, XML, and HTML
Databases: Oracle 11g/10g, MySQL, Teradata, HBase, Cassandra
Tools: Eclipse, JDeveloper, ETL, JUnit, MS Visual Studio
Application Servers: Apache Tomcat
Testing Tools: NetBeans, Eclipse, JUnit, MR Unit
Methodologies: Agile, Scrum and Waterfall
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential, Boston, MA
Responsibilities:
- Worked with Business Analyst and helped representing the business domain details.
- Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
- Involved in loading data from LINUX file system to HDFS
- Experience in Writing Map Reduce jobs for text mining and worked with predictive analysis team to check the output and requirement.
- Hands on experience in writing hive UDF's for the requirements and to handle different schema’s and xml data.
- Used Pig as ETL tool to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS.
- Wrote Hive and Pig scripts for joining the raw data with the lookup data and for some aggregative operations as per the business requirement.
- DevelopedSparkcode using Scala andSpark-SQL/Streaming for faster testing and processing of data.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Involved in writing Flume and Hive scripts to extract, transform and load the data into Database
- Implemented Partitioning and bucketing in Hive based on the requirement.
- Connected Tableau from client end with AWS ip addresses and view the end results.
- Coordinator and Oozie workflows are developed to automate Hive, Map Reduce, Pig and other jobs.
- Creation of test cases as part of enhancement rollouts and Involved in Unit level and Integration level testing.
- Hands on experience in working with snappy compression and also different file formats.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to
generate reports For the BI team.
Environment: Hadoop, Map Reducer, Hortonworks, HDFS, Hive, Pig, Sqoop, Spark, Kafka, Oozie, Tableau, Impala, Greenplum, SQL, Java (jdk 1.6), Eclipse.
Hadoop Developer
Confidential, Dallas, TX
Responsibilities:
- Responsible for building scalable distributed data solutions usingHadoop
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis from various sources.
- Written multiple MapReduce programs to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
- Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
- Involved in loading data from LINUX file system to HDFS.
- Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
- Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
- Hands on experience in setting up HBase Column based storage repository for archiving and retro data.
- Responsible for creating Hive tables based on business requirements.
- Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline.
- Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
- Collaboratedwiththe infrastructure, network, database, application and BI teams to ensuredata qualityand availability.
- Involved in data modeling and sharding and replication strategies inCassandra.
- Load the data intoSparkRDD and do in memory data Computation to generate the Output response.
- Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
- Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4developerswith regular code review sessions.
Hadoop Developer
Confidential, Bloomington, IL.
Responsibilities:
- Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
- Optimizing Hadoop MapReduce code, Hive and Pig scripts for better scalability, reliability and performance.
- Developed the OOZIE workflows for the Application execution.
- Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
- Writing Pig scripts for data processing.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Implemented Hive tables and HQL Queries for the reports.
- Imported data fromCassandrainto HDFS using Mongo export utility.
- Involved in developing shell scripts and automated data management from end to end integration work
- Experience in performing data validation using HIVE dynamic partitioning and bucketing.
- Written and used complex data type in storing and retrieved data using HQL in Hive.
- Developed Hive queries to analyze reducer output data.
- Implemented ETL code to load data from multiple sources into HDFS using pig scripts.
- Highly involved in designing the next generation data architecture for the Unstructured data.
- Developed PIG Latin scripts to extract data from source system.
- Created and maintained technical documentation for executing Hive queries and Pig scripts.
- Involved in Extracting, loading Data from Hive to Load an RDBMS using Sqoop.
Environment: HDFS, Map Reduce, MySQL, Cassandra, Hive, HBase, Oozie, PIG, ETL, Hortonworks(HDP 2.0), Shell Scripting, Linux, Sqoop, Flume and Oracle 11g.
Java/J2EE Developer
Confidential, Dallas, TX
Responsibilities:
- Developed a parameterizable technique to recommend indexes based on index types.
- Reduced the time taken to answer the query.
- Designed and documentedRestful/HTTP APIs, including JSON data formats and API versioning strategy.
- Involved in code review, User Acceptance and Use case Testing.
- Developed application code for java programs.
- Developed and Consumed Rest based web service using JAX-RS specification.
- Developed all logical and physical models and deploy all applications and provide excellent documents for all processes.
- Used Spring Framework for Dependency injection and integrated with Hibernate.
- Involved in development of the application using Spring Web MVC and other components of the Spring Framework such as Spring Context, Spring ORM.
- CreatedSQLqueries, Sequences, Views for the back end database in Oracle database.
- Prepared all documents for project standards and maintain accuracy in same and manage all technical resources to meet all requirements and perform tests on various processes in coordination with development teams.
- Involved in unit testing and resolving test defects.
Environment: Java, JSP, Restful Services, Spring, Servlets, Tomcat server 5.0, SQL Server 2000, Web Services, Data mining.
Java/J2EE Developer
Confidential
Responsibilities:
- Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
- Prepared the High and Low level design document and Generating Digital Signature
- For analysis and design of application created Use Cases, Class and Sequence Diagrams.
- For the registration and validation of the enrolling customer developed logic and code.
- Developed web-based user interfaces using struts frame work.
- Handled Client side Validations used JavaScript
- WroteSQLqueries, stored procedures and enhanced performance by running explain plans.
- Involved in integration of various Struts actions in the framework.
- Used Validation Framework for Server side Validations
- Created test cases for the Unit and Integration testing.
- Front-end was integrated with Oracle database usingJDBCAPI through JDBC-ODBC Bridge driver at server side.
Environment: Java Servlets, JSP, Java Script, Web Services, XML, HTML, UML, Apache Tomcat, JDBC, Oracle, SQL.
Java Trainee/Developer
Confidential
Responsibilities:
- Involved in Analysis of the requirements.
- Prepared the High and Low level design document.
- Created Creating UML artifacts - Use Cases, Class and Sequence Diagrams.
- Implemented Connection pool object for database connectivity.
- Wrote hbm files and BO classes using Hibernate 3.3.1
- Used XML parsers to parse incoming data and populating the database with the data
- Designed the GUI screens using Struts and Configured log4j to debug the Application.
- Performed End to end integration testing of online scenarios and unit testing usingJUnitTesting Framework.
Environment: Java, JSP, Struts, SQL, JDBC, Java Script, CSS
