Hadoop Developer Resume
Beaverton, OregoN
SUMMARY
- Over 7+ years of total IT experience including 2 years 8 months in Hadoop and BigData technologies.
- Strong experience in using Hadoop eco - system components like HDFS, MapReduce, Oozie, Pig, Hive, Sqoop, Flume, Kafka, Impala, HBase, Zookeeper.
- Excellent understanding of both Classic MapReduce, YARN and their applications in BigData Analytics.
- Experience in working with Spark and Storm.
- Experience in installing, configuring and maintaining the Hadoop Cluster including YARN configuration using Cloudera, Hortonworks.
- Hands on experience in implementing secure authentication for Hadoop Cluster by using Kerberos.
- Experience in benchmarking Hadoop Cluster to tune and obtain the best performance out of it.
- Familiar with all stages of Software Development Life Cycle, Issue Tracking, Version Control and Deployment.
- Extensively worked in writing, tuning and profiling jobs in M apReduce, Advanced M apReduce using Java.
- Experience in writing MRUnit to test the correctness of MapReduce programs.
- Expertise in writing Shell-Scripts, Cron Automation and Regular Expressions.
- Hands on experience in dealing with Compression Codecs like Snappy, BZIP2.
- Implemented workflows in Oozie using Sqoop, MapReduce, Hive and other Java and Shell actions.
- In-depth knowledge of working with Avro and Parquet formats.
- Excellent knowledge of Data Flow Lifecycle and implementing transformations and analytic solutions.
- Extending Hive and Pig core functionality by writing Custom UDFs.
- Excellent knowledge in NoSQL databases like HBase, Cassandra and MongoDB.
- Working knowledge in Data Warehousing with ETL tools like IBM - DB2 Warehouse Edition.
- Extensively worked on Database Applications using DB2, Oracle, MySQL, PL/SQL.
- Hands on experience in application development using Java, RDBMS.
- Strong experience as a senior Java Developer in Web/intranet, Client/Server technologies using Java, J2EE, Servlets, JSP, EJB, JDBC.
- Expertise in implementing Database projects which includes Analysis, Design, Development, Testing and Implementation of end-to-end IT solutions.
- Worked on End-To-End implementation with Data warehousing team and Strong understanding of Data Warehousing concepts and exposure to Data Modeling, Normalization and Business Process Analysis.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Excellent working knowledge of popular frameworks like Struts, Hibernate, and Spring MVC.
- Proven ability to work with senior level business managers and understand the key business drivers that impacts their satisfaction.
- Experience in Agile Engineering practices. Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS
- Hadoop, MapReduce, YARN, Pig, Hive, HBase, Flume, Storm, Kafka, Sqoop, Impala, Oozie, Zookeeper, Spark, Ambari, Mongo-DB, Cassandra, Elasticsearch and Kibana, Avro, Parquet, Maven, Ant, Snappy, Bzip2.
- Cloudera, HDP, Amazon Web Services.
- Java, Java Beans, Struts, Hibernates, J2EE (JSP, Servlets, EJB), SOA, JDBC, Spring.
- Cassandra, Mongo DB, HBase.
- MySQL, PL/SQL, Oracle, DB2.
- UNIX / LINUX, MS-DOS, Windows-XP/7/8/Vista, Mac.
- C, C++, .NET in C#, R, SCALA, PYTHON, HTML5, JS, JSON, PHP, AJAX, XML, Visual Studio 2010, D2L(LMS).
- Agile, UML, Design Patterns
- Apache Tomcat 5.x 6.0, GlassFish v3.1.2.2
- Tableau
- NetBeans, Eclipse
- JUnit, MRUnit
- Microsoft Office tool, JIRA and Prezi.
PROFESSIONAL EXPERIENCE
Confidential, Beaverton, Oregon
Hadoop Developer
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Involved in setting up the Hadoop cluster along with Hadoop Administrator.
- Experience in working with an OpenStack environment.
- Worked on 40 node Hadoop cluster during production.
- Installed and configured Hadoop Ecosystem components.
- Imported the data from Oracle source and populated it into HDFS using Sqoop.
- Developed a data pipeline using Kafka and Storm to store data into HDFS and Cassandra.
- Performed real time analysis on the incoming data.
- Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
- Performed transformations like event joins, filter bot traffic and some pre-aggregations using Pig.
- Developed MapReduce jobs to convert data files into Parquet file format.
- Included MRUnit to test the correctness of MapReduce programs.
- Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
- Developed business specific Custom UDF's in Hive, Pig.
- Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
- Optimized MapReduce code, pig scripts and performance tuning and analysis.
- Implemented a POC with Spark SQL to interpret Json records.
- Created table definition and made the contents available as a Schema-RDD.
- Implemented advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
- Developed different kind of custom filters and handled pre-defined filters on HBase data using Java API.
- Exported the aggregated data onto Oracle using Sqoop for reporting on the Tableau dashboard.
- Involvement in design, development and testing phases of Software Development Life Cycle.
- Performed Hadoop installation, updates, patches and version upgrades when required.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
Environment: CDH4, CDH5, Eclipse, Centos Linux, HDFS, MapReduce, Kafka, Storm, Parquet, Pig, Hive, Sqoop, Spark, HBase, Spark-SQL, Oracle, Oozie, RedHat Linux, Tableau.
Confidential, Detroit, MI
Hadoop Developer
Responsibilities:
- Involved in installing cluster and Configuring Hadoop Ecosystem components.
- Worked with Hadoop administrator in rebalancing blocks and decommissioning nodes in the cluster.
- Responsible to manage the data coming from different sources.
- Extracted the data from web servers onto HDFS using Flume.
- Imported and exported structured data using Sqoop to load data from RDBMS to HDFS and vice versa, on regular basis.
- Developed, Monitored and Optimized MapReduce jobs for data cleaning and preprocessing.
- Built data pipeline using Pig and MapReduce through Java API.
- Implemented MapReduce jobs to write data into Avro format.
- Exported the data from Avro files and indexed the documents in sequence file format.
- Implemented various performance optimizations like using distributed cache for small datasets, Partitioning, Bucketing in hive, using Compression Codecs where ever necessary.
- Automated all the jobs to pull the data and load into Hive tables, using Oozie workflows.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
- Configured scheduler jobs using Oozie work flow that integrates Hadoop actions like Map Reduce, Hive, Pig and Sqoop.
- Implemented Pattern matching algorithms with Regular Expressions, built profiles using Hive and stored the results in HBase.
- Performed CRUD operations using HBase Java Client API and Rest API.
- Used SVN for version control.
- Used Maven to build the application.
- Implemented Unit Testing using MRUnit.
Environment: HDP, HDFS, Flume, Sqoop, Pig, Hive, MapReduce, HBase, Oozie, MRUnit, Maven, Avro, RedHat Linux, SVN, RDBMS.
Confidential, Denver, CO
Java/J2EE Developer
Responsibilities:
- Responsible for Analysis, Design, Development and Integration of UI components with backend using J2EE technologies such as Servlets, Java Beans, JSP, JDBC.
- Used Spring Framework 3.2.2 for transaction management and Hibernate3 to persist the data into the database.
- Developed JSP's for user interfaces, JSP's uses Java Beans objects to produce responses.
- Created controller Servlets for handling HTTP requests from JSP pages.
- Writing JavaScript functions for various validation purposes.
- Implemented the presentation layer using Struts2 MVC framework.
- Designed HTML Web pages utilizing JavaScript and CSS.
- Involved in developing distributed, transactional, secure and portable applications based on Java using EJB technology.
- Deployed web applications in web-logic server by creating Data source and uploading jars.
- Created connection pool, Configured deployment descriptor specifying data environment.
- Implemented Multithread concepts in Java classes to avoid deadlocking.
- Involved in High Level Design and prepared Logical view of the application.
- Involved in designing and developing of Object Oriented methodologies using UML and created Use Case, Class, Sequence diagrams and also in complete development, testing and maintenance process of the application.
- Created Core java Interfaces and Abstract classes for different functionalities.
Environment: Java /J2EE, CSS, AJAX, XML, JSP, JS, Struts2, Hibernate3, Spring Framework 3.2, Web Services, EJB3, Oracle, J-Unit, Windows XP, Web-logic Application Server, Ant 1.8.2, Ecplise3.x, SOA tool.
Confidential
Java Developer
Responsibilities:
- Extensively involved in the design and development of JSP screens to suit specific modules.
- Converted the application’s console printing of process information to proper logging technology using log4j.
- Developed the business components (in core Java) used in the JSP screens.
- Involved in the implementation of logical and physical database design by creating suitable tables, views and triggers.
- Developed related procedures and functions used by JDBC calls in the above components.
- Extensively involved in performance tuning of Oracle queries.
- Created components to extract application messages stored in xml files.
- Executed UNIX shell scripts for command line administrative access to oracle database and for scheduling backup jobs.
- Created war files and deployed in web server.
- Performed source and version control using VSS.
- Involved in maintenance support.
Environment: JDK, HTML, JavaScript, XML, JSP, Servlets, JDBC, Oracle 9i, Eclipse, Toad, Unix Shell Scripting, MS Visual SourceSafe, Windows 2000.