Sr Hadoop Developer Resume
Minneapolis, MN
SUMMARY
- 7+ years of IT experience, including 3 plus years of experience in dealing with Apache Hadoop components like BIG Data, HDFS, MapReduce, Yarn Hive, Pig, Sqoop, Oozie, Solr. and Big Data Analytics.
- Expertise in Big Data technologies as consultant, proven capability in project based teamwork and also as an individual developer with good communication skills.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in working with Hadoop clusters using AWS EMR, Cloudera (CDH5), MapR and HortonWorks Distributions.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, kafka, Solr, Pig and Flume, AVRO.
- Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
- Performed importing and exporting data into HDFS and Hive using Sqoop.
- Experience in managing and reviewing Hadoop log files.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Worked on Performance Tuningof Hadoopjobs by applying techniques such as Map Side Joins, Partitioning, and Bucketing.
- Knowledge and understanding on industry latest Hadoop ecosystems like Apache Spark integration with Hadoop.
- Extending Hive and Pig core functionality by writing UDFs.
- Customizing batch Java programs & Shell script development.
- Good experience installing, configuring, testing Hadoop ecosystem components.
- Highly knowledgeable in Writer Comparable,Writer interfaces, Mapper and Reducer abstract classes,HadoopData Objects such asInt Writable,Byte Writable, Text objects.
- Reporting solution was provided through ETL through Ab Intio.
- Good experience in writing PIG and Hive UDF’s to solve the purpose of util classes.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH4&5 clusters.
- Hands on experience in Agile and Scrum methodologies.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Provided support for AzureCloud servers environment for project Code Deployments and Oracle DB
- Proven experience in ETL (PDI / Kettle (Pentaho Data Integration), Ab Initio).
TECHNICAL SKILLS
Languages: C,C++,JAVA, SQL and PL/SQL
Big Data Framework and Eco Systems: Hadoop, MapReduceHive, Pig, HDFS, Zookeeper, Sqoop, Solr, Oozie, Yarn, Flume, spark, AVRO.
No SQL: HBase, Cassandra,MongoDB and MemBase
Web Technologies: JavaScript, CSS, HTML, XHTML, AJAX, XML, XSLT.
Databases: Oracle 8i/9i/10g/11g, MySQL, PostGre SQL and MS-Access
Operating Systems: Windows XP/2000/NT, Linux, UNIX
Tools: Ant, Maven, TOAD, AgroUML, WinSCP, Putty, Lucene
IDE Tools: Eclipse 4.x, Eclipse RCP, NetBeans 6, Editplus
Version Control Tools: CVS, SVN
ETL Tools: PDI / Kettle (Pentaho Data Integration), Ab Initio
PROFESSIONAL EXPERIENCE
Confidential, Minneapolis, MN
Sr Hadoop Developer
Responsibilities:
- Developed Map Reduce jobs in java for data cleansing and preprocessing.
- Moving data from DB2, Oracle Exadata to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked with different file formats and compression techniques to determine standards.
- Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
- Leveraged Flume to stream data from Spool Directory source to HDFS Sink using AVRO protocol.
- Developed hive queries and UDFS to analyze/transform the data in HDFS.
- Developed hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF’s as per the Business logic.
- Developed user defined functions in pig using Python.
- Analyzing/Transforming data with Hive and Pig.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Implemented Hadoop Float equivalent to the DB2 Decimal.
- Interacted with Business users on daily basis to understand ETL requirements.
- Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data warehouses and Data Marts.
- Involved in End to End implementation of ETL logic.
- Effective coordination with offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Sqoop, Pig, Spark, XML, ETL, DB2, QA, python and Pentaho (PDI/Kettle)
Confidential, Atlanta, GA
Hadoop Developer/Admin
Responsibilities:
- Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Designed and developed Oozie workflows for sequence flow of jod execution.
- Mainly working on handling of BigData Analytics and infrastructure of Hadoop, MapReduce
- Got good experience with NoSQL database.
- Performed Map Reduce Programs those are running on the cluster.
- Installed and configured Hive and also written Hive UDFs.
- Implemented CDH3 Hadoop cluster.
- Installing cluster, monitoring/administration of cluster recovery, capacity planning, and slots configuration.
- Experience in developing complex mappings using Ab initioto load the data from various sources into the Data Warehouse and using different transformations like Transform components, Database components, De-partition components, Partition components, Sort components, miscellaneous components.
- Performed attribute level analysis on source and target and prepared the data mapping document.
- Responsible for development of AbInitio graphs.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented best income logic using Pig scripts.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the Business Intelligence (BI) team.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Writing Hadoop MR programs to get the logs and feed into Cassandra for Analytics purpose
- Building, packaging and deploying the code to the Hadoop servers.
- Developed script to run night batch process using python.
- Unix Scripting to manage the Hadoop Operation stuffs.
- Written Puppet program for installation and configuration of Cloudera Hadoop CDH3u1.
Environment: JDK 1.6, Hadoop, MapReduce, HDFS, Hive, Java, SQL,Datameter, PIG, Sqoop, CentOS,Cloudera, python.
Confidential
Java Developer
Responsibilities:
- Involved in System Analysis that included the high-level design, low-level design, and contributed to the technical architecture of the system.
- Involved in drawing UML diagrams like class diagram, package diagrams, sequence diagrams, activity diagrams.
- Used spring framework to implement MVC architecture.
- Developed UI components like JSP,JSTL, JavaScript and JQuery, Ajax
- Configured application Context.xml to integrate Hibernate with spring.
- Wrote named queries using Hibernate Query Language (HQL).
- Implemented Listener classes and configured in web.xml.
- Developed scripts for making asynchronous calls to update the combo boxes across the project using AJAX.
- Involved in setting coding standards and writing related documentation.
- Developed web tier using Struts tag libraries, CSS, HTML, XML, JSP and Servlets.
- Expertise in core Java, Multithreading, JDBC, Shell Scripting and proficient in using Java API's for application development.
- Involved in DB design phase of the project.
- Involved in Bug fixing and tracking.
- Prepared unit level test cases and tested using JUnit.
- E-mail notification by using JavaMail.
- Used MAVEN for dependency management.
Environment: Java/J2EE, JSP, Servlets, JDBC, Spring2.5,EJB,Java Mail, Web Trends, AJAX, HTML, XML,SQL, PL/SQL, Oracle 9i, JavaScript, shell scripting, ANT, Oracle8i, WSAD 5.1, VSS, Unix, LOG4J,JQuery.
Confidential
Java Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Developed the modules based on struts MVC Architecture.
- Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Created Business Logic using Servlets, Session beans and deployed them on Weblogic server.
- Used MVC struts framework for application design.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUNIT for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
- Resolved more priority defects as per the schedule.
Environment: Java, HTML, Java Script, CSS, Oracle, JDBC, Swing and Eclipse.
