We provide IT Staff Augmentation Services!

Sr Hadoop Developer Resume

0/5 (Submit Your Rating)

MN

SUMMARY

  • 7+ years of IT experience, including 3.5 years of experience in dealing with Apache Hadoop components like BIG Data, HDFS, MapReduce,Yarn Hive, Pig, Sqoop, Oozie, Solr. and Big Data Analytics.
  • Expertise in Big Data technologies as consultant, proven capability in project based teamwork and also as an individual developer with good communication skills.
  • Hands on experience in designing and implementing Big Data projects using components of Hadoop ecosystem including Hive, Pig, Sqoop, Oozie, Flume, Spark.
  • Experience in working with Hadoop clusters using AWS EMR, Cloudera (CDH5), MapR and HortonWorks Distributions.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, kafka, Solr, Pig and Flume.
  • Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
  • Performed importing and exporting data into HDFS and Hive using Sqoop, Flume.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Worked on Performance Tuningof Hadoopjobs by applying techniques such as Map Side Joins, Partitioning, and Bucketing.
  • Working experience and understanding on industry latest Hadoop ecosystems like Apache Spark integration with Hadoop.
  • Extending Hive and Pig core functionality by writing UDFs.
  • Customizing batch Java programs & Shell script development.
  • Good experience installing, configuring, testing Hadoop ecosystem components.
  • Good experience in writing PIG and Hive UDF’s to solve the purpose of util classes.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH4&5 clusters.
  • Hands on experience in Agile and Scrum methodologies.
  • Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
  • Proven experience in ETL (PDI / Kettle (Pentaho Data Integration), Ab Initio).

TECHNICAL SKILLS:

Languages: C,C++,JAVA, SCALA, SQL and PL/SQL

Big Data Framework and Eco Systems: Hadoop, MapReduceHive, Pig, HDFS, Zookeeper, Sqoop, Solr, Oozie, Yarn, Flume, spark, storm,shark.

No SQL: HBase, Cassandra,MongoDB and MemBase

Web Technologies: JavaScript, CSS, HTML, XHTML, AJAX, XML, XSLT.

Databases: Oracle 8i/9i/10g/11g, MySQL, PostGre SQL and MS-Access

Operating Systems: Windows XP/2000/NT, Linux, UNIX

Tools: Ant, Maven, TOAD, AgroUML, WinSCP, Putty, Lucene

IDE Tools: Eclipse 4.x, Eclipse RCP, NetBeans 6, Editplus

Version Control Tools: CVS, SVN

ETL Tools: PDI / Kettle (Pentaho Data Integration), Ab Initio

PROFESSIONAL EXPERIENCE:

Confidential, MN

Sr Hadoop Developer

Responsibilities:

  • Developed Map Reduce jobs in java for data cleansing and preprocessing.
  • Moving data from DB2, Oracle Exadata to HDFS and vice-versa using SQOOP.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
  • Worked with different file formats and compression techniques to determine standards.
  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
  • Developed and Implemented data integration (ETL) using Sqoop, MapReduce tan automate process in Cloudera environment and build Oozie workflows.
  • Responsible for generating GIS domain specific reports for various clients in the organization
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Implemented POC writing programs in Scala and data processed using Spark SQL
  • Developed hive queries and UDFS to analyze/transform the data in HDFS.
  • Developed hive scripts for implementing control tables logic in HDFS.
  • Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
  • Developed Pig scripts and UDF’s as per the Business logic.
  • Developed user defined functions in pig using Python.
  • Analyzing/Transforming data with Hive and Pig.
  • Involved in converting Hive/SQL queries into sparktransformations using spark RDDs, Python and Scala.
  • Conducted POC for Hadoop and Spark as part of NextGen platform implementation. Implemented recommendation engine using Scala.
  • Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
  • Designed and developed read lock capability in HDFS.
  • Implemented Hadoop Float equivalent to the DB2 Decimal.
  • Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data warehouses and Data Marts.
  • Involved in End to End implementation of ETL logic.
  • Effective coordination with offshore team and managed project deliverable on time.

Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Sqoop,Pig, Scala,Spark,XML, ETL, DB2, Linux, QA, python and Pentaho (PDI/Kettle)

Confidential, Dearborn, MI

Hadoop Developer

Responsibilities:

  • Led the team in making valuable decisions by performing analytical operations
  • Predicted consumer behavior, such as wat products a particular user has bought and made predictions/recommendations based on recognizing patterns by using Hadoop, Hive and Pig queries.
  • Installed and configured Hadoop, MapReduce, and HDFS.
  • Developed multiple MapReduce jobs using Java API for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and HIVE from a Oracle 11g database using Sqoop
  • Responsible to manage data coming from different sources
  • Monitoring the running MapReduce programs on the cluster.
  • Responsible for loading data from UNIX file systems into HDFS.
  • Installed and configured Hive.
  • Worked with application teams to install hadoop updates, patches, version upgrades as required.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.0 cluster.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
  • Implemented HDFS snapshot feature.
  • Experience in Migrating Business reports to Spark, Hive, Pig and Map Reduce.
  • Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.0.
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
  • Involved in Installation and configurations of patches and version upgrades.
  • Involved in Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster Monitoring.
  • Troubleshooting.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in HDFS maintenance and administering it through Hadoop-Java API
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Installed and configured Pig.
  • Wrote Pig scripts to process unstructured data and create structure data for use with Hive.
  • Developed the Sqoop scripts in order to make the interaction between Pig and MySQL Database.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.

Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, HDFS, LINUX, Oozie, Cassandra, Hue, HCatalog, Java, Eclipse, VSS, Red Hat Linux.

Confidential, Atlanta, GA

Hadoop/ Lead Java Developer

Responsibilities:

  • Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Designed and developed Oozie workflows for sequence flow of jod execution.
  • Mainly working on handling of BigData Analytics and infrastructure of Hadoop, MapReduce
  • Got good experience with NoSQL database.
  • Performed Map Reduce Programs those are running on the cluster.
  • Installed and configured Hive and also written Hive UDFs.
  • Implemented CDH3 Hadoop cluster.
  • Installing cluster, monitoring/administration of cluster recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented best income logic using Pig scripts.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the Business Intelligence (BI) team.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Writing Hadoop MR programs to get the logs and feed into Cassandra for Analytics purpose
  • Building, packaging and deploying the code to the Hadoop servers.
  • Developed script to run night batch process using python.
  • Unix Scripting to manage the Hadoop Operation stuffs.
  • Written Puppet program for installation and configuration of Cloudera Hadoop CDH3u1.

Environment: JDK 1.6, Hadoop, MapReduce, HDFS, Hive, Java, SQL,Datameter, PIG, Sqoop, CentOS,Cloudera, python.

Confidential

Java Developer

Responsibilities:

  • Involved in System Analysis that included the high-level design, low-level design, and contributed to the technical architecture of the system.
  • Involved in drawing UML diagrams like class diagram, package diagrams, sequence diagrams, activity diagrams.
  • Used spring framework to implement MVC architecture.
  • Developed UI components like JSP,JSTL, JavaScript and JQuery, Ajax
  • Configured applicationContext.xml to integrate Hibernate with spring.
  • Wrote named queries using Hibernate Query Language (HQL).
  • Implemented Listener classes and configured in web.xml.
  • Developed scripts for making asynchronous calls to update the combo boxes across the project using AJAX.
  • Involved in setting coding standards and writing related documentation.
  • Developed web tier using Struts tag libraries, CSS, HTML, XML, JSP and Servlets.
  • Expertise in core Java, Multithreading, JDBC, Shell Scripting and proficient in using Java API's for application development.
  • Involved in DB design phase of the project.
  • Involved in Bug fixing and tracking.
  • Prepared unit level test cases and tested using JUnit.
  • E-mail notification by using JavaMail.
  • Used MAVEN for dependency management.

Environment: Java/J2EE, JSP, Servlets, JDBC, Scala, Spring2.5,EJB,Java Mail, Web Trends, AJAX, HTML, XML,SQL, PL/SQL, Oracle 9i, JavaScript, shell scripting, ANT, Oracle8i, WSAD 5.1, VSS, Unix, LOG4J,JQuery.

Confidential

Java Developer

Responsibilities:

  • Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
  • Developed the modules based on struts MVC Architecture.
  • Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
  • Created Business Logic using Servlets, Session beans and deployed them on Weblogic server.
  • Used MVC struts framework for application design.
  • Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
  • Prepared the Functional, Design and Test case specifications.
  • Involved in writing Stored Procedures in Oracle to do some database side validations.
  • Performed unit testing, system testing and integration testing
  • Developed Unit Test Cases. Used JUNIT for unit testing of the application.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java, HTML, Java Script, Scala, CSS, Oracle, JDBC, Swing and Eclipse.

We'd love your feedback!