Sr Hadoop Developer Resume
MN
SUMMARY
- 7+ years of IT experience, including 3.5 years of experience in dealing with Apache Hadoop components like BIG Data, HDFS, MapReduce,Yarn Hive, Pig, Sqoop, Oozie, Solr. and Big Data Analytics.
- Expertise in Big Data technologies as consultant, proven capability in project based teamwork and also as an individual developer with good communication skills.
- Hands on experience in designing and implementing Big Data projects using components of Hadoop ecosystem including Hive, Pig, Sqoop, Oozie, Flume, Spark.
- Experience in working with Hadoop clusters using AWS EMR, Cloudera (CDH5), MapR and HortonWorks Distributions.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, kafka, Solr, Pig and Flume.
- Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
- Performed importing and exporting data into HDFS and Hive using Sqoop, Flume.
- Experience in managing and reviewing Hadoop log files.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Worked on Performance Tuningof Hadoopjobs by applying techniques such as Map Side Joins, Partitioning, and Bucketing.
- Working experience and understanding on industry latest Hadoop ecosystems like Apache Spark integration with Hadoop.
- Extending Hive and Pig core functionality by writing UDFs.
- Customizing batch Java programs & Shell script development.
- Good experience installing, configuring, testing Hadoop ecosystem components.
- Good experience in writing PIG and Hive UDF’s to solve the purpose of util classes.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH4&5 clusters.
- Hands on experience in Agile and Scrum methodologies.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Proven experience in ETL (PDI / Kettle (Pentaho Data Integration), Ab Initio).
TECHNICAL SKILLS:
Languages: C,C++,JAVA, SCALA, SQL and PL/SQL
Big Data Framework and Eco Systems: Hadoop, MapReduceHive, Pig, HDFS, Zookeeper, Sqoop, Solr, Oozie, Yarn, Flume, spark, storm,shark.
No SQL: HBase, Cassandra,MongoDB and MemBase
Web Technologies: JavaScript, CSS, HTML, XHTML, AJAX, XML, XSLT.
Databases: Oracle 8i/9i/10g/11g, MySQL, PostGre SQL and MS-Access
Operating Systems: Windows XP/2000/NT, Linux, UNIX
Tools: Ant, Maven, TOAD, AgroUML, WinSCP, Putty, Lucene
IDE Tools: Eclipse 4.x, Eclipse RCP, NetBeans 6, Editplus
Version Control Tools: CVS, SVN
ETL Tools: PDI / Kettle (Pentaho Data Integration), Ab Initio
PROFESSIONAL EXPERIENCE:
Confidential, MN
Sr Hadoop Developer
Responsibilities:
- Developed Map Reduce jobs in java for data cleansing and preprocessing.
- Moving data from DB2, Oracle Exadata to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked with different file formats and compression techniques to determine standards.
- Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
- Developed and Implemented data integration (ETL) using Sqoop, MapReduce tan automate process in Cloudera environment and build Oozie workflows.
- Responsible for generating GIS domain specific reports for various clients in the organization
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Implemented POC writing programs in Scala and data processed using Spark SQL
- Developed hive queries and UDFS to analyze/transform the data in HDFS.
- Developed hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF’s as per the Business logic.
- Developed user defined functions in pig using Python.
- Analyzing/Transforming data with Hive and Pig.
- Involved in converting Hive/SQL queries into sparktransformations using spark RDDs, Python and Scala.
- Conducted POC for Hadoop and Spark as part of NextGen platform implementation. Implemented recommendation engine using Scala.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Implemented Hadoop Float equivalent to the DB2 Decimal.
- Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data warehouses and Data Marts.
- Involved in End to End implementation of ETL logic.
- Effective coordination with offshore team and managed project deliverable on time.
Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Sqoop,Pig, Scala,Spark,XML, ETL, DB2, Linux, QA, python and Pentaho (PDI/Kettle)
Confidential, Dearborn, MI
Hadoop Developer
Responsibilities:
- Led the team in making valuable decisions by performing analytical operations
- Predicted consumer behavior, such as wat products a particular user has bought and made predictions/recommendations based on recognizing patterns by using Hadoop, Hive and Pig queries.
- Installed and configured Hadoop, MapReduce, and HDFS.
- Developed multiple MapReduce jobs using Java API for data cleaning and preprocessing.
- Importing and exporting data into HDFS and HIVE from a Oracle 11g database using Sqoop
- Responsible to manage data coming from different sources
- Monitoring the running MapReduce programs on the cluster.
- Responsible for loading data from UNIX file systems into HDFS.
- Installed and configured Hive.
- Worked with application teams to install hadoop updates, patches, version upgrades as required.
- Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.0 cluster.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
- Implemented HDFS snapshot feature.
- Experience in Migrating Business reports to Spark, Hive, Pig and Map Reduce.
- Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.0.
- Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
- Involved in Installation and configurations of patches and version upgrades.
- Involved in Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster Monitoring.
- Troubleshooting.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in HDFS maintenance and administering it through Hadoop-Java API
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Installed and configured Pig.
- Wrote Pig scripts to process unstructured data and create structure data for use with Hive.
- Developed the Sqoop scripts in order to make the interaction between Pig and MySQL Database.
- Developed scripts and automated data management from end to end and sync up b/w all the clusters.
Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, HDFS, LINUX, Oozie, Cassandra, Hue, HCatalog, Java, Eclipse, VSS, Red Hat Linux.
Confidential, Atlanta, GA
Hadoop/ Lead Java Developer
Responsibilities:
- Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Designed and developed Oozie workflows for sequence flow of jod execution.
- Mainly working on handling of BigData Analytics and infrastructure of Hadoop, MapReduce
- Got good experience with NoSQL database.
- Performed Map Reduce Programs those are running on the cluster.
- Installed and configured Hive and also written Hive UDFs.
- Implemented CDH3 Hadoop cluster.
- Installing cluster, monitoring/administration of cluster recovery, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented best income logic using Pig scripts.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the Business Intelligence (BI) team.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Writing Hadoop MR programs to get the logs and feed into Cassandra for Analytics purpose
- Building, packaging and deploying the code to the Hadoop servers.
- Developed script to run night batch process using python.
- Unix Scripting to manage the Hadoop Operation stuffs.
- Written Puppet program for installation and configuration of Cloudera Hadoop CDH3u1.
Environment: JDK 1.6, Hadoop, MapReduce, HDFS, Hive, Java, SQL,Datameter, PIG, Sqoop, CentOS,Cloudera, python.
Confidential
Java Developer
Responsibilities:
- Involved in System Analysis that included the high-level design, low-level design, and contributed to the technical architecture of the system.
- Involved in drawing UML diagrams like class diagram, package diagrams, sequence diagrams, activity diagrams.
- Used spring framework to implement MVC architecture.
- Developed UI components like JSP,JSTL, JavaScript and JQuery, Ajax
- Configured applicationContext.xml to integrate Hibernate with spring.
- Wrote named queries using Hibernate Query Language (HQL).
- Implemented Listener classes and configured in web.xml.
- Developed scripts for making asynchronous calls to update the combo boxes across the project using AJAX.
- Involved in setting coding standards and writing related documentation.
- Developed web tier using Struts tag libraries, CSS, HTML, XML, JSP and Servlets.
- Expertise in core Java, Multithreading, JDBC, Shell Scripting and proficient in using Java API's for application development.
- Involved in DB design phase of the project.
- Involved in Bug fixing and tracking.
- Prepared unit level test cases and tested using JUnit.
- E-mail notification by using JavaMail.
- Used MAVEN for dependency management.
Environment: Java/J2EE, JSP, Servlets, JDBC, Scala, Spring2.5,EJB,Java Mail, Web Trends, AJAX, HTML, XML,SQL, PL/SQL, Oracle 9i, JavaScript, shell scripting, ANT, Oracle8i, WSAD 5.1, VSS, Unix, LOG4J,JQuery.
Confidential
Java Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Developed the modules based on struts MVC Architecture.
- Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Created Business Logic using Servlets, Session beans and deployed them on Weblogic server.
- Used MVC struts framework for application design.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUNIT for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.
Environment: Java, HTML, Java Script, Scala, CSS, Oracle, JDBC, Swing and Eclipse.
