Hadoop Developer Resume
Pasadena-cA
SUMMARY
- Over 7 Years of Professional IT experience in Big Data, Hadoop, Java/J2EE technologies in Banking, Retail, Insurance and Telecom sectors.
- Experience in working with Big Data and Hadoop ecosystem with expertise in HDFS, MapReduce, HIVE, PIG, HBase, Sqoop, Oozie, Zookeeper, and Flume.
- Hands on experience in writing Map reduce jobs, Pig Scripts and Hive data modeling.
- Experience in creating real time data streaming solutions and batch style large scale distributed computing applications using Apache Spark, Spark Streaming, Kafka and Flume.
- Hands - on experience in performing analytics using Apache Spark.
- Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
- Experience in installation and setup of various Kafka producers and consumers along with the Kafka brokers and topics.
- Experience using various Hadoop Distributions (Cloudera, Hortonworks, and MapR) to fully implement and leverage new Hadoop features.
- Hands on experience in Sequence files, Combiners, Counters, Dynamic Partitions, Bucketing.
- Experience and in-depth understanding of analyzing data using HIVEQL, PIG Latin.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- In depth Knowledge and expertise in Map side join, Reducer side join, Shuffle & Sort, Distributed Cache, Compression techniques, Multiple Input & output.
- Excellent understanding of Hadoop architecture Map Reduce MRv1 and Map Reduce MRv2 (YARN)
- Experienced in job workflow scheduling and monitoring tools like Oozie.
- Experience in Apache Flume for efficiently collecting, aggregating, and moving large amounts of log data.
- Hands on Experience in requirements gathering, design and development, application migration and maintenance phases of the Software Development Lifecycle (SDLC).
- Adequate knowledge of Agile & Waterfall methodologies.
- Experience in developing Applications using Java, J2EE, JSP, MVC, Servlets, Struts, Hibernate, JDBC, JSF, EJB, XML, AJAX and web based development tools.
- Expertise in web Technologies like HTML, CSS, PHP, XML.
- Experienced in backend development using SQL, stored procedures on Oracle 9i, 10g and 11i.
- Strong programming skills in XML related technologies like XML, XSL, XSLT, parsers like SAX, DOM, JAXP, schemas like DTD, XSD (XML Schema).
- Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus.
- Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of profession.
TECHNICAL SKILLS
Big Data/Hadoop: HDFS, Hadoop MapReduce, Zookeeper, Hive, Pig, HBase, Sqoop, Flume, Oozie.
Real-time/Stream Processing: Apache Storm, Apache Spark
Distributed Message broker: Apache Kafka
Languages: Java, C, C++,SQL,PL/SQL
Databases/NoSQL: HBase, Oracle 9i,10g, MySQL
Java Technologies: HTML, Java Script, XML, ODBC, JDBC, Java Beans, EJB, MVC, Ajax, JSP, Servlets, Struts, Junit, Spring, Hibernate
Web Dev. Technologies: HTML, XML
IDE's: Eclipse, NetBeans
Operating Systems: Linux, Unix, Windows 8, Windows 7,xp
PROFESSIONAL EXPERIENCE
Confidential, Pasadena-CA
Hadoop Developer
Environment: Hadoop, Hortonworks HDP, HDFS, Pig, Hive, MapReduce, Sqoop, Linux, Unix, Flume, Oozie, Spark, Kafka.
Responsibilities:
- Experience in installing, configuring and testing Hadoop ecosystem components like Map Reduce, HDFS, Pig, Hive, Scoop, Flume, Oozie, and HBase.
- Involved in developing a Map reduce framework that filters bad and unnecessary records.
- Implemented various additional features like Row Count, bad record count and data profiling using map reduce jobs.
- Implemented Map side join, Distributed Cache and Multiple Input & output techniques to handle custom business requirements.
- Involved in designing data architecture for semi structured and unstructured data.
- Experience in managing data coming from different sources.
- Experience in writing Spark code to convert unstructured data to structured data.
- Responsible for creating Hive tables and loading the structured data resulted from MapReduce jobs into the tables.
- Implemented map-reduce counters to gather metrics of good records and bad records.
- Built reusable Hive UDF's for business requirements which enabled users to use these UDF's in Hive Querying.
- Used Pig to perform transformations, joins, filter and some pre-aggregations
- Developed customized UDF's in java to extend Hive and Pig functionality.
- Worked with data scientists team to optimize Pig scripts and Map reduce jobs.
- Responsible for developing multiple Kafka Producers and Consumers from scratch as per the software requirement specifications.
- Created, altered and deleted topics (Kafka Queues) when required with varying configurations involving replication factors, partitions and TTL.
- Worked on Hue interface for querying the data.
- Experience in writing Shell scripts and implementing cron jobs.
- Used Zoo Keeper for Cluster co-ordination services.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Utilized Flume to filter out the input data read to retrieve only the data needed to perform the required analytics.
- Involved in Hadoop cluster tasks like adding and removing Nodes without affecting the running jobs.
- Used File System check (FSCK) to check the health of files in HDFS.
- Generated reports and assisted the BI team in data analysis.
- Created Technical documentation for launching Hadoop clusters and for executing Hive and Pig queries.
Confidential, Spring-TX
Hadoop Developer
Environment: Hadoop, Java, UNIX, HDFS, Pig, Hive, MapReduce, Sqoop, NoSQL DB’s, HBase, LINUX, Flume, Oozie.
Responsibilities:
- Involved in writing custom MapReduce, Pig and Hive programs.
- Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
- Worked on Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
- Exported the analyzed data to the relational databases using Sqoop.
- Experience in using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins and multiple input and output.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Developed UNIX shell scripts for creating the reports from Hive data.
- Experienced on loading and transforming of large sets of structured, semi and unstructured data.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Confidential, Akron- OH
Hadoop Developer
Environment: Linux, Hadoop, Hive, HBase, Pig, Flume, Map Reduce, Sqoop, Kerberos, SQL and Oracle.
Responsibilities:
- Explored and used Hadoop ecosystem features and architectures.
- Responsible for building scalable distributed data solutions using Hadoop.
- Developed Map Reduce pipeline jobs to process the data and create necessary Files.
- Extracted the data into HDFS using sqoop.
- Involved in writing Pig Scripts for data cleansing.
- Configured the Hive tables to store Pig script output.
- Expertise in understanding Partitions, Bucketing concepts in Hive.
- Responsible for designing both Managed and External tables in Hive to optimize performance.
- Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
- Used Zookeeper for providing coordinating services to the cluster.
- Made changes to the frontend using JSP, DHTML, CSS, AJAX, JavaScript, Struts, Spring, Java and XML.
Confidential, Beverly Hills - CA
Sr. Java Developer
Environment: Java 1.7, J2EE, Spring, Hibernate, JSP, JDBC, HTML, JavaScript, AJAX, CSS, Oracle 11g, Eclipse, Unix Shell Scripting, CVS, WebSphere Application Server.
Responsibilities:
- Involved in all phases of SDLC (Software development Life Cycle).
- Constant Interaction with the Clients for business requirements.
- Developed the presentation layer using HTML, CSS, JavaScript and jsp's.
- Developed Class, Sequence and Activity Diagrams to understand the Systems architecture.
- Implemented object/relational persistence (Hibernate) for the domain model.
- Configured the Hibernate configuration files to persist the data to the Oracle 11g Database.
- Developed DAOs using DAO Design Pattern to insert and update the data for the Policy Module.
- Performed the server side validations using AJAX.
- Deployed applications on IBM WebSphere Application Server.
- Responsible for troubleshooting the defects, fixing and identifying the source of defects.
- Documented common and recurring issues.
Confidential, Falls church - VA
Java Developer
Environment: Java, J2EE, WebLogic 9.0, Spring 2.0, EJB 3.0, Hibernate 3.0, Eclipse, AJAX/DOJO, JSP 2.2, XML, XSLT, XSD, HTML, Maven, Log4j, JIRA, CVS, Oracle 10g and JUnit.
Responsibilities:
- Involved in the Requirements gathering, design and development phases of the SDLC.
- Used SpringMVC and AJAX/DOJO for client side dynamic functionality.
- Extensively used Hibernate 3.0 in data access layer to access and update the database.
- Responsible for designing XSDs, XML Schema for data transmission.
- Used JAX-WS and JAX-RS services, SOAP, WSDL and UDDI to interact with the external systems.
- Wrote SQL Queries and PL/SQL Stored procedures.
- Used Log4j for logging.
- Developed JUnit test cases for unit testing and JIRA for tracking bugs.
- Involved in deployment, support and migration of production system.
Confidential, Schaumburg - IL
Java Developer
Environment: Java,J2EE,JSP 2.0,Struts,ORACLE,HTML,XML,DOM,SAX,Hibernate 2.0,spring.
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) from requirements gathering, planning, design & development for the project.
- Frequent interactions with clients for requirement gathering and interacting with business analysts to finalize the business requirements.
- Responsible for designing the pipeline for the part lifecycle. (pick, fulfill, ship, receive).
- Designed and developed User Interface of application modules using HTML, JSP, CSS, JavaScript (client side validations), JQuery and AJAX.
- Developed the application using Struts Framework that leverages MVC (Model View Controller) architecture.
- Implemented the persistence layer using Hibernate that use the POJOs to represent the persistence database tables.
- Implemented the Spring framework to support Hibernate and Struts.
- Extensively used Eclipse as an IDE for building, developing and integrating the application.
- Provided SQL scripts and PL/SQL stored procedures for querying the database.
- Developed JUnit test cases for all the modules.
- Provided on call support for production environment and end-to-end post implementation support and maintenance of the application.
Confidential
Java Developer
Environment: Core Java, Servlets, struts, JSP, XML, XSLT, JavaScript.
Responsibilities:
- Performed Requirements gathering and analysis and prepared Requirements Specifications document.
- Provided high level systems design specifying the class diagrams, sequence diagrams and activity diagrams
- Developed a web-based system with HTML5, XHTML, JSTL, custom tags and Tiles using Struts framework.
- Involved in implementation of presentation layer logic using HTML, CSS, JavaScript and XHTML.
- Used Asynchronous JavaScript (AJAX) and XML for a better interactive Front-End.
- Responsible for development of the core backend business logic using Java.
- Developed Servlets, Action classes, Action Form classes and configured the struts-config.xml file.
- Performed client side validations using Struts validation Framework.
- Provided on call support based on the priority of the issues.