Hadoop Developer Resume
Alpharetta, GA
SUMMARY
- Extending Hive and Pig core functionality by writing custom UDFs.
- Over 7+ years of experience in software development life cycle design, development and support of systems application architecture.
- 3+ Years of Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
- 3 Years of experience with Cloudera distribution.
- In depth knowledge and hands on experience in Hadoop (HDFS, MapReduce, Pig, Hive, Scoop, Flume,Hbase, Oozie).
- Hands on experience in installing, configuring, monitoring and integration of Hadoop ecosystem components likeMapReduce, HDFS, Hbase, Hive, Oozie, Sqoop, Pig and Flume.
- Good experience in Cloudera, Hortonworks & Apache Hadoop distributions.
- Extensively worked on MRV1 and MRV2 Hadoop architectures.
- Hands on experience in writing MapReduce programs, Pig & Hive scripts.
- Designing and creating Hive external tables using shared meta - store instead of derby with partitioning, dynamic partitioning and buckets.
- Experience in bi-directional data pipelines from HDFS to Relational Database with Sqoop.
- Expertise in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
- Extensively used Kafka to load the log data from multiple sources directly into HDFS.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Good Experience in using Teradata & Structure Query Language.
- Having good knowledge on using Teradata utilities like Fast Export, Fast Load & BTEQ’s etc.
- Experience in setting up Nagios and Ganglia monitoring tools forHadoopClusters.
- Good knowledge on Authentication management tools like Kerberos.
- Excellent understanding and knowledge of NOSQL databases like MongoDB, HBase and Cassandra.
- Developed free text search solution with Hadoop and Solr to analyze emails.
- Familiar with Java virtual machine (JVM) and multi-threaded processing.
- Expertise in database design, creation and management of schemas, writing Stored Procedures, Functions, DDL, DML, SQL queries & Modeling.
- Proficient in using RDMS concepts with Oracle, SQL Server and MySQL.
- Developed UML Diagrams for Object Oriented Design: Use Cases, Sequence Diagrams and Class Diagrams using Rational Rose,Visual Paradigm and Visio.
- Good Working experience in using different Spring modules like Spring Core Container Module, Spring Application Context Module, Spring MVC Framework module, Spring ORM Module in Web applications.
- Aced the persistent service, Hibernate and JPA for object mapping with database. Configured XML files for mapping and hooking it with other frameworks like Springs, Struts.
- Working knowledge of database such as Oracle 8i/9i/10g.
- Strong experience in database design, writing complex SQL Queries and Stored Procedures.
- Have extensive experience in building and deploying applications on Web/Application Servers like Weblogic, Websphere, and Tomcat.
- Worked on different data formats like structured, semi-structured and unstructured data.
- Strong work ethics with desire to succeed and make significant contributions to the organization.
TECHNICAL SKILLS
Hadoop/Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper, Apache Kafka and Kerberos
Programming Languages: Java JDK 1.4/1.5/1.6/1.8 (JDK 5/JDK 6), C/C++, HTML, SQL, PL/SQL, AVS & JVS
Frameworks: Hibernate 2.x/3.x, Spring 2.x/3.x,Struts 1.x/2.x
Web Services: WSDL, SOAP, Apache CXF/XFire, Apache Axis, REST, Jersey
Client Technologies: JQUERY, Java Script, AJAX, CSS, HTML 5, XHTML
Operating Systems: UNIX, WINDOWS, LINUX
Application Servers: IBM Web sphere, Tomcat, Web Logic, Web Sphere
Web technologies: JSP, Servlets, JNDI, JDBC, Java Beans, JavaScript, Web Services(JAX-WS)
Databases: Oracle 8i/9i/10g & MySQL 4.x/5.x
Java IDE: Eclipse 3.x, IBM Web Sphere Application Developer, IBM RAD 7.0
Tools: TOAD, SQL Developer, SOAP UI, Visio, Rational Rose, Endur 8.x/10.x/11.x
PROFESSIONAL EXPERIENCE
Confidential, Alpharetta, GA
Hadoop DeveloperResponsibilities:
- Responsible for building scalable distributed data pipelines using Hadoop.
- Used Apache Kafka for tracking data ingestion to Hadoop cluster.
- Wrote pig scripts to dedup kafka hourly data and perform daily roll ups.
- Data Migration from existing Teradata systems to HDFS and build datasets on top of it.
- Built a framework using SHELL scripts to automate Hive registration, Which does dynamic table creation and automated way to add new partitions to the table.
- Designing and creating Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.
- Responsible and managed entire Hive warehouse.
- Setup and benchmarked Hadoop/HBase clusters for internal use. Developed Simple to complex MapReduce programs.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Import/Export analyzed and aggregated data to Teradata & MySQL databases using SQOOP.
- Developed Oozie workflows that chain Hive/MapReduce modules for ingesting periodic/hourly input data.
- Wrote Pig & Hive scripts to analyse customer data and detect user patterns.
- Continuous monitoring and managing the Hadoop cluster by using Cloudera Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Developed ETL pipelines to source data to Business intelligence teams to build visualizations.
- Involved in unit testing, interface testing, system testing and user acceptance testing of the workflow
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (JDK 1.6), Eclipse.
Confidential, Atlanta, GA
Hadoop developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Pig and loaded data into HDFS.
- Hands on experience in loading data from UNIX file system to HDFS.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- Built and maintained scalable data pipelines using the Hadoop ecosystem and other open source components like Hive, and Cassandra instead of HBase.
- Managing and scheduling of Jobs on a Hadoop cluster using Oozie.
- Created tables using Hive and queries are performed using HiveQL which will invoke and run MapReduce jobs automatically.
- Involved in creating Hive tables, loading data and running hive queries in those data.
- Extensive working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
- Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts.
- Working knowledge in writing Pig’s Load and Store functions.
- Assisted application teams in installing Hadoop updates, operating system, patches and version upgrades when required.
Environment: Amazon EC2, Apache Hadoop 1.0.1, MapReduce, HDFS, CentOS 6.4, Hbase, Hive, Pig, Oozie, Flume, Java (jdk 1.6), Eclipse
Confidential, Houston, TX
Hadoop Developer
Responsibilities:
- Migrated the needed data from MySQL into HDFS using Sqoop and importing various formats of flat files in to HDFS.
- Mainly worked on Hive queries to categorize data of different claims.
- Integrated the hive warehouse with HBase
- Involved in loading data from LINUX file system to HDFS
- Written customized Hive UDFs in Java where the functionality is too complex.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Designing and creating Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Responsible to manage the test data coming from different sources
- Reviewing peer table creation in Hive, data loading and queries.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
- Involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
- Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Core Java, Pig, Sqoop, Cloudera CDH4, Oracle, MySQL, Tableau.
Confidential, Buffalo, NY
Java & J2EE developer
Responsibilities:
- Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.
- Implemented various J2EE Design Patterns such as Model-View-Controller, Data Access Object, Business Delegate and Transfer Object.
- Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.
- Involved in configuring Struts, Tiles and developing the configuration files.
- Developed Struts Action classes and Validation classes using Struts controller component and Struts validation framework.
- Developed and deployed UI layer logics using JSP, XML, JavaScript, HTML /DHTML.
- Used Spring Framework and integrated it with Struts.
- Involved in Configuring web.xml and struts-config.xml according to the Struts framework.
- Provided connections using JDBC to the database and developed SQL queries to manipulate the data.
- Developed DAO using spring JDBC Template to run performance intensive queries.
- Developed ANT script for auto generation and deployment of the web service. Used log4j to perform logging in the applications.
Environment: Java, J2EE, Struts MVC, Tiles, JDBC, JSP, JavaScript, HTML, Spring IOC, Spring AOP, JAX-WS, Ant, Web sphere Application Server, Oracle, JUNIT and Log4j, Eclipse.
Confidential
Java /J2EE Developer
Responsibilities:
- Responsible for gathering and analyzing requirements and converting them into technical specifications.
- Used Rational Rose for creating sequence and class diagrams. Developed presentation layer using JSP, Java, HTML and JavaScript.
- Designed and developed a ‘Convention Based Coding’ utilizing Hibernate’s persistence framework and O-R mapping capability to enable dynamic fetching and displaying of various table data with JSF tag libraries.
- Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively. Used HQL and SQL for fetching and storing data in databases.
- Participated in the design and development of database schema and Entity-Relationship diagrams of the backend Oracle database tables for the application.
- Used Spring Core Annotations for Dependency Injection. Implemented web services with Apache Axis.
- Designed and Developed Stored Procedures, Triggers in Oracle to cater the needs for the entire application. Developed complex SQL queries for extracting data from the database.
- Designed and built SOAP web service interfaces implemented in Java.
- Used Apache Ant for the build process.
Environment: Java, JDK 1.5, Servlets, Hibernate, Ajax, Oracle 10g, Eclipse, Apache Ant, Web Services (SOAP), Apache Axis, Apache Ant, Web Logic Server, JavaScript, HTML, CSS, XML.