Hadoop Developer/administrator Resume
CA
SUMMARY:
- Hadoop Developer/Administrator with 8.5 years of IT experience Including 3.5 years as Hadoop Developer/Administrator, Proficient in Hadoop cluster installation & configuration, Data Migration and Ingestion to NoSQL DB, Data processing & Representation, experience in deployment of Hadoop Ecosystems like Map Reduce, Map Reduce, Yarn, Sqoop,Flume, Pig, Hive, Hbase, Cassandra, Zoo Keeper, Oozie, Spark, Storm, Impala, AWS and Kafka.and Expertise in Object oriented design using Java/J2EE technologies in client/server, Web & Enterprise architecture.
- Enormous experience in Software Development Life Cycle (SLDC) including requirements and system analysis, design, programming, testing, implementation, and application maintenance.
- Experience in Big Data Analytics and Hadoop ecosystem like MapReduce Programming, Spark, Hive, impala, Pig, Sqoop, Hbase, Oozie, Impala, Kafka.
- Experience in writing Hive Queries and Pig for processing and analyzing large volumes of data.
- Knowledge in using Map Reduce programming model for analyzing the data stored in HDFS and experience in writing Map Reduce codes in Java as per business requirements.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Experience in using Storm for real time data processing using Hadoop.
- Expert in creating PIG and HIVE UDFs using java in order to analyze data sets.
- Used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, Hive, Sqoop, Pig, HDFS, Hbase, Zookeeper, Oozie, and Flume.
- Experience in installation, configuration, supporting and managing - Cloudera's Hadoop platform along with CDH4&CDH5 clusters.
- Good knowledge on NOSQL Data bases such as Hbase, also used SPARK for real time streaming of data into the cluster.
- Extensively worked on data extraction, Transformation and loading (ETL) data from various sources like Oracle, SQL Server and flat files.
- Have hands on experience in Sequence files, AVRO and HAR file formats and compression.
- Have dealt with Zookeeper, Oozie, AppWorx and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.
- Used Sqoop in performing last modified and append imports from Oracle to HDFS.
- Efficient in Object oriented programming, integrating and testing software implementations collecting business specifications, user requirements, design confirmation, development and documenting the entire software development life cycle and QA.
- Strong programming skills in design and implementation using Core Java, Collections, JDBC, Multithreading, J2EE, Servlets, EJB, JSP, Struts, JNDI and JMS.
- Expertise in open source frameworks like Spring MVC, Struts, JSF, Web development technologies like AJAX, HTML, CSS, JavaScript, JQuery, AJAX and Object Relational Mapping Technology (ORM) like Hibernate.
- Developed distributed applications using SOAP, RESTful Web Services, HTTP and JMS
- Extensively used IDEs like Eclipse 3.2, MyEclipse, NetBeans and RAD for development activities
- Experience working with open source web server Tomcat 5.0, JBOSS, Jenkins, WebLogic.
- Good hands on experience in UNIX, writing SQL, PL/SQL scripts and queries for RDBMS databases like Oracle and MySQL.
- Experience in building projects and managing the software using Apache MAVEN
- Extensive experience with XML, JSP, HTML, DHTML in web page designing.
- Developed and automated Unix Shell Scripts to execute Java programs for production purposes.
- Extensive experience in working with Agile and Scrum development methodologies.
- Proficient in developing database objects - Packages, Stored Procedures, Triggers, Views, Tables, Synonyms and Sequences.
- Developed efficient SQL queries to improve the performance of the application.
- Good hands on experience in writing SQL, PL/SQL scripts and queries for RDBMS databases like Oracle and MySQL.
TECHNICAL SKILLS:
Programming Languages: C, Java (JDK 1.7), J2EE, XML, Shell Programming, Python, PL/SQL
Big Data Eco System: Hadoop, HDFS, MapReduce, Pig 0.8, Hive0.13, Sqoop 1.4.4, Zookeeper 3.4.5, Yarn, Spark, Impala, Kafka, tableau, Hbase, Cassandra.
UI design/ Scripting: JavaScript, JQuery, AJAX, HTML 5, CSS3, Angular JS, Bootstrap, PERL Script, CQ5
RDBMS: Oracle 10g/9i/8i, MS SQL Server 7.x/2000/2003, DB2, Teradata, Netezza, My Sql
Markup Languages: HTML, XML, XSL, XSLT, DHTML.
Middle ware: Web services, Java beans, EJB, Servlets, JSP, RMI, ESB
Web services: SOAP, RESTful, XML-RPC, WSDL, UDDI, AXIS 1.3/1.4
Application Servers: Web Sphere Application Server (WAS), Web Sphere Process
Server (WPS), Web sphere Portal server: Query Tools
SQL Developer, TOAD, ETL Tool (Pentaho, Embercadro), Lotus Notes, putty, Microsoft Visio, TSOD: IDE
Eclipse 3.2, WSAD 5.1Version control: SVN, CVS, IBM Clear Case, Git Hub, JIRA
PROFESSIONAL EXPERIENCE:
Confidential, CA
Hadoop Developer/Administrator
Responsibilities:
- Responsible for Build/develop/testing shared components that will be used across many modules.
- Imported all the Customer specific personal data to Hadoop using SQOOP component from various relational databases like Netezza and oracle.
- Created tasks for incremental load into staging tables, and schedule them to run.
- Used Apache Kafka as messaging system to load log data, data from UI applications into HDFS system.
- Written Map Reduce code that will take input as customer related flat file and parse the same data to extract the meaningful (domain specific) information for further processing.
- Created Hive External tables with partitioning to store the processed data from Map Reduce.
- Implemented Different analytical algorithms using map reduce programs to apply on top of HDFS data.
- Implemented Hive optimized joins to gather data from different sources and run ad-hoc queries on top of them.
- Wrote Hive Generic UDF's to perform business logic operations at record level.
- Implemented Data classification algorithms using map reduce design patterns.
- Involved in upgrading mapR from version 3.1.0 to 4.1.0.
- Extensively worked on creating combiners, Partitioning, Distributed cache to improve the performance of Map Reduce jobs.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig and used ZooKeeper to coordinate the clusters.
- Involved in loading data from LINUX file system to HDFS.
- Worked on various file formats and compressions Text, Avro, Parquet file formats, snappy, bz2, gzip compression.
- Hadoop installation, configuration of multiple nodes in AWS-EC2 & Cloudera platform.
- Used Pig to do data transformations, event joins, filter and some pre-aggregations before storing the data onto HDFS.
- Implemented test scripts to support test driven development and continuous integration.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Exported the aggregated data onto RDBMS using Sqoop for creating dashboards in the Tableau anddeveloped trend analysis using statistical features.
- Successfully configured the Flume Sink, Source and Channel and ingested the data from Twitter into HDFS
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, and backup & DR systems
- Involved in analyzing system failures, identifying root-cause and recommendation of remediation actions. Documented issue log with solutions for future references.
- Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Cloudera Manager.
- Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
- Created local YUM repository for installing and updating packages.
- Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster. Installation
Environment : Cloudera, Hadoop, Map Reduce, HDFS, Hive, Pig, Oozie, Sqoop, Apache Kafka, flume, Linux, Oracle, Netezza.
Confidential, Minneapolis, MN
Hadoop Developer/Administrator
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Involved in data extraction from distributed RDBMS like Teradata and Oracle.
- Involved in loading data from UNIX file system to HDFS.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Used Map Reduce JUnit for unit testing.
- Worked with BI tools like Tableau for report creation and further analysis from the front end.
- Used Impala to read, write and query the Hadoop data in HDFS and Hbase.
- Troubleshooting the cluster by reviewing Hadoop LOG files. Involved in managing and reviewing Hadoop log files.
- Troubleshoot Single Point of Failure (SPOF) of Hadoop Daemons and recovery procedures.
- Installed and configured Pig for ETL jobs.
- Used real time streaming Storm to check each message against a set of regular expressions.
- Used Oozie to manage the Hadoop jobs.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Used CDH3 and CDH4 distributions for development and deployment.
- Imported data using Sqoop from Teradata using Teradata connector.
- Implemented Partitioning, Dynamic Partitioning, and Bucketing in HIVE.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Used Zookeeper for various types of centralized configurations.
- Involved in maintaining various Unix Shell scripts.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- Used SVN for version control.
- Maintain System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).
- Monitor System health and logs and respond accordingly to any warning or failure conditions.
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, and backup & DR systems.
- Involved in analyzing system failures, identifying root-cause and recommendation of remediation actions. Documented issue log with solutions for future references.
- Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Cloudera Manager.
- Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, NoSQL, Java 1.6 UNIX Shell Scripting, Teradata, Oracle.
Confidential, Dallas
Sr. Java/J2EE Developer
Responsibilities:
- Involved in analysis and design phases of Software Development Life Cycle (SDLC).
- Created detail design document, Use cases and Class Diagrams using UML and Rational Rose.
- Implemented/Developed core J2EE patterns: MVC, DI/IOC, DAO, Interceptors, Business Delegate, Service Locator, Singleton for the enterprise applications.
- Used spring framework to implement the MVC design pattern in the application.
- Front-Tier: Primarily focused on the spring components such as Dispatcher Servlets, Controllers, Model and View Objects, View Resolver.
- Configured struts TILES in spring (applicationContext.xml) for common look and feel.
- Involved in writing Application Context XML (applicationContext.xml) file that contains declarations and other dependent objects declaration.
- Developed Spring Validator Interface for enterprise level validations and developed code for obtaining bean references in spring IoC framework and implemented Dependency Injection (DI/IoC).
- Middle-Tier: Primarily focused on Business Logic using EJB components, such as JavaBeans, Business Delegate, MDBs, JMS, DAOs and Hibernate. Used Stateless Session Beans (EJB) and implemented the business logic.
- Implemented the proxy design pattern using BusinessDelegateProxy and Remote Method Invocations (RMI).
- Achieved asynchronous communication using JMS message listeners and configured the JMS environment by setting up Queue and Topic connection factories.
- Applied annotations for running the XDOCLET and transforming POJOs to EJBs.
- Used Web services - WSDL and SOAP using Apache-AXIS to communicate between the systems.
- Implemented Object-Relational Mapping(ORM) for mapping between the Java classes and Database tables.
- Used EntityBeans and Java Annotations to maintain the database schema.
- Used Container-Managed Persistence and Bean-Managed Persistence to make queries against entities stored in a relational database.
- Involved in writing complex SQL queries using JDBC and stored procedures for the application in Oracle 10g.
- Used My-Eclipse IDE as development environment, Designed, Developed and Deployed EJB Components (EAR) on Web-Sphere.
- Built the application using Apache ANT.
- Developed test cases using Junit and tested the application.
- Used Log4j as the logging framework.
- Installation and setup of the class paths, maintenance and troubleshooting while deployment of software during development and QA phases.
- Involved in Unit integration, bug fixing, Design Reviews, Code Walk through, Client interaction and Team Support.
- Used CVS and SVN as a version control systems for giving version label and checking in the changes
- Used Rational Clear Case for software configuration management and version control.
Environment: J2EE 1.5, Rational Rose, UML 2.0, JSP, Struts 1.2, Spring 2.0, EJB3.0, MDB, JNDI, JMS, Entity Beans, SOAP, WSDL, HTML, JavaScript 1.7, XML, Oracle 10g, JIRA, Junit, ANT 1.7, Log4j, Clear Case, Linux, Eclipse, Web Sphere 6.1, Red Hat Linux.
Confidential, Santa Fe
Confidential
Java/J2EE Developer
Responsibilities:
- Worked in EDBC team and took responsibility for implementing the business policies and fixing bugs in Java by Debugging.
- Actively participated in daily team discussions to plan and complete the tasks for every release.
- Developed Web Services using JAX-RS and JAX-WS Models using the MULE ESB.
- Responsible for implementing the business policies using Corticon Business Rule Management System (BRMS) tool and coding them in Java.
- Developed web services that provide services like add, update and inquire information for any individual in ASPEN.
- Configured and deployed the entire application in IBM WebSphere Application Server in different environments.
- Responsible for coding Java Batch and Restful Web Services
- Modified and Developed DAOs for data access from database.
- Developed and implemented the business logic component in middle tier using EJB framework including stateless session bean classes and Entity bean classes.
- Extensively used Eclipse IDE for developing the project code and debugging to fix the issues mentioned in work requests.
- Extensively used the JSP’s and JavaScript to make any necessary changes to the functionality of application when dealing with new CRs.
- Extensively used the SQL queries using SQL Developer for getting details from DB to resolve the issues in functionality in the application.
- Extensively used MVC Framework, Fast4j developed by Deloitte, in entire development process.
- Used IBM Clear Case as code repository, version control and for check-in and check-out of code into the development environment.
- Extensively used IBM Clear Quest tool for defect tracking and bug fixing in the system and assigning the work requests.
- Responsible for the code fix, unit testing and Integration testing in entire development process.
- Developed stored procedures, triggers to create batch jobs and to trigger events according to the business requirements.
- Consistently maintained the coding standards throughout the development process.
- Interacting with the clients and QA in discussing the specific functionality and guiding them for testing application with test scripts prior moving fix to the production environment.
Environment: Java/J2EE, EJB, Fast4j MVC Framework, Corticon (BRMS), Oracle 11g, SQL Developer, Eclipse IDE, JSP, JavaScript, IBM Clear case, IBM Clear Quest, SOAP Web services.
Confidential
Java Developer
Responsibilities:
- Involved in design and development of architecture of the application using MVC Model 2 design patterns using JSP and Servlets.
- Designed and developed interactive static HTML screens as screen level prototype.
- Developed JavaScript for client side validation and developed Cascading Style Sheet (CSS).
- Involved in design and development of JSP based presentation layer for web based account inquiry using Struts custom tags, DHTML, HTML, and JavaScript.
- Used Servlets 2.3 for processing business rules.
- Developed server side application, which handles the database manipulation with the back-end Oracle, database-using JDBC.
- Deployed the application components into Apache Tomcat web server.
Environment: JDK 1.4, Servlets 2.3, JSP 1.2, JDBC, JavaScript, CSS, HTML, DHTML, Ant, Log4j, JUnit, Apache Tomcat web server, Oracle8i.
Confidential
Java Developer
Responsibilities:
- Used WebSphere, which has high performance and full-integrated Java platform for
- Enterprise Applications
- Actively involved in component development, deployment for the application interface.
- Strongly followed the coding standards and implemented MVC Design Patterns.
- Involved in creating EJBs that handle business logic and persistence of data.
- Involved in impact analysis of Change requests and Bug fixes.
- Unit testing and integration testing of the modules.
- Integrated the modules with the other modules of the system.
- Java Naming/Directory Interface (JNDI) to support transparent access to distributed components.
Environment: Sybase, WebSphere Studio Application Developer WSAD, Enterprise Java Beans (EJB), Struts, WebSphere Application Server, HTML, Java.