Sr. Hadoop Developer Resume
Lynchburg, VA
SUMMARY
- Having 7+ years of IT experience as a Developer, Designer & quality reviewer with cross platform integration experience using Hadoop, Java, J2EE and SOA.
- Strong experience with Hadoop components: Hive, Pig, HBase, Zookeeper, Sqoop and Flume.
- Hands on Experience using Hadoop components like Hadoop Map Reduce (MR1), YARN (MR2), HDFS, Hive, Pig, Avro, Deflate, Flume and Sqoop.
- Hands - on experience in setting up Apache Hadoop and Confidential CDH clusters on Ubuntu, Fedora and other Linux distributions environments.
- Experience with HDSF Federation and High Availability.
- Experience working on NoSQL databases including Hbase and data access using HIVE.
- Experienced in performing analytics on structured data using Hive queries, operations.
- Experience in handling different file formats like text files, Sequence files, Avro data files using different SerDe's in Hive.
- Worked On Apache Spark.
- Experience with a variety of data formats and protocols such as JSON, and AVRO.
- Hands on experience in dealing with Compression Codecs like Snappy, and Gzip.
- Good working experience using Sqoop to import data into HDFS from RDBMS.
- Hands on experience on apache and Confidential Hadoop and Horton works environments.
- Strong understanding of Hadoop daemons and MapReduce concepts.
- Experienced in importing-exporting data into HDFS format.
- Experienced in analyzing big data using Hadoop environment.
- Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Experienced on loading and transforming of large sets of semi structured and unstructured data using Pig Latin operations.
- Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
- Experienced in developing UDFs for Hive using Java.
- Experienced in using Flume to transfer log data files to Hadoop Distributed File System (HDFS)
- Experience in Multiple Relational Databases like Oracle 10g and NOSQL database Hbase.
- Strong understanding of databases like HBase, MongoDB & Cassandra.
- Extensive experience in MVC (Model View Controller) architecture, design, development of multi-tier enterprise applications for J2EE platform/SOA using Java, JDBC, Servlets, EJB, Struts, Tag Libraries, Hibernate, and XML.
- Strong front-end UI development skills using scripting languages like JSP, HTML, JavaScript, JQuery and CSS.
- Extensive experience in design, development and support Model View Controller using Struts and Spring framework.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
- Develop reusable solution to maintain proper coding standard across different java project.
- Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
- Expertise in debugging and optimizing Oracle and java performance tuning with strong knowledge in Oracle 11g and SQL
- Ability to work effectively in cross-functional team environments and experience of providing training to business users.
- Effective leadership quality with good skills in strategy, business development, client management and project management
- Excellent global exposure to various work cultures and client interaction with diverse teams.
TECHNICAL SKILLS
Languages/Tools: Java, C, C++, SQL/PLSQL, Python, Shell Scripting.
Hadoop: HDFS, MapReduce, Confidential, HIVE, PIG, HBase, SQOOP, Oozie, Zookeeper, Spark, Kafka, Storm, MongoDB
J2EE Standards: JDBC, JNDI, JMS, Java Mail & XML Deployment Descriptors
Web/Distributed Technologies: J2EE, Servlets, JSP, Struts, Hibernate, EJB, XML, MVC, Struts, Spring.
Operating System: Windows, UNIX, multiple flavors of Linux.
Databases / NO SQL: Oracle 10g, MS SQL Server 2000, DB2, MS Access & MySQL,Cassandra,MongoDB
App/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5, BEA Web logic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.
Testing &Case Tools: Junit, Log4j, Rational Clear case, CVS, ANT, JBuilder.
Version Control Systems: Git, SVN, CVS
PROFESSIONAL EXPERIENCE
Confidential, Lynchburg, VA
Sr. Hadoop Developer
Responsibilities:
- Worked on analyzing, writing Hadoop MapReduce jobs using JavaAPI, Pig and Hive.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from edge node to HDFS using shell scripting.
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Used Sqoop job to import the data from RDBMS using Incremental Import.
- Customized Avro tools used in MapReduce, Pig and Hive for deserialization and to work with Avro ingestion framework.
- Worked with using different kind of compression techniques to save data and optimize data transfer over network using Lzo, Snappy, etc.
- Analyze large and critical datasets using Confidential, HDFS, Hbase, MapReduce, Hive, Hive UDF, Pig, Sqoop, Zookeeper, & Spark. customized Flume interceptors to encrypt and mask customer sensitive data as per requirement
- Continuous monitoring and managing the Hadoop cluster using Confidential Manager .
- Worked with NoSQL database Hbase to create tables and store data.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Used Pig to store the data into HBase.
- Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using Hive QL
- Used Pig to parse the data and Store in Avro format.
- Stored the data in tabular formats using Hive tables and Hive SerDe's.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Designed and implemented MapReduce-based large-scale parallel relation-learning system.
- Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
- Implemented a script to transmit sysprint information from Oracle to Hbase using Sqoop.
- Implemented best income logic using Pig scripts and UDFs.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented MapReduce programs to handle semi/ unstructured data like XML, JSON, Avro data files and sequence files for log files.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts withPig and Sqoop.
Environment: Hadoop, HDFS, Pig, Sqoop, Spark, MapReduce, Confidential, Avro, Snappy, Zookeeper, NoSQL, HBase, Shell Scripting, Ubuntu, Linux Red Hat.
Confidential, Princeton, NJ
Sr. Hadoop Developer
Responsibilities:
- Defined, designed and developed Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
- Developed workflow using Oozie for running Map Reduce jobs and Hive Queries.
- Worked on loading log data directly into HDFS using Flume.
- Worked on Confidential to analyze data present on top of HDFS
- Responsible for managing data from multiple sources.
- Load data from various data sources into HDFS using Flume.
- Created and maintained Technical documentation for launching Confidential Hadoop Clusters and for executing Hive queries and Pig Scripts
- Designed and implemented MapReduce-based large-scale parallel relation-learning system
- Successfully loaded files to Hive and HDFS from Mongo DB Solar.
- Familiarity with a NoSQL database such as MongoDb Solar.
- Successfully loaded files to Hive and HDFS from Mongo DB Solar.
- Extracted files from MySQL through Sqoop and placed in HDFS and processed.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Created Hbase tables to store various data formats of PII data coming from different portfolios.
- Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files into Hadoop
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
- Involved in loading data from LINUX file system to HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Worked on processing unstructured data using Pig and Hive.
- Supported MapReduce Programs those are running on the cluster.
Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Maven, Hudson/Jenkins, Ubuntu, Linux Red Hat, Mongo DB.
Confidential, Ashburn, VA
Hadoop Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
- Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
- Developed data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis.
- Continuous monitoring and managing the Hadoop cluster using Confidential Manager.
- Used Pig to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into Hbase.
- Participated in development/implementation of Confidential Hadoop environment.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Worked with Zookeeper, Oozie, AppWorx and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.
- Designed and built the Reporting Application, which uses the Spark SQL to fetch and generate reports on HBase table data.
- Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
- Involved in running MapReduce jobs for processing millions of records.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
- Developed Hive queries and Pig scripts to analyze large datasets.
- Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
- Involved in generating the Adhoc reports using Pig and Hive queries.
- Used Hive to analyze data ingested into Hbase by using Hive-Hbase integration and compute various metrics for reporting on the dashboard.
- Developed job flows in Oozie to automate the workflow for pig and hive jobs.
- Loaded the aggregated data onto Oracle from hadoop environment using Sqoop for reporting on the dashboard.
Environment: RedHat Linux, HDFS, Map-Reduce, Hive, Java JDK1.6, Pig, Sqoop, Flume, Zookeeper, Oozie, Oracle, HBase.
Confidential, Pasadena, CA
Java Developer
Responsibilities:
- As part of the lifecycle development prepared class model, sequence model and flow diagrams by analyzing Use cases using Rational Tools.
- Reviewing and analyzing data model for developing the Presentation layer and Value Objects.
- Involved in developing Database access components using Spring DAO integrated with Hibernate for accessing the data.
- Extensive use of Struts Framework for Controller components and view components.
- Involved in writing the exception and validation classes using Struts validation rules.
- Involved in writing the validation rules classes for general server side validations for implementing validation rules as part observer J2EE design pattern.
- Used Hibernate for the persistence of the project.
- Used Spring AOP and Dependency injection during various modules of project.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Spring framework was used for dependency injection and was integrate with different frameworks like Struts, Hibernate
- Developed various java objects (POJO) as part of persistence classes for OR mapping.
- Developed web services using SOAP and WSDL with Axis.
- Implemented EJB (Message Driven Beans) in the Service Layer.
- Involved in working with JMS MQ Queues (Producers/Consumers) in Sending and Receiving Asynchronous messages via MDB’s.
- Developed, implemented, and maintained an asynchronous, AJAX based rich client for improved customer experience using XML data and XSLT templates.
- Involved in writing the parsers for parsing and building the XML documents using SAX and DOM Parsers.
- Developed SQL stored procedures and prepared statements for updating and accessing data from database.
- Used JBoss for deploying various components of application and MAVEN as build tool and developed build file for compiling the code of creating WAR files.
- Used CVS for version control.
- Performed Unit testing and rigorous integration testing of the whole application.
Environment: Java, J2EE, EJB, JMS, Strut, JBoss, Hibernate, JSP, JSTL, AJAX, CVS, JavaScript, HTML, XML, MAVEN, SQL, Oracle, SOA, SAX and DOM Parser, Web Services (SOAP,WSDL), Spring, Windows.
Confidential
Java Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC)
- Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
- Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
- Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
- Implemented Model View Controller (MVC) architecture using Jakarta Struts frameworks at presentation tier.
- Developed a Dojo based front end including forms and controls and programmed event handling.
- Implemented SOA architecture with web services using JAX-RS (REST) and JAX-WS (SOAP)
- Developed various Enterprise Java Bean components to fulfill the business functionality.
- Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
- Validated all forms using Struts validation framework and implemented Tiles framework in the presentation layer.
- Used Core java and object oriented concepts.
- Extensively used Hibernate in data access layer to access and update information in the database.
- Used Spring Framework for Dependency injection and integrated it with the Struts Framework and Hibernate.
- Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
- Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
- Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
- Deployed application on windows using IBM Web Sphere Application Server.
- Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
- Used Web Services - WSDL and REST for getting credit card information from third party and used SAX and DOM XML parsers for data retrieval.
- Implemented SOA architecture with web services using Web Services like JAX-WS.
- Used ANT scripts to build the application and deployed on Web Sphere Application Server
Environment: Core Java, J2EE, Oracle, SQL Server, JSP, Struts, Spring, JDK, Hibernate, JavaScript, HTML, CSS, AJAX, Junit, Log4j, Web Services, Windows.