Hadoop/ Big Data Developer Resume
Atlanta, GA
SUMMARY:
- Around 8 years of experience spread across Hadoop, Java and ETL, that includes extensive experience into Big Data Technologies and in development of standalone and web applications in multi - tiered environment using Java, Hadoop, Hive, HBase, Impala, Pig, Sqoop, J2EE Technologies (Spring, Hibernate), Oracle, HTML, Java Script.
- Extensive experience on BigData Analytics with hands on experience in writing MapReduce jobs on Hadoop Ecosystem including Hive and Pig.
- Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
- Experience in using Cloudera Manager for installation and management of single-node and multi-node Hadoop cluster (CDH3, CDH4 & CDH5).
- Excellent knowledge on Hadoop architecture; as in HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
- The system consists of several applications, highly distributive, scalable and large in nature using Cloudera Hadoop.
- Experience with distributed systems, large scale non-relational data stores, MapReduce
- Systems, data modeling, and big data systems
- Experience on Apache, Cloudera, Hortonworks Hadoop distributions
- Involved in developing solutions to analyze large data sets efficiently
- Excellent hands on with importing and exporting data from different Relational Database Systems like Mysql and Oracle into HDFS and Hive and vice-versa, using Sqoop
- Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie
- Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java
- Experience with web-based UI development using jQuery, ExtJS, CSS, HTML, HTML5, XHTML and Java script
- Good knowledge on Amazon AWS concepts like EMR & EC2 web services which provides fast and efficient processing of Big Data
- Efficient in packaging & deploying J2EE applications using ANT, Maven & Cruise Control on WebLogic, WebSphere & JBoss. Worked on the performance & load test related tools like JProfiler and JMeter.
- Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper
- Experience with databases like DB2, Oracle 9i, Oracle 10g, MySQL, SQL Server, lambda and MS Access
- Experience in creating complex SQL Queries and SQL tuning, writing PL/SQL blocks like stored procedures, Functions, Cursors, Index, triggers and packages
- Very good understanding on NOSQL databases like MongoDB and HBase
- Have good Knowledge in ETL and hands on experience in Informatica ETL
- Extensive experience in creating Class Diagrams, Activity Diagrams, Sequence Diagrams using Unified Modeling Language (UML)
- Developing applications using all Java/J2EE technologies like Servlets, JSP, EJB, JDBC, JNDI, JMS etc.
- Experienced in SDLC, Agile (SCRUM) Methodology, Iterative Waterfall
- Experience in developing test cases, performing Unit Testing, Integration Testing, experience in QA with test methodologies and skills for manual/automated testing using tools like WinRunner, JUnit
- Experience with various version control systems Clear Case, CVS, SVN
- Expertise in extending Hive and Pig core functionality by writing custom UDFs
- Development Experience with all aspects of software engineering and the development life cycle
- Strong desire to work for a fast-paced, flexible environment
- Proactive problem solving mentality that thrives in an agile work environment
- Good Experience on SDLC (Software Development Life cycle)
- Exceptional ability to learn new technologies and to deliver outputs in short deadlines
- Worked with developers, DBAs, and systems support personnel in elevating and automating successful code to production Authorized to work in United States for any employer.
AREAS OF EXPERTISE:
- Apache
- Eclipse
- J2EE
- Java
- REST
- MapReduce
- HDFS
- Hive
- Pig
- Hue
- Oozie
- Core Java
- Perl/Shell scripts
- Eclipse
- Hbase
- Flume
- Spark
- Kafka
- Cloudera Manager
- Cassandra
- REST API
- Python
- Greenplum DB
- IDMS
- VSAM
- SQL*PLUS
- Toad
- Putty
- Windows NT
- UNIX Shell Scripting
- Pentaho
- Talend
- Bigdata
- YARN
TECHNICAL SKILLS:
Languages: C, C++,Java/J2EE,Python,SQL,HiveQL, PIGLatin
Hadoop Ecosystem: HDFS, MapReduce,MRUnit, YARN, Hive, Pig, HBase, Impala, Zookeeper, Sqoop, Oozie, Apache Cassandra, scala, Flume, Spark, Apache ignite, Avro, AWS.
Framework: Core Spring, Spring DAO, Spring MVC, Hibernate
Web/Application Servers: Jetty, Apache Tomcat
Scripting Languages: JavaScript, jQuery, AJAX, JSTL, CSS XML DOM, SAX, DTD, XSD, SOAP, REST, JAXB, XSL, XSLT
Databases: Oracle, MySQL, MS SQL Server 2005, Derby, MS Access
OS: MS-Windows 95/98/NT/2000/XP/7, Linux, Unix, Solaris 5.1
Version Control Tools: SVN, CVS, Git
Tools: Eclipse Maven, ANT, JUnit, TestNG, Jenkins, Soap UI, Putty, Log4j
PROFESSIONAL EXPERIENCE:
Hadoop/ Big data Developer
Confidential, Atlanta, GA
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in writing Map Reduce jobs.
- Involved in Sqoop, HDFS Put or Copy from Local to ingest data.
- Used Pig to do transformations, event joins, filter boot traffic and some pre - aggregations before storing the data onto HDFS.
- Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive tables.
- Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
- Develop and maintain operational best practices for smooth operation of large Hadoop clusters
- Involved in loading data from UNIX file system to HDFS. Installed and configured Hive and also written Hive UDFs.and Cluster coordination services through Zoo Keeper.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Knowledge in performance troubleshooting and tuning Hadoop clusters
- Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Extracted and updated the data into Monod using Mongo import and export command line utility interface.
- Extracted and updated the data into Monod using Mongo import and export command line utility interface. Involved in using SQOOP for importing and exporting data into HDFS.
- Used Eclipse and ant to build the application. Proficient work experience with NOSQL, Monod databases. Also the HDFS data from Rows to Columns and Columns to Rows.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move the data files within and outside of HDFS.
Environment: MapReduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Perl/Shell scripts, Eclipse, Hbase, Flume, Spark, Kafka, Cloudera Manager, Cassandra, REST API, Python, Greenplum DB, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, Pentaho, Talend, Bigdata, YARN.
Sr Hadoop/ Big Data Developer
Confidential, San Jose, CA
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
- Developed data pipeline using Flume, Sqoop, Pig and MapReduce to ingest customer behavioral data and purchase histories into HDFS for analysis
- Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs
- Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre - aggregations before storing the data onto HDFS
- Written Hive queries to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Developed Hadoop streaming Map/Reduce works using Python.
- Installation, configuration and administration experience in Big Data platforms Cloudera CDH, Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems
- Real time stream processing with Apache Kafka and Apache Storm
- Written Apache Spark streaming API on Big Data distributions in the active cluster environment.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way
- Worked in AWS environment for development and deployment of Custom Hadoop Applications.
- Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud
- As a Hadoop Developer responsibilities include software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks & Cloudera)
- Experienced in managing and reviewing the Hadoop log files
- Working with Apache Crunch library to write, test and run HADOOP MapReduce pipeline jobs
- Involved in joining and data aggregation using Apache Crunch
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS. Worked on Oozie workflow engine for job scheduling
- Installed and configured Storm, Spark, Kafka, Solr, Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters.Loaded the aggregated data onto DB2 for reporting on the dashboard
- Project also involved building analytical reports using Sql server and Excel
- Monitoring and Debugging Hadoop jobs/Applications running in production
- Worked on Providing User support and application support on Hadoop Infrastructure
- Reviewing ETL application use cases before on boarding to Hadoop
- Worked on Evaluating, comparing different tools for test data management with Hadoop
- Helped and directed testing team to get up to speed on Hadoop Application testing
- Worked on Installing 20 node UAT Hadoop cluster
- Created ETL jobs to generate and distribute reports from MySQL database using Pentaho Data Integration
- Responsible for analyzing multi-platform applications using python
- Created ETL jobs using Pentaho Data Integration to handle the maintenance and processing of data.
Environment: JDK1.6, RedHat Linux, HDFS, CDH 5.3, Maven, Impala, Storm, Python, Mahout, Kafka, AWS, Map-Reduce, Apache Crunch, Hive, Pig, Sqoop, SQL server, Flume, Spark, lambda, Zookeeper, Oozie, DB2, HBase and Pentaho.
Java/J2EE Developer
Confidential, Austin, TX
- Team lead and Java Application developer for an application called ORION to service Discover Card
- Members and Orion Admin, Work Load Manager (WLM) for Admin users.
- Web Application developer for ORION admin, an application used by the Business partners in Discover
- Financial Services, a leading premium credit card Company.
- Java Application developer for Interactive Collections Environment (ICE), an application used by the
- Collection agents in Discover Financial Services, a Leading premium credit card Company.
- Java developer and support analyst for Incubator project as a part of HSBC Credit card domain application.
- The main functionality of this application is to load the Credit card offers in site and check the credit card valid or not and add the offer to the particular Card and if the offer is not available in the specific site, it redirect to domain which is related to the client requirements.
Software Engineer
Confidential, Sunnyvale, CA
Responsibilities:
- Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
- Developed Class diagrams, Sequence diagrams using visual paradigm.
- Developed modules of the application in core java and EJB.
- Produced web service using WSDL/SOAP/REST standard.
- Deployed and built the application using Maven.
- Extensively used Log4j for logging throughout the application.
- Implemented J2EE design patterns like Singleton Pattern and Factory Pattern
- Developed the UI pages using HTML, CSS, Java script, jQuery, JSP and tag libraries.
- Designed Java Servlets and Objects using J2EE standards.
- Coded HTML, JSP and Servlets internal application using Angular JS.
- Implemented the Business logic using Java Spring Transaction Spring AOP.
- Created new connections through application coding for better access to DB2 database and involved in writing
- SQL & PL SQL - Stored procedures, functions, sequences, triggers, cursors, object types etc.
- Implemented application using Struts MVC framework for maintainability.
- Involved in testing and deploying in the development server.
- Involved in the design tables of the database in Oracle.
- Worked on mainframes and have hands-on experience on Endeavor tool.
- Confidential in order to familiarize associates with domain- knowledge and technicalities involved in IT industry.
Environment: Java 1.5, MVC, Web Services, SOAP, XSLT, XML, XSD, JSTL, JSP, Adobe Flex 4.5, EJB2.0, Eclipse 3.6, Subversion, Web logic 10.3 Application Server, HTML5, Jboss 5.0 Runtime Server, Apache Tomcat 7.0.12, Log4J,Jasper Reports, JUnit, SQL Server, JProfiler, Xpath and Maven.
Software Developer
Confidential, Orlando, FL
Responsibilities:
- Written ANT Scripts to deploy the application into Tomcat application server for dev.
- Monitored UNIX Switches utilizing Exceed, IBM Tivoli Netcool, Dotcom Monitor, WebStats and Nagios. Monitored network status via HP OpenView.
- Developed integrated systems by Implementing dozer mapping, Java, Spring, JAXB Diagnose and solve Application performance and stability issues.
- Facilitated Daily Scrum, Sprint Planning, Sprint Review, & Sprint Retrospectives
- Developed the application using J2EE Design Patterns like Delegate, Singleton, and DAO
- Consumed web services from different applications within the network.
- Developed Custom Tags to simplify the JSP2.0 code. Designed UI screens using JSP 2.0, CSS, XML1.1 and HTML. Used JavaScript for client side validation.
- Used Spring 2.5 Framework for Dependency injection and integrated with Hibernate and Struts frameworks. Developed and implemented UI controls and APIs with ExtJS.
- Created shell and perl scripts required in the project maintenance and software migration.
- Built a framework for Agile Project and Program management office and aligned processes and tools. Implemented scalable server code and conducted unit testing.
- Configured Hibernate's second level cache using EHCache to reduce the number of hits to the configuration table data.
- Designed and developed Utility Class that consumed the messages from the Java messageQueue and generated emails to be sent to the customers. Used Java Mail API for sending emails.
- Used JUnit framework for unit testing of application and Log4j 1.2 to capture the log that includes runtime exceptions.
- Used CVS for version control and used IBM RAD 6.0 as the IDE for implementing the application.
