Hadoop Consultant Resume
Richardson, TX
SUMMARY:
- Over 7 years of professional IT experience which includes 3+ years of experience in Hadoop, Big data ecosystem related technologies.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node, Resource Manager, Node Manager and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components likeHadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper, Flume and kafka.
- Experience on Apache Hadoop technologies Hadoop distributed file system (HDFS), MapReduce framework, YARN, Pig, Hive, HCatalog, Sqoop, Flume and Kafka.
- Extensive hold over Hive and Pig core functionality by writing custom UDFs.
- Led many Data Analysis & Integration efforts involving HADOOP along with ETL.
- Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS. Good Knowledge on Hadoop Cluster administration, monitoring and managing Hadoop clusters using Cloudera Manager.
- In - depth understanding of Data Structure and Algorithms.
- Experience in managing and reviewing Hadoop log files.
- Experience in NoSQL database HBase.
- Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2E E design patterns and Core Java design patterns.
- Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience in Java, JSP, Servlets, WebLogic, WebSphere, JDBC, XML, and HTML
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
TECHNICAL SKILLS:
Programming /Scripting Languages: Java, C, J2EE, Unix Shell / Python Scripts
Web /XML Technologies: HTML, CSS, JavaScript, AJAX, Servlets, JSP, XML, XSLT, JAXB2.0
Hadoop-Big Data: Apache Hadoop, Map Reduce, PIG, HDFS, Hive, Sqoop, Oozie, Zookeeper, Flume, Kafka
NoSQL Database: HBase, DynamoDB
RDBMS: Oracle 9i, MS SQL Server
Development / Build Tools: Eclipse, Ant, Maven
Operating Systems: Windows, Linux, Unix
WORK EXPERIENCE:
Hadoop Consultant
Confidential, Richardson, TX
Responsibilities:
- Designed and developed Hadoop system to analyze the SIEM (Security Information and Event Management) data using MapReduce, HBase, Hive, Sqoop and Flume.
- Migrated data from SQL Server to HBase using Sqoop.
- Developed custom writable MapReduce JAVA programs to load web server logs into HBase using flume.
- Log data Stored in HBase DB is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write HQL queries.
- Built re-usable Hive UDF libraries which enabled various business analysts to use these UDF's in Hive querying.
- Developed various workflows using custom MapReduce, Pig, Hive and scheduled them using Oozie.
- Using Pentaho generated the reports which are consumed by the business analysts.
- Extensive knowledge in troubleshooting code related issues.
- Configured various big data workflows to run on top of Hadoop and these workflows comprise of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce.
- Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
- Integrated Kafka with Flume in sand box Environment using kafka source and kafka sink.
- Configured flume agent with flume syslog source to receive the data from syslog servers.
- Auto Populate Hbase tables with data coming from kafka sink.
- Designed and coded application components in an agile environment utilizing test driven development approach.
Environment: MapReduce, yarn2.0, HBase, Hive, Java, SQL, Pig, Sqoop, Oozie, Flume, Pentaho.
Hadoop Admin/ Developer
Confidential, Englewood, CO
Responsibilities:
- Implemented 100 node CDH4 Hadoop cluster on Red hat Linux using Cloudera Manager.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Developed Simple to complex Map/Reduce Jobs using Hive and Pig.
- Handled importing of data from various data source s, performed transformations using Hive, MapReduce, loaded data into HDFS and Migrated the data from MySQL to HDFS using Sqoop
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Setup Amazon web services (AWS) to check whether Hadoop is a feasible solution or not.
- Setup Hadoop cluster using EC2 (Elastic MapReduce) on managed Hadoop Frame Work.
- Used Maven extensively for building MapReduce jar files and deployed it to Amazon Web Services (AWS) using EC2 virtual Servers in the cloud.
- Used S3 Bucket to store the jar's, input datasets and used DynamoDB to store the processed output from the input data sets.
Environment: CDH4, Cloudera Manager, MapReduce, HDFS, Hive, Pig, HBase, Flume, MySQL, Sqoop, Oozie, AWS.
Hadoop Administrator
Confidential, San Jose, CA:
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis
- Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
- Implemented Name Node backup using NFS. This was done for High availability.
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
- Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
- Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Automated workflows using shell scripts pull data from various databases into Hadoop
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux
Java Developer
Confidential, Buffalo, NY
Responsibilities:
- Designed and developed various modules of the application with J2EE design architecture and frameworks like Spring MVC architecture and Spring Bean Factory using IOC, AOP concept.
- Followed agile software development with Scrum methodology.
- Wrote application front end with HTML, JSP, JSF, Ajax/JQuery, Spring Web Flow and XHTML.
- Used J Query for UI centric Ajax behavior.
- Implemented JAVA/J2EE design patterns such as Factory, DAO, Session Façade and Singleton.
- Used Hibernate in persistence layer and developed POJO's, Data Access Object (DAO) to handle all database operations.
- Implemented features like logging, user session validation using Spring-AOP module.
- Developed server-side services using Java, Spring, Web Services (SOAP, WSDL, JAXB, JAX-RPC)
- Worked on Oracle as the backend database.
- Used JMS for messaging.
- Used Log4j to assign, track, report and audit the issues in the application.
- Develop and execute Unit Test plans using J Unit, ensuring that results are documented and reviewed with Quality Assurance teams responsible for integrated testing.
- Worked in deadline driven environment with immediate feature release cycles.
Environment: Java, J2EE, JSP, Servlets, Hibernate, spring, Web Services, SOAP, WSDL, UML, HTML, XHTML, DHTML, JavaScript, J Query, CSS, XML, J Boss, Log4j, Oracle, J Unit, Eclipse.
Java Developer
Confidential
Responsibilities:
- Worked in deadline driven environment with immediate feature release cycles.
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Involved in the feasibility study of the project.
- Gathered requirements from the client for designing the Web Pages.
- Participated in designing the user interface for the application using HTML, DHTML, and Java Server Pages (JSP)
- Involved in writing Client side Scripts using Java Scripts and Server Side scripts using Java Beans and used Servlets for handling the business
- Developed the Form Beans and Data Access Layer classes.
- XML was used to transfer the data between different layers.
- Involved in writing complex sub-queries and used Oracle for generating on-screen reports.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Deployed EJB Components on WebLogic.
- Involved in deploying the application in test environment using Tomcat.
- Identified System Requirements and Developed System Specifications, responsible for high-level design and development of use cases.
- Involved in designing Database Connections using JDBC.
- Involved in design and Development of UI using HTML, JavaScript and CSS.
- Developed coded, tested, debugged and deployed JSPs and Servlets for the input and output forms on the web browsers.
- Created Java Beans accessed from JSPs to transfer data across tiers.
- Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle9i.
- Experience in going through bug queue, analyzing and fixing bugs, escalation of bugs.
- Involved in Significant customer interaction resulting in stronger Customer Relationships.
- Responsible for working with other developers across the globe on implementation of common solutions.
- Involved in Unit Testing.
Environment: Java, JSP, Servlets, EJB, Java Beans, JavaScript, JDBC, WebLogic Server, Oracle, HTML, DHTML, XML, CSS, Eclipse, Jdk1.6, Servlets, CVS.: Tomcat Web Server, Windows