Hadoop/big Data Consultant Resume
El Segundo, CA
PROFESSIONAL SUMMARY:
- Software professional having over 6 years of ITExperience as a Big Data/Oracle PL/SQL Technical Consultant, which includes 3+years of experience in Big Data/Hadoop and 2 years with Oracle PL/SQL.
- In depth understanding of Hadoop Architecture and its components such as HDFS, YARN.
- Expertise in writing HadoopJobs for analyzing data using MapReduce, Hive, Pig & Spark.
- Experienced in administrative tasks such as installing, configuring, commission&de - commission nodes, troubleshooting, backups and recovery of Hadoop and its ecosystem components such as Hive, Pig, Sqoop, HBase, and Spark.
- Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau, OBIEE.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- 2+ years of experience in NoSQL databases like HBase.
- Extensive experience in developing MySQL, DB2 and Oracle DatabaseTriggers, Stored Procedures and Packages within quality standards using SQL and PL/SQL .
- Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
- Comprehensive knowledge in Debugging, Optimizing and Performance Tuning of DB2,Oracle and MySQL databases.
- Assisted testing team/Involved in with test planning, test conditions, and obtaining test data related to break point.
- Expertise in Unit Testing, Functional Testing, System Testing, Integration Testing, performance Testing and Web Application Testing.
- Experience on Agile Methodology. Implemented and documented Project road map and Strategy.
- Expertise in working with the BI tools such as OBIEE, Qlickview to represent the data in form of graphical manner.
- Very good experience in knowledge transfer (KT) to support team giving pre and post deployment support.
- Designed Use case Diagrams, Class Diagrams, Sequence Diagrams, Flow Charts and Deployment diagrams using MS VISIO and UML Rational Rose Tool.
- Committed to excellence, self-motivator, team-player, and a far-sighted developer with strong problem-solving skills and with zeal to learn new technologies.
- Strengths include good team player, excellent communication interpersonal and analytical skills and ability to work effectively in a fast-paced, high volume, deadline-driven environment.
TECHNICAL SKILLS:
Languages: C, Core Java, SQL, PL/SQL
Functional Programming: SCALA (Learning Stage)
Technologies: Hadoop, HDFS, MapReduce, Hive, PIG, OOZIE, HUE, SQOOP, Spark, Cloudera manager and Navigator
Database: Oracle 10g, 11g, MySQL
NoSQL Database: Apache HBase
Development Tools: Eclipse, Putty.
Database Tools: Oracle SQL Developer, TOAD and PLSQL Developer.
Reporting Tool: Oracle BI EE 10g/11g, BI Publisher.
Methodologies & Standards: Software Development Lifecycle (SDLC), RUP, Waterfall Model and Agile
Hardware(Operating Systems): Linux, Unix and Windows 8, 7, XP
Others: MS Office, Apache Open Office, Putty, WinSCP, MS-Visio
PROFESSIONAL EXPERIENCE:
Confidential, EL Segundo, CA
Hadoop/Big Data Consultant
Responsibilities:
- Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH4 Hadoop cluster on Centos. Assisted with performance tuning and monitoring.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Worked with Apache Hadoop, Spark and Scala.
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
- Handling structured and unstructured data and applying ETL processes.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Coding complex Oracle stored procedures, functions, packages, and cursors for the client specific applications.
- Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
Environment: Java (JDK 1.6), Eclipse, Linux, CDH4.x, Sqoop, Pig, Hive, Oozie, Unix Shell Scripting, HUE, WinSCP, MYSQL.
Confidential, Minnesota, MNHadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Experienced in managing and reviewing Hadoop log files.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
Environment: Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, Pig, HBase, Linux,, MapReduce, HDFS, Java (JDK 1.6), Cloudera, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6.
Confidential, Indianapolis, INHadoop/Java Developer
Responsibilities:
- Worked on capacity planning and management of HDFS Clusters to meet the project needs
- Involved in the Installation and configuration of the Hadoop clusters in the network of 10 nodes one making the master and rest as slaves, and its growing
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MS SQL 2008, Oracle 11g into HDFS using Sqoop
- Involved in SQOOP, HDFS Put or CopyFromLocal to ingest data.
- Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
- Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
- Involved in writing the hive queries for analysis on the structured data in the output folder of HDFS
- Worked closely with the ETL Lead, Solution Architect, Data Modeller, Business Analysts to understand business requirements, providing expert knowledge and solutions on Data Warehousing, ensuring delivery of business needs in a timely cost-effective manner.
- Responsible for data analysis, requirements gathering, source-to-target mapping, process flow diagrams, and documentation.
- Constantly in touch with the Business Line to verify the requirements and create the mapping documents.
Environment: Java 1.7, Hadoop 1.x, Hive, HDFS, Pig, Sqoop, Linux, SQL 2005/2008, Oracle 11g, MS Visio.
Confidential, North Hollywood, CAOracle/Java Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Developed the modules based on struts MVC Architecture.
- Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Created Business Logic using Servlets, Session beans and deployed them on Weblogic server.
- Used MVC struts framework for application design.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUNIT for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.
Environment: Java, HTML, Java Script, CSS, Oracle, JDBC, Swing and Eclipse.
Confidential, Mumbai, MHOracle Developer
Responsibilities:
- Involved in all phases of software development including gathering requirements, creating
- Specs, developing various database objects, developing code and validating programs, etc.,
- Involved in development of User Interface Objects and Testing of entire module
- Involved in the creation of database objects like tables, views, stored procedures, functions, packages, DB triggers, Indexes and Collections.
- Exported the reports into the required formats
- Performed of conversion extension and installation scripts
- Extensively used PL/SQL Developer for creating database objects, running the command scripts for inserting the configuration data items.
- Written many database triggers for automatic updating the tables and views.
- Responsible for back end stored procedures development using PL/SQL predefined procedures.
- Used Explain Plan and hints to tune the SQL
- Implemented proactive database administration to avoid problems such as running out of free space, insufficient space for temporary segments, fragmentation of data segments, etc.
- Used UNIX Shell scripts to deploy the Oracle forms and reports to production servers.
- Involved in loading the data from flat files to Oracle tables using SQL*Loader and C.
- Involved in creating user documentation and providing End user training.
- Responsible for making the technical documentation, reports related to the application.
Environment: Oracle 9i, Sun Solaris 2.5, Server Manager, SQL* Plus, SQL*Loader and Windows NT, Developer 2000