- Over 8 years of experience in IT industry which around 3+ years of experience in Big Data in implementing complete Hadoop solutions.
- Working experience in using Apache Hadoop ecosystem components like Map Reduce, HDFS, Hive, SQOOP, Pig, Oozie, Flume, HBase, and Zoo Keeper.
- Strong experience in data analytics using Hive and Pig, including by writing custom UDFs.
- Performed Importing and exporting data into HDFS and Hive using SQOOP.
- Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Knowledge of creating Map Reduce codes in Java as per the business requirements.
- Extensive knowledge in using SQL Queries for backend database analysis.
- Expertise in Core Java and Product Lifecycle Management tools.
- Experience in developing multi - tier JAVA based web application.
- Good Experience in developing applications using JavaJ2EE technologies includes Servlets, Struts, JSP, and JDBC.
- Well-versed in Agile, other SDLC methodologies and can coordinate with owners and SMEs.
- Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD) .
- Strong knowledge of Software Development Life Cycle (SDLC).
- Experienced in preparing and executing Unit Test Plan and Unit Test Cases after software development.
- Worked extensively on Health and Automotive Insurance domains.
- Experienced to work with multi-cultural environment with a team and also individually as per the project requirement.
Big Data Ecosystem: Hadoop, MapReduce, HDFS, Hive, Pig, SQOOP, Oozie
Programming Languages: C, C++, Java, SQL, COBOL, REXX, CICS
Database: DB2, IMS DB, VSAM
Operating Systems: Linux, WINDOWS
Methodologies: Agile, Waterfall
Tools: Changeman, Endevor, Servicenow, File-Aid, NDM, IBM utilities (SYNCSORT, IEBGENR, IDCAMS, ISPF, FTP etc.), Eclipse, Junit, MunitQMF
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Cassandra with Hortonworks and Cloudera .
- Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
- Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
- Written Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UFD).
- Execution of Hadoop ecosystem and Applications through Apache HUE.
- Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.
- Developed the OOZIE workflows for the Application execution.
- Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
- Performing data migration from Legacy Databases RDBMS to HDFS using SQOOP.
- Writing Pig scripts for data processing.
- Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive. Storing and retrieved data using HQL in Hive. Developed Hive queries to analyze reducer output data.
- Highly involved in designing the next generation data architecture for the unstructured data.
- Managed a 4-node Hadoop cluster for a client conducting a Hadoop proof of concept. The cluster had 12 cores and 3 TB of installed storage.
- Developed PIG Latin scripts to extract data from source system.
- Involved in Extracting, loading Data from Hive to Load an RDBMS using SQOOP.
- Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
Environment: CDH4, HDFS, Map Reduce, Hive, Oozie, Java, PIG, Shell Scripting, Linux, HUE, SQOOP, Flume, DB2, and Oracle 11g
Confidential, NYC, NY
- Installed and configured Apache Hadoop, Hive and Pig environment on the prototype server
- Configured MySql Database to store Hive metadata
- Responsible for loading unstructured data into Hadoop File System (HDFS)
- Created POC to store Server Log data in MongoDB to identify System Alert Metrics
- Created Reports and Dashboards of Server Alert Data
- Created Map Reduce Jobs using Pig Latin and Hive Queries
- Data is collected from Teradata and pushing into Hadoop using SQOOP
- Used SQOOP tool to load data from RDBMS into HDFS
- Cluster coordination services through Zoo Keeper
- Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows
- Created Reports and Dashboards using structured and unstructured data
- Maintained documentation for corporate Data Dictionary with attributes, table names and constraints.
- Extensively worked with SQL scripts to validate the pre and post data load.
- Created unit test plans, test cases and reports on various test cases for testing the data loads
- Worked on integration testing to verify load order, time window.
- Performed the Unit Testing which validate the data is processed correctly which provides a qualitative check of overall data flow up and deposited correctly into targets.
- Responsible for post production support and SME to the project.
- Involved in the System and User Acceptance Testing.
Environment: Hadoop, Pig, Hive, Java, SQOOP, HBase, noSQL, Oracle 10g, PL/SQL, SQL Server, SQL Developer Toad, Windows NT, Stored Procedures.
Confidential, Bloomfield, CT
- Responsible for understanding the scope of the project and requirement gathering.
- Review and analyze the design and implementation of software components/applications and outline the development process strategies
- Coordinate with Project managers, Development and QA teams during the course of the project.
- Used Spring JDBC to write some DAO classes to interact with the database to access account information.
- Used Tomcat web server for development purpose.
- Involved in creation of Test Cases for JUnit Testing.
- Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
- Used CVS, Perforce as configuration management tool for code versioning and release.
- Developed application using Eclipse and used build and deploy tool as Maven.
- Used Log4J to print the logging, debugging, warning, info on the server console.
- Extensively used Core Java, Servlets, JSP and XML
Environment: Java1.5, J2EE, XML, Spring 3.0, Design Patterns, Log4j, CVS, Maven, Eclipse, Apache Tomcat 6, and Oracle 11g.
- Interacting with the client on a regular basis to gather requirements.
- Understanding the business, technical, and functional requirements.
- Checking for timely delivery of various milestones.
- Using Spring Framework, Axis, developed web services including design of the XML request/response structure.
- Implemented Hibernate/Spring framework for Database and business layer.
- Configured Oracle with Hibernate, wrote hibernate mapping and configuration files for database processing (Create, Update, select) operations.
- Involved in creating Oracle stored procedures for data/business logic.
- Created PL/SQL stored procedures for Contract generation module.
- Involved in configuring and deploying of code to different environments Integration, QA and UAT.
- Preparing and designing system/acceptance test cases and executing them.
- Created ant build script to build Artifacts.
- Worked on fine-tuning the response time of Web Service components.
Environment: Java, JSP, EJB, Servlets, Struts, Tomcat, Web logic, Oracle 10g