- More than 8 years of experience in IT industry, including Big data environment, Hadoop ecosystem and Design, Developing, Maintenance of various applications.
- Hands on experience in writing MapReduce jobs on Hadoop Ecosystem, including major components like Hive, Pig, MongoDB, Sqoop, Flume, HBase.
- Expertise in writing UDFs in Pig , wrote pig - Latin scripts to analyze raw data.
- Proficient in writing in UDFs in Hive QL to transform data.
- Professional understanding in various phases of SDLC including Requirements analysis, Development, Maintenance and Testing of various client/server and web applications.
- Along with developing on the Hadoop Eco system, also have little experience on installing and configuring of the Horton works distribution Hdp 2.2 and Cloudera distribution (CDH3 and CDH4).2
- Experience in Nosql database HBase, MongoDB and Cassandra .
- Good understanding of Hadoop architecture and hands on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
- Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
- Extracted & processed streaming log data from various sources and integrated in to HDFS using Flume.
- Extensively worked with different data sources non-relational databases such as XML files, parses like SAX, DOM and other relational databases such as Oracle, MySQL.
- Expertise in writing Oozie work flow to schedule Hadoop jobs
- Used compression techniques with file formats to leverage the storage in HDFS
- Experience in working on Operating Systems like Ubuntu, CentOS, Windows 7 and windows 10.
- Involved in loading data from LINUX file system to HDFS.
- Extensive experience using MAVEN and ANT as a Build Tool for the building of deployable artifacts (war & ear) from source code.
- Used Jenkins CI : distributed build farm that supports all of the environments to run the builds
- Expert in deploying the code trough web application servers like Web Sphere/Web Logic/ Apache Tomcat in AWS CLOUD.
- Delivered project needs on time and within the agreed acceptance criteria in a hybrid methodology environment as they attempted to transition to an Agile Methodology.
- Ability in development and execution of automation tools and framework design like CHEF, shell and Python Scripts.
- My core responsibilities include Migrating data from Oracle databases to Hive, Ingesting application logs using flume and performing ETL operations using PIG .We are also handling Incremental data with Sqoop Incremental.
Big Data EcoSystem: MapReduce,HDFS, HIVE, Pig, Sqoop, Flume, HDP, Oozie, Zookeeper, MR,Spark, kafka, storm, Hue
Hadoop Distributions: Cloudera (CDH3, CDH4, CDH5), Hortonworks Hdp2.2.
Databases/ETL: Oracle 11g/10g, MySQL, MS-SQL Server, PL/SQL
NoSQL Databases: HBase and MongoDb
Programming: C,C++, Java, Unix shell scripting,R programming
Version Control: Git, GitLab,SVN
Application Server: Apache Tomcat, Web Logic Server
Operating Systems: Linux, Windows
Web Technologies: HTML, DHTML, XML
Cloud Computing Services: AWS: IAM,EC2, S3, Elastic Beanstalk(EBS), VPC, Instances, Opsworks, Elastic Load balancer (ELB), RDS (MySQL), AMI, SQS, SNS, SWF, Data security, Trouble Shooting, Dynamo DB, API Gateway, Direct Connect, CLoud Front, Cloud Watch, Cloud Trail, Route 53
Big data/ Hadoop Developer
- Developing parser and loader Map reduce application to retrieve data from HDFS and store to HBase and Hive .
- Involved in Agile Methodologies of Project Development
- Importing the unstructured data into the HDFS using Flume .
- Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
- Written Map Reduce java programs to analyze the log data for large-scale data sets.
- Involved in using HBase Java API on Java application.
- Automated all the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System using Oozie Workflow Scheduler.
- Implemented Map Reduce jobs using Java API and Python using Spark
- Participated in the setup and deployment of Hadoop cluster
- Hands on design and development of an application using Hive (UDF).
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Provide support data analysts in running Pig and Hive queries.
- Involved in Hive QL and Involved in Pig Latin.
- Importing and exporting Data from Mysql/Oracle to HiveQL Using SQOOP .
- Configured HA cluster for both Manual failover and Automatic failover.
- Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce .
- Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Experience in writing SOLR queries for various search documents
- Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.
Environment: Linux, Hdp2.2, Big Data Apache Hadoop, Hive, Hue Tool, Zookeeper, Map Reduce, Sqoop, crunch API, Pig 0.10 and 0.11, HCatalog, Unix, Java, JSP, Eclipse, Maven, Oracle, SQL Server, MYSQL, Oozie, Python 3.4.
Big data / Hadoop Developer
- Processed Big Data using a Hadoop cluster consisting of 40 nodes.
- Designed and configured Flume servers to collect data from the network proxy servers and store to HDFS .
- Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop .
- Built data pipeline using Pig and Java/Scala Map Reduce to store onto HDFS .
- Applied transformations and filtered both traffic using Pig .
- Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer using Hive and stored the results in HBase
- Performed unit testing using MRUnit.
- Responsible for building scalable distributed data solutions using Hadoop
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster
- Setup and benchmarked Hadoop / HBase clusters for internal use
- Developed Simple to complex Map/reduce Jobs using Scala and Java in Spark .
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations using Hive , MapReduce , loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Analyzed the data by performing Hive queries and running Pig scripts to study employee behavior
- Installed Oozie workflow engine to run multiple Hive and Pig jobs
- Involved in managing and reviewing the Hadoop log files
Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HDFS, Flume, Oozie, DB2, HBase, Mahout, Scala.
Big Data/ Hadoop Developer
- Acted as a resource and build the entire Hadoop platform from scratch.
- Evaluated suitability of Hadoop and its ecosystem to the above project and implementing / validating with various proof of concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.
- Estimated the Software & Hardware requirements for the Namenode and Datanodes in the cluster.
- Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase using MapReduce.
- Responsible for writing programs for testing data as well as develop business enterprise data
- warehouse application
- Written the Map Reduce programs, Hive UDFs.
- Used Map Reduce JUnit for unit testing.
- Develop HIVE queries for the analysts.
- Executing parameterized Pig , Hive , impala, and UNIX batches in Production.
- Created an e-mail notification service upon completion of job for the particular team which requested for the data.
- Defined job work flows as per their dependencies in Oozie.
- Data cleansing, data quality tracking and process balancing checkpoints.
- Create flexible data model design that is scalable, reusable, while emphasizing performance, Data Validation and business needs.
- Played a key role in productionizing the application after testing by BI analysts.
- Maintain System integrity of all sub-components related to Hadoop .
Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Cloudera CDH4, Oozie, Oracle, MySQL, Amazon S3.
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Involved in the feasibility study of the project.
- Gathered requirements from the client for designing the Web Pages.
- Gathered specifications for the Library site from different departments and users of the services.
- Assisted in proposing suitable UML class diagrams for the project.
- Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle .
- Designed and implemented the UI using HTML and Java .
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
- Looked after Release Management & code reviews.
- Partly used Hibernate EJB and Web Services .
- Involved in developing build file for the project.
- Involvement in all Payday Transactions Issue Fixes and Enhancements.
- Supported with UAT , Pre-Prod and Production Build management.
- Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
- Coordination with team for Fixes and Releases.
- Involvement in all Title Transactions, printing of the documents.
- Involved in program setup, program profile, fees, card settings Modules.
- Developed Action classes, business classes, helper classes, Hibernate POJO classes .
- Developed spring DAO classes, store proc. classes to connect the DB through spring JDBC
- Developed Action Forms , Form Beans and Java Action classes using Struts framework
- Participated in code reviews and ensured compliance with standards
- Involved in preparing database scripts and deployment process
- Used JDBC API to connect to the database and carry out database operations.
- Involved in design and implementation of web tier using Servlets and JSP.
- Used Apache POI for Excel files reading.
- Used JSP and JSTL Tag Libraries for developing User Interface components.
- Involved in developing UML Diagrams like Use Case, Class, Sequence diagrams.
- Handled the session management to switch from classic application to new wizard and vice versa.
Jr. Software Engineer
- Gathered and analyzed user/business requirements and developed System test plans.
- Managed the project using Test Director, added test categories and test details.
- Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
- Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes .
- Implemented server side tasks using Servlets and XML .
- Implemented Struts Validation Framework for Server side validation.
- Developed Jsp’s with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
- Implemented Struts Action classes using Struts controller component.
- Involved in creating scripts and PL-SQL Programs for data integration Project
- Involved in deploying the application in Web logic
- Worked on Databases like Oracle, MySql using Struts, JSP, JDBC, Servlets.
Environment: Windows Xp, Java 1.6, JSP, Servlets, J2EE, JDBC, HTML, Oracle, MySql in Eclipse and Apache Tomcat Server.