- Overall 8 years of experience in design and deployment of Enterprise Application Development, Web Applications, Client - Server Technologies, Web Programming using Java and Big data technologies.
- Possesses 5+ years of comprehensive experience as a Hadoop, Big Data & Analytics Developer.
- Expertise on Hadoop architecture and ecosystem such as HDFS, Map Reduce, Pig, Hive, Sqoop Flume and Oozie.
- Complete Understanding on Hadoop daemons such as Job Tracker, Task Tracker, Name Node, Data Node and MRV1 and YARN architecture.
- Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various distributions such as Apache, Cloudera and AWS.
- Experience in Installation and Configuring Hadoop Stack elements Map Reduce, HDFS, Hive, PigSqoop, Flume, Oozie and Zookeeper.
- Experience in data process and analysis using Map Reduce, HiveQL, and Pig Latin.
- Extensive experience in Writing User Defined Functions (UDFs) in Hive and Pig.
- Worked on ApacheSqoop to perform importing and exporting data from HDFS to RDBMS/NoSQL DBs and vice-versa.
- Worked with NoSQL databases such as Hbase.
- Exposure to search, cache, and analytics data solutions such as Hive.
- Experience in job workflow scheduling and Job Designer with the help of Oozie.
- Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data and Machine Learning Concepts.
- Worked extensively over semi-structured data (fixed length & delimited files), for data sanitation, report generation and standardization.
- Experienced in monitoring Hadoop cluster using Cloudera Manager and Web UI.
- Extensive Experience working on web technologies like HTML, CSS, XML, JSON, JQuery
- Extensive experience in documenting requirements, functional specifications and technical specifications.
- Extensive experience with SQL, PL/SQL and database concepts.
- Strong Database background with Oracle, PL/SQL, Stored Procedures, trigger, SQL Server, MySQL, and DB2.
- Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
- Good Team Player, Strong Interpersonal, Organizational and Communication skills combined with Self-Motivation, Initiative and Project Management Attributes.
- Holds strong ability to handle multiple priorities and work load and also has ability to understand and adapt to new technologies and environments faster.
Hadoop Core Services: HDFS, Map Reduce, Spark, YARN.
Hadoop Distribution: Cloudera, Apache, Horton works
NO SQL Databases: Hbase, Cassandra.
Hadoop Data Services: Hive, Pig, Impala, Sqoop, Flume, Kafka (beginner).
Hadoop Operational Services: Zookeeper, Oozie.
Monitoring Tools: Cloudera Manager.
Cloud Computing Tools: Amazon AWS.
Languages: C, Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Java Script, Shell Scripting.
Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts, JMS, EJB.
Application Servers: Web Logic, Web Sphere, JBoss, Tomcat.
Databases: Oracle, MySQL, Postgress, Teradata.
Operating Systems: UNIX, Windows, LINUX.
Build Tools: Jenkins, Maven, ANT.
Development Tools: Eclipse, Net Beans, Microsoft SQL Studio, Toad,.
Confidential, Jacksonville FL
Sr. Hadoop developer
- Developed simple and complex Map Reduce programs in Java for Data Analysis on different data formats.
- Developed Map Reduce programs that filter bad and un-necessary records and find out unique records based on different criteria.
- Developed Secondary sorting implementation to get sorted values at reduce side to improve map reduce performance.
- Implemented custom Data Types, Input Format, Record Reader, Output Format, Record Writer for Map Reduce computationsto handle custom business requirements.
- Implemented Map Reduce programs to classified data organizations into different classifieds based on different type of records.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
- Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-coordinator jobs.
- Responsible for performing extensive data validation using Hive.
- Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.
- Worked intuning Hive and Pig scriptsto improve performance.
- Involved in submitting and tracking Map Reduce jobs using Job Tracker.
- Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time and data availability.
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
- Involved in loading the created HFiles into Hbase for faster access of large customer base without taking Performance hit.
- Implemented Hive Generic UDF's to implement business logic.
- Coordinated with end users for designing and implementation of analytics solutions for User Based Recommendations using R as per project proposals.
- Worked on research team that developed Scala, a programming language with full Java interoperability and a strong type system.
- Improved stability and performance of the Scala plug-in for Eclipse, using product feedback from customers and internal users.
- Redesigned and implemented Scala REPL (read-evaluate-print-loop) to tightly integrate with other IDE features in Eclipse.
- Assisted monitoring Hadoop cluster using Ganglia.
- Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
- Implemented test scripts to support test driven development and continuous integration.
- Junit framework was used to perform unit and integration testing.
- Configured build scripts for multi module projects with Maven and Jenkins CI.
- Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, Kafka, Linux, Scala, Maven, Java Scripting, Oracle 11g/10g, SVN, Ganglia.
Confidential, Columbus, OH
- Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, Hbase, Sqoop, Flume, Oozie and Zookeeper.
- Implemented six nodes CDH4 Hadoop Cluster on CentOS.
- Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop.
- Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie.
- Importing log files using Flume into HDFS and load into Hive tables to query data.
- Monitoring the runningMap Reduceprograms on the cluster.
- Responsible for loading data from UNIX file systems to HDFS.
- Used Hbase-Hive integration, written multiple Hive UDFs for complex queries.
- Involved in writing APIs to ReadHbasetables, cleanse data and write to anotherHbasetable.
- Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.
- Written multiple Map Reduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
- Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements.
- Experienced in writing programs using Hbase Client API.
- Involved in loading data into Hbase using Hbase Shell, Hbase Client API, Pig and Sqoop.
- Experienced in design, development, tuning and maintenance of NoSQL database.
- Written Map Reduce program in Python with the Hadoop streaming API.
- Developed unit test cases for Hadoop Map Reduce jobs with MRUnit.
- Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of database.
- Continuously monitored and managed the Hadoop cluster using Cloudera manager and Web UI.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Used Maven as the build tool and SVN for code management.
- Worked on writing RESTful web services for the application.
- Implemented testing scripts to support test driven development and continuous integration.
Environment: Hadoop, Map Reduce, HDFS, Hbase, Hive, Impala, Pig, Java, SQL, Ganglia, Scoop, Flume, Oozie, Unix, Java, Java Script, Maven, Eclipse.
- Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
- Worked on writing transformer/mapping Map-Reduce pipelines using Apache Crunch and Java.
- Imported Bulk Data into Cassandra file system Using Thrift API.
- Involved in creatingHive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
- Perform analytics on Time Series Data exists in Cassandra using Java API
- Designed and implemented Incremental Imports into Hive tables.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
- Involvedin collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Experienced in managing and reviewing theHadooplog files.
- Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Implemented the workflows using Apache Oozie framework to automate tasks.
- Worked with Avro Data Serialization system to work with JSON data formats.
- Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
- Involved in Unit testing and delivered Unit test plans and results documents using JUnit and MRUnit.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Developed scripts and automated data management from end to end and sync up between all the clusters.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Scripts.
Environment: Hadoop, HDFS, Horton works (HDP 2.1), Map Reduce, Hive, Oozie, Sqoop, Pig, MySQL, Java, Rest API, Maven, MRUnit, JUnit.
Confidential, Spring field, IL
Sr. Java Developer
- Designed, developed, maintained, tested, and troubleshoot Java and PL/SQL programs in support of Payroll employees.
- Developed documentation for new and existing programs, designs specific enhancements to application.
- Implemented web layer using JSF and Ice faces.
- Implemented business layer using Spring MVC.
- Implemented Getting Reports based on start date using HQL.
- Implemented Session Management using Session Factory in Hibernate.
- Developed the DO’s and DAO’s using hibernate.
- Implement SOAP web service to validate zip code using Apache Axis.
- Wrote complex queries, PL/SQL Stored Procedures, Functions and Packages to implement Business Rules.
- Wrote PL/SQL program to send EMAIL to a group from backend.
- Developer scripts to be triggered monthly to give current monthly analysis.
- Scheduled Jobs to be triggered on a specific day and time.
- Modified SQL statements to increase the overall performance as a part of basic performance tuning and exception handling.
- Used Cursors, Arrays, Tables, Bulk collect concepts.
- Extensively used log4j for logging the log files.
- Performed UNIT testing in all the environments.
- UsedSubversionas the version control system
- Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
- Developed Class diagrams, Sequence diagrams using Rational Rose.
- Developed presentation layer using Struts framework, and performed validations using Struts Validator plugin.
- Created SQL script for the Oracle database
- Implemented the Business logic using Java Spring Transaction Spring AOP.
- Implemented persistence layer using Spring JDBC to store and update data in database.
- Produced web service using WSDL/SOAP standard.
- Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
- Extensively involved in the creation of the Session Beans and MDB, using EJB 3.0.
- Used Hibernate framework for Persistence layer.
- Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
- Deployed and built the application using Maven.
- Performed testing using JUnit.
- Used JIRA to track bugs.
- Extensively used Log4j for logging throughout the application.
- Produced a Web service using REST with Jersey implementation for providing customer information.
- Used SVN for source code versioning and code repository.