- Around 9 Years of accumulated experience that includes Big Data, Business Intelligence and Analytics, Java, SAP HCM ABAP, ETL, ELT, UNIX, Linux and Tableau.
- Experienced in development of Hadoop and its services of major distributions including Cloud era, Horton works, Pivotal and MapR Distributions.
- Worked on multiple projects involving MapReduce, HDFS, Oozie, Pig, Hive, Sqoop, Zookeeper, Flume, Python, Storm, Kafka, Solar, Elastic Search, Spark SQL, Spark Streaming, Spark and Scala, AWS, YARN and NoSQL (HBase, MongoDB and Cassandra).
- Well experienced in Core Java programming and shell scripting.
- Importing the data from Relational Databases to Hive and exporting to Relational Databases from Hive using Sqoop.
- Experienced in Enterprise Search Platforms Solr and Elastic Search.
- Solid understanding of relational database management systems and process of ETL (Extract Transform Load) and ELT (Extract Load Transform).
- Great understanding of Linux specifically in CentOS.
- Experienced in BI Tools including Tableau for visualizing the data.
- Experienced in preparing and publishing Dashboards for better visualization of data using Tableau.
- Experienced in MVC (Model View Controller) architecture.
- Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals and can work as part of a team as well as independently.
- A Strong team player having good analytical skills to identify key issues and provide solution and design of Technical Specification document as per schedule.
Big Data: MapReduce, HDFS, Hive with Tez, Pig, Sqoop, Oozie, Zookeeper, Flume, AWS, YARN, Storm, Kafka, Spark, Python, Solar and HBase, MongoDB, Cassandra, Cloud era Manager.
Languages: C, C++, Java, Python, SQL, PL/SQL, ABAP.
Databases: Oracle, SQL Server, DB2, MS Access
Technologies: Java, Log4j, Java Help API, JDBC, Java Beans
Methodologies: Agile Software development, Six Sigma, Project Management.
Framework: Ajax, Struts 2.0, JUnit, log4j, spring, Hibernate
Application Server: Apache Tomcat, JBoss
Tools: HTML, CSS, Java Script, XML, JQuery
Testing Tools: NetBeans, Eclipse, HP Quality Center
Operating System: UNIX, Mac OSX, Windows
Control Tools: CVS, Tortoise SVN, GIT
Others: MS Office
Confidential, Bellevue, WA
- Gathering the requirements from client, coordinating with Onsite, Offshore and Client teams.
- Experience in Horton works Distribution Platform 2.2, MapReduce, PIG, Hive, Sqoop, Control - M, HBase and Strom.
- Worked with large data sets in a pretty large cluster.
- Great knowledge on data mining and data warehousing.
- Worked with Rabbit MQ with regards to messaging system.
- Worked on data preparation and data processing which needs to be loaded into HBase.
- Experienced on loading the data into Hive, and retrieving the data from Hive tables using HiveQL.
- Experienced on Implementing Spark using Scala and Spark SQL for faster testing and processing of data
- Developed Spark code using Scala and Spark-SQL for faster testing and processing of data.
- Worked on loading the raw data extracts into Hive tables.
- Worked on creating external and managed tables in Hive.
- Designed HBase Schema, created HBase tables and loaded the historical data into HBase tables.
- Worked on loading data into HBase tables using HBase Put method and HBase Bulk loading methods.
- Daily updated the HBase tables using Oozie.
- Used Spark SQL to process the huge amount of structured data.
- Worked on HBase and Hive integration and loaded the data into HBase tables.
- Worked on building dashboards for visualizing it to higher level of hierarchy using Tableau.
- Worked on project related documentation in Confluence.
- Experience in offshore and onsite coordination.
Environment: Hadoop (HDP 2.2), MapReduce, HBase, SolrCloud, Pig, Hive, Ambari, Oozie, Sqoop, Spark, Kafka, Storm, YARN, Maven, Zookeeper, JDBC, RabbitMQ and File Watcher
Confidential, Charleston, IL
IT Programmer (Hadoop)
- Experience in developing POCs and highly distributed applications.
- Worked with Data Scientists, Big Data Analysts and Business Analysts from ITS teams to gather requirements for implementing HBase and Solr project.
- Involved in analyzing, planning, coding, debugging, testing and go-live phases of Big Data applications.
- Primarily worked on HBase and Solr project.
- Implemented enterprise search with SolrCloud.
- Responsible for writing Map Reduce programs.
- Primarily worked on HBase and Solr implementation.
- Wrote and integrated MapReduce jobs in order to load crawl data into HBase and index it into Solr.
- Responsible for loading raw data into HBase table and index the updates on top of HBase table using NRT indexing.
- Worked on Morph lines ETL tool for batch indexing into SolrCloud from HBase table.
- Worked on Near Real Time Indexing to update the indexing on existing HBase table.
- Experienced in Hadoop with spark and Scala.
- Worked with MapReduce Indexer Tool to create a set of Solr index shards from the input files and merge the output shards into a SolrCloud.
- Wrote shell scripts to automate document indexing to SolrCloud in production
- Implemented MapReduce jobs that call Solr for fuzzy matching of data where each mapper calls Solr service.
- Worked on indexing the data using Elastic Search and Cloud Search managed on AWS Cloud.
- Developed multiple MapReduce jobs in Java for data analysis.
- Worked with Oozie to automate the daily loading of raw data coming from multiple sources into HBase table and also setting up the email notification in case if automated Oozie job fails.
- Worked on Hive UDFs for smoothing the data.
- Worked with services on AWS such as Elastic Map Reduce, S3 and EC2 platform.
- Involved in checking security measures for loading the Arrow data on AWS Cloud.
- Involved in loading data from MySQL and DB2 to HDFS using Sqoop.
- Experienced in querying in Impala and Hive.
- Worked with converting Avro formatted data into JSON formatted data and load it into HDFS and then to HBase tables using Bulk Loading with the help of HFiles.
- Developed Hive queries to process the data and visualize data.
- Worked on designing and developing dashboards using Tableau and Qlikview for visualizing the data and presented dashboards to higher-level hierarchy.
- Communicated with BI team when building the dashboards in Tableau, Qlikview and Qliksense.
- Gave KT sessions on HBase, Solr and Tableau to team members.
- Responsible for building scalable distributed data solutions using Hadoop.
- Implemented custom map reduce jobs, Impala queries, Pig scripts and Oozie.
- Worked with Maven for Project Management and to integrate the build process.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Attended Hadoop Essentials workshop from Cloud era regarding implementing the services of Cloud era and its benefits.
- Attended workshops and training sessions from Cloud era, GitHub, Qlik, Tableau and Micro Strategy.
- Provided quick response to ad hoc internal and external client requests for data and experienced in creating ad hoc reports.
- Documentation skills such as preparing Technical specs, Functional specs and Implementation plan for documentation purpose.
- Involved in code review sessions with team.
- Used Bittbuket for sharing the code, Confluence for Knowledge Transfer and JIRA for project tracking and updates.
- Followed agile best practices.
Environment: Hadoop (CDH 5.2, CDH 5.3, CDH 5.4), MapReduce, HBase, SolrCloud (Cloud era Search), Pig, Hive, Impala, Oozie, Sqoop, Spark, Key-Value Store Indexer, Kafka, Storm, YARN, AWS (EC2, S3, Elastic Search), Maven, Zookeeper, JDBC, Log4j
- Involved in various phases of Software Development Life Cycle.
- Interacting with all the modules of the project, gathered the batch related requirements and designed accordingly.
- Used Eclipse as IDE for application development.
- Created and maintained the configuration of the Spring Application Framework.
- Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
- Designed and developed GUI using JSP, HTML, DHTML and CSS.
- Worked with JMS for messaging interface.
- Developed UI using JAVA and used Oracle 10g as backend support through TOAD.
- Extensively used log4j for logging the log files.
- Used Subversion as the version control system.
- Responsible for understanding the scope of the project and requirement gathering.
- Used Tomcat web server for development purpose.
- Involved in creation of Test Cases for JUnit Testing.
- Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
- Used CVS as configuration management tool for code versioning and release.
- Developed application using Eclipse and used build and deploy tool as Maven.
- Used Log4J to print the logging, debugging, warning, info on the server console.
- Performed unit testing using JUnit.
- Involved in scheduling all the batch tasks to run in different environments.
- Used JMS to send and receive messages in the form of XML’s.
- Configured the Data source to access the Oracle database using JDBC Provider for Oracle in the Application server.
- Involved in the maintenance and production support.
Environment: Java, XML, spring, Hibernate, Design Patterns, WebLogic, Log4j, CVS, Maven, Eclipse, Apache, Oracle, Grails and Groovy.
Confidential, Warrenville, IL
SAP HCM ABAP Developer
- Involved in various phases of Software Development Life Cycle.
- Interacting with all the modules of the project, gathered requirements and designed accordingly.