- 8 Years of Professional IT experience in Big Data, Hadoop, Java /J2EE and Cloud technologies in Financial, Retail and HealthCare domains.
- Over 4 years of experience in Big Data platform as both Developer and Administrator.
- Experience in building high performance and scalable solutions using various Hadoop ecosystem tools like Pig, Hive, Sqoop, Impala, Spark, Solr and Kafka.
- Responsible for designing and building a DataLake using Hadoop and its ecosystem components.
- Handled Data Movement, data transformation, Analysis and visualization across teh lake by integrating it with various tools.
- Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for teh Data Lake.
- Experienced working with ETL tools like Talend and Inforamtica.
- Extensively worked on Spark and its components like Sparksql, SparkR and Spark streaming for data manipulation preparation and cleansing.
- Good knowledge on analysing streaming data and defined real time data streaming solutions across teh cluster using Spark Streaming, Apache Storm, Kafka, Nifi and Flume.
- Experienced in writing Spark SQL scripts and implementing Spark transformations and actions usingPython.
- Very good understanding and Working Knowledge of Object Oriented Programming (OOPS) and Python.
- Good Expertise in Planning, Installing and Configuring Hadoop Cluster based on teh business needs.
- Installed and configured multiple Hadoop clusters of different sizes and with ecosystem components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.
- Worked on all major distributions of Hadoop Cloudera (CDH4, CDH5), Hortonworks (HDP 2.2, 2.4) and MapR.
- Experience in implementing Failover mechanisms for Namenode, Resource Manager and Hive.
- Configured AWSEC2 instances, S3Buckets, Cloud services and architected teh flow of data to and from AWS.
- Transformed and aggregated data for analysis by implementing work flow management of Sqoop, Hive and Pig scripts.
- Experience working on different file formats like Avro, Parquet, ORC, Sequence and Compression techniques like Gzip, Lzo, snappy in Hadoop.
- In-Depth knowledge of Scala and Experience building Spark applications using Scala.
- Good experience working on Tableau and Spotfire and enabled teh Jdbc/Odbc data connectivity from those to Hive tables.
- Well versed with SQL/PL SQL and Oracle database in writing queries, stored procedures, triggers and functions.
- Experience in developing Applications using Java, J2EE, JSP, MVC, Servlets, Struts, Hibernate, JDBC, JSF, EJB, XML, AJAX and web based development tools.
- Expertise in web Technologies like HTML, CSS, PHP, XML.
- Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus.
- Highly motivated with teh ability to work independently or as an integral part of a team and Committed to highest levels of profession.
Big Data / Hadoop: HDFS, MapReduce, HBase, Kafka, PIG, HIVE, Sqoop, Impala and Flume, Talend
Real time/Stream Processing: Apache Storm, Apache Spark
Cloud Technologies: Amazon web services, EC2, S3, EMR, Redshift
Operating Systems: Windows, Unix and Linux
Programming Language: C, Java, J2EE, SQL
Data Base: Oracle 9i/10g, SQL Server, MS Access
IDE Development Tools: Eclipse, NetBeans
Methodologies: Agile, Scrum and Waterfall
Confidential, Sunnyvale, CA
Sr. Hadoop Consultant
- Installing, configuring and testing Hadoop ecosystem components like MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Hue and HBase.
- Imported data from various sources into HDFS and Hive using Sqoop.
- Involved in writing custom MapReduce, Pig and Hive programs.
- Experience in writing customized UDF's in java to extend Hive and Pig Latin functionality.
- Created Partitions and Buckets in Hive for both Managed and External tables for optimizing performance.
- Worked on several PoC's involving No SQL Databases like HBase, MongoDB and Cassandra.
- Configured Tez as execution engine for Hive queries to improve teh performance.
- Developed a data pipeline using Kafka and Storm to store data into HDFS and performed teh real time analytics on teh incoming data.
- Hands on experience in Spark and Spark Streaming creating RDD's, Applying operations -Transformation and Actions on it.
- Developed Spark code using Pythonand Spark-SQL for faster data processing.
- Configured Spark streaming to receive real time data from teh Kafka and store teh stream data to HDFS using Scala.
- In-depth knowledge of Scala and experienced in building teh Spark applications using Scala.
- Configured Flume to stream data into HDFS and Hive using HDFS Sinks and Hive sinks.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in scheduling Oozie workflow engine to run multiple Hive, Pig and Spark jobs.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Experience in Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of teh cluster.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, AWS, LINUX, Spark, Kafka, Hbase, Solr and UNIX
Confidential, Atlanta, GA
Sr. Hadoop Developer
- Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase and HDFS.
- Designing and implementing semi-structured data analytics platform leveraging Hadoop.
- Worked on performance analysis and improvements for Hive and Pig scripts at MapReduce job tuning level.
- Used Sqoop to load data from RDBMS into HDFS.
- Worked on implementing several POCs to validate and fit teh several Hadoop eco system tools on CDH and Hortonworks distributions
- Involved in Hadoop cluster task like Adding and Removing Nodes without any TEMPeffect to running jobs and data.
- Designed and Implemented Error-Free Data Warehouse-ETL and Hadoop Integration.
- Proficient in data modelling with Hive partitioning, bucketing, and other optimization techniques in Hive
- Developed Pythonscripts to automate and provide Control flow to Pig scripts for extracting teh data and load into HDFS.
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing with Pig.
- Worked in Spark streaming to get ongoing information from theKafkaand store teh stream information to HDFS.
- Set up standards and processes for Hadoop based application design and implementation.
- Wrote Shell scripts for several day-to-day processes and worked on its automation.
- Collected teh logs data from web servers and integrated in to HDFS using Flume.
- Implemented Fair Schedulers on teh Job tracker to share teh resources of teh Cluster for teh Map Reduce jobs given by teh users.
- Worked on establishing connectivity between Tableau and Spotfire.
Environment: Hadoop, HDFS, Map Reduce, Spark, Java, HIVE, PIG, HBase, Sqoop, Flume, Linux, UNIX.
Confidential, King of Prussia, PA
- Responsible for building scalable distributed data solutions usingHadoop.
- Collection and Downloading of data generated by sensors from teh Patients body activities to HDFS.
- Performed necessary transformations and aggregation to build teh common learner data model in NoSQL store (Hbase).
- Used Pig, Hive and MapReduce for analyzing teh Health insurance data and patient information.
- Developed workflow in Oozie to orchestrate a series of Pig scripts to remove, merge and compress files using pig pipelines in teh data preparation stage.
- Used Pig UDF's in Python, Java code and used sampling of large data sets.
- Moving all log files generated from various sources to HDFS for further processing through Flume.
- Extensively used PIG to communicate with Hive and Hbase using Hcatalog and Handlers.
- Involved in transforming data from legacy tables to HDFS, and Hbase tables using Sqoop.
- Implemented test scripts to support test driven development and continuous integration.
- Exported analyzed data to relational databases using Sqoop for visualization and generate reports for teh BI team.
- Good understanding of ETL tools and their application to Big Data environment.
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Oozie, Java, Hbase, Flume, Oracle 10g, UNIX Shell Scripting.
Confidential, St Petersburg, FL
- Designed teh application in J2EE architecture and developed dynamic and browser compatible User Interfaces for on-line account management, order and payment processing.
- Used Hibernate Object relational mapping (ORM) to achieve data persistence.
- Developed Servlets and JSPs based on MVC pattern using Spring Framework.
- Developed required helper classes following CoreJavamulti-threaded programming.
- Developed hibernate DAO Classes using Spring JDBC Template and Methods in teh DAO layer to persist teh POJOS in teh database.
- Designed and developed Web services based on SOAP and WSDL for handling transaction history.
- Involved in designing and developing teh JSON, XML Objects with MySQL.
- Developed web applications using Spring MVC, jQuery and implemented Spring Dependency Injection mechanism.
- Integrated user interface, server layer and persistence layer using Spring IOC, AOP and Spring MVC integration with OBPM and Hibernate.
- Developed data access classes using JDBC and created SQL queries and used PL/SQL procedures with Oracle Database.
- Used LOG4J & JUnit for debugging, testing and maintaining teh system state andtested teh website with older and latest versions/releases on multiple browsers.
- Implemented test cases for Unit testing of modules using JUnit and used ANT for building teh project.
- Provided production support for two of teh applications involving swing and struts framework.
Confidential, Charlotte, NC
- Worked with Business analysts and Product owners to analyse and understand teh requirements and giving teh estimates.
- Implement J2EE design patterns such as Singleton, DAO, DTO and MVC.
- Developed dis web application to store all system information in a central location using Spring MVC, JSP, Servlet and HTML.
- Used SpringAOP module to handle transaction management services for objects in any Spring-based application.
- Implemented Spring DI and Spring Transactions in business layer.
- Developed data access components using JDBC, DAOs, and Beans for data manipulation.
- Designed and developed database objects like Tables, Views, Stored Procedures, User Functions using PL/SQL, SQLDeveloperand used them in WEB components.
- Used iBATIS for dynamically building SQL queries based on parameters.
- Developed Junit test cases for Unit Testing &Used Maven as build and configuration tool.
- Used Shell scripting to create jobs to run on daily basis.
- Debugged teh application using Firebug and traversed through teh nodes of teh tree using DOM functions.
- Monitored teh error logs using log4jand fixed teh problems.
- Used Eclipse IDE and deployed teh application on Web Logic server.
- Responsible for configuring and deploying teh builds on Web Sphere App Server.
Confidential, Plano TX
- Design and development of Java classes using Object Oriented Methodology.
- Worked insystemusing Java, JSP and SERVLET.
- Development of Java classes and methods for handling Data from database.
- Used JDBC/Jconnect for Oracle.
- Create SQL script to create/drop database objects like tables, views, indexes, constraints, sequences and synonyms.
- Developed SQL*Loader scripts to load teh data from external files that is exported from SQL Server.
- Creating complex PLSQL packages incorporating multi-org functionality with many modules merged together, by working with complex queries, complex Joins and conditions.
- Developing efficient queries and views to produce customers delight.
- Creating Servlets, JSP for administration module.
- Creating Unix Shell Scripts for sequential execution of Java scripts including data extraction, loading and Oracle Stored Procedure execution.
- Developing many KSH scripts for data file movement and scheduling.
- Attended and Conducted User meetings for requirement analysis and project reporting.
- Testing and bug fixing and providing support teh production.
- Collecting and understanding teh User requirements and Functional specifications.
- Creating components for isolated business logic.
- Deployment of application in J2EE Architecture.
- Using Oracle 8i as teh Database Server.
- Designing EJB 2.0 components with various design patterns like Service Locator and Business Delegate.
- Finalize teh design specifications for teh new system.
- Involvement in design, development and maintenance of teh application.
- Performing Unit Integration and performance testing and continuous interaction with Quality Assurance group.
- Provided on call support based on teh priority of teh issues.