- Around 9 years of professional Information Technology experience in Hadoop and Java Administration activities such as installation, configuration and maintenance of systems/clusters.
- Hands on experience on Hadoop Clusters using Hortonworks ( HDP ), Cloudera (CDH5), and Yarn distributions platforms.
- Possessing skills in Apache Hadoop , Map - Reduce , Pig , Impala , Hive , Platfora , Hbase , Zookeeper , Sqoop , Flume , OOZIE , Kafka .
- Experience in deploying and managing the multi-node development and production Hadoop cluster with different Hadoop components ( Hive , Pig , Sqoop, Oozie , Flume, Catalog, Hbase , Zookeeper ) using Hortonworks Ambari.
- Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL , PL/SQL , and DB2 .
- Experience in Configuring Name-node High availability and Name-node Federation and depth knowledge on Zookeeper for cluster coordination services.
- Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
- Extensive knowledge in Tableau on enterprise environment and Tableau administration experience including technical support, troubleshooting, reporting and monitoring of system usage.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Worked on NoSQL databases including Hbase , Cassandra .
- Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
- Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
- Experience in scheduling all Hadoop / Hive / Sqoop / Hbase jobs using Oozie .
- Knowledge of Data Ware Housing concepts and Cognos 8 BI Suit and Business Objects.
- Experience in HDFS data storage and support for running map-reduce jobs .
- Working knowledge in installing and maintaining Cassandra by configuring the Cassandra yaml file as per the business requirement and performed reads/writes using Java JDBC connectivity
- Comprehensive Knowledge of Linux kernel tuning, patching and extensive knowledge of Linux system imaging/mirroring using System Imager.
- Hands on experience in Zookeeper and ZKFC in managing and configuring in Name node failure scenarios.
- Team Player with good communication and interpersonal skills and also goal oriented approach to problem solving issues.
Big Data Technologies: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Hbase, Flume, Oozie, Spark, Zookeeper.
Hadoop Platforms: Hortonworks and Cloudera, Apache Hadoop
- As an Hadoop admin worked in Huge Cluster on maintaining nodes with High availability environment using Hortonworks Ambari manager and Cloudera Manager.
- Involved in Installation and configuration, Hadoop Cluster and Maintenance, Cluster Monitoring, Troubleshooting and Transform data from RDBMS to HDFS and followed proper backup & Recovery strategies.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior like frequency of calls, top calling customers and designed and implemented service layer over Hbase Database.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce and loaded data into HDFS.
- Provide Business Intelligence support using Tableau for implementing effective Business dashboards & visualizations of data.
- Configuring, implementing and supporting High Availability (Replication) with Load balancing (Sharding) cluster of MongoDB having Terabytes of data.
- Hadoop cluster monitoring and troubleshooting Hive, Datameer, Platfora and flume.
- Experience with securing Hadoop clusters including Kerberos KDC installation, Open LDAP installation, data transport encryption with TLS.
- Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
- Used Cassandra in multiple virtual and physical data centers to ensure the system was highly redundant and scalable.
- Exported the analyzed data from MySQL to the HDFS using Sqoop for visualization and to generate reports for the BI team.
- Importing of data from various data sources such as Oracle and Comptel server into HDFS using transformations such as Sqoop, Map Reduce.
- Designed and developed scalable and custom Hadoop solutions as per dynamic data needs and coordinated with technical team for production deployment of software applications for maintenance.
- Involved in loading data from UNIX file system to HDFS
- Real time streaming data using Spark with Kafka.
- Worked with ETL team to load data into Data Warehouse/Data Marts using Informatica.
- Experience in providing support to data analyst in running Pig and Hive queries.
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia. Reviewing the log files and error solving.
- Involved in importing the real time data to Hadoop using Kafka. Expert in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
- Developed custom Process chains to support master data and transaction data loads from BI to BPC.
- Involved in various POC activity using technology like Map reduce, Hive, Pig, and Oozie.
Environment: Hadoop, HDFS, Hive, Sqoop, Flume, Hortonworks, Cassandra, Java, Impala, Talend, Tableau, Kafka, storm, Zookeeper and Hbase, Kafka, YARN, Oracle 9i/10g/11 RAC with Solaris/RedHat, MongoDB, Kerberos, SQL plus, PHP, Shell Scripting, ETL/BI architectures and SQL, RedHat/Suse Linux, EM Cloud Control.
Confidential, Warren, NJ
- Involved in design and planning phases of Hadoop Cluster planning.
- Responsible for Regular health checkups of the Hadoop cluster using custom scripts.
- Installed and configured multi-node fully distributed Hadoop cluster of large number of nodes.
- Provided Hadoop, OS, and Hardware optimizations.
- Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
- Monthly Linux server maintenance, shutting down essential Hadoop name node and data node.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Experienced in managing and reviewing the Hadoop log files.
- Balancing Hadoop cluster using balancer utilities to spread data across the cluster equally.
- Implemented data ingestion techniques like Pig and Hive on production environment.
- Routine cluster maintenance on every weekend to make required configuration changes, installation etc.
- Expertise in Linux Enterprise using HP ProLiant Servers and Virtual Connect Technology.
- Implemented Kerberos Security Authentication protocol for existing cluster.
- Designing and creating ETL jobs through Talend to load huge volumes of data into cassandra, Hadoop Ecosystems and relational databases.
- Worked extensively with sqoop for importing metadata from Oracle. Used Sqoop to import data from SQL server to Cassandra.
- Implement Flume, Spark, Spark Stream framework for real time data processing. Developed analytical components using Scala, Spark and Spark Stream. Implemented Proofs of Concept on Hadoop and Spark stack and different big data analytic tools, using Spark SQL as an alternative to Impala.
- Implemented Apache Spark data processing project to handle data from RDBMS and streaming sources.
- Monitoring and Debugging Hadoop jobs/Applications running in production.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Kerberos keytabs creation for ETL application use cases before on boarding to Hadoop.
- Responsible for adding User to Hadoop cluster.
- Worked on Evaluating, comparing different tools for test data management with Hadoop.
- Helped and directed testing team to get up to speed on Hadoop Application testing.
- Worked on Installing 20 node UAT Hadoop cluster.
Environment: Cloudera, Java, RedHat Linux, HDFS, Mahout, Map-Reduce, Cassandra, Hive, Pig, Sqoop, Spark, Scala, Flume, Zookeeper, Oozie, DB2, HBase and Pentaho.
Confidential, Chicago, ILHadoop Adminstator
- Supported technical team members for automation, installation and configuration tasks.
- Wrote shell scripts to monitor the health check of Hadoop Components services and respond accordingly to any warning or failure conditions.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Store unstructured data in semi structure on HDFS using HBase.
- Used Change management and Incident management process following the company standards.
- Implemented partitioning, dynamic partitions and buckets in HIVE.
- Installed, configured and maintained Hadoop cluster.
- Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
- Involved in data migration from Oracle database to MongoDB.
- Created HBase tables to store variable data formats of data coming from different applications.
- Experience in managing and reviewing Hadoop log files.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Performing planned and Break fix changes in Infrastructure.
Environment: Cent OS, CDH 5.4.5, Oracle, MS-SQL, Zookeeper3.4.6, Oozie 4.1.0, MapReduce, YARN 2.6.1, Nagios, REST APIs, Amazon web services, HDFS, Sqoop1.4.6, Hive 1.2.1, Pig 0.15.0
Java/ Hadoop Administrator
- Involved in Requirements Analysis, and design an Object-oriented domain model.
- Involvement in the detailed Documentation, written functional specifications of the module.
- Involved in development of Application with Java and J2EE technologies.
- Develop and maintain elaborate services based architecture utilizing open source technologies like Hibernate, ORM and Spring Framework.
- Developed server-side services using Java multithreading, Struts MVC, Java, EJB, and spring, Web Services (SOAP, WSDL, and AXIS).
- Responsible for developing DAO layer using Spring MVC and configuration XML’s for Hibernate and to also manage CRUD operations (insert, update, and delete).
- Designing, Development and Implementation of JSPs in Presentation layer for Submission, Application, and reference implementation.
- Deployed Web, presentation and business components on Apache Tomcat Application Server.
- Developed PL/SQL procedures for different use case scenarios
- Involvement in post-production support, Testing and used JUNIT for unit testing of the module.
Environment: Java/J2EE, JSP, XML, Spring Framework, Hibernate, Eclipse(IDE), Java Script, Ant, SQL, PL/SQL, Oracle, Windows, UNIX, Soap, Jasper reports.
Senior Web application Developer
- Involved in writing programs for XA transaction management on multiple databases of the application.
- Developed java programs, JSP pages and servlets using Cantata Struts framework.
- Involved in creating database tables, writing complex TSQL queries and stored procedures in the SQL server.
- Used EJB s in the application and developed Session beans to implement business logic at the middle tier level.
- Actively involved in writing SQL using SQL Query Builder.
- Involved in coordinating the on-shore/Off-shore development and mentoring the new team members.
- Extensively Used Ant tool to build and configure J2EE applications and used Log4J for logging in the application
- Used JAXB to read and manipulate the xml properties.
- Used JNI for calling the libraries and other implemented functions in C language.
- Used prototype MooTools and script.aculo.us for fluid User Interface.
- Involved in fixing defects and unit testing with test cases using JUnit.
Environment: Java, EJB, Servlets, XSLT, CVS, J2EE, AJAX, Struts, Hibernate, ANT, Tomcat, JMS, UML, Log4J, Oracle 10g, Eclipse, Solaris, JUnit and Windows 7/XP, Maven.