- 7+ years of experience in IT specializing in Hadoop Administration, AWS infrastructure setup, DevOps and Software testing.
- Hands on experience in installation, configuration, management and support of full stack Hadoop Cluster both on premise and cloud using Hortonworks and Cloudera bundles.
- Solid hands - on experience in the installation and configuration of Hadoop Ecosystem components like HDFS, YARN, Flume, Sqoop, HBase, Hive, Impala, Spark, Kafka and Zookeeper.
- Good understanding on cluster capacity planning and configuring the cluster components based on requirements.
- Exposure to configuration of high availability cluster.
- Collecting and aggregating a large amount of log data using Apache Flume and storing data in HDFS for further analysis.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in data processing using Hive and Impala.
- Knowledge of securing the Hadoop cluster using Kerberos and Sentry.
- Knowledge of data processing using Apache Spark.
- Working knowledge of Kafka for real-time data streaming and event based architecture.
- Experience in setting up automatic failover control and manual failover control using ZooKeeper and quorum journal nodes.
- Involved in Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster monitoring and troubleshooting.
- Backup configuration and recovery from a Name-node failure .
- Experience in performance tuning of Hadoop cluster using various JVM metrics.
- Experience in creating small Hadoop cluster (Sandbox) for POCs performed by development.
- Knowledge of minor and major upgrades of Hadoop and eco-system components.
- Hands-on experience in AWS cloud landscape including EC2, Identity and Access management (IAM), S3, Cloud Watch, VPC, RDS.
- Managed users, groups, roles and policies using IAM.
- Hands on experience in deploying AWS services using Terraform.
- Coordinated with UNIX system admin to add and provision new server for Hadoop cluster.
- Experience in monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
- Hands on experience in Linux admin activities on RHEL & Ubuntu.
- Experience in Black Box Testing.
- Expertise in Writing, Reviewing and Executing Test Cases.
- Conceptual Knowledge in Load testing, Performance testing, and Stress Testing.
- Experience in the entire SDLC and specialized in Verification & Validation.
Hadoop Ecosystem: HDFS, YARN, HBase, Hive, Impala, Spark, Kafka, ZooKeeper, Flume, SqoopCluster Management Tools: Cloudera Manager, Ambari
Scripting: Puppet, Shell Scripting, Terraform
OS: Windows, Linux (RHEL7, Ubuntu, CentOS)
Cloud Service: AWS(Amazon Web Services), Terraform
Configuration management: GitHub
Database: MYSQL, MS Access
Confidential, Baltimore, MD
- Provided Administration, management and support for large scale Big Data platforms on Hadoop eco-system.
- Involved in Cluster Capacity planning, deployment and managing Hadoop for our data platform operations.
- Created EC2 instances and implemented large multi node Hadoop clusters in AWS cloud from scratch using automated scripts such as terraform.
- Configured AWS IAM and Security Groups.
- Developed terraform template to deploy Cloudera Manager on AWS.
- Hands on experience in managing and monitoring the Hadoop cluster using Cloudera Manager.
- Experience in decommissioning failed nodes and commissioning new nodes as the cluster grows and to accommodate more data on HDFS.
- Experience in enabling High Availability to avoid any data loss or cluster down time.
- Backup configuration and recovery from a Name-node failure.
- Hands on experience in cluster upgrade and patching without any data loss and with proper backup plans.
- Experience in HDFS data storage and support for running map-reduce jobs.
- Involved in analysing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Managed and reviewed Hadoop log files as a part of administration for troubleshooting purposes.
- Extensive knowledge on HDFS and YARN implementation in cluster for better performance.
- Installed, configured Hadoop Cluster using automation tool Puppet and created alerts using shell script.
- Installed and configured Sqoop to establish import/export pipeline from various databases such as MySQL.
- Installed and configured Flume to establish import/export data pipeline from various unstructured and semi structured data source.
- Installed and configured HBase for random access and modification of the data in HDFS.
- Installed and configured Hive for faster data processing with SQL-like interface.
- Installed and configured Impala for real time data processing.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Environment: Hadoop, HDFS, Map-Reduce, YARN, Cloudera Manager, ZooKeeper, Flume, Sqoop, HBase, Hive, Impala, Red Hat Linux, Amazon Web Services (AWS), EC2, Terraform, Java, Puppet.
Confidential, Camp Hill, PA
- Installed and configured Hortonworks stack of Hadoop which included HDFS (Hadoop Distributed File System), YARN, Flume, HBase, ZooKeeper and Sqoop.
- Managed and monitored the cluster using Ambari dashboard.
- Conducted root cause analysis of production problems and data issues and resolved them.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Documented and managed failure/recovery (loss of name node, loss of data node, replacement of HW or node)
- Involved in Minor and Major Release work activities.
- Executed tasks for upgrading cluster on the staging platform before doing it on production cluster.
- Installed and configured HBase.
- Proactively monitored Hadoop Namenode, Datanodes and cluster health status.
- Added nodes to the clusters and removed nodes from the cluster by using scripts.
- Periodically managed and reviewed various Hadoop log files.
Environment: Hadoop, YARN, HDFS, HBase, Ambari, Flume, MapReduce, Sqoop, ZooKeeper, Oozie, NoSQL, agile, Java, Oracle, Linux.
Confidential, Houston, TX.
- Installed and configured full stack hadoop cluster from scratch which consists of HDFS, YARN, HBase, ZooKeeper and Sqoop.
- Extensively worked on commissioning and decommissioning of nodes, trouble shooting of failed disks, filesystem integrity checks and cluster data replication.
- Setup HBase cluster which includes master and region server configuration, High availability configuration, data migration and backing up/restoring HBase data.
- Configured YARN Schedulers (Fair and Capacity).
- Setup HDFS Quotas to enforce the fair share of computing resources.
- Implemented migration of data from different databases to Hadoop.
- Worked on day to day issues pertaining to HDFS file system, name node failure, data node failure, resource manager, node manager and YARN.
- Set up alerts and dashboards in cloudera manager for proactive monitoring of key hadoop components.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Collected cluster usage metrics periodically and provided reports to management for future actions.
- Created necessary user accounts and groups and provided users the access to the Hadoop cluster.
- Involved in loading data from UNIX file system to HDFS through cron job.
- Coordinated with QA team during testing phase.
- Provided necessary application support to the development team.
- Monitored workload and job performance of the cluster.
Environment: Cloudera, HDFS, Sqoop, Zookeeper and HBase, Windows 2000/2003, Unix, Linux, Java, HDFS, Map Reduce, HBase, Flume, Sqoop, Shell Scripting
Confidential, Albany, NY
- Analysed the user requirements by interacting with system architect, developers and business users.
- Suggested improvements in test process by gathering and analysing data.
- Reviewed and analysed Detail Designed Specification and Technical Specification documents.
- Analysed the SRS (System Requirement Specifications) and developed Test Suites to cover the overall quality assurance testing.
- Prepared Test Cases with the complete description of requirements, uploaded test cases and report results into Test Director.
- Performed Manual Testing to check flow of the application.
- During testing life cycle, performed different types of testing like System Testing, Integration Testing and Regression Testing.
- Created Test Execution Matrices during the test cycle.
- Identifying bugs and interacted with QA Lead and Developers to resolve the issues of on bugs.
- Participated in QA Team meetings and weekly QA testing reviews.
Environment: Manual Testing, HTML, Java Script, Oracle, Windows and UNIX.
Confidential, Jersey City, NJ
Manual QA Tester
- Identified the test requirements based on application business requirements and blueprints.
- Performed manual testing and maintain documentation on different types of Testing viz., Positive, Negative, Regression, Integration, System, User-acceptance, Performance and Black Box
- Involved in analyzing the applications and development of test cases
- Involved in doing System testing of the entire applications along with team members Applications are tested manually.
- Analyzed and reviewed the software requirements, functional specifications and design documents.
- Proficient in QA processes, test strategies and experience in creating documents like Test plan, Test procedures.
- Developed test scenarios and test procedures based on the test requirements.
- Participated in Preparing Test Plans.
- Wrote SQL queries and stored procedures to validate data.
- Documented errors and implemented their resolutions.
- Created test scripts, executed test scripts.
- Developed Test Objectives and test Procedures.
Environment: Manual testing, Quality Center, Oracle, Visual Basic, Windows.