- Around 8 years of experience in IT industry this includes 3+ years in Hadoop 4+ years in Linux.
- Hands on experience in installing, configuring and using eco - System components like Hadoop Map Reduce, HDFS, Hbase, Pig, Flume, Hive, Zookeeper and Sqoop.
- Accomplished Hadoop Administrator on Linux servers, hands on experience in designing, building, and maintaining high performance Cloudera Hadoop environment and installing Hadoop Horton Works cluster
- Hands-on experience with productionalizing Hadoop applications (such as administration, configuration management, debugging, and performance tuning).
- Worked on Multi Clustered environment and setting up Cloudera Hadoop eco-System.
- Implemented in setting up standards and processes for Hadoop based application design and implementation.
- Implemented Kerberos and Sentry for managing security in Hadoop Ecosystem.
- Implemented High Availability of different applications in Hadoop cluster.
- Practical knowledge on Hadoop ecosystem components like Pig, Hive, Impala, Mapreduce, Hue, Sqoop, Flume, Oozie, Zookeeper, Cloudera Manager & Ambari.
- Experience on Hadoop Security tools like Sentry, Ranger, Knox and Kerberos .
- Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
- Support and utilize multiple Oracle based applications including: SQL and PL/SQL, TOAD, Oracle views, stored procedures, triggers, and the Microsoft Office suite of tools.
- Design, create/modify, and implement documented solutions as agreed to by all business partners responsible to design and integrate a computing system, from start to finish.
- Experience in installation and configuration of HBASE, using HBasemaster and HBase regional server.
- Worked with system engineering teams to plan and deploy hardware and software environments optimization for Hadoop implementations.
- Experience in deploying Hadoop cluster on Public and Private Cloud Environments like Amazon Web Services (AWS)
- Experience in managing the spark cluster and tuning spark .
- Experience in working with flume to stream data from multiple sources into HDFS.
- Extensive experience in Software Development Life Cycle (SDLC) which includes Requirement Analysis, System Design, Development, Testing, and Implementation.
- Investigated on new technologies in Apache to catch up with industry developments.
Hadoop Ecosystem: HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, Zookeeper, Ranger, Flume, Splunk, Knox, Sentry, Hue, Impala.
NoSql Databases: Hbase
Database: DB2, ORACLE, MySQL, SQL Server, Teradata
Scripting Languages: Shell Scripting
IDE: Net Beans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office
Operating Systems: Linux (Redhat, Centos, UBUNTU), Windows.
WEB Servers: Apache Tomcat,JBOSS and Apache Http web server
Cluster Management Tools: Cloudera Manager and HDP Ambari
Virtualization technologies: VMware vSphere, Citrix XenServer xvructure for hadoop 00000000000000000000000000000000000000000000000000
- Manage several Hadoop clusters in production, development, Disaster Recovery environments.
- Expertise in recommending hardware configuration for Hadoop cluster.
- Trouble shooting issues such as Data Node down, Network failure and data block missing.
- Installed and configured Ranger server to enable schema level Security.
- Experience with UNIX and Linux OS.
- Integrated external components like Informatica BDE, Tibco and Tableau with Hadoop.
- Expertise in Performance tuning and optimized Hadoop clusters to achieve high performance.
- Worked on POCs to evaluate and document new and emerging trends in Hadoop and related technologies
- Managed cluster coordination services through Zoo Keeper.
- Provided support for on call issues with the vendors for our existing clusters.
- Mystifying and demystifying nodes from the Cluster environment.
- Set up automated monitoring for Hadoop cluster using Ganglia, which helped figure out the load distribution, memory usage and provided an indication for more space.
- Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, oozie, Pig, Hive, Zookeeper.
- Installed and Configured Hbase by installing Hbase Master and Hbase Regional Servers.
- Tuning the Hbase writes to meet the performance requirements.
- Installing the necessary packages required for data analytics that includes Spark, R lang, python.
- Installed and configured Revolution R and RStudio Server and integrated with Hadoop Cluster.
- Installed and configured Hive Using Hiveserver2 and HCatalog.
- Patching and upgrading Horton works AMBARI Clusters.
- Worked on spark streaming and Spark SQL.
- Worked on Linux in a Cloud Computing and Virtualized environment for AWS.
- Setting up IAM user’s in AWS for least privilege access.
- Transferring the data from mysql to AWS Hadoop environment using sqoop.
- Created the VPC’s for the environment and also performed regular administration on AWS cloud environment.
- Suggesting the new cloud services based on the business requirement’s in AWS.
- Recovering from node failures and troubleshooting common Hadoop cluster issues.
- Hadoop package installation and configuration to support fully-automated deployments.
- Supporting Hadoop developers and coordinating with different teams and assisting in optimization of map reduce jobs, Hive Scripts, and HBase ingest as required.
Environment: Horton works Ambari, HDFS, Hbase, Map reduce, Hive, Pig, Zookeeper, Flume, Oozie, Sqoop, Eclipse, Kerberos, and Ranger.
Confidential, Santa Clara
- Responsible for building a cluster on HDP 2.0. With Hadoop 2.2.0 using Ambari.
- Expertise in analyzing data with Hive, Pig and HBase.
- Worked on POCs to evaluate and document new and emerging trends in Hadoop and related technologies.
- Expertise in Cluster Capacity Planning.
- Implemented Kerberos Security Authentication protocol for Hadoop clusters with Knox and Ranger for granular level access.
- Expertise with trouble shooting Pig and Hive Queries.
- Good experience in troubleshooting production level issues in the cluster and its functionality.
- Expertise in Manage and review data backups.
- Frequent monitoring of Hadoop connectivity issues and security issues.
- Reviewing the Hadoop log files while troubleshooting the error’s and failed jobs.
- Backed up data on regular basis to a remote cluster using distcp.
- Imported logs from web servers with Flume to ingest the data into HDFS.
- Responsible for creating new users to the hadoop cluster and providing access to the datasets
- Expertise in disaster recovery processes as required.
- Implemented capacity scheduler to allocate fair amount of resources to small jobs.
- Monitored the data from MySQL database to HDFS and vice-versa using Sqoop.
- Implemented rack aware topology on the Hadoop cluster.
- Good Experience in Planning, Installing Configuring Hadoop Cluster in Cloudera and Horton works Distributions
- Expertise in applying Hadoop updates, patches, version upgrades as required
- Expertise in Optimize and tune the Hadoop environment to meet the performance requirements.
Environment: Hadoop Hdfs, Mapreduce, Hive, Pig, Flume, Oozie, Sqoop, AWS, Eclipse, Hortonworks Ambari, Horton Works
- Works with the Hadoop production support team to install operating system and Hadoop updates, patches, version upgrades as required.
- Installed and configured an automated tool Puppet that included the installation and configuration of the Puppet master, agent nodes and an admin control workstation.
- Provided POC for test and QA cluster using Cloudera 4.5 using NOSQL DB using HBASE.
- Involved in moving all log files generated from various sources to HDFS for further processing.
- Involved in installation of Hive, Pig and Sqoop.
- Provided Level 1, 2 and 3 technical support.
- Familiar with developing Oozie workflows and Job Controllers for job automation
- Experience in working on production support and maintenance related projects
- Experience on working with multiple relational databases: DB2/400, MySQL, Oracle
- Dumped the data from one cluster to another cluster by using Distcp.
- Deployed Network file system (NFS) for NameNode Metadata backup.
- Responsible for setting up and configuring MYSQL database for the cluster.
- Extracted output files using Sqoop and loaded the extracted log data using Flume.
- Created user accounts and given users the access to the Hadoop cluster.
- Worked on AWS Cloud management and Chef automation
- Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop,Hue, Linux, Java, Oozie, Hbase, AWS, Cloudera Hadoop.
Linux System Administrator
- Administration, package installation, configuration of Oracle Enterprise Linux 5.x.
- Administration of RHEL, which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- Creating, cloning Linux Virtual Machines.
- Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
- RPM and YUM package installations, patch and other server management.
- Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
- Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
- Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, and user and group quota.
- Creating physical volumes, volume groups, and logical volumes.
- Gathering requirements from customers and business partners and design, implement and provide solutions in building the environment.
- Installing and configuring Apache and supporting them on Linux production servers.