- 7 + years of professional experience including around 3 plus years in Big Data analytics as Hadoop Administrator .
- Experience in designing, installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks and Cloudera.
- Experience in managing the Hadoop infrastructure with Cloudera Manager .
- Strong Knowledge in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- Experience in designing Big Data solutions for traditional enterprise businesses.
- Good knowledge in using Network Monitoring Daemons like Nagios.
- Experience in Backup configuration and Recovery from a Name Node failure.
- Excellent command in creating Backups & Recovery and Disaster recovery procedures and Implementing BACKUP and RECOVERY strategies for off - line and on-line Backups.
- Involved in bench marking Hadoop/HBase cluster file systems various batch jobs and workloads
- Making Hadoop cluster ready for development team working on POCs.
- Experience in minor and major upgrades of Hadoop and Hadoop eco system.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup& Recovery strategies.
- Installing and configuring Hadoop eco system like hive, Sqoop, pig.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Optimizing performance of Hbase/Hive/Pig jobs.
- Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon Web Services (AWS) EC2
- Upgraded the Hadoop cluster from CDH3 to CDH4.
- Excellent interpersonal, communication, documentation and presentation skills.
Hadoop Framework: HDFS, Map Reduce, Pig, Hive, Hbase, Sqoop, zookeeper, Oozie, Nagios NoSQL Databases: Hbase
Microsoft: MS Office, MS Project, MS Visio, MS Visual Studio 2003
Databases: MySQL, SQL Server, PL/SQL Developer
Operating Systems: Linux, Cent OS
Scripting: Shell Scripting, HTML Scripting, puppet
Cluster Management Tools: Cloudera Manager, HUE
Confidential, Fremont, CA
Sr. Hadoop Administrator
- Handle the installation and configuration of a Hadoop cluster.
- Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
- Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
- Monitor the data streaming between web sources and HDFS.
- Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
- Setting up Identity, Authentication, and Authorization.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and Patch updates.
- Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Commission and decommission the Data nodes from cluster in case of problems.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Set up and manage High Availability Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
- Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.
Confidential, New York, NY
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
- Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
- Experienced on adding/installation of new components and removal of them through ambari.
- Implemented and Configured High Availability Hadoop Cluster(Quorum Based)
- Installed and Configured Hadoop monitoring and Administrating tools: Nagios and Ganglia
- Back up of data from active cluster to a backup cluster using distcp.
- Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings
- Hands on experience working on Hadoop ecosystem components like Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume
- Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
- Experience in using Flume to stream data into HDFS - from various sources. Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Worked on analyzing Data with HIVE and PIG +9
- Helped in setting up Rack topology in the cluster.
- Upgraded the Hadoop cluster from CDH3 to CDH4.
- Deployed a Hadoop cluster using CDH3 integrated with Nagios and Ganglia.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Deployed Network file system for Name Node Metadata backup.
- Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.
- Performed both major and minor upgrades to the existing cluster and also rolling back to the previous version.
- Designed the cluster so that only one secondary name node daemon could be run at any given time.
- Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
- Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
Environment: Flume, Oozie, Pig, Hive, Map-Reduce, YARN, and Cloudera Manager
Confidential - Minnesota, MN
- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Load data from various data sources into HDFS using Flume.
- Worked on Cloudera to analyze data present on top of HDFS.
- Worked extensively on Hive and PIG.
- Worked on large sets of structured, semi-structured and unstructured data.
- Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Participated in design and development of scalable and custom Hadoop solutions as per dynamic data needs.
- Coordinated with technical team for production deployment of software applications for maintenance.
- Good knowledge on reading data from Cassandra and also writing to it.
- Provided operational support services relating to Hadoop infrastructure and application installation. Handled the imports and exports of data onto HDFS using Flume and Sqoop.
- Supported technical team members in management and review of Hadoop log files and data backups.
- Participated in development and execution of system and disaster recovery processes.
- Formulated procedures for installation of Hadoop patches, updates and version upgrades.
- Automated processes for troubleshooting, resolution and tuning of Hadoop clusters.
- Set up automated processes to send alerts in case of predefined system and application level issues.
- Set up automated processes to send notifications in case of any deviations from the predefined resource utilization.
Environment: Redhat Linux/Centos 4, 5, 6, Logical Volume Manager, Hadoop, VMware ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11, 12, Oracle Rac 12c, HPSM, HPSA.
- Installing and upgrading OE & Red hat Linux and Solaris 8/9/10 x86 & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
- Experience in LDOM's and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 10.
- Experience working in AWS Cloud Environment like EC2 & EBS.
- Implemented and administered VMware ESX 3.5, 4.x for running the Windows, Centos, SUSE and Red hat Linux Servers on development and test servers.
- Installed and configured Apache on Linux and Solaris and configured Virtual hosts and applied SSL certificates.
- Implemented Jumpstart on Solaris and Kick Start for Red hat environments.
- Experience working with HP LVM and Red hat LVM.
- Experience in implementing P2P and P2V migrations.
- Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
- Implemented HA using Red hat Cluster and VERITAS Cluster Server 5.0 for Web Logic agent.
- Managing DNS, NIS servers and troubleshooting the servers.
- Troubleshooting application issues on Apache web servers and also database servers running on Linux and Solaris.
- Experience in migrating Oracle, MYSQL data using Double take products.
- Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes with layouts like RAID 1, 5, 10, 51.
- Re-compiling Linux kernel to remove services and applications that are not required.
- Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss, Dtrace.
- Experience working on LDAP user accounts and configuring ldap on client machines.
- Upgraded Clear-Case from 4.2 to 6.x running on Linux (Centos &Red hat)
- Worked on patch management tools like Sun Update Manager.
- Experience supporting middle ware servers running Apache, Tomcat and Java applications.
- Worked on day to day administration tasks and resolve tickets using Remedy.
- Used HP Service center and change management system for ticketing.
- Worked on the administration of the Web Logic 9, JBoss 4.2.2 servers including installation and deployments.
- Worked on F5 load balancers to load balance and reverse proxy Web Logic Servers.
- Shell scripting to automate the regular tasks like removing core files, taking backups of important files, file transfers among servers.
Environment: Solaris 8/9/10, Veritas Volume Manager, web servers, LDAP directory, Active Directory, BEA Web logic servers, SAN Switches, Apache, Tomcat servers, WebSphere application server.
- Handle all user issues and installation requests through Spice works in house ticketing system.
- Currently maintaining Open stack Cluster Servers
- Maintain Linux servers running versions of CentOS, Red Hat, and Oracle Linux
- Assist in server maintenance ranging from Windows Server version 2003, 2008, and 2012
- Responsible for supporting entire Inter-Tel and Mitel VoIP Phone and PBX systems
- Responsible for user account creation, group policies objects and password maintenance through Active Directory
- Performs email troubleshooting and account creation through Microsoft Exchange server
- Perform VMware host upgrades and virtual machine maintenance/migrations when needed
- Perform Laptop and desktop repair and troubleshooting ranging from hard drive repair to memory upgrades and parts replacement
- Report findings for all issue resolutions to management on a daily basis
- Ensure daily backups run efficiently by injecting and ejecting tapes from tape library
- Use Symantec NetBackup Console to make sure tape libraries are running at peak performance
Environment: COBOL, JCL, TSO/ISPF, File Manager, File Master.