Hadoop Administrator Resume
New York, NY
SUMMARY
- Over 25+ years of IT experience including 2 years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster.
- Experience in Hadoop Administration (HDFS, MAP REDUCE, HIVE, PIG, SQOOP, FLUME, OOZIE, and HBASE) and NoSQL Administration.
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
- Experience in installing Hadoop cluster using different distributions of Apache Hadoop:Cloudera and Hortonworks.
- Good experience in understanding client's Big Data business requirements and transforming them into Hadoop centric technologies.
- Experience in analyzing client's existing Hadoop infrastructure and understanding the performance bottlenecks and providing the performance tuning accordingly.
- Installed, configured and maintained HBASE.
- Worked with Sqoop in Importing and Exporting data from different databases like MySQL and Oracle into HDFS and Hive.
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Loading logs from multiple sources directly into HDFS using Flume.
- Experience in benchmarking, performing backup and recovery of Namenode metadata, and data residing in the cluster.
- Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
- Adept at configuring Name Node High Availability.
- Worked on Disaster Management with Hadoop Cluster.
- Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
- Experience in deploying and managing the multi-node development, testing and production.
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating and managing the realm domain.
- Worked on setting up Name Node high availability for major production cluster and designed automatic failover control using Zookeeper and Quorum Journal Nodes.
- Installation, patching, upgrading, tuning, configuring and trouble-shooting Linux based operating systems - Red Hat, Ubuntu and Centos and virtualization in a large set of servers.
- Experience in creating building and managing public and private cloud infrastructure.
- Experience in deploying Hadoop cluster using puppet automation in the cloud environment.
- Ability to quickly master new concepts and technologies.
- Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS, and also used them in building infrastructure in a Linux Environment.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Subnetting, Ethernet Bonding and Static IP).
TECHNICAL SKILLS
Operating System: RedHat, CentOS, Ubuntu, Solaris, Windows ...
Hardware: Sun Ultra Enterprise Servers (E3500, E4500), SPARC server 1000, SPARC server 20 Enterprise Servers
Languages: C++, Core Java and JDK 7/8
Web Languages: HTML, CSS, and XML
Hadoop Distribution: Cloudera and HortonWorks Ecosystem Hadoop MapReduce, YARN, HDFS, Sqoop, Hive, Pig, Hbase, Sqoop, Flume, and Oozie.
Tools: JIRA, PuTTy, WinSCP, FileZilla.
Database: HBase, RDBMS Sybase, Oracle 7.x/8.0/9i, MySQL, SQL.
Protocols: TCP/IP, FTP, SSH, SFTP, SCP, SSL, ARP, DHCP, TFTP, RARP, PPP and POP3 Shell Scripting Bash Cloud Technologies AWS
PROFESSIONAL EXPERIENCE
Hadoop Administrator
Confidential, New York, NY
Responsibilities:
- Installed and configured multi-nodes on fully distributed Hadoop cluster.
- Involved in Hadoop Cluster environment administration that includes De-commissioning and commissioning nodes, cluster capacity planning, balancing, performance tuning, cluster Monitoring and Troubleshooting.
- Configured Fair Scheduler to provide service-level agreements for multiple users of a cluster.
- Implemented the Hadoop Name-node HA services to make the Hadoop services highly available.
- Worked on installing Hadoop Ecosystem components such as Sqoop, Pig, Hive, Oozie, and Hcatalog.
- Involved in HDFS maintenance and administering it through Hadoop-Java API.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Proficient in writing Flume and Hive scripts to extract, transform and load the data into Database.
- Responsible for maintaining, managing and upgradation of Hadoop cluster connectivity and security.
- Worked on Hadoop CDH upgrade from CDH4.x to CDH5.x.
- Analyzed logs and resolved issues with in the cluster and ecosystem components.
- Resolved configuration and troubleshooting issues with Apache add-on tools.
- Worked on tuning the cluster to meet performance standards and Implemented backup and recovery for Hadoop cluster.
- Deployed the Hadoop cluster using Kerberos to provide secure access to the cluster.
- Implemented Sentry enterprise security for fine-grained authorization to data. Sentry is used for advanced authorization control to enable multi user applications and data sets.
- As a part of POC, used the Amazon AWS S3 as an underlying file system for the Hadoop and implemented the elastic Map-Reduce jobs on the data in S3 buckets.
- Provided 24X7 On-call Support during weekends on roaster basis.
Environment: Hadoop, HDFS, MapReduce, Hive, Sqoop, Pig, DB2, Oracle, XML, Cloudera Manager
Hadoop Administrator
Confidential, Jersey city, NJ
Responsibilities:
- Client used HortonWorks distribution of Hadoop to store and process their huge data generated from different enterprises.
- Experience in installing, configuring, and monitoring HDP stacks 2.1, 2.2, and 2.3.
- Installed and configured Hadoop ecosystem components like Hive, Pig, Sqop, Flume, Oozie and HBase.
- Experience in cluster planning, performance tuning, Monitoring, and troubleshooting the Hadoop cluster.
- Experience in cluster planning phases: planning the cluster, preparing the nodes, pre-installation and testing.
- Responsible for cluster HDFS maintenance tasks: commissioning and decommissioning nodes, balancing the cluster, and rectifying failed disks.
- Responsible for cluster MapReduce maintenance tasks: commissioning and decommissioning task trackers and map reduce jobs.
- Experince in using sqoop to import and export data from external databases to Hadoop cluster.
- Experince in using flume to get log files into the Hadoop cluster.
- Experince in configuring Mysql to store the hive metadata.
- Experience in administration of No Sql databases including Hbase and MongoDB.
- Communicating with the development teams and attending daily meetings.
- Addressing and Troubleshooting issues on a daily basis.
- Experience in setting up Kerberos in Horton works cluster.
- Working with data delivery teams to setup new Hadoop users.
- This job includes setting up Linux users, setting up Kerberos principals and testing MFS, and Hive.
- Cluster maintenance as well as creation and removal of nodes.
- Monitor Hadoop cluster connectivity and security
- Manage and review Hadoop log files.
- File system management and monitoring.
Hadoop Engineer
Confidential
Responsibilities:
- Involved in making Design documents by gathering the information from Business teams
- Developed and integrated various software components like Hive, Pig, Sqoop, Oozie, and Pig and requested by Business user as a Proof of concept.
- Development of hql and Pig latin scripts for Fraud and Risk teams.
- Processed XML files using Pig.
- Written Map reduce programs for cleansing of data in HDFS.
- Implemented Netezza queries in HDFS with Hive and Pig.
- Developed Map reduce code for calculating Histograms.
- Development experience with Datameer.
- Implemented custom python UDF's in Hive and Pig
- Involved in migrating the code to Production and support for any bug fixes.
- Developed Python scripts for Streaming on Hadoop and Pig scripts for various ETL activities.
- Developed scripts to automate application deployments, Hadoop cluster performance tuning and monitoring.
Linux Administrator
Confidential
Responsibilities:
- Installation, Configuration, Upgradation and administration of Windows, Sun Solaris, RedHat Linux and Solaris.
- Linux and Solaris installation, administration and maintenance.
- User account management, managing passwords setting up quotas and support.
- Worked on Linux Kick-start OS integration, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, and LDAP integration.
- Network traffic control, IPSec, Quos, VLAN, Proxy, Radius integration on Cisco Hardware via Red Hat Linux Software.
- Installation and configuration of MySql on Windows Server nodes.
- Responsible for configuring and managing Squid server in Linux and Windows.
- Configuration and Administration of NIS environment.
- Involved in Installation and configuration of NFS.
- Package and Patch management on Linux servers.
- Worked on Logical volume manager to create file systems as per user and database requirements.
- Data migration at Host level using Red Hat LVM, Solaris LVM, and Veritas Volume Manager.
- Expertise in establishing and documenting the procedures to ensure data integrity including system fail-over and backup/recovery in AIX operating system.
- Managed 100 + UNIX servers running RHEL, HPUX on Oracle HP and Dell server including blade centers.
- Solaris Disk Mirroring (SVM), ZONE installation and configuration
- Escalating issues accordingly, managing team efficiently to achieve desired goals.