Hadoop Administrator Resume
Manhattan, NY
SUMMARY
- Around 6+ Years of extensive IT experience with 2+ years of experience as a Hadoop Administrator and 2 years of experience in UNIX/Linux Administrator in different domains.
- Balanced cluster after adding/removing nodes or major data clean - up
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
- Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like HBase, Zookeeper and Sqoop.
- Hands-on experience on major components in Hadoop Ecosystem including HDFS, Yarn, Hive, Flume, Zookeeper, Oozie and other ecosystem Products.
- Experience in Implementing High Availability of Name Node and Hadoop Cluster capacity planning to add and remove the nodes
- Added security to the cluster using Kerberos.
- Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network
- As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
- Hands on experience in analysing Log files for Hadoop and eco system services and finding root cause.
- Expertise in Installing, Configuration and Managing Red hat Linux 7.
- Experience with multiple Hadoop distribution s like Apache, Cloudera, MapR and Hortonworks.
- Experience in securing Hadoop clusters using Kerberos and Sentry.
- Experience with distributed computation tools such as Apache Spark Hadoop.
- Experience as Deployment Engineer and System Administrator on Linux (Centos, Ubuntu, Red Hat).
- Well versed in installing, configuring and tuning Hadoop distributions: Cloudera, Hortonworks on Linux systems.
- Experience in Rebalance on HDFS cluster
- Cluster Management using Cloudera Manager
- Decommissioning and commissioning the Node on running Hadoop cluster.
- Experience in installing, configuring and optimizing Cloudera Hadoop version CDH3, CDH 4.X and CDH 5.X in a Multi Clustered environment.
- Expertise in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
- Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera.
- Hands on experience on backup configuration and Recovery from a Name Node failure.
- Experience implementing big data projects using Cloudera Automating system tasks using Puppet.
- Extensive experience in installation, configuration, management and deployment of Big Data components and the underlying infrastructure of Hadoop Cluster.
TECHNICAL SKILLS
Hadoop: HDFS, Map reduce, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, HBase, Spark
System software: Linux, Windows XP, Server 2003, Server 2008
Network admin: TCP/IP fundamentals, wireless networks, LAN and WAN
Languages: SQL, J2SE, J2EE, HQL, PYTHON, UNIX shell scripting
Web Technologies: XML, HTML, DHTML.
Databases: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, MS- Access, Hadoop, HBase, MongoDB, Cassandra
Methodologies: Agile, V-model, Waterfall model
Web Tools: HTML, XML, JDBC, JSON, JSP, and Tableau
PROFESSIONAL EXPERIENCE
Confidential, Manhattan, NY
Hadoop Administrator
Responsibilities:
- Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Hbase, Zookeeper and Sqoop.
- Responsible for loading the customer's data and event logs from Oracle database, Teradata into HDFS using Sqoop
- Monitor Hadoop cluster connectivity and security
- Manage and review Hadoop log files
- Installing, Upgrading and Managing Hadoop Cluster on Cloudera distribution
- Worked with Kafka for the proof of concept for carrying out log processing on distributed system.
- Configured NameNode high availability and NameNode federation.
- Use of Sqoop to import and export data from HDFS to Relational database and vice-versa.
- Data analysis in running Hive queries.
- Involved in Installing Cloudera Manager, Hadoop, Zookeeper, HBASE, HIVE, PIG etc
- Involved in configuring Quorum base HA for NameNode and made the cluster more resilient
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop NameNode, Secondary NameNode, Resource Manager, Node Manager and DataNodes.
- Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
- Configuring Sqoop and Exporting/Importing data into HDFS
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode
- Worked on Hue interface for querying the data
- Experienced in loading data from UNIX local file system to HDFS.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
Environment:: Hadoop, HDFS, Map Reduce, RedHat Linux, Cloudera Manager
Confidential, Louisville, KY
Hadoop Administrator
Responsibilities:
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement
- Performed Importing and exporting data into HDFS using Sqoop
- Involved in start to end process of Hadoop cluster setup including installation, configuration and monitoring the Hadoop Cluster
- Monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
- Installed various Hadoop Ecosystems and Hadoop Daemons.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
- Administered Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting
- Performed Adding/removing new nodes to an existing Hadoop cluster
- Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicated and escalated issues appropriately.
- Applied standard Back up policies to make sure the high availability of cluster.
- Implemented Backup configurations and Recoveries from a Name Node failure.
- Integrated Hive and HBase to perform analysis on data
- Involved in analysing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
Environment: Hadoop, HDFS, Zookeeper, Map Reduce, YARN, HBase, Hive, Sqoop, Oozie, Linux- CentOS, Ubuntu, Red Hat, Big Data Cloudera CDH, Apache Hadoop
Confidential, St. Louis, MO
Hadoop Administrator
Responsibilities:
- Monitor the data streaming between web sources and HDFS.
- Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Handle the installation and configuration of a Hadoop cluster.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Set up and manage High Availability Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
- Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
- Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
- Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
- Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
- Setting up Identity, Authentication, and Authorization.
- Commission and decommission the Data nodes from cluster in case of problems.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and Patch updates.
- Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, Zookeeper, CDH3, Oracle, NoSQL and Unix/Linux
Confidential
Linux System Administrator
Responsibilities
- User account creation and account maintenance both local and centralized (LDAP - Sun Identity Manager).
- Performed all duties related to system administration like troubleshooting, providing sudo access, modifying DNS entries, NFS, backup recovery (scripts).
- Worked on Change Request raised by customer/team and follow up.
- Did Root Cause Analysis on Problem Tickets and frequently occurring incidents.
- Raised Case with vendors if any software or hardware needs to be updated/replaced/repaired.
- Raised Case with RedHat and follow up them as and when required.
- Setup password less login using ssh public - private key.
- Installed RedHat Enterprise Linux (RHEL 6) on production servers.
- Completed Work Requests raised by customer/team and following up with them.
- Engaged different team’s member when ticket requires multiple team support.
- Effectively and efficiently monitored SDM / Remedy queues so that no SLA Breach should happen.
- Worked in a 24X7 on call rotation to support critical production environments.
- Provided Support to Production Servers.
- Updated firmware on Servers, Installed patches and packages for security vulnerabilities for Linux.
- Monitored system resources, like network, logs, disk usage etc.
- Setting up cron jobs for the application owners to deploy scripts on production servers.
- Performed check out for the sanity of the file systems and volume groups.
- Developed scripts for internal use for automation of some regular jobs using shell scripting.
Environment: RedHat LINUX Release 5.x, 6.x, SUSE LINUX v 10.1, 11, OpenBSD, TCP/IP Wrapper, SSH, SCP, RSYNC, Service Desk Manager, BMC Remedy, Hostinfo, Apache Web Server, Samba Server, IPtables, FTP, DHCP, DNS, NFS, RPM, YUM, LDAP, Auto FS, LAN, WAN, KVM, RedHat Ent Virtualization, Xen, VMware.