We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Manhattan, NY

SUMMARY

  • Around 6+ Years of extensive IT experience with 2+ years of experience as a Hadoop Administrator and 2 years of experience in UNIX/Linux Administrator in different domains.
  • Balanced cluster after adding/removing nodes or major data clean - up
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
  • Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like HBase, Zookeeper and Sqoop.
  • Hands-on experience on major components in Hadoop Ecosystem including HDFS, Yarn, Hive, Flume, Zookeeper, Oozie and other ecosystem Products.
  • Experience in Implementing High Availability of Name Node and Hadoop Cluster capacity planning to add and remove the nodes
  • Added security to the cluster using Kerberos.
  • Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
  • Hands on experience in analysing Log files for Hadoop and eco system services and finding root cause.
  • Expertise in Installing, Configuration and Managing Red hat Linux 7.
  • Experience with multiple Hadoop distribution s like Apache, Cloudera, MapR and Hortonworks.
  • Experience in securing Hadoop clusters using Kerberos and Sentry.
  • Experience with distributed computation tools such as Apache Spark Hadoop.
  • Experience as Deployment Engineer and System Administrator on Linux (Centos, Ubuntu, Red Hat).
  • Well versed in installing, configuring and tuning Hadoop distributions: Cloudera, Hortonworks on Linux systems.
  • Experience in Rebalance on HDFS cluster
  • Cluster Management using Cloudera Manager
  • Decommissioning and commissioning the Node on running Hadoop cluster.
  • Experience in installing, configuring and optimizing Cloudera Hadoop version CDH3, CDH 4.X and CDH 5.X in a Multi Clustered environment.
  • Expertise in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera.
  • Hands on experience on backup configuration and Recovery from a Name Node failure.
  • Experience implementing big data projects using Cloudera Automating system tasks using Puppet.
  • Extensive experience in installation, configuration, management and deployment of Big Data components and the underlying infrastructure of Hadoop Cluster.

TECHNICAL SKILLS

Hadoop: HDFS, Map reduce, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, HBase, Spark

System software: Linux, Windows XP, Server 2003, Server 2008

Network admin: TCP/IP fundamentals, wireless networks, LAN and WAN

Languages: SQL, J2SE, J2EE, HQL, PYTHON, UNIX shell scripting

Web Technologies: XML, HTML, DHTML.

Databases: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, MS- Access, Hadoop, HBase, MongoDB, Cassandra

Methodologies: Agile, V-model, Waterfall model

Web Tools: HTML, XML, JDBC, JSON, JSP, and Tableau

PROFESSIONAL EXPERIENCE

Confidential, Manhattan, NY

Hadoop Administrator

Responsibilities:

  • Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Hbase, Zookeeper and Sqoop.
  • Responsible for loading the customer's data and event logs from Oracle database, Teradata into HDFS using Sqoop
  • Monitor Hadoop cluster connectivity and security
  • Manage and review Hadoop log files
  • Installing, Upgrading and Managing Hadoop Cluster on Cloudera distribution
  • Worked with Kafka for the proof of concept for carrying out log processing on distributed system.
  • Configured NameNode high availability and NameNode federation.
  • Use of Sqoop to import and export data from HDFS to Relational database and vice-versa.
  • Data analysis in running Hive queries.
  • Involved in Installing Cloudera Manager, Hadoop, Zookeeper, HBASE, HIVE, PIG etc
  • Involved in configuring Quorum base HA for NameNode and made the cluster more resilient
  • Extensively involved in Installation and configuration of Cloudera distribution Hadoop NameNode, Secondary NameNode, Resource Manager, Node Manager and DataNodes.
  • Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
  • Configuring Sqoop and Exporting/Importing data into HDFS
  • Collected the logs data from web servers and integrated into HDFS using Flume.
  • Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode
  • Worked on Hue interface for querying the data
  • Experienced in loading data from UNIX local file system to HDFS.
  • Managing and scheduling Jobs on a Hadoop cluster using Oozie.

Environment:: Hadoop, HDFS, Map Reduce, RedHat Linux, Cloudera Manager

Confidential, Louisville, KY

Hadoop Administrator

Responsibilities:

  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement
  • Performed Importing and exporting data into HDFS using Sqoop
  • Involved in start to end process of Hadoop cluster setup including installation, configuration and monitoring the Hadoop Cluster
  • Monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
  • Installed various Hadoop Ecosystems and Hadoop Daemons.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
  • Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
  • Administered Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting
  • Performed Adding/removing new nodes to an existing Hadoop cluster
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicated and escalated issues appropriately.
  • Applied standard Back up policies to make sure the high availability of cluster.
  • Implemented Backup configurations and Recoveries from a Name Node failure.
  • Integrated Hive and HBase to perform analysis on data
  • Involved in analysing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.

Environment: Hadoop, HDFS, Zookeeper, Map Reduce, YARN, HBase, Hive, Sqoop, Oozie, Linux- CentOS, Ubuntu, Red Hat, Big Data Cloudera CDH, Apache Hadoop

Confidential, St. Louis, MO

Hadoop Administrator

Responsibilities:

  • Monitor the data streaming between web sources and HDFS.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Handle the installation and configuration of a Hadoop cluster.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Set up and manage High Availability Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
  • Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Setting up Identity, Authentication, and Authorization.
  • Commission and decommission the Data nodes from cluster in case of problems.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Maintaining Cluster in order to remain healthy and in optimal working condition.
  • Handle the upgrades and Patch updates.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, Zookeeper, CDH3, Oracle, NoSQL and Unix/Linux

Confidential

Linux System Administrator

Responsibilities

  • User account creation and account maintenance both local and centralized (LDAP - Sun Identity Manager).
  • Performed all duties related to system administration like troubleshooting, providing sudo access, modifying DNS entries, NFS, backup recovery (scripts).
  • Worked on Change Request raised by customer/team and follow up.
  • Did Root Cause Analysis on Problem Tickets and frequently occurring incidents.
  • Raised Case with vendors if any software or hardware needs to be updated/replaced/repaired.
  • Raised Case with RedHat and follow up them as and when required.
  • Setup password less login using ssh public - private key.
  • Installed RedHat Enterprise Linux (RHEL 6) on production servers.
  • Completed Work Requests raised by customer/team and following up with them.
  • Engaged different team’s member when ticket requires multiple team support.
  • Effectively and efficiently monitored SDM / Remedy queues so that no SLA Breach should happen.
  • Worked in a 24X7 on call rotation to support critical production environments.
  • Provided Support to Production Servers.
  • Updated firmware on Servers, Installed patches and packages for security vulnerabilities for Linux.
  • Monitored system resources, like network, logs, disk usage etc.
  • Setting up cron jobs for the application owners to deploy scripts on production servers.
  • Performed check out for the sanity of the file systems and volume groups.
  • Developed scripts for internal use for automation of some regular jobs using shell scripting.

Environment: RedHat LINUX Release 5.x, 6.x, SUSE LINUX v 10.1, 11, OpenBSD, TCP/IP Wrapper, SSH, SCP, RSYNC, Service Desk Manager, BMC Remedy, Hostinfo, Apache Web Server, Samba Server, IPtables, FTP, DHCP, DNS, NFS, RPM, YUM, LDAP, Auto FS, LAN, WAN, KVM, RedHat Ent Virtualization, Xen, VMware.

We'd love your feedback!