We provide IT Staff Augmentation Services!

Hadoop Admin Resume

3.00/5 (Submit Your Rating)

Bloomington, MN

SUMMARY

  • Over 6+ years of Experience in Information Technology. 4 years Experience in Hadoop Administrations.
  • Extensive experience in design Experience in Hadoop administration activities such as installation and configuration of clusters using Apache, Azure, Cloudera, Hortonworks, AWS and ECS.
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Cloudera, Hortonworks, and Amazon AW.
  • Hands on experience in installing, configuring and using hadoop ecosystem components Like HDFS (HADOOP), Yarn, Spark, Flume, MapReduce, Hive, Pig, HBase, Oozie, Impala, Cloudera, Hue and knowledge on Kafka, Falcon, Ranger, Knox.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4), Yarn distributions(CDH 5.X).
  • Strong experience on Design, configure and manage the backup, Restore and disaster recovery, troubleshooting for Hadoop data.
  • Experience with Amazon AWS and AWS technologies such as EMR, EC2, IAM, S3, Cloudewatch etc. are a strong plus.
  • Experience in HDFS Data Storage and support for running map - reduce jobs, installing and configuring hadoop eco system like sqoop, pig, hive and knowledge on Hbase and zookeeper.
  • Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
  • Adding/removing new nodes to an existing hadoop cluster.
  • Configuring High Availability on NameNode & Resource Manager, HDFS Maintenance and support.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Installing and monitoring the Hadoop cluster resources using Ganglia and Nagios.
  • Experience in updating and migrating of objects in different environment using deployment management module.
  • Knowledge on Log Analysis for user behavioral patterns in Splunk.
  • Having working knowledge onWindows, LINUXandUNIXshell scripting

TECHNICAL SKILLS

Operating System: RHEL, Cent OS, Windows 7, Linux, UNIX.

Languages: UNIX Shell scripting, Core Java.

Databases: ORACLE, MYSQL, SQL Server, HBASE.

Ecosystem: Hive, MapReduce, Pig, Sqoop, Kafka, Spark.

Distribution System: Cloudera Manager CDH 5.x, Hortonworks HDP 2.x

Hadoop Scheduler: Ooze.

Hadoop Security: Kerberos, Apache Ranger, Key Management Service.

System Monitoring Tools: Splunk Enterprise & Cloud, Ambari, Cloudera Manager.

Developing Skills: Hive/MR, SQL Programming.

Cloud Technologies: Amazon Web Services (AWS).

Data Analytics & Visualization Tools: Tableau, SSRS.

PROFESSIONAL EXPERIENCE

Confidential, Bloomington, MN

Hadoop Admin

Responsibilities:

  • Involved in start to end process of hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
  • Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement
  • Experienced in managing and reviewing Hadoop log file
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installation and configuration of Sqoop and Flume, Hbase
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately. • As a admin followed standard Back up policies to make sure the high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Worked with systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
  • Monitored multiple hadoop clusters environments using Ganglia and Nagios.
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the hadoop cluster.
  • Involved in Installing and configuring Kerberos for the authentication of users and hadoop daemons.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Environment: Hadoop, HDFS, Hive, Sqoop, Flume, Zookeeper and HBase, Big Data Cloudera CDH Apache Hadoop, Toad, SQL plus, Redhat/Suse Linix, EM Cloud Control

Confidential, PA

HADOOP ADMIN

Responsibilities:

  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting Up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Cluster maintenance as well as creation and removal of nodes using Cloudera Manager
  • Performance tuning of Hadoop clusters and Hadoop MapReduce routines.
  • Screen Hadoop cluster job performances and capacity planning.
  • Monitor Hadoop cluster connectivity and security
  • Configuring of Hive, PIG, Impala, Sqoop, Flume and Oozie in CDH 5
  • Manage and review Hadoop log files.
  • File system management and monitoring.
  • Major Upgrade from cdh 4 to chd 5.2
  • HDFS support and maintenance.
  • Collaborating with application teams to install operating system and Hadoop updates,patches, version upgrades when required.
  • Scheduling and Managing Oozie Jobs to automate sequence of rotational activity.
  • Deciding on security and access control model for cluster and data protection.
  • Testing plan of production cluster before and after hadoop installation for HA's and performance.
  • Planning on requirements for migrating users to production beforehand to avoid last minute access issues.
  • Planning on data topology, rack topology and resources availability for users to share as required.
  • Planning and implementation of data migration from existing staging to production cluster.
  • Installed and configured Hadoop ecosystem components like MapReduce, Hive, Pig, Sqoop, HBase, ZooKeeper, fuse and Oozie.
  • Supported MapReduce Programs and distributed applications running on the Hadoop cluster.

Environment: Hadoop, YARN, Hive, HBase, Flume, Hortonworks, Apache Phoenix, Kafka, Zookeeper, Oozie and Sqoop, MapReduce, Ambari, HDFS, Splunk, Elastic search, Jenkins, Kerberos, MySQL, Apache, NoSQL,, Linux/Unix.

Confidential

HADOOP ADMIN

Responsibilities:

  • Monitoring/Controlling the HDP2.5 cluster via Ambari 2.4.1
  • Did Capacity Planning for estimated 60 terrabytes workload of periodic snapshots
  • Bring up/down the application for cluster upgradation/issues.
  • Creating & Running hive queries to generate the reports for analysing purpose.
  • Worked with text file formats compressing and decompressing Snappy algorithm technique.
  • Worked on configuring and optimizing Hiveserver, Metastore to use Tez engine and ORC compression format for the processing of data.
  • Worked closely with ETL Team and trouble shooting the issues.
  • Interacting with Internal Customers and end users via SNOW.
  • Administrating & troubleshooting the MySQL.
  • Involved in 3rd party tools - Tableau & Splunk and integrating with Hadoop.
  • Developing and Maintaining Hive queries(Hql) in CLI(opsware) and Ambari Hive views.
  • Involved in Hadoop Cluster environment administration that includes adding and removing Nodes, cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
  • Manage and review Hadoop log files and Log cases with Ambari.
  • Configured Capacity Scheduler to provide service-level agreements for multiple users of a cluster.
  • Experience with the Hadoop stack (MapR, Yarn, HBase, Hive, Sqoop, Pig, Zookeeper)
  • HDFS File system management and monitoring, Scheduled regular health checks on file system, Hive & HDFS data replication.
  • Importing and exporting data from HDFS to RDBMS(Oracle & SQL Server) vice versa using Sqoop.
  • Installed and configured Hive ODBC drivers on the client machine to export the data from hive to desktop applications.
  • Monitor and report data usage across the clusters.
  • Adding security to the cluster using Apache Ranger, defined user policies authorizing the access to databases.
  • Worked on LDAP integration with Ranger authenticating the users against Hiveserver from SAS
  • Performed POC on Cloud Computing. Installed Splunk Cloud, performed search and analysis to identify the anamolies from the application logs.

Environment: RHEL 6.7, Oracle, MS-SQL, Zookeeper, MapReduce, YARN, Hortonworks HDP 2.5, REST APIs, Ranger, Splunk, Ambari 2.4.1, Sqoop, Hive, SAS, Splunk

Confidential

Systems Engineer

Responsibilities:

  • Installed and configured MySQL on Linux and Windows environments.
  • Managing/Troubleshooting MySQL5.0.22 and 5.1.24 in prod and non-prod environments on Linux
  • Wrote Java programs to interface with MySQL database
  • Strong MySQL skills and brief general system administration skills on Linux and Windows environment.
  • Performed installation, new databases design, configuration, backup, recovery, security, upgrade and schema changes, tuning and data integrity.
  • Increased database performance by utilizing MySQL config changes, multiple instances and by upgrading hardware.
  • Experienced in Database optimization and developing Triggers, Joins, Views and SQL on MySQL database.
  • Designed data marts that were used as the source for analytic reporting.
  • Responsible for maintaining database and reporting any issues immediately to management.
  • Created and deleted users, groups and set up restrictive permissions, configuration of the sudo files.
  • Documented all servers and databases.

Environment: MS SQL Server 2005/2000, MS Access 2000, MS SQL Analysis Services 2005, MS Excel, Visual Studio 2005, Windows Server 2003,linux.

We'd love your feedback!