We provide IT Staff Augmentation Services!

Hadoop/system Administrator Resume

5.00/5 (Submit Your Rating)

Atlanta, GA

PROFESSIONAL SUMMARY:

  • 4+ years of professional IT experience which includes 2 years of proven experience in Hadoop Administration in deploying, maintaining, monitoring and upgradingHadoopClusters using Hortonworks (HDP) and Cloudera (CDH) Distributions.
  • Experience in installing and configuring Hortonworks Data Flow(HDF).
  • Good understanding on Hadoopecosystem components likeHadoopMapReduce, HDFS, Spark, Ranger, HDFS encryption, Zookeeper, Oozie, Hive, HBase, Tez, Sqoop, Pig.
  • Worked on Multi Clustered environment and setting up Cloudera and Hortonworks Hadoopecho - System.
  • Good knowledge on MapR Distribution.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Experience in setting up Ranger policies for authorization on top of Hadoop eco-system components.
  • Experience in configuring Zookeeper to provide Cluster coordination services.
  • Worked on setting up Name Node high availability and involved in designing automatic failover control using zookeeper and quorum journal nodes.
  • Experience in providing security forHadoopCluster with Kerberos.
  • Enabled SSL for different services of Hadoop.
  • Setup and manage HA on nodes to avoid single point of failures in large clusters.
  • Experience in setting up the High-AvailabilityHadoopClusters.
  • Experience in upgrading the existingHadoopcluster to latest releases.
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on VMware and Amazon Web Service (AWS) using an EC2 instance.
  • Experience in developing Shell Scripts for system management.
  • Good knowledge in developing Python scripts.
  • Good working knowledge on NO-SQL database HBase.
  • Benchmarking and Stress Testing on Hadoop Cluster.
  • Ability to diagnose network problems.
  • Understanding of TCP/IP networking and its security considerations.
  • Excellent in communicating with clients, customers, managers, and other teams in the enterprise at all levels.
  • Effective problem-solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines.
  • Motivated to produce robust, high-performance software.
  • Ability to learn and use new technologies quickly.

TECHNICAL SKILLS:

Bigdata Technologies: HDFS, MapReduce (Mrv1 & Mrv2), Ambari, NiFi, Airflow, Ranger, Falcon, Spark, Impala, Kerberos, Hive, Tez, Pig, Zookeeper, Sqoop, Oozie, Flume, Kafka, HBase, Cassandra.

Hadoop Distributions: Cloudera (CDH4, CDH5), HDP (2.4, 2.5.3)

Programming Languages: Shell Scripting, Python, HQL & Java

Operating Systems: Windows 98/2000/XP/Vista/NT/ 8.1, Red hat Linux/Centos 4, 5, Unix

Database: Oracle 10g/11g, PL/SQL

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Hadoop/System Administrator

Responsibilities:

  • Hands on experience in Installing, Upgrade and maintain Hadoop clusters with Apache & Hortonworks Hadoop Ecosystem components.
  • Good understanding and related experience with Hadoop stack - internals, HDFS, Spark, Hive, HBase, Pig, Zookeeper, Yarn, Hive, Kerberos and Map/Reduce.
  • Supported in deploying 50 node Hortonworks Hadoop Cluster (HDP 2.4.2.0 and 2.5.3, 2.6.0.1) using Ambari-2.4 and 2.5.
  • Experienced in managingHadoopinfrastructure like adding capacity.
  • Deployed HDF-2.0.1 cluster with Ambari-2.4 that comes with NiFi on 10 nodes with HA.
  • Installed and configured Ranger authorization for different flows of NiFi that controls access for different users.
  • Enabled Ranger UserSync for synchronization of users from LDAP and creating new policies for HDFS, Hive, NiFi e.t.c.,
  • Enabled SSL with signed s (symmantec) for NiFi instances.
  • Enabled Kerberos authentication for HDF cluster and established cross realm setup between HDP cluster from HDF cluster.
  • Deployed Grafana, that uses Graphite and Collectl to monitor metrics on 750 servers.
  • Enabled SSL for NiFi login module with NiFi Authority.
  • Enabled Ambari views for HDFS Files, SmartSense, Hive, Tez and Yarn queue manager.
  • Involved in managing the cluster resources by implementing fair and capacity scheduler with ACL enabled.
  • Developed automated scripts using Unix Shell for storing Namenode metadata backup, file system health check and User/Group creation on HDFS.
  • Enabled Name node HA with auto Fail over.
  • Involved in setting up Kafka brokers.
  • Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational databases management systems using Sqoop.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
  • Supported users in running Pig and Hive queries and with the debugging.
  • Involved in troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
  • Responsible to manage data coming from different sources.
  • Deployed Grafana Dashboards for monitoring cluster nodes using Graphite as a Data Source and collectl as a metric sender.
  • Involved in loading data from UNIX file system to HDFS.
  • Cluster maintenance as well as creation and removal of nodes using Hortonworks.
  • Monitor System health and respond accordingly to any warning or failure conditions.
  • Involved in writing Automation scripts for loading data to cluster and deployment with installation of services using scripts.
  • Installed and configured Airflow for workflow management in cluster mode and standalone mode.
  • Enabled High Availability for Airflow scheduler in a cluster mode.
  • Worked with data governance services like Falcon for automating distcp jobs between the clusters.
  • Enabled different process, mirror entities of Falcon for scheduling distcp jobs with data lineage.
  • Responsible for User onboarding and creating user directories for new users on HDFS.
  • Responsible for creating Principals and Keytabs for the UID’s and Loadid’s.
  • Worked on enabling query search engines like Ambari-Infra(Solr) for storing and analyzing ranger audits.
  • Enabled Log Search service for HDP cluster for analyzing and debugging different service issues(logs).
  • Worked on configuring bundles from SmartSense on HDP cluster for cluster diagnostics and cluster tune-up.
  • Experienced in installaing and integrating RCloud with Hadoop, that uses GitHub as a data repository and successfully completed data migration from the GitHub enterprise to the RCloud Gist Service.
  • Constantly learning various Big Data tools and providing strategic direction as per development requirement.

Environment: RHEL, CentOS, Ubuntu, HDP 2.4 and HDP 2.5.3, Apache Hadoop, HDFS, Map, Reduce, Yarn, Zookeeper, Hbase, Shell Scripts.

Confidential, Wilson, NC

BigData Intern

Responsibilities:

  • Installing, Upgrade and maintain the Hadoop Clusters using Horton Works.
  • Deployed 100 node Hortonworks Hadoop Cluster (HDP 2.1) using Ambari server 1.6
  • Performed major and minor upgrades in large environments.
  • Hands on experience with Apache & Hortonworks Hadoop Ecosystem components such as Sqoop, Hbase and MapReduce.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
  • Design and Configure the Cluster with the services required (Sentry, Hive server2, Kerberos, HDFS, Hue, Spark, Hive, Hbase, Zookeeper).
  • Involved in Monitoring and support through Nagios and Ganglia.
  • Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational databases management systems using Sqoop.
  • Flume configuration for the transfer of data from the webservers to the HDFS.
  • Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
  • Supported users in running Pig and Hive queries and with the debugging.
  • Responsible for troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
  • Design and maintain the Name node and Data nodes with appropriate processing capacity and disk space.
  • Performed Data scrubbing and processing with Oozie.
  • Used Tableau to visualize the analyzed data.
  • Installed and configured Kerberos for Hadoop and all of its eco system tools for security.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Monitor System health and respond accordingly to any warning or failure conditions.
  • Writing automation scripts for loading data to cluster and deployment with installation of services using scripts.
  • Create Execute and Debug SQL queries to perform data completeness, correctness, data transformation and data quality testing.
  • Sentry configuration for appropriate user permissions accessing Hive server2/beeline.
  • Cluster maintenance as well as creation and removal of nodes using Hortonworks.
  • Created Data migration plan from one cluster to another using BDR.
  • Monitoring the cluster on a daily basis and to check the error logs and debugging them.
  • Provided support for users and helped to resolve any job failures.

Environment: CentOS, Horton Works, FLUME, HBase, HDFS, Map-Reduce, Hive, Oozie, Zookeeper, Sqoop.

Confidential

Linux Admin

Responsibilities:

  • Install, setup and configure Hortonworks(HDP) sandbox on Oracle virtual box to run queries with Pig, MapReduce etc.,
  • Install, setup and configure Cloudera (CDH) sandbox on Oracle virtual box.
  • Implemented Pig Latin script to retrieve word count and Punctuation count for 5GB data set.
  • Implemented MapReduce program to retrieve word count and Punctuation count for 5GB data set.
  • Deployed Pig and MapReduce programs to generate matrix multiplication on top Hortonworks sandbox.

Confidential

Linux Admin

Responsibilities:

  • Worked on Administration, monitoring and fine tuning on an existing Cloudera Hadoop Cluster used by internal and external users as a Data and Analytics as a Service Platform.
  • Worked on Cloudera Cluster backup and recovery, performance monitoring, load balancing, rebalancing and tuning, capacity planning, and disk space management.
  • Assisted in designing, development and architecture of HADOOP and HBase systems.
  • Coordinated with technical teams for installation of HADOOP and third related applications on systems.
  • Formulated procedures for planning and execution of system upgrades for all existing HADOOP clusters.
  • Supported daily operations and helped to develop strategies in-order to improve availability and utilization of UNIX environments.
  • Worked on system administration, user creation, file/directory permissions, LVM, loading of software and system patches.
  • Supported technical team members for automation, installation and configuration tasks.
  • Involved in troubleshooting problems and issues related to the efficient, secure operation of the Linux operating system.
  • Worked closely in designing and optimizing the configuration of Linux to meet the Service Level Agreements of our applications and services.
  • Worked in the development and maintenance of UNIX shell scripts for automation and suggested improvement processes for all process automation scripts and tasks.
  • Worked on Linux Administration LVM (Logical Volume Manager) administration, user management, exposure to yum, zypper, rpm and security hardening.
  • Monitored server and application performance & tuning via various commands (vmstat, top, iostat, ping, vmstat, free etc).
  • Implemented Firewall with (iptables) rules for new servers to enable communication with application servers.
  • Extensive UNIX system administration experience, user creation, file/directory permissions, LVM, loading of software and system patches in a large-scale server environment.
  • Worked upon concepts of tools, technologies and methodologies to collaborate with other technical specialists when carrying out assigned duties.
  • Effectively negotiated with technical peers and customers to implement technical solutions.
  • Worked in performing firmware upgrade in Linux Platform.

Environment: Hadoop, Unix, LVM, Redhat Linux 4, Cloudera, Firewall

Confidential

System Admin

Responsibilities:

  • Install and maintain all server hardware and software systems and administer all server performance and ensure availability for same.
  • Responsible for creating and managing users accounts, groups and security policies, assigning permissions rights, Accounts, Profiles and policies, maintaining of all user complaints.
  • Install, configure and provide technical support for Windows Servers like Active Directory, DNS, DHCP, DFS, FTP and VPN.
  • Monitor everyday systems and evaluate availability of all server resources and perform all activities for Linux servers.
  • Data Protection and maintenance of healthy network using different Backups and Recovery strategies.
  • Perform tests on all new software and maintain patches for management services and perform audit on all security processes.
  • Developed Automation Scripts using shell scripts to check the log files size and report the application.
  • Involved in Database Testing Using SQL to pull data from database and check whether it matches with GUI.

Environment: RedHat Linux 6.3, MySQL, VMware, Shell, Perl.

We'd love your feedback!