Hadoop/system Administrator Resume
Atlanta, GA
PROFESSIONAL SUMMARY:
- 4+ years of professional IT experience which includes 2 years of proven experience in Hadoop Administration in deploying, maintaining, monitoring and upgradingHadoopClusters using Hortonworks (HDP) and Cloudera (CDH) Distributions.
- Experience in installing and configuring Hortonworks Data Flow(HDF).
- Good understanding on Hadoopecosystem components likeHadoopMapReduce, HDFS, Spark, Ranger, HDFS encryption, Zookeeper, Oozie, Hive, HBase, Tez, Sqoop, Pig.
- Worked on Multi Clustered environment and setting up Cloudera and Hortonworks Hadoopecho - System.
- Good knowledge on MapR Distribution.
- Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
- Experience in setting up Ranger policies for authorization on top of Hadoop eco-system components.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Worked on setting up Name Node high availability and involved in designing automatic failover control using zookeeper and quorum journal nodes.
- Experience in providing security forHadoopCluster with Kerberos.
- Enabled SSL for different services of Hadoop.
- Setup and manage HA on nodes to avoid single point of failures in large clusters.
- Experience in setting up the High-AvailabilityHadoopClusters.
- Experience in upgrading the existingHadoopcluster to latest releases.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on VMware and Amazon Web Service (AWS) using an EC2 instance.
- Experience in developing Shell Scripts for system management.
- Good knowledge in developing Python scripts.
- Good working knowledge on NO-SQL database HBase.
- Benchmarking and Stress Testing on Hadoop Cluster.
- Ability to diagnose network problems.
- Understanding of TCP/IP networking and its security considerations.
- Excellent in communicating with clients, customers, managers, and other teams in the enterprise at all levels.
- Effective problem-solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines.
- Motivated to produce robust, high-performance software.
- Ability to learn and use new technologies quickly.
TECHNICAL SKILLS:
Bigdata Technologies: HDFS, MapReduce (Mrv1 & Mrv2), Ambari, NiFi, Airflow, Ranger, Falcon, Spark, Impala, Kerberos, Hive, Tez, Pig, Zookeeper, Sqoop, Oozie, Flume, Kafka, HBase, Cassandra.
Hadoop Distributions: Cloudera (CDH4, CDH5), HDP (2.4, 2.5.3)
Programming Languages: Shell Scripting, Python, HQL & Java
Operating Systems: Windows 98/2000/XP/Vista/NT/ 8.1, Red hat Linux/Centos 4, 5, Unix
Database: Oracle 10g/11g, PL/SQL
PROFESSIONAL EXPERIENCE:
Confidential, Atlanta, GA
Hadoop/System Administrator
Responsibilities:
- Hands on experience in Installing, Upgrade and maintain Hadoop clusters with Apache & Hortonworks Hadoop Ecosystem components.
- Good understanding and related experience with Hadoop stack - internals, HDFS, Spark, Hive, HBase, Pig, Zookeeper, Yarn, Hive, Kerberos and Map/Reduce.
- Supported in deploying 50 node Hortonworks Hadoop Cluster (HDP 2.4.2.0 and 2.5.3, 2.6.0.1) using Ambari-2.4 and 2.5.
- Experienced in managingHadoopinfrastructure like adding capacity.
- Deployed HDF-2.0.1 cluster with Ambari-2.4 that comes with NiFi on 10 nodes with HA.
- Installed and configured Ranger authorization for different flows of NiFi that controls access for different users.
- Enabled Ranger UserSync for synchronization of users from LDAP and creating new policies for HDFS, Hive, NiFi e.t.c.,
- Enabled SSL with signed s (symmantec) for NiFi instances.
- Enabled Kerberos authentication for HDF cluster and established cross realm setup between HDP cluster from HDF cluster.
- Deployed Grafana, that uses Graphite and Collectl to monitor metrics on 750 servers.
- Enabled SSL for NiFi login module with NiFi Authority.
- Enabled Ambari views for HDFS Files, SmartSense, Hive, Tez and Yarn queue manager.
- Involved in managing the cluster resources by implementing fair and capacity scheduler with ACL enabled.
- Developed automated scripts using Unix Shell for storing Namenode metadata backup, file system health check and User/Group creation on HDFS.
- Enabled Name node HA with auto Fail over.
- Involved in setting up Kafka brokers.
- Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
- Loaded data into the cluster from dynamically generated files using Flume and from relational databases management systems using Sqoop.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
- Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
- Supported users in running Pig and Hive queries and with the debugging.
- Involved in troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
- Responsible to manage data coming from different sources.
- Deployed Grafana Dashboards for monitoring cluster nodes using Graphite as a Data Source and collectl as a metric sender.
- Involved in loading data from UNIX file system to HDFS.
- Cluster maintenance as well as creation and removal of nodes using Hortonworks.
- Monitor System health and respond accordingly to any warning or failure conditions.
- Involved in writing Automation scripts for loading data to cluster and deployment with installation of services using scripts.
- Installed and configured Airflow for workflow management in cluster mode and standalone mode.
- Enabled High Availability for Airflow scheduler in a cluster mode.
- Worked with data governance services like Falcon for automating distcp jobs between the clusters.
- Enabled different process, mirror entities of Falcon for scheduling distcp jobs with data lineage.
- Responsible for User onboarding and creating user directories for new users on HDFS.
- Responsible for creating Principals and Keytabs for the UID’s and Loadid’s.
- Worked on enabling query search engines like Ambari-Infra(Solr) for storing and analyzing ranger audits.
- Enabled Log Search service for HDP cluster for analyzing and debugging different service issues(logs).
- Worked on configuring bundles from SmartSense on HDP cluster for cluster diagnostics and cluster tune-up.
- Experienced in installaing and integrating RCloud with Hadoop, that uses GitHub as a data repository and successfully completed data migration from the GitHub enterprise to the RCloud Gist Service.
- Constantly learning various Big Data tools and providing strategic direction as per development requirement.
Environment: RHEL, CentOS, Ubuntu, HDP 2.4 and HDP 2.5.3, Apache Hadoop, HDFS, Map, Reduce, Yarn, Zookeeper, Hbase, Shell Scripts.
Confidential, Wilson, NC
BigData Intern
Responsibilities:
- Installing, Upgrade and maintain the Hadoop Clusters using Horton Works.
- Deployed 100 node Hortonworks Hadoop Cluster (HDP 2.1) using Ambari server 1.6
- Performed major and minor upgrades in large environments.
- Hands on experience with Apache & Hortonworks Hadoop Ecosystem components such as Sqoop, Hbase and MapReduce.
- Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
- Design and Configure the Cluster with the services required (Sentry, Hive server2, Kerberos, HDFS, Hue, Spark, Hive, Hbase, Zookeeper).
- Involved in Monitoring and support through Nagios and Ganglia.
- Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
- Loaded data into the cluster from dynamically generated files using Flume and from relational databases management systems using Sqoop.
- Flume configuration for the transfer of data from the webservers to the HDFS.
- Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
- Supported users in running Pig and Hive queries and with the debugging.
- Responsible for troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
- Design and maintain the Name node and Data nodes with appropriate processing capacity and disk space.
- Performed Data scrubbing and processing with Oozie.
- Used Tableau to visualize the analyzed data.
- Installed and configured Kerberos for Hadoop and all of its eco system tools for security.
- Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
- Responsible to manage data coming from different sources.
- Involved in loading data from UNIX file system to HDFS.
- Monitor System health and respond accordingly to any warning or failure conditions.
- Writing automation scripts for loading data to cluster and deployment with installation of services using scripts.
- Create Execute and Debug SQL queries to perform data completeness, correctness, data transformation and data quality testing.
- Sentry configuration for appropriate user permissions accessing Hive server2/beeline.
- Cluster maintenance as well as creation and removal of nodes using Hortonworks.
- Created Data migration plan from one cluster to another using BDR.
- Monitoring the cluster on a daily basis and to check the error logs and debugging them.
- Provided support for users and helped to resolve any job failures.
Environment: CentOS, Horton Works, FLUME, HBase, HDFS, Map-Reduce, Hive, Oozie, Zookeeper, Sqoop.
Confidential
Linux Admin
Responsibilities:
- Install, setup and configure Hortonworks(HDP) sandbox on Oracle virtual box to run queries with Pig, MapReduce etc.,
- Install, setup and configure Cloudera (CDH) sandbox on Oracle virtual box.
- Implemented Pig Latin script to retrieve word count and Punctuation count for 5GB data set.
- Implemented MapReduce program to retrieve word count and Punctuation count for 5GB data set.
- Deployed Pig and MapReduce programs to generate matrix multiplication on top Hortonworks sandbox.
Confidential
Linux Admin
Responsibilities:
- Worked on Administration, monitoring and fine tuning on an existing Cloudera Hadoop Cluster used by internal and external users as a Data and Analytics as a Service Platform.
- Worked on Cloudera Cluster backup and recovery, performance monitoring, load balancing, rebalancing and tuning, capacity planning, and disk space management.
- Assisted in designing, development and architecture of HADOOP and HBase systems.
- Coordinated with technical teams for installation of HADOOP and third related applications on systems.
- Formulated procedures for planning and execution of system upgrades for all existing HADOOP clusters.
- Supported daily operations and helped to develop strategies in-order to improve availability and utilization of UNIX environments.
- Worked on system administration, user creation, file/directory permissions, LVM, loading of software and system patches.
- Supported technical team members for automation, installation and configuration tasks.
- Involved in troubleshooting problems and issues related to the efficient, secure operation of the Linux operating system.
- Worked closely in designing and optimizing the configuration of Linux to meet the Service Level Agreements of our applications and services.
- Worked in the development and maintenance of UNIX shell scripts for automation and suggested improvement processes for all process automation scripts and tasks.
- Worked on Linux Administration LVM (Logical Volume Manager) administration, user management, exposure to yum, zypper, rpm and security hardening.
- Monitored server and application performance & tuning via various commands (vmstat, top, iostat, ping, vmstat, free etc).
- Implemented Firewall with (iptables) rules for new servers to enable communication with application servers.
- Extensive UNIX system administration experience, user creation, file/directory permissions, LVM, loading of software and system patches in a large-scale server environment.
- Worked upon concepts of tools, technologies and methodologies to collaborate with other technical specialists when carrying out assigned duties.
- Effectively negotiated with technical peers and customers to implement technical solutions.
- Worked in performing firmware upgrade in Linux Platform.
Environment: Hadoop, Unix, LVM, Redhat Linux 4, Cloudera, Firewall
Confidential
System Admin
Responsibilities:
- Install and maintain all server hardware and software systems and administer all server performance and ensure availability for same.
- Responsible for creating and managing users accounts, groups and security policies, assigning permissions rights, Accounts, Profiles and policies, maintaining of all user complaints.
- Install, configure and provide technical support for Windows Servers like Active Directory, DNS, DHCP, DFS, FTP and VPN.
- Monitor everyday systems and evaluate availability of all server resources and perform all activities for Linux servers.
- Data Protection and maintenance of healthy network using different Backups and Recovery strategies.
- Perform tests on all new software and maintain patches for management services and perform audit on all security processes.
- Developed Automation Scripts using shell scripts to check the log files size and report the application.
- Involved in Database Testing Using SQL to pull data from database and check whether it matches with GUI.
Environment: RedHat Linux 6.3, MySQL, VMware, Shell, Perl.
