We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Austin, TX

PROFESSIONAL SUMMARY:

  • Overall 7+ years of experience in Software analysis, design, development and maintenance in diversified areas of Client - Server, Distributed and embedded applications.
  • Hands on experiences with Hadoop stack. (HDFS, Map Reduce, YARN, Sqoop, Flume, Hive-Beeline, Impala, Tez, Pig, Zookeeper, Oozie, Solr, Sentry, Kerberos, Centrify DC, Falcon, Hue, Kafka, Storm).
  • Experience with Cloudera Hadoop Clusters with CDH 5.6.0 with CM 5.7.0.
  • Experienced on Horton works Hadoop Clusters with HDP 2.4 with Ambari 2.2.
  • Hands on day-to-day operation of the environment, knowledge and deployment experience in Hadoop ecosystem.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml andhadoop-env.xml based upon the job requirement.
  • Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Experience in installing, configuring and optimizing ClouderaHadoopversion CDH3, CDH 4.X and CDH 5.X in a Multi Clustered environment.
  • Commissioning and de-commissioning the cluster nodes, Data migration. Also, Involved in setting up DR cluster with BDR replication setup and Implemented Wire encryption for Data at REST.
  • Implemented Security TLS 3 over on all CDH services along with Cloudera Manager.
  • Data Guise Analytics implementation over secured cluster.
  • Blue-Talend integration and Green Plum migration has been successfully implemented.
  • Ability to plan, manage HDFS storage capacity and disk utilization.
  • Assist developers with troubleshooting Map Reduce, BI jobs as required.
  • Provide granular ACLs for local file datasets as well as HDFS URIs. Role level ACL Maintenance.
  • Cluster monitoring and troubleshooting using tools such as Cloudera, Ganglia, NagiOS, and Ambari metrics.
  • Manage and review HDFS data backups and restores on Production cluster.
  • Implement new Hadoop infrastructure, OS integration and application installation. Install OS (rhel6, rhel5, centos, and Ubuntu) and Hadoop updates, patches, version upgrades as required.
  • Implement and maintain security LDAP, Kerberos as designed for cluster.
  • Expert in setting up Horton works (HDP2.4) cluster with and without using Ambari 2.2
  • Experienced in setting up Cloudera (CDH5.6) cluster using packages as well as parcels Cloudera manager 5.7.0.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, YARN, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts.
  • Solid understanding of all phases of development using multiple methodologies i.e. Agile with JIRA, Kanban board along with ticketing tool Remedy and Servicenow.
  • Expertise to handle tasks in Red Hat Linux includes upgrading RPMS using YUM, kernel, configure SAN Disks, Multipath and LVM file system.
  • Creating and maintaining user accounts, profiles, security, rights, disk space and process monitoring. Handling and generating tickets via the BMC Remedy ticketing tool.
  • Configure UDP, TLS, SSL, HTTPD, HTTPS, FTP, SFTP, SMTP, SSH, Kickstart, Chef, Puppet and PDSH.
  • Overall Strong experience in system Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance monitoring and Fine-tuning on Linux (RHEL) systems.

TECHNICAL SKILLS:

Big Data Technologies: Hadoop, HDFS, Map Reduce, YARN, PIG, Hive, Hbase, Smartsense, Zookeeper, Oozie, Ambari, Kerberos, Knox, Ranger, Sentry, Spark, Tez, Accumulo, Impala, Hue, Storm, Kafka, Flume, Sqoop, Solr.

Hardware: IBM pSeries, pureflex, RS/6000,IBM Blade servers, HP Proliant DL 360,380, HP Blade servers C6000,C7000.

SAN: EMC Clariion, EMC DMX, IBM XIV.

Operating Systems: Linux, AIX, CentOS, Solaris & Windows.

Networking: DNS, DHCP, NFS, FTP, NIS, Samba, LDAP, Open LDAP, SSH, Apache, NFS, NIM.

Tools & Utilities: Managenow, Remedy, Maximo, Nagios, Chipre & SharePoint.

Databases: Oracle 10/11g, 12c, DB2, MySQL, HBase, Cassandra, MongoDB.

Backups: Veritas Netbackup & TSM Backup.

Virtualization: VMware vSphere, VIO

Cluster Technologies: HACMP 5.3, 5.4, Power HA 7.1, VERITAS Cluster Servers 4.1

Web/Application Servers: Tomcat, WebSphere Application Server 5.0/6.0/7.0, Message Broker, MQ Series, Web Logic Server, IBM HTTP Server.

Cloud Knowledge: Openstack, AWS.

Scripting & Programming Languages: Shell & Perl programming, Python.

PROFESSIONAL EXPERIENCE:

Confidential, Austin TX

Sr. Hadoop Administrator

Responsibilities:

  • Understood the existing Enterprise data warehouse set up and provided design and architecture suggestion converting toHadoop ecosystem.
  • Deployed Hadoop cluster of Hortonworks Distribution and installed ecosystem components: HDFS, YARN, Zookeeper, Hbase, Hive, MapReduce, Pig, Kafka, Storm and Spark in Linux servers using Ambari.
  • Set up automated 24x7x365 monitoring and escalation infrastructure for Hadoop cluster using Nagios Core and Ambari.
  • Designed and implemented Disaster Recovery Plan forHadoopClusters.
  • Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
  • IntegratedHadoop clusterwith Active Directory and enabled Kerberos for Authentication.
  • Implemented Capacity schedulers on the Yarn Resource Manager to share the resources of the cluster for the MapReduce jobs given by the users.
  • Set up Linux Users, and tested HDFS, Hive, Pig and MapReduce Access for the new users.
  • Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors.
  • OptimizedHadoopclusters components: HDFS, Yarn, Hive, Kafka to achieve high performance.
  • Worked with Linux server admin team in administering the server Hardware and operating system.
  • Interacted with Networking team to improve bandwidth.
  • Provided User, Platform and Application support onHadoopInfrastructure.
  • Applied Patches and Bug Fixes onHadoopCluster.
  • Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
  • Conducted Root Cause Analysis and resolved production problems and data issues.
  • Performed Disk Space management to the users and groups in the cluster.
  • Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
  • Performed Backup and Recovery process in order to Upgrade Hadoop stack.
  • Used Sqoop, Distcp utilities for data copying and for data migration.
  • End to end Data flow management from sources to Nosql (mongoDB) Database using Oozie.
  • Integrated Oozie with the rest of theHadoopstack supporting several types ofHadoopjobs such as MapReduce, Pig, Hive, and Sqoop as well as system specific jobs such as Java programs and Shell scripts.
  • Installed Kafka cluster with separate nodes for brokers.
  • Performed Kafka operations on regular basis.
  • Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
  • Monitored cluster stability, used tools to gather statistics and improved performance.
  • Used Apache(TM)Tez, an extensible framework for building high performance batch and interactive data processing applications, on Pig and Hive jobs.
  • Identified disk space bottlenecks and Installed Nagios Log Server and integrated it with the PRD cluster to aggregate service logs from multiple nodes and created dashboards for important service logs for better analyzation based on historical log data.

Environment: Hue, Oozie, Eclipse, HBase, Flume, Splunkd, Linux, Java Hibernate, Java jdk, Kickstart, Puppet PDSH, chef, gcc4.2, git, Cassandra, AWS, NoSql, RedHat, CDH(4.x), Flume, Impala, MySQL, MongoDB, Nagios, Chef.

Confidential, Farmington Hills, MI

Hadoop Administrator

Responsibilities:

  • Provided Administration, management and support for large scale Big Data platforms on Hadoop eco-system.
  • Involved in Cluster Capacity planning, deployment and Managing Hadoop for our data platform operations with a group of Hadoop architects and stakeholders.
  • Developed backup policies for HADOOP systems and action plans for network failure.
  • Involved in the User/Group Management in Hadoop with AD/LDAP integration.
  • Resource management and load management using capacity scheduling and appending changes according to requirements.
  • Implemented strategy to upgrade entire cluster nodes OS from RHEL5 to RHEL6 and ensured cluster remains up and running.
  • Expertise in Hadoop cluster tasks like commissioning and de-commissioning nodes without any effect to running jobs and data for maintenance and backup.
  • Monitored multipleHadoopclusters environments using Nagios and Ganglia.
  • Implemented High Availability of Hadoop Namenode using the Quorum Journal Manager.
  • Good understanding of Partitioning concepts and different file formats supported in Hive and Pig.
  • Developed scripts in shell and python to automate lot of day to day admin activities.
  • Implemented HCatalog for making partitions available for Pig/Java MR and established Remote Hive metastore using MySQL.
  • Load log data into HDFS using Flume, Kafkaand performing ETL integrations
  • Implemented POC by spinning up a 8 node IBM Big Insights(Big Insights for Apache Hadoop Edition) cluster as per the management requirement
  • Moved data from production into IBM Big Insight Cluster for testing and performance
  • Creating hive tables and setting the user permissions.
  • Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
  • Experience in writing the automatic scripts for monitoring the file systems.
  • Responsible for giving presentations about new ecosystems to be implemented in the cluster with the teams and managers.
  • Applying patches for cluster (e.g. HIVE-1975).
  • Adding new Data Nodes when needed and re-balancing the cluster
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Dealt with major and minor upgrades to the Hadoop cluster without decreasing cluster performance.
  • Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
  • Done stress and performance testing, benchmark for the cluster using DFSIO and Terasort.
  • Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
  • Debug and solve the major issues with the help of Cloudera support team.
  • Supported 300+ business users to use Hadoop platform and resolving tickets and issues they run into and helping them to use best ways to achieve their results.
  • Continues integration of new services to the Hadoop cluster.
  • Installed several projects on Hadoop servers and configured each project to run jobs and scripts successfully.

Environment: CDH 3.x,4.X,5.X, Cloudera Manager 4&5, Nagios, Ganglia, Tableau, Shell Scripting, Oozie, Pig, Hive, Flume, bash scripting, Teradata, Abinitio, Kafka, Impala, Oozie, Sentry, CentOS.

Confidential

Hadoop Administrator.

Responsibilities:

  • Involved in architectural design cluster infrastructure, Resource mobilization, Risk analysis and reporting.
  • Installation and configuration of Big Insight cluster with help of IBM engineers.
  • Commissioning and de-commissioning the data nodes and involve in Name Node maintenance.
  • Install security using Kerberos on cluster for AAA (authentication, authorization and auditing).
  • Regular backup and clear logs from HDFS space. This is to utilize data nodes optimally. Write shell scripts for time bound commands execution.
  • Edit and configure HDFS and tracker parameters.
  • Script the requirements using BigSQL and provide time statistics of running jobs.
  • Involve code review tasks in simple to complex Map/reduce Jobs using Hive and Pig
  • Cluster Monitoring using Big Insights ionosphere tool.
  • Importing of data from various data sources, parse into structured data region wise and date wise. Analysed the data by performing Hive queries and running Pig scripts to study customer behaviour.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Optimized MapReduce code, pig scripts and performance tuning and analysis.
  • Implemented a POC with SparkSQL to interpret Json records.
  • Created table definition and made the contents available as a Schema-BackedRDD.
  • Implemented advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.

Environment: Linux, Hadoop, Big Insights, Hive, puppet, Java, C++.

Confidential

Sr. Linux Administrator

Responsibilities:

  • Provided system support for 100+ servers of Red hat Linux including routine maintenance, patching, and system backups and restores, and software and hardware upgrades.
  • Worked on building new SUSE & Redhat Linux servers, support lease replacements and implementing system patches using the HP Server Automation tool.
  • Configuration, implementation and administration of Clustered servers on SUSE Linux environment.
  • System administration, System planning, co-ordination and group level and user level management.
  • Create and manage Logical Volumes in Linux.
  • Experience on backup and recovery software like Net-backup on Linux environment.
  • Setting up NagiOS Monitoring software.
  • Supported production systems 24 x 7 on a rotational basis.
  • Resolved Security Access Requests via Peregrine Service center to provide the requested User access related requests.
  • Handling and generating tickets via the BMC Remedy ticketing tool.
  • Performance Monitoring and Performance Tuning using Top, prstat, sar, vmstat, netstat, jps, iostat etc.
  • Creating new file system, managing & checking data consistency of the file system.
  • Successfully Migrated virtual machines from legacy Virtual environment VMware Vsphere 4.1 to new VMware Vsphere 5.1.
  • Documented the procedure of obsolete servers resulting in considerable reduction in time and mistakes for this process as well streamlining the existing process.
  • Communicated and Coordinated with customers internal/external for resolving issues for reducing downtime.
  • Disaster Recovery and Planning.
  • Problem determination, Security, Shell Scripting.

Environment: REDHAT Linux 5, 6, HP Gen 8 Blades and Rack mount Servers, Oracle Virtual Box.

Confidential

Linux/ Database Administrator.

Responsibilities:

  • Installing and maintaining the Linux servers
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers
  • Monitoring System Metrics and logs for any problems.
  • Adding, removing, or updating user account information, resetting passwords, etc
  • Creating and managing Logical volumes. Using Java JDBC to load data into MySQL
  • Maintaining the MySQL server and Authentication to required users for databases
  • Installing and updating packages using YUM
  • Patches installation and updating on server.
  • Virtualization on RHEL server (Through Xen & KVM Server)
  • Resize LVM disk volumes as needed. Administration of VMware virtual Linux serve
  • Installation and configuration of Linux for new build environment.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Installed Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Updating YUM Repository and Red hat Package Manager (RPM).

Environment: Linux, Centos, Ubuntu, FTP, NTP, MYSQL.

We'd love your feedback!