We provide IT Staff Augmentation Services!

Hadoop Admin Resume

4.00/5 (Submit Your Rating)

SUMMARY:

  • Over 7 years of experience including 4+ years of experience with Hadoop Ecosystem in installation and administrated of all UNIX/LINUX servers and configuration of different Hadoop eco - system components in the existing cluster project.
  • Experience in configuring, installing and managing MapR, Hortonworks & Cloudera Distributions.
  • Hands on experience in installing, configuring, monitoring and using Hadoop components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Hortonworks, Oozie, Apache Spark, Impala.
  • Working experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
  • Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
  • Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large scale data processing.
  • Good knowledge on implementation and design of big data pipelines.
  • Knowledge in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
  • Worked in agile projects delivering end to end continuous integration/continuous delivery pipeline by Integration of tools like Jenkins, puppy and AWS for VM provisioning.
  • Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks, Cloudera and Map Reduce.
  • Extensive experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
  • Excellent knowledge of NOSQL databases like HBase, Cassandra.
  • Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
  • Experience in job scheduling using different schedulers like FAIR, CAPACITY & FIFO and cluster co-ordination through DISTCP tool.
  • Experience in designing and implementing of secure Hadoop cluster using Kerberos.
  • Experience in AWS CloudFront, including creating and managing distributions to provide access to S3 bucket or HTTP server running on EC2 instances.
  • Involved in implementing security on HDF and HDF Hadoop Clusters with Kerberos for authentication and Ranger for authorization and LDAP integration for Ambari, Ranger, and NiFi.
  • Responsible for support of Hadoop Production environment which includes Hive, YARN, Spark, Impala, Kafka, SOLR, Oozie, Sentry, Encryption, HBase, etc.
  • Migrating applications from existing systems like MySQL, Oracle, DB2 and Teradata to Hadoop.
  • Good hands on experience on LINUX Administration and troubleshooting issues related to Network and OS level.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka, Horton work, Ambari, Knox, Phoenix, Impala, Storm.

Hadoop Distribution: Cloudera Distribution of Hadoop (CDH), Hortonworks HDP (Ambari 2.6.5)

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and JBoss, Web Applications Tomcat and Nginx.

Programming Languages/Scripting: Java, Pl SQL, Shell Script, Perl, Python.

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Ranger Test NG, Junit.

Databases: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Systems Administration, Incident Management, Release Management, Change Management.

WORK EXPERIENCE:

Hadoop Admin

Confidential

Responsibilities:

  • Maintenance of the cluster, addition and removal of nodes using cluster monitoring tools (Cloudera manager and ganglia), configuring Name Node high availability and keeping a track of all running jobs.
  • Implementation, management and administration of the overall eco system.
  • Execution of day to day to running of clusters including availability on a continuous basis.
  • Implementation of capacity planning and estimation of requirements for lowering or increasing the capacity of cluster.
  • Management and review of log files.
  • Implementation of recovery and backup tasks.
  • Implementation of resource and security management.
  • Build new environments Dev/QA, new Hadoop cluster Cloudera 5.15.2
  • Enable Kerberos and TLS security
  • Create policies for RBAC (Role based access control list)
  • Creating AD groups for respective groups.
  • Build logical and physical diagrams for Dev/QA and provide hardware recommendation.

Hadoop Admin

Confidential, Chandler, AZ

Responsibilities:

  • Deployed HDInsight cluster in Microsoft Azure ranging from Dev cluster to prod cluster.
  • Implemented ESP (Enterprise security package) in HDInsight cluster.
  • Involved in creating RBAP (Role based access policies)
  • Created key-tabs for all the application specific service accounts.
  • Involved in cluster tuning Hive, Tez, Yarn, and HDFS. Spark.
  • Configured HIVE LLAP for better performance of the query.
  • Worked closely with developer team to troubleshoot all the ongoing issue
  • Worked on R-Studio connectivity to secured cluster HDP3.1 cluster
  • Worked on Zeppelin with sparklvy and hive interpreters.
  • Installed and configured Kerberos manager on client machine to connect secured Hadoop cluster.
  • Installed ggplot2 libraries on all R-Nodes for analytics team to develop graphs in zeppelin.
  • Involved in performance tuning of spark.livy2 interpreter for troubleshooting on going issues.
  • Worked on SparkSQL and hive.execute.query.

Hadoop Admin

Confidential, Denver, CO

Responsibilities:

  • Installed Hadoop cluster on Hortonworks distribution HDP2.6.
  • Implemented LDAP integration with Ambari.
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage & review Hadoop log files.
  • Working as Hadoop Administrator with Hortonworks of Hadoop.
  • Planned and installed Hadoop clusters from Poc to Prod environment.
  • Implemented HA for Namenode, YARN, Hive, and Spark.
  • Big Data DEV & PROD cluster upgraded from HDP 2.3.x to HDP2.6.x
  • Big Data DEV & PROD cluster upgraded from Ambari 2.1.x to 2.6.x.x
  • Installed Elastic Search cluster using Ansible.
  • Installation, configuration and administration experience in Big Data platform Hortonworks Ambari, Apache Hadoop on RedHat, and Centos as a data storage, retrieval, and processing systems
  • Experience in designing, installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks.
  • Created Ansible playbook for Elastic search configuration.
  • Troubleshoot and debugs cluster problems on RHEL based Linux infrastructure.
  • Configure YARN Capacity Scheduler based on infrastructure needs/YARN tuning
  • Review Namenode WebUI for information concerning Datanode volume failures.
  • Purge older log files.
  • ACL's and audits to meet compliance specifications using Ranger.
  • Installed Kafka Manager to manage and monitor Kafka cluster. (Monitor consumer lags)

Environment: Hortonworks, Hive, HDFS, Impala, and SPARK YARN, Kafka, SOLR, Oozie, Sentry, Encryption, Hbase, etc.

Hadoop Admin

Confidential, St. Louis, MO

Responsibilities:

  • Deployed Hadoop cluster of Cloudera Distribution and installed ecosystem components: HDFS, Yarn, Zookeeper, HBase, Hive, MapReduce, Pig, Kafka, Confluent Kafka, Storm and Spark in Linux servers.
  • Configured Capacity Scheduler on the Resource Manager to provide a way to share large cluster resources.
  • Deployed Name Node high availability for major production cluster.
  • Installed Kafka cluster with separate nodes for brokers.
  • Experienced in writing the automatic scripts for monitoring the file systems, key MapR services.
  • Configured Oozie for workflow automation and coordination.
  • Troubleshoot production level issues in the cluster and its functionality.
  • Backup data on regular basis to a remote cluster using Distcp.
  • Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
  • Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster.
  • Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
  • Used Sqoop to connect to the ORACLE, MySQL, and Teradata and move the data into Hive /HBase tables.
  • Used Apache Kafka for importing real time network log data into HDFS.
  • Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
  • Performed Disk Space management to the users and groups in the cluster.
  • Used Storm and Kafka Services to push data to HBase and Hive tables.
  • Documented slides & Presentations on Confluence Page.
  • Used Kafka to allow a single cluster to serve as the central data backbone for a large organization.
  • Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
  • Used Sqoop, Distcp utilities for data copying and for data migration.
  • Worked on end to end Data flow management from sources to NoSQL (mongo DB) Database using Oozie.
  • Involved with Continuous Integration team to setup tool GitHub for scheduling automatic deployments of new/existing code in Production.
  • Monitored multiple hadoop clusters environments using Nagios. Monitored workload, job performance and capacity planning using MapR control systems.
  • Monitor Hadoop cluster connectivity and security.
  • Manage and review Hadoop log files.
  • File system management and monitoring.
  • Used Avro SerDe for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.

Environment: Kafka, HBase, Hive, Pig, Sqoop, Yarn, Apache Oozie workflow scheduler, Flume, Zookeeper, Kerberos, Nagios, MapR.

Hadoop Admin

Confidential, Atlanta, GA

Responsibilities:

  • Experience in managing scalable Hadoop cluster environments.
  • Involved in managing, administering and monitoring clusters in Hadoop Infrastructure.
  • Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades when required.
  • Experience in HDFS maintenance and administration.
  • Managing nodes on Hadoop cluster connectivity and security.
  • Experience in commissioning and decommissioning of nodes from cluster.
  • Experience in Name Node HA implementation.
  • Working on architected solutions that process massive amounts of data on corporate and AWS cloud-based servers.
  • Working with data delivery teams to setup new Hadoop users.
  • Installed Oozie workflow engine to run multiple Map Reduce, Hive and pig jobs.
  • Configured Megastore for Hadoop ecosystem and management tools.
  • Hands-on experience in Nagios and Ganglia monitoring tools.
  • Experience in HDFS data storage and support for running Map Reduce jobs.
  • Performing tuning and troubleshooting of MR jobs by analysing and reviewing Hadoop log files.
  • Installing and configuring Hadoop eco system like Sqoop, Pig, Flume, and Hive.
  • Maintaining and monitoring clusters. Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Experience in using distcp to migrate data between and across the clusters.
  • Installed and configured Zookeeper.
  • Hands on experience in analyzing Log files for Hadoop eco system services.
  • Coordinate root cause analysis efforts to minimize future system issues.
  • Troubleshooting of hardware issues and closely worked with various vendors for Hardware/OS and Hadoop issues.

Environment: Cloudera4.2, HDFS, Hive, Pig, Sqoop, HBase, Chef, Rhel, Mahout, Tableau, Micro strategy, Shell Scripting, Red Hat Linux.

Linux Systems Administrator

Confidential

Responsibilities:

  • Installation, configuration and troubleshooting of Red hat 4.x, 5.x, Ubuntu 12.x and HP-UX 11.x on various hardware platforms
  • Installed and configured JBoss application server on various Linux servers
  • Configured Kick-start for RHEL (4, and 5), Jumpstart for Solaris and NIM for AIX to perform image installation through network
  • Involved in configuration and troubleshooting of multipathing Solaris and bonding on Linux Servers
  • Configured DNS Servers and Clients, and involved in troubleshooting DNS issues
  • Worked with Red Hat Linux tools like RPM to install packages and patches for Red Hat Linux server
  • Developed scripts for automating administration tasks like customizing user environment, and performance monitoring and tuning with nfsstat, netstat, iostat and vmstat
  • Performed network and system troubleshooting activities under day to day responsibilities.
  • Created Virtual server on VMware ESX/ESXi based host and installed operating system on Guest Servers.
  • Installed and configured the RPM packages using the YUM Software manager.
  • Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
  • Defining and Develop plan for Change, Problem & Incident management Process based on ITIL.
  • Networking communication skills and protocols such as TCP/IP, Telnet, FTP, NDM, SSH, rlogin.
  • Installed/Configured/Maintained/Administrated the network servers DNS, NIS, NFS, SENDMAIL and application servers Apache, JBoss on Linux.
  • Extensive use of Logical Volume Manager (LVM), creating Volume Groups, Logical volumes and disk mirroring in HP-UX, AIX and Linux.
  • Working with Clusters; adding multiple IP addresses to a Servers via virtual network interface in order to minimize network traffic (load-balancing and failover clusters)
  • User and Security administration; addition of User into the System and Password management.
  • System Monitoring and log management on UNIX and Linux Servers; including, crash and swap management, with password recovery and performance tuning.

Environment: Red hat, Ubuntu, JBoss, RPM, VMware, ESXi, LVM, DNS, NIS, HP-UX.

We'd love your feedback!