We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

3.00/5 (Submit Your Rating)

Knoxville, TennesseE

SUMMARY

  • 8+ years of professional experience including around 4 years of Linux Administrator and 4 plus years in Big Data analytics as Hadoop/Big Data Administrator.
  • Experience in architecting, designing, installing, configuring and managing of ApacheHadoop Clusters in MapR, Hortonworks&Cloudera Hadoop Distribution.
  • Experience in Configuring and maintaining HA of HDFS, YARN (yet another resource negotiator) Resource Manager, MapReduce, Hive, HBASE and Kafka.
  • Practical noledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
  • Experience in managing Hadoop infrastructure like commissioning, decommissioning, log rotation, rack topology implementation.
  • Experience in understanding and managing Hadoop Log Files.
  • Configuring the Zookeeper to coordinate the servers in Clusters and to maintain the Data Consistency.
  • Experience in understanding Hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in collecting the logs from log collector into HDFS using Flume.
  • Experience in Kafka multi node cluster setup.
  • Experience in setting up and managing the batch scheduler Oozie.
  • Extending Hive functionalities by writing custom UDFs.
  • Experience in integrating AD/LDAP users wif Ambari and Ranger.
  • Good experience in implementing Kerberos&Ranger in Hadoop Ecosystem.
  • Experience in configuring policies in Ranger to provide the security for Hadoop services (Hive, HBase, Hdfsetc.)
  • Good Understanding of Rack Awareness in the Hadoop cluster.
  • Experience in using Monitoring tools like Cloudera manager and Ambari.
  • Experienced in adding/installation and configuring of new services and removal of them through Ambari.
  • Experienced in Ambari-alerts configuration for various components and managing the alerts.
  • Involved in migration of cluster to AWS.
  • Good understanding of Lambda functions.
  • Actively worked on enabling ssl for Hadoop services in EMR.
  • Analyzed and tuned performance for spark jobs in EMR, understanding the type and size of the input processed using specific instance types.
  • Good Understanding of data ingestion pipelines.
  • Set up Disks for MapR, Handled Disk Failures configured storage pools and worked wif a Logical Volume Manager.
  • Managed Data wif Volumes and worked wif Snapshots, Mirror Volumes, Data protection and Scheduling in MapR.
  • Experience on UNIX commands and Shell Scripting.
  • Excellent interpersonal, communication, documentation and presentation skills.

TECHNICAL SKILLS

Hadoop /Big Data Technologies: HDFS, Map Reduce, YARN, HBase, Hive, Tez, Sqoop, Flume, Zookeeper, Spark, Storm, MongoDB, Pig, Hue, Ranger, Impala, Kafka, Oozie and Kerberos

Programming Languages: Shell Scripting, Java, Python

Monitoring Tools: Cloudera Manager, Ambari, MapR

Databases: SQL Server, MYSQL, Cassandra

Web Technologies: HTML, XML, JSON, JavaScript

Operating Systems: Linux, Unix, Windows, Mac, CentOS

Other Concepts: OOPS, Data Structures, Algorithms, Software Engineering

Protocols: TCP/IP, FTP,SSH,Telnet,SCP,RSH,ARP and RARP

Configuration Tools: Puppet, IBM-TEM tool

PROFESSIONAL EXPERIENCE

Confidential, Knoxville,Tennessee

Hadoop Administrator

Responsibilities:

  • Cluster maintenance, Monitoring, Troubleshooting, Manage and review data backups, Manage & review log files Using Hortonworks.
  • Implemented and configured High Availability Hadoop Cluster usingHortonworks Distribution.
  • Experience working on Hadoop components like HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka.
  • Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
  • Experience in using Flume to stream data into HDFS - from various sources. Used Oozie workflow
  • Engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Deployed Network file system for Name Node Metadata backup.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Back up of data from active cluster to a backup cluster using distcp.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented an instance of Zookeeper for Kafka Brokers.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Worked in Kerberos, Active Directory/LDAP, Unix based File System.
  • Implemented Kerberos for autanticating all the services in Hadoop Cluster.
  • Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing wif blacklisted task trackers.
  • Performance tuning of Jobs when Yarn jobs are slow, Tez job is slow, Slow data loading.
  • Managing the alerts on the Ambari page and take corrective and preventive actions.
  • HDFS Disk space management, Generate HDFS Disc Utilization report for Capacity planning.
  • User access management by setting up new users and providing them name Quotas and Space Quotas.

Environment: HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka, Ranger, Kerberos, Zookeeper

Confidential, Herndon, Virginia

Hadoop Administrator

Responsibilities:

  • Installed and Configured Hadoop monitoring and administrating tools like Cloudera Manager.
  • Participated in setting up Rack topology in the cluster.
  • Implemented and configured High Availability Hadoop Cluster (Quorum Based) usingCloudera Distributed Hadoop.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings
  • Removed the nodes for maintenance or malfunctioning nodes using decommissioning and added nodes using commissioning.
  • Hands on experience working on Hadoop ecosystem components like Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume.
  • Monitored services through Zookeeper
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Worked on analyzing Data wif HIVE and PIG.
  • Performed cluster backup using DISTCP, Cloudera manager BDR and parallel ingestion.
  • Configure and manage TCP/IP networking on RHEL systems. Manage filesystems using fdisk and LVM.
  • Configure NFS server and mount exported NFS resources at the client side.
  • Setup secured password less SSH autantication on servers using SSH keys. Setup SSH keys for secured key-based autantication. Monitored and controlled system processes.
  • Performs hardware/software installations; and uses LVM (logical volume management) to create, volume groups, and physical volumes, extend and reduce logical volumes.
  • Performs system monitoring using Linux native tools such as: top, vmstat, sar, tcpdump, ps, etc. Manage disk usage maintenance using native Linux tools such as: du, df and sar.
  • Responsible for maintaining system logs for unauthorized root usage and access. NFS/CIFS filesystem mounting and support for developers. Experience upgrading server operating systems, tech refresh, imaging, patch management.
  • Use Linux utilities such as lvextend, vgextend, etc. to increase file systems as needed. Troubleshoot User's login & home directory related issues.

Environment: Hadoop Quorum Based, Oozie, Hive, Pig, Sqoop, MapReduce, HDFS, Cloudera, Cloudera manager ZooKeeper, Nagios, Ganglia, Metadata, Flume, Yarn.

Confidential, Columbus, Ohio

Hadoop Administrator

Responsibilities:

  • Installed and Configured Hadoop monitoring and administrating tools like MapR.
  • Installed and Configured Sqoop to import and export the data into MapR-FS, HBase and Hive from Relational databases.
  • Administering large MapR Hadoop environments to build and support cluster set up, performance tuning and monitoring in an enterprise environment.
  • Involved in setup and configuration of Kafka as a Messaging System.
  • Worked on setting up KAFKA for streaming data and monitoring for the Cluster. Load data from relational databases into MapR-FS filesystem and HBase using Sqoop.
  • Has good noledge on writing and using the user defined functions in HIVE, PIG and MapReduce.
  • Setup password less login using SSH public - private key
  • Setting up MapR metrics wif NoSQL database to log metrics data. Integrated clusters wif Active Directory and enabled Kerberos for Autantication.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
  • Installation and Configuration of other Open Source Software like Pig,Hive, HBASE, Flume and Sqoop.
  • Worked on creating the Data Model for HBase from the current Oracle Data model. Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs. Worked wif Linux server admin team in administering the server hardware and operating system.
  • Worked closely wif data analysts to construct creative solutions for their analysis tasks. Collaborating wif application teams to install operating system and Hadoop updates, patches, version upgrades when required.
  • Automated workflows using shell scripts pull data from various databases into Hadoop.
  • Performed both major and minor upgrades to the existing cluster and rolling back to the previous version.
  • Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.
  • Handle the issues reported by the developers and clients. Monitor Hadoop cluster connectivity and security.

Environment: MapR, Map-FS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume,Infra, Kafka, Ranger, Kerberos, Zookeeper

Confidential, Irvine, CA

Hadoop/Bigdata Administrator

Responsibilities:

  • Handle the installation and configuration of a Hadoop cluster using Hortonworks Distribution.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
  • Monitor the data streaming between web sources and HDFS.
  • Worked in Kerberos and how it interacts wif Hadoop and LDAP.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Experience in Continuous Integration and expertise in Jenkins and Hudson tools.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution.
  • Responsible for doing capacity planning based on the data size requirements provided by end-clients.
  • Worked in Unix commands and Shell Scripting.
  • Experience in doing performance tuning based on the inputs received from the currently running jobs.
  • Used Apache Oozie for scheduling and managing the Hadoop Jobs. Knowledge on HCatalog for Hadoop based storage management.
  • Worked in core competencies in Java, HTTP, XML and JSON.
  • Worked on spark it’s a fast and general - purpose clustering computing system.
  • Worked on Storm its distributed real-time computation system provides a set of general primitives for Commission and decommissions the Data nodes from cluster in case of problems.
  • Experience in a Web-based Git repository hosting service, which offers all the distributed revision control and source code management (SCM) functionality of Git as well as adding its own features in Git Hub.
  • Experience in Hortonworks Distribution Platform (HDP) cluster installation and configuration.
  • Worked in statistics collection and table maintenance on MPP platforms.
  • Worked on large sets of structured, semi-structured and unstructured data.
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Involved in creating Hive tables, loading wif data and writing hive queries, which will run internally in map, reduce way.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Hortonworks, Flume, HBase, ZooKeeper, Oracle, NoSQL and Unix/Linux.

Confidential

Linux Administrator

Responsibilities:

  • Installing and upgradingRed hat Linux and Solaris 8/ & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
  • Experience in LDOM’s and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 10.
  • Experience working wif HP LVM and Red hat LVM.
  • Experience in implementing P2P and P2V migrations.
  • Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
  • Implemented HA using Red Hat Cluster and VERITAS Cluster Server 5.0 for Web Logic agent.
  • Managing DNS, NIS servers and troubleshooting the servers.
  • Troubleshooting application issues on Apache web servers and database servers running on Linux and Solaris.
  • Experience in migrating Oracle, MYSQL data using Double take products.
  • Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes wif layouts like RAID 1, 5, 10, 51.
  • Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss, Dtrace.
  • Experience working on LDAP user accounts and configuring Ldap on client machines.
  • Upgraded Clear-Case from 4.2 to 6.x running on Linux (Centos &Red hat)
  • Worked on patch management tools like Sun Update Manager.
  • Experience supporting middle ware servers running Apache, Tomcat and Java applications.
  • Worked on day-to-day administration tasks and resolve tickets using Remedy.
  • Used HP Service center and change management system for ticketing.
  • Worked on the administration of the Web Logic 9, JBoss 4.2.2 servers including installation and deployments.
  • Worked on F5 load balancers to load balance and reverse proxy Web Logic Servers.

Environment: Solaris 8/9/10, Veritas Volume Manager, web servers, LDAP directory, Active Directory, BEA Web logic servers, SAN Switches, Apache, Tomcat servers, Web Sphere application server.

Confidential

Linux Administrator

Responsibilities:

  • Installing, configuring and updating Solaris 7, 8, Red Hat 7.x, 8, 9, Windows NT/2000 Systems using media and Jumpstart and Kick-start.
  • Installing and configuring Windows Active directory server 2000 and Citrix Servers.
  • Published and administered applications via Citrix Meta Frame.
  • Creating System Disk Partition, mirroring root disk drive, configuring device groups in UNIX and Linux environment.
  • Working wif VERITAS Volume Manager 3.5 and Logical Volume Manager for file system management, data backup and recovery.
  • Implementing backup solution using Dell T120 autoloader and CA Arc Server 7.0
  • Installed and Configured SSH Gate for Remote and Secured Connection.
  • Configuration of DHCP, DNS, NFS and auto mounter.
  • Creating, troubleshooting and mounting NFS File systems on different OS platforms.
  • Installing, Configuring and Troubleshooting various software’s like Windd, Citrix - Clarify, Rave, VPN, SSH Gate, Visio 2000, Star Application, Lotus Notes, Mail clients, Business Objects, Oracle, Microsoft Project.
  • Working 24/7 on call for application and system support.
  • Experience in working and supported SIBES database running on Linux Servers.

Environment: HP ProLiant servers, SUN Servers (6500, 4500, 420, Ultra 2 Servers), Solaris 7/8, Veritas Net Backup, Veritas Volume Manager, Samba, NFS, NIS, LVM, Linux, Shell Programming.

We'd love your feedback!