We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Dallas, TX

PROFESSIONAL SUMMARY

  • Around 7 years of professional IT experience including 3 years in Big data ecosystem related technologies.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
  • Setting up monitoring infrastructure for Hadoop Cluster using Nagios and Ganglia.
  • Performed importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in managing and reviewing Hadoop log files.
  • Excellent technical and analytical skills with clear understanding of ETL design and project architecture based on reporting requirements.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Extending Hive and Pig core functionality by writing UDFs.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, Flume,Sentry,Zookeeper.
  • Familiar with major Hadoop distributions Cloudera CDH, Apache Hadoop distribution, Hortonworks.
  • Utilized Apache Hadoop environment by Hortonworks
  • Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH 4&5 clusters.
  • Expert level understanding of Hadoop technologies, integration and troubleshooting.
  • Providing support to Data analyst in running PIG and HIVE queries.
  • Writing shell scripts to dump the Shared Data from MySQL servers to HDFS.
  • Good understanding of the NoSQL databases like MongoDB, Cassandra.
  • Implemented configuration management system using Puppet
  • Refactored site specific several Puppet manifests into reusable Puppet modules.
  • Configuration management and deployment with Chef.
  • Experience in designing, installing and configuring VMware ESXi, withinVSphere 5 environment with Virtual Center management, Consolidated Backup, DRS, HA, vMotion and VMware Data.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing
  • Hands on experience in Agile and Scrum methodologies.
  • Extensive experience in working with the Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for the users.
  • Deep knowledge of the Java programming language and exposure to other JVM languages and tech especially Scala, Akka, and Storm
  • Excellent coding skills in Java, Python and familiar with Code controls.
  • Installing and configuring of security networks like certificates using SSL.
  • Managed small projects interfacing with key stakeholders to update servers and operating systems, security vulnerabilities, patching, passwords, setup and installations and ad hoc projects.
  • Worked on different security networking like VPN, IKE, IPSEC, Firewall and SSL Certificates.

TECHNICAL SKILLS

Big Data Ecosystem: HDFS, HBase, Hadoop MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Hue,Nagios,Whirr,Apache Spark,Mahout,Centry,Impala Sun Grid Enine,Puppet,Chef

RDBMS/ Database: SQL Server 2000/2005/2008 R2, MS-Access XP/2007/2008, ORACLE 10g/9i, MySQL, NoSQL: MongoDB, HBase, Cassandra

Scripting Languages: Shell scripting, Java Scripting, UNIX shell scripting, Python, SQL, PIG LATIN

Networking: HTTP, SMTP, NFS, FTP, DNS, DHCP, CIFS, TCP/IP

Operating Systems: Unix, Linux, AIX, Windows XP, Server 2000, 03, Server 2008.

PROFESSIONAL EXPERIENCE

Confidential, Dallas,TX

Hadoop Administrator

Environment: Hadoop, Map Reduce, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script, Hortonworks.

Responsibilities:

  • Worked on performing major upgrade of cluster from CDH3u6 to CDH4.2.0.
  • Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
  • Installed Cloudera Manager on an already existing Hadoop cluster.
  • Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
  • Responsible for ongoing maintenance, expansion and improvement of a cross-regional ESX infrastructure, supporting over 400 Virtual Servers, as well as offshore Desktops systems.
  • Closely worked and coordinate efforts with the storage team to analyze performance data, and used it to plan and deploy methods of maximizing performance and reliability.
  • User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
  • Used HiveQL to write Hive queries from the existing SQL queries.
  • Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of OpenStack
  • The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
  • Developed custom MapReduce programs and custom User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
  • Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
  • Worked on installation of DataStax Cassandra cluster.
  • Installing and monitoring the Hadoop cluster resources using Ganglia and Nagios.
  • Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, Flume, Apache Spark, Sentry,
  • Managed Extracting, Loading and transforming in and out of hadoop using HIVE and IMPALA.
  • Rebuilt existing nagios system to integrate tightly with puppet for automatic monitoring configuration Collaborated closely with engineers to track down various issues causing segfaults and performance degredations in our environment
  • Instituted change management process using git and puppet, added validation and automated problem reporting
  • Accommodate load in its place before the data is analyzed using Apache Kafka with its fast, scalable, fault-tolerant system.
  • Handled HUE the open source web based interface to interact with hadoop services.
  • Utilized Apache Spark for Interactive Data Mining and Data Processing.

Confidential, NC

Hadoop Administrator

Environment: MapReduce, HDFS, Pig, Hive, HBase, Flume, Sqoop, Ooozie, Zookeeper, Nagios, Ganglia, Tableau and Cloudera Manager.

Responsibilities:

  • Worked on setting up the Hadoop cluster for the dev, test and prod Environment.
  • Worked on pulling the data from oracle databases into the hadoop cluster using the sqoop import.
  • Worked with flume to import the log data from the reaper logs, syslog’s into the Hadoop cluster.
  • Monitored disk, Memory, Heap, CPU utilization on all Master and Slave machines using Cloudera Manager.
  • Worked on High Availability for Name Node using ClouderaManager to avoid single point of failure.
  • Configured Fair Scheduler to provide fair resources to all the applications across the cluster.
  • Worked with application teams to install Hadoop updates, patches, version upgrades as required.
  • Manage and review data backups and log files.
  • Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
  • Implemented nagios and integrated with puppet for automatic monitoring of servers known to puppet.
  • Effectively used Sqoop to transfer data from databases (MySql, Oracle) to HDFS, Hive.
  • Created Hive Managed and External tables defined with static and dynamic partitions.
  • Used Ganglia to monitor the cluster around the clock.
  • Supported Data Analysts in running Map Reduce Programs.
  • Contribute to the creation and maintenance of system documentation.
  • Deployment hadoop on AWS EC2 with Whirr.
  • Creating authenticated URL with time bounded validity using Amazon S3.
  • Utilized the Apache Solr to replace the core content, boost features and better its performance.

Confidential, Sunnyvale, CA

Linux Administrator

Environment: Linux ( Red Hat Enterprise, CentOS), Windows 2000/NT, HP, IBM, Solaris, Oracle 8i, Cisco routers/switches, Dell 6400, 1250, Sun E450, E250.

Responsibilities:

  • Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
  • Log management like monitoring and cleaning the old log files.
  • System audit report like no. of logins, success & failures, running cron jobs.
  • System performance for hourly basis or daily basis.
  • Remotely coping files using sftp, ftp, sfcp, winscp, and filezilla.
  • Resolved Security Access Requests via Peregrine Service center to provide the requested User access related requests.
  • Created user roles and groups for securing the resources using local operatingSystem authentication.
  • Experienced in tasks like managing User Accounts and Groups, managing Disks and File systems.
  • Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
  • Configuring & monitoring DHCP server.
  • Taking backup using tar and recovering during the data loss.
  • Experience in writing bash scripts for job automation.
  • Problem determination, Security, Shell Scripting.
  • Documenting the installation of third-party software’s.
  • Configuring printers to the Solaris and Linux servers and also installing third party softwares.
  • Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
  • Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
  • Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
  • Performed regular installation of patches using RPM and YUM.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP
  • Maintained LVM, VxVM and SVM filesystems along with NFS.
  • AWS Elastic Map Reduce and S3/HDFS storage with Hbase, Apache Mahout Platform design and build.

Confidential

System Administrator

Responsibilities:

  • Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
  • Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
  • Log management like monitoring and cleaning the old log files.
  • System audit report like no. of logins, success & failures, running cron jobs.
  • System performance for hourly basis or daily basis.
  • Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
  • Created user roles and groups for securing the resources using local operating System authentication.
  • Experienced in tasks like managing User Accounts and Groups, managing Disks and Filesystems.
  • Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
  • Configuring & monitoring DHCP server.
  • Taking backup using tar and recovering during the data loss.
  • Experience in writing bash scripts for job automation.
  • Documenting the installation of third-party software’s.
  • Configuring printers to the Solaris and Linux servers and also installing third party softwares.
  • Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
  • Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
  • Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
  • Performed regular installation of patches using RPM and YUM.
  • Maintained LVM, VxVM and SVM filesystems along with NFS.

We'd love your feedback!