We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Dallas, TX

PROFESSIONAL SUMMARY:

  • Overall 8 years of IT experience which include 4+ years of experience in Hadoop Administration and 3+ years of experience into Linux/UNIX Systems administration.
  • Experienced in installation, configuration, supporting and monitoring Hadoop clusters using ClouderaHortonworks and MapR distributions.
  • Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
  • Hadoop Cluster capacity planningperformance tuningcluster MonitoringTroubleshooting.
  • Design Big Data solutions for traditional enterprise businesses.
  • Backup configuration and Recovery from a NameNode failure.
  • Excellent command in creating Backups & Recovery and Disaster recovery procedures and Implementing BACKUP and RECOVERY strategies for off - line and on-line Backups.
  • Experience in minor and major upgrades of Hadoop and Hadoop eco system
  • Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network
  • Ability to work closely with Devops teams, in order to ensure high quality and timely delivery of builds and releases.
  • Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup& Recovery strategies.
  • Good Experience in setting up the Linux environments, Password less SSH, Creating file systems, disabling firewalls, swappiness, Selinux and installing Java.
  • Experienced with Devops tools like Chef, Puppet, Ansible, Jenkins, Jira, Docker and Splunk. 
  • Good Experience in Planning, Installing and Configuring Hadoop Cluster in Cloudera and Hortonworks Distributions
  • Installing and configuring Hadoop eco system like Pig, Hive.
  • Hands on experience in Installing, Configuring and managing the Hue and HCatalog.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Experience in importing and exporting the logs using Flume.
  • Optimizing performance of Hbase/Hive/Pig jobs.
  • Experienced on Devops Virtualization technologies like V M ware V Sphere &XEN. 
  • Hands on experience in Zookeeper and ZKFC in managing and configuring in NameNode failure scenarios.
  • Knowledge on MEAN Stack - Express.js, AngularJS, Node.js.  Expertise using TalendPentaho, Sqoop for ETL operations
  • Handsome experience in Linux admin activities on RHEL & Cent OS.
  • Experience in deploying Hadoop 2.0(YARN).
  • Expertise in implementing enterprise level security using AD/LDAPKerberosKnox, Sentry and Ranger.
  • Extensive experience in developing the SOA middleware based out of Fuse ESB and Mule ESB.And Configured, Elastic Search Log StashKibana to monitor spring batch jobs.
  • Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability
  • Familiar with writing Oozie workflows and Job Controllers for job automation.
  • Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment -  Amazon Web Services (AWS)-EC2 and on private cloud infrastructure -  Open Stack cloud platform.
  • Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.

TECHNICAL SKILLS:

  • HDFS, Map Reduce, Pig, Hive, Hbase, Sqoop, Zookeeper, Oozie, Hue, HCatalog, Storm, Kafka, Key Value Store Indexer, and Flume.
  • MySQL, Oracle 8i/9i/10g, SQLServer, PL/SQL.
  • Hbase, Cassandra, Cloudera  Impala, Mongo DB.
  • HDP Ambari, Cloudera Manager, Hue, SolrCloud.
  • Shellscripting, HTMLscripting, Puppet, Ansible.
  • Apache Tomcat, JBOSS and Apache Http web server.
  • Net Beans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office.
  • Kerberos, NagiOS & Ganglia
  • Java, HTML, MVC, Struts, Hibernate, Servlet, Spring, Web services.
  • Windows XP, 7, 8, UNIX, MAC, MS DOS.
  • MS Office, MS Project, MS Visio, MS Visual Studio 2003/ 2005/ 2008.

PROFESSIONAL EXPERIENCE:

Hadoop Administrator

Confidential, Dallas, Tx

Responsibilities:

  • Responsible for adding/installation of new components and removal of them through HDP.
  • Responsible for Deployment Automation using multiple tools Chef, GIT, ANT Scripts on AWS.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive
  • Installed, configured and deployed a 50 node MapR Hadoop Cluster for Development and Production
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
  • Managing Amazon Web Services (AWS) infrastructure with automation and configuration management tools such as Chef, Ansible, Puppet
  • Experience in projects involving movement of data from other databases to Cassandra with basic knowledge of Cassandra Data Modeling.
  • Worked with development teams to assist in Hive tables and storage design, query analysis
  • Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios. 
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (like MapReduce, Pig, Hive, Sqoop) as well as system specific jobs.
  • Migrated services from a managed hosting environment to AWS including service design, network layout, data migration, automation, monitoring, deployments and cutover, documentation, overall plan, cost analysis, and timeline. 
  • Integrated Kafka with Flume in sand box Environment using Kafka source and Kafka sink.
  • Worked on Audit login, Error login and Reference check while loading data to Data warehouse.
  • Experienced installing, configuring, upgrading, managing, and administering a Cassandra database.
  • Integrated Impala to use the same file and data formats, metadata, security and resource management frameworks.
  • Worked on tuning the performance Pig queries.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Troubleshoot the build issue during the Jenkins build process. Implement Docker to create containers for Tomcat Servers, Jenkins.
  • Coordinated deployment methodologies (Bash, Puppet & Ansible).

Environment: HDFS, Map Reduce, Hive, AWS, Pig, GIT, Flume, kafka, Ambari, Oozie, Sqoop, HDP, Informatica, SOLR, python, Nosql, chef, Kerberos, MySQL and Oracle.

Hadoop Administrator

Confidential, McLean, VA

Responsibilities:

  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Using Flume and Spool directory loading the data from local system to HDFS.
  • Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis
  • Fine tuning hive jobs for optimized performance.
  • Worked on NoSQL databases including HBase, Mongo DB, and Cassandra. Implemented multi-data center and multi-rack Cassandra cluster.
  • Configured internode communication between Cassandra nodes and client using SSL encryption.
  • Partitioned and queried the data in Hive for further analysis by the BI team. Extending the functionality of Hive and Pig with custom UDF s and UDAF’s.
  • Monitoring SOLR transparency’s and reviews the SOLR servers. Creating and deploying a corresponding SolrCloud collection.
  • Creating and truncating HBase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue. Responsible for building scalable distributed data solutions using Hadoop.
  • Commissioned and Decommissioned nodes on CDH5 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS. Creating and managing the Cron jobs.
  • Implementation of Ranger, Ranger plug-ins and Knox security tools. Implemented Kerberos Security Authentication protocol for existing cluster.
  • Researched applications in use that is not able to use TLS and possibly stand up a proxy LDAP server to fulfill this request.
  • Implemented test scripts to support test driven development and continuous integration.
  • Responsible to manage data coming from different sources. Involved in loading data from UNIX file system to HDFS.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: HDFS, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Salt, Puppet, chef, Oozie, Sqoop, CDH5, Apache Hadoop 2.6, Spark, SOLR, Storm, Knox, Impala, Red Hat, MySQL and Oracle.

Hadoop Administrator

Confidential, Herndon, VA

Responsibilities:

  • Worked on setting up Hadoop cluster for the Production Environment.
  • Supported 200+ servers and 50+ users to use Hadoop platform and resolve tickets and issues they run into and provide training to users to make Hadoop usability simple and updating them for best practices.
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Installed, configured and deployed a 50 node MapR Hadoop Cluster for Development and Production
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
  • Configured, installed, monitored MapR Hadoop on 10 AWS ec2 instances and configured MapR on Amazon EMR making AWS S3 as default file system for the cluster
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.
  • Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
  • Responsible for Installing, setup and Configuring Apache Kafka and Apache Zookeeper.
  • Wrote a technical paper and created slideshow outlining the project and showing how Cloudera can be potentially used to improve performance.
  • Setting up monitoring tools for Hadoop monitoring and alerting. Monitoring and maintaining Hadoop cluster HBase/zookeeper.
  • Write scripts to automate application deployments and configurations. Hadoop cluster performance tuning and monitoring. Troubleshoot and resolve Hadoop cluster related system problems.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
  • Performance tuning of Hadoop clusters and Hadoop MapReduce routines. Screen Hadoop cluster job performances and capacity planning.
  • Monitored Hadoop cluster connectivity and security and also involved in management and monitoring Hadoop log files.

Environment: Big Data, HDFS, Pig, Hive, Storm, HBase, MapReduce, Sqoop, Flume, Zookeeper, Eclipse, MYSQL, UNIX Shell Scripting.

Linux - Hadoop Administrator

Confidential, Saint Louis, MO

Responsibilities:

  • Performed Benchmarking and performance tuning on the Hadoop infrastructure.
  • Automated data loading between production and disaster recovery cluster. Migrated hive schema from production cluster to DR cluster.
  • Provided on-site Linux System Administration support for the Hadoop Cluster and related software stack.
  • Worked on Migrating application by doing POC's from relation database systems. Helping users and teams with incidents related to administration and development.
  • Installed, Configured and Maintained Debian/RedHat Servers at multiple Data Centers. Configured RedHat Kickstart server for installing multiple production servers.
  • Performed high-level, day-to-day operational maintenance, support, and upgrades for the Hadoop operating system, workstations and servers.
  • Coordinate, direct and perform complex software installations and upgrades to operating systems and layered software packages.
  • Continually monitor and tune multiple systems to achieve optimum performance levels.
  • On boarding and training on best practices for new users who are migrated to our clusters. Guide users in development and work with developers closely for preparing a data lake.
  • Log data Stored in HBase is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write SQL queries.
  • Built re-usable Hive UDF libraries which enabled various business analysts to use these UDF's in Hive querying.
  • Created Hive external tables for loading the parse data using partitions.
  • Extensive knowledge in troubleshooting code related issues.
  • Auto Populate Hbase tables with data. Designed and coded application components in an agile environment utilizing test driven development approach.

Environment: HDFS, Map Reduce, Linux Scripting, Shell Scripting, Zoo keeper, cluster health, monitoring security, RedHat Linux

Linux/ Unix Systems Administrator

Confidential 

Responsibilities:

  • Involved in writing scripts to migrate consumer data from one production server to another production server over the network with the help of Bash and Perl scripting.
  • Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
  • Good experience in installation/upgradation of VMware. Automated server building using System Imager, PXE, Kickstart and Jumpstart.
  • Planning, documenting and supporting high availability, data replication, business persistence, and fail-over, fail-back using Veritas Cluster Server in Solaris, RedHat Cluster Server in Linux and HP Service Guard in HP environment.
  • Worked on monitoring of VMware virtual environments with ESXi 4 servers and Virtual Center. Automated tasks using shell scripting for doing diagnostics on failed disk drives.
  • Configured Global File System (GFS) and Zetta byte File System (ZFS). Troubleshooting production servers with IPMI tool to connect over SOL.
  • Established and maintained network users, user environment, directories, and security.
  • Configured system imaging tools Clonezilla and System Imager for data center migration. Configured yum repository server for installing packages from a centralized server.
  • Installation, Configuration and Administration of Solaris 8/9/10 and Linux (RHEL 4/5, SLES 10).
  • Hands on experience working with production servers at multiple data centers.
  • Installed Fuse to mount the keys on every Production server for password-less authentication on Debian servers.
  • Performed SAN Administration for EMC storage, including VMAX, VNX and XtremeIO. Management of RedHat Linux user accounts, groups, directories and file permissions.
  • Implemented the Clustering Topology that meets High Availability and Failover requirement for performance and functionality.
  • Performance monitoring using SAR, Iostat, VMstat and MPstat on servers and also logged to munin monitoring tool for graphical view.
  • Used LDAP in Active directory to add new user to a directory, remove, modify, and grant privileges and policy.
  • Performed Kernel tuning with the sysctl and installed packages with yum and rpm. Installed and configured PostgresSQL database on RedHat/Debian Servers.
  • Configuration and Administration of Apache Web Server and SSL. Backup management Recovery through Veritas Net Backup (VNB).
  • Provided 24/7 on call support on Linux Production Servers. Responsible for maintaining security on Red Hat Linux.

Environment: RHEL 5.x/4.x, Solaris 8/9/10, Nagios, Sun Fire, IBM blade servers, Web sphere 5.x/6.x, Apache 1.2/1.3/2.x, iPlanet, Oracle 11g/10g/9i, Logical Volume Manager, Veritas net backup 5.x/6.0, SAN Multipathing (MPIO, HDLM, Power path), VM ESX 4.1.

We'd love your feedback!