We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Chesterfield, MO

SUMMARY:

  • 8+ Years of extensive IT experience with years of experience as a Hadoop Administrator and 4 years of experience as UNIX/Linux Administrator.
  • Experience in Hadoop Administration activities such as installation, configuration and management of clusters in Cloudera (CDH), Hortonworks (HDP) Distributions, & MapR using Cloudera Manager & Ambari.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, Hive, Impala, Sqoop, Pig, Oozie, Zookeeper, Falcon, Spark, Hue, Flume, Storm, Kafka & Yarn distributions.
  • Experience in Performance Tuning of Yarn, Spark and Hive.
  • Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop and troubleshooting for any issues.
  • Experience in performing backup and disaster recovery of Namenode metadata and important sensitive data residing on cluster.
  • Experience in developing MapReduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
  • Experience in monitoring the health of cluster using Ambari, Nagios, Ganglia and Cron jobs.
  • Cluster maintenance and Commissioning /Decommissioning of datanodes.
  • Good understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, Namenode, Datanode and MapReduce concepts.
  • Implemented security controls using Kerberos principals, ACLs, Data encryptions using dm - crypt to protect entire Hadoop clusters.
  • Assisted development team in identifying the root cause of slow performing jobs / queries.
  • Expertise in installation, administration, patches, upgrade, configuration, performance tuning and troubleshooting of Red hat Linux, SUSE, CentOS.
  • Created and maintained user accounts, profiles, security, rights, disk space and process monitoring.
  • Experience in administration activities of RDBMS data bases, such as MS SQL Server.
  • Involved in log file management where the logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.
  • Planned, documented and supported high availability, data replication, business persistent, fail-over and fallback Solutions.
  • Knowledge of NoSQL databases such as Cassandra, and MongoDB.
  • Knowledge in Installing, Configuring, Supporting and managing Hadoop Cluster on Amazon Web Services (AWS).
  • Knowledge in Implementing AWS solutions using EC2, S3 and load balancers.
  • Installation knowledge on AWS EC2 instances and configuring the storage on S3 buckets.
  • Developed a fully automated continuous integration system using Git, Jenkins, Maven and custom tools developed in Python and Bash.
  • Provided 24/7technical support to Production and development environments.
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS:

Hadoop ecosystem tool s: MapReduce, HDFS, Pig, Hive, HBase, Sqoop, Zookeeper, Oozie, Hue, Storm, Kafka, Spark, Flume

Programming Language: Java, core java, HTML, Programming C, C++

Databases: MySQL, SQL Server, MongoDB

Platforms: Linux (RHEL, Ubuntu,), CentOS, SUSE Linux

Scripting languages: Shell Scripting, HTML scripting, PHP, Puppet.

Hadoop Distributions: Cloudera, Hortonworks and MapR

Cluster Management Tools: HDP Ambari, Cloudera Manager, Hue

PROFESSIONAL EXPERIENCE:

Confidential, Chesterfield, MO

Hadoop Administrator

Responsibilities:

  • Currently working as admin in Horton works (HDP 2.3.4.0) distribution for 3 clusters ranges from Test, Dev, and PROD contains 50 nodes.
  • Responsible for Cluster maintenance, Cluster Monitoring, commissioning and decommissioning Data nodes.
  • Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues and providing instant solution to reduce the impact and documenting the same and preventing future issues.
  • Installed and configured Apache Ranger and Apache Knox for securing HDFS, HIVE and HBASE.
  • Experience on new component installations and upgrading the cluster with proper strategies.
  • Experience on new Discovery Tools installation and integration with Hadoop Components.
  • Monitoring systems and services, architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Hands on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Experience in setup, configuration and management of security for Hadoop clusters using Kerberos and integration with LDAP/AD at an Enterprise level.
  • Installed, Configured, Managed Hadoop Cluster using HDP Ambari.
  • Supported data analyst in running Pig and Hive queries.
  • Performed Data scrubbing and processing with Oozie.
  • Job management using Fair scheduler.
  • Efficiently performed Commissioning and Decommissioning of nodes on the Hadoop Cluster.
  • Successfully Trouble shooted the cluster in case if any node has gone down. And was successfully able to fix it and balance the data distribution on the data nodes using balancer.

Environment: Hadoop, HDFS, MapReduce, Yarn, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, SUSE Linux, Hortonworks Distribution.

Confidential, Rolling Meadow, IL

Hadoop Administrator

Responsibilities:

  • Involved in start to end process of Hadoop cluster setup over Hortonworks manager where in installation, configuration and monitoring the Hadoop Cluster.
  • Installed, Configured and Maintained ApacheHadoop clusters for application development and Hadoop tools like Hive, Hbase, Zookeeper and Sqoop.
  • Worked on Distributed/Cloud Computing (MapReduce/ Hadoop, Hive, Pig, HBase, Sqoop, Flume, Spark, Zookeeper, Tableau, etc.), Hortonworks (HDP 2.2.4.2), for 4 clusters ranges from POC to PROD contains nearly 100 nodes.
  • Installing and Working on 4 Hadoop clusters for different teams, and develop a Data lake which serves as a Base layer to store and do analytics for Developers, providing services to developers, install their custom software’s, upgrade Hadoop components, solve their issues, and help them troubleshooting their long running jobs, working as L3 and L4 support for the Data lake, and manage clusters for other teams.
  • Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
  • Responsible for upgrading Hortonworks Hadoop HDP2.2.0 and MapReduce 2.0 with YARN in Multi Clustered Node environment.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
  • Involved in projects extensively on Hive, Spark, Pig, and Sqoop throughout the development Lifecycle until the projects went into Production.
  • Integrated Kafka with Flume in sand box Environment using Kafka source and Kafka sink.
  • Building automation frameworks for data ingestion, processing in Python, Java, JavaScript, and Scala with NoSQL and SQL databases.
  • Used Python scripts to update the content in Database and manipulate files.
  • Migrated services from a managed hosting environment to AWS including the service design, network layout, data migration, automation, monitoring, deployments and cutover, documentation, overall plan, cost analysis, and timeline.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
  • Performed many complex system analyses in order to improve ETL performance, identified high critical batch jobs to prioritize.
  • Implemented Spark solution to enable real time reports from Cassandra data. Was also actively involved in designing column families for various Cassandra Clusters.
  • Managed Hadoop clusters: setup, install, monitor, maintain.

Environment: Hortonworks Hadoop, Cassandra, Flat files, Oracle 11g/10g, MySQL, Toad 9.6, Windows NT, Sqoop, Hive, Oozie, Unix Shell Scripts, Python, Zoo Keeper, SQL, MapReduce, Pig, Kerberos, Jenkins.

Confidential, Kansas City, MO

Hadoop Administrator

Responsibilities:

  • Installed and configured various components of Hadoop ecosystem and maintained their integrity.
  • Installed and configured Cloudera CDH with Hadoop Eco-Systems like Hive, Oozie, Hue, Spark, Kafka, Hbase, Yarn
  • Planning for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Commissioned Data Nodes when data grew and decommissioned when the hardware degraded.
  • Migrated data across clusters using DISTCP.
  • Experience in collecting metrics for Hadoop clusters using Ganglia.
  • Experience in creating shell scripts for detecting and alerting problems system.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios.
  • Monitored workload, job performance and capacity planning.
  • Worked with application teams to install Hadoop updates, patches, version upgrades as required.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the 2.0 cluster.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Implemented HDFS snapshot feature.
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
  • Configured custom interceptors in Flume agents for replicating and multiplexing data into multiple sinks.
  • DevelopedSimpletocomplexMap/reducestreamingjobsusingPythonlanguagethatareimplementedusingHive and Pig.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
  • AnalyzedthedatabyperformingHivequeries(HiveQL)andrunningPigscripts(PigLatin)tostudycustomer behavior.
  • Implemented Kerberos Security Authentication protocol for existing cluster.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Used Impala to read, write and query the Hadoop data in HDFS or HBase or Cassandra.
  • Exported the analyzed data to their relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Executed tasks for upgrading cluster on the staging platform before doing it on production cluster.
  • Worked with Linux server admin team in administering the server hardware and operating system.
  • Perform maintenance, monitoring, deployments, and upgrades across infrastructure that supports all our Hadoop clusters
  • Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
  • Commissioned Datanode when data grew and decommissioned when the hardware degraded.
  • Debugging and troubleshooting the issues in development and Test environments.
  • Monitor cluster stability, use tools to gather statistics and improve performance.
  • Help to plan for future upgrades and improvements to both processes and infrastructure.

Environment: MapReduce, Sqoop, Flume, Hive, HQL, Pig, RHEL, Cent OS, Oracle, MS-SQL, Zookeeper, Oozie, PostgreSQL, Nagios, Cloudera.

Confidential, Hartford, CT

Hadoop Administrator

Responsibilities:

  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Processed Multiple Data sources input to same Reducer using Generic Writable and Multi Input format.
  • Created Data Pipeline of Map Reduce programs using Chained Mappers.
  • Visualize the HDFS data to customer using BI tool with the help of HIVE ODBC Driver.
  • Familiarity with a NoSQL database such as MongoDB.
  • Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
  • Implemented complex map reduce programs to perform joins on the Map side using Distributed Cache in Java.
  • Responsible for importing log files from various sources into HDFS using Flume.
  • Created customized BI tool for manager team that perform Query analytics using HiveQL.
  • Used Hive and Pig to generate BI reports.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Created Hive Generic UDF's to process business logic that varies based on policy.
  • Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.
  • Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Developed Unit test cases using Junit, Easy Mockand MRUnit testing frameworks.
  • Experienced in Monitoring Cluster using Cloudera manager.

Environment: Hadoop, HDFS, HBase, MongoDB, MapReduce, Java, Hive, Pig, Sqoop, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager, MySQL.

Confidential

Linux Administrator

Responsibilities:

  • Supported a large server and network infrastructure of Solaris, Linux and Windows environment.
  • Managed installation, configuration, upgrades, and patch systems running OS such as Red Hat, Fedora, Centos and Oracle Solaris.
  • Utilized bash and ksh shell scripting to automate daily system administration tasks.
  • Researched to improve service performance and reliability through investigation and root cause analysis.
  • Managed data backup of UNIX, Windows, Virtual servers, disk storage tier1 and tier2 backups.
  • Copy of important backup images to tape media and send them to offsite every week. Performed quarterly offsite media audits and reports.
  • Ensured that various backup life cycle policies, and daily backup jobs are running, and failed jobs are fixed.
  • Troubleshoot of technical issues related to tier 3 Storage and Quantum tape libraries, reported and logged all media and drive errors. Worked with vendors to resolve hardware and software issues.
  • Configuration of NDMP backup and troubleshoot of NDMP backup failures associated with storage.
  • Maintained configuration and security of the UNIX/LINUX operations systems with the enterprise's computing environment. Provided required evidences to support internal controls per SOX quarterly audit requirement.
  • Monitored system activities and fine-tuned system parameters and configurations to optimize performance and ensure security of systems.
  • Adding servers to domain and managing the groups and user in Active Directory
  • Custom build of Windows 2003 and Windows 2008 servers which includes adding users, SAN, network configuration, installing application related packages, managing services.
  • Responsible for maintenance of development tools and utilities and to maintain shell, Perl automation Scripts.
  • Worked with project manager and auditing teams to implement PCI compliance.
  • Installed and configured Virtual I/O Server V1.3.01 with fixpack8.1.
  • Integrating WebLogic 10.x and Apache 2.x and deploying EAR, WAR files in WebLogic Application servers.
  • As a member of the team, monitored the VERITAS Cluster Server 4.1 in SAN Environment.
  • Created new groups and tested first in development, QA Boxes and then implemented the same in production Boxes.
  • Created and maintained detailed procedural documentation regarding operating system installation and configuration of software packages, vendor contact information, etc.
  • Trained and worked Primarily on RHEL 4/5 Operating Systems.

Environment: Solaris 10, RHEL 5/4, Windows 2008/2003, Sun SPARC and Intel Servers, VMware Infrastructure. Red Hat Linux Enterprise Linux 4/5 4, Solaris 9, 10, Sun E10k, E25K, E4500, SunFire V440/880, DMX 3 & DMX4, SiteMinder, SonicMQ 7.0, VxFS 4.1, VxVM 4.1.

Confidential

Junior Linux System Administrator

Responsibilities:

  • Trained and worked Primarily on RHEL 4/5 Operating Systems.
  • Assisted senior-level administrators in various aspects of Linux (Red Hat) server administration including installing and maintaining the operating system software, performance monitoring, problem analysis and resolution and production support.
  • Assisted other Linux/UNIX administrators when help was needed (i.e. creating Linux/UNIX accounts, writing scripts to perform system administrator functions, responding to trouble tickets, etc.)
  • Involved in preparation of functional and system specifications. Estimated storage requirements for applications.
  • Perform primary Linux Server administrator tasks, including setup, installation, OS patching, data backup, user account management and access control.
  • Disk Management like adding and replacing of hot swappable drives on existing servers, partitioning according to requirement, creating new file systems and growing new ones.
  • Performed swap space management and installed patches and packages as needed.
  • Established and maintained user accounts, assigned file permissions and established password and account policies.
  • Troubleshoot and resolved basic level system hardware, software and communication problems.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Performed basic system monitoring, verified the integrity and availability of all hardware, server resources, systems and key processes, reviewed system and application logs, verified completion
  • Monitored server and application performance &tuned I/O, memory and Installation of SSH and configuring of keys base authentication.
  • Created Local Yum Repositories to support Package Management with Yum and RPM and Installed and configured secure FTP daemon to support a FTP-based Yum repository.
  • Scheduled jobs and automating processes using CRON and AT and Created and maintained file systems and performed RAID configuration on LINUX.
  • Monitored everyday systems and evaluate availability of all server resources and perform all activities for Linux servers.
  • Managed and maintained user accounts and Configured and managed network interfaces.

Environment: Red Hat Linux 4/5, SAN, NAS, Samba, Jira, Apache, Tomcat WebSphere.

We'd love your feedback!