Hadoop Administrator / Developer Resume
Westlake Village, CA
SUMMARY:
- Around 5 years of overall experience with 3+ years of solid Hadoop Administrator experience in building, operationalizing and managing Hadoop clusters using distributions like CDH 5.x, 4.x, HDP2.x, EMR.
- Hands on experience in installation, configuration, security and monitoring of Hadoop Cluster on RHEL.
- Experience in installing, configuring and managing of Hadoop ecosystem components like HDFS, Yarn, HBase, Oozie, Hive, Impala, Spark, Sqoop, Pig, Flume.
- Expertise with Hortonworks Hadoop platform (HDFS, Hive, Oozie, Sqoop, Yarn).
- Expertise in configuring Hadoop Security using Kerberos.
- Installed, configured and maintained Hadoop Cluster in High Availability environment.
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Ambari.
- Experience in Hadoop Shell commands, Writing Map Reduce Program, verifying managing and reviewing Hadoop Log files.
- Worked on implementing and integrating in NoSQL databases like HBASE.
- Good Experience in Hadoop cluster capacity planning and designing Name Node, Secondary Name Node, Data Node, Job Tracker, Task Tracker.
- Setup HDFS directories and permissions for different applications.
- Experience in benchmarking, performing backup and disaster recovery of Name Node, metadata and important sensitive data residing on cluster.
- Experience in scheduling jobs using OOZIE workflow.
- Knowledge about software Development Lifecycle(SDCL), Agile, Application Maintenance Change Process(AMCP).
- Strong Knowledge on Spark concepts like RDD Operations, Caching and Persistence.
- Proficient in configuring Confidential, Flume to the existing Hadoop Cluster.
- Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large - scale data processing.
- Experience in writing shell scripting for various ETL needs.
- Hands on experience with Creating folders, Groups, Roles, Users in Admin console and giving their permissions.
- Knowledge in implementing ETL/ELT processes with MapReduce, PIG, Hive.
- Well exposed and worked closely with testing teams in performance unit testing, user acceptance testing and system integration testing.
- Experience with Metadata Manager, Business Glossary, IDQ and Data Analyst.
- Support the Testers, Deployers and KT to the Team about the best Practices.
- Exclusively worked on different flavors of UNIX and Windows operating systems.
- Good Experience in UNIX shell scripting.
- In which my responsibilities are collecting information and configuring network devices such as servers, printers, Hubs, switches and Routers on an Internet Protocol (IP) network
- Team player with good interpersonal and problem-solving skills, ability to work in team.
- Experienced monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
- Participated in 24x7 on-call rotation, off-hour production problem resolution activities.
- A self-motivated, responsible and reliable team player with a set of strong technical skills.
- Excellent analytical, interpersonal and communication skills.
TECHNICAL SKILLS:
Big Data Technologies: Horton work, Apache Hadoop, HDFS, HBase, Hive, Sqoop, Flume, Pig, Oozie, Kerberos, Spark, Confidential, Cloudera Manager
Hadoop Distribution: Cloudera, Horton Work, AWS
Operating Systems: Window10, Windows 8, Windows 7, Windows Server2008/2003, Mac OS, Ubuntu, Red Hat Linux, Linux, UNIX
Languages: C, C++, Java, XML, SQL, PL/SQL, UNIX Shell Scripting, Python
RDMS: Oracle 12c/11g/10g/, SQL Server 2012/2008/2005.
Project Management: MS-Project
PROFESSIONAL EXPERIENCE
Confidential, Westlake Village, CA
Hadoop Administrator / Developer
Responsibilities:
- Install, Configure and maintained HDFS, Hive, Pig, HBase, Oozie, Sqoop, Spark, Yarn.
- Install, Configured, Upgraded, Applied patches, Bug fixes for Prod, Test and Dev Servers.
- Implemented multiple Sparks jobs in scale for data cleaning and Transformations.
- Installed Complex R Packages NLP, H2O cluster etc.,
- Data processing using SPARK.
- Usage of Sqoop to import data into HDFS from MYSQL database and vice-versa.
- Experience in managing and reviewing Hadoop log files.
- Installed various Hadoop ecosystem and Hadoop Daemons.
- Tested out various other Mesos frameworks such as Kafka and tested isolation.
- Worked on the installation and configuration of Hadoop HA cluster.
- Involved in capacity planning and design of Hadoop clusters.
- Setting up alerts in Ambari for the monitoring of Hadoop clusters.
- Setting up security authentication using Kerberos security.
- Commissions and decommission the data nodes from clusters.
- Write and modify UNIX shell scripts to manage HDF environments.
- Administrator, configure and performance tuning for Spark application.
- Create directories and setup appropriate permissions for different applications.
- Backup tables in HBase to HDFS directories using export utility.
- Involved in planning and implementation of Hadoop cluster Upgrade.
- Installation, Configuration and administrator of HDP on Red Hat Enterprise Linux 6.6.
- Used Sqoop to import data into HDFS from Oracle database.
- On call support for 24x7 Production job failures and resolve the issue in timely manner.
- Have deep and through understanding of ETL tools and how they can be applied in a Big Data environment.
- Troubleshoots with problem regarding the databases, applications and development tools.
Environment: Hortonworks, Ambari, Core Java, HDPS, Yarn, Spark, Pig, Oozie, Hive, Sqoop, Flume, Confidential, Kerberos, LINUX, AWS, UNIX.
Confidential, VA
Hadoop Administrator / Developer
Responsibilities:
- Installed and configured multi-nodes fully distributed Hadoop cluster for large numbers of nodes.
- Experience on Hortonworks and Cloudera Manager.
- Implemented multiple Spark Jobs in Scala and Sql for data cleaning and Transformation.
- Involved in testing HDFS, Hive, Pig and Map Reduce access for the new users.
- Cluster maintenance as well as creation and removal of nodes using Apache Ambari.
- Configured Confidential to implement node coordination in clustering support.
- Creating snapshots and restoring snapshots.
- Worked on setting up Hadoop cluster for the Production Environment.
- Used Impala to read, write and query the Hadoop data in HDFS.
- Tested Mesos frameworks such as Kafka and tested isolation.
- Used impala for optimization of query performance instead of HIVE.
- Data processing using spark.
- Involved installation and configuration of Tableau server.
- Experience in understanding the security requirements for Hadoop and Integrating with Kerberos authentication infrastructure-KDC server setup, creating realm/domain.
- Building massively scalable multi-threaded application for bulk data processing primarily with Apache Spark and PIG on Hadoop.
- Developed Scripts and Batch job to schedule various Hadoop program.
- Involved in cluster capacity Planning, Hardware Planning, Installation, Performance tuning of the Hadoop Cluster.
- Load log data into HDFS using Flume.
- MapReduce jobs to power data to search and aggregation.
- Enable HA for Name node, Resource Manager, Yarn Configuration and Hive Megastore.
- Run the benchmark tools to test the cluster performance.
- Provided support to users for diagnosis, reproducing and fixing Hadoop related issues.
- Ensure the critical user issues are addressed quickly and effectively.
- Configured Rack Awareness on HDP clusters.
Environment: Cloudera Manager, Core Java, Hortonworks, HDFS, Yarn, Spark, Hive, Pig, HBase, MapReduce, Sqoop, Flume, Kerberos, Confidential, RHEL.
Confidential
Hadoop Administrator
Responsibilities:
- Perform configuration, Administration and monitoring of Hadoop Clusters.
- Worked on evaluating, installation/setup of Hortonworks 2.1/1.8 Big data ecosystem which includes Apache Hadoop HDFS, Pig, Hive and Sqoop.
- Managing HDFS directories permissions for applications.
- Managing the filesystem which include creating backups of HDFS, creating new file systems.
- Detailed analysis of system and application as per functional requirements.
- Worked on installation and configuration of Hadoop HA cluster.
- Responsible for creating repository users, user groups, giving privileges to the users with access to the repository folders.
- Addressing and troubleshooting issues on a daily basis.
- Configured Confidential to implement node coordination in cluster support.
- Allocating the name and space Quotas to the users in case of space problem.
- Creating snapshots and restoring snapshots.
- Commissioning and Decommissioning of nodes depending upon the amount of data.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and patches.
- Involved in configuring the LDAP on different application for a secure login.
- Worked on YARN capacity scheduler by creating queues to allocate resource guarantee to specific groups.
- Scheduled several times based Oozie workflow by developing Python scripts.
- Maintains the EC2 (Elastic Computing Cloud) and RDS (Relational Database Services) in amazon web services.
- Created Hive External tables and loaded the data into tables and query data using HQL.
- Performance tuning of Hadoop clusters and jobs related to Hive and Spark.
- Created methods off process for the Kerberos KDC cluster Setup.
- Creating and truncating HBase tables in HUE and taking backup of submitter ID(s).
- Responsible for building scalable distributed data solutions using Hadoop.
- Manage and Review Hadoop Log Files.
- Used GitLab UI with Puppet as the platform to manage the Hadoop Users.
- Worked with Bi teams in generating the reports and designing ETL workflows on Tableau.
Environment: Hortonworks 2.1/18, Ambari, HDFS, HBase, SQL, Oozie, Hive, Spark, LINUX,
ConfidentialLinux Administrator
Responsibilities:
- Mutual redistribution between EIGRP and OSPF in internal networks with specific requirements of client with on-demand routing.
- Maintain and monitor all server frameworks and provide after call support to all system and maintain optimal Linux Knowledge.
- Install and maintain all server hardware and software system and administration all server performance and ensure availability for same.
- Performed switching technology administration including VLANs, Trunking, STP, RSTP, inter-VLAN routing, port aggregations and link negotiation.
- Configuration of Access List ACL (Std, Ext, Named) to allow users all over the company to access different applications and blocking others.
- Performs test on all new software and maintain patches for management services and perform audit on all security processes.
- Built, Implemented and maintained system-level software packages such as OS, Clustering, Disk, File management, backup, web applications, DNS, LDAP.
- Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
- Installed RPM and YUM packages patch and another server management.
- Experience in system authentication on RHEL servers using Kerberos, DAP.
- Configuring and troubleshooting DHCP on RHEL servers.
- Involved in Database Testing Using SQL to pull data from database and check whether it matches with GUI.
- Familiar with hardware tools like servers, printers, VOIP, networking and telecommunication devices.
- Excellent troubleshooting skills in complex software and hardware problems.
- Configuring the router and TCP/IP protocols.
- Solid understanding of all phases of SMPS and UPS.
- Creating database objects such as Tables, Indexes, Views, Sequences, Primary and Foreign keys, Constraints and Triggers.
- Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers.
- Monitoring System Metrics and logs for any problems.
- Configuring and troubleshooting desktops, laptops and servers.
Environment: Linux/Unix, Red Hat Linux servers, Oracle, DHCP, Windows 2008/2007 server Unix Shell Scripting, SQL Manager Studio, Microsoft SQL Server 2000/2005/2008, MS Access.
