We provide IT Staff Augmentation Services!

Hadoop Security Admin Resume

Oak Brook, IL

SUMMARY:

  • Around 8 years of professional experience including 4+ years of Hadoop Administration and 4 years as Linux Admin.
  • Experienced in installation, configuration, supporting and monitoring 100+ node Hadoop cluster using Cloudera manager and Hortonworks distributions.
  • Experience in performing various major and minor Hadoop upgraded on large environments.
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup& Recovery strategies.
  • Experience in HDFS data storage and support for running map - reduce jobs.
  • Involved in Infrastructure set up and installation of HDP stack on Amazon Cloud.
  • Experience with ingesting data from RDBMS sources like - Oracle, SQL and Teradata into HDFS using Sqoop.
  • Experience in big data technologies: Hadoop HDFS, Map-reduce, Pig, Hive, Oozie, Sqoop, Zookeeper and NoSQL.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
  • Experience in using Ambari for Installation and management of Hadoop clusters.
  • Experience in Ansible and related tools for configuration management.
  • Experience in working large environments and leading the infrastructure support and operations.
  • Migrating applications from existing systems like MySQL, oracle, db2 and Teradata to Hadoop.
  • Expertise with Hadoop, Map reduces, Pig, Sqoop, Oozie, and Hive.
  • Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.

SKILLS:

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, Storm, Hortonworks and Cloudera distribution.

Monitoring Tools: Ambari, Cloudera manager, Ganglia, Nagios, Cloud watch.

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH, Ruby, PHP

Programming Languages: C, Java, SQL, and PL/SQL.

Front End Technologies: HTML, XHTML, XML.

Application Servers: Apache Tomcat, WebLogic Server, Web sphere

Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.

Security: Kerberos, Ranger, Rangerkms, Knox.

WORK EXPERIENCE:

Hadoop Security Admin

Confidential, Oak Brook, IL

Responsibilities:

  • Designed and implemented end to end big data platform solution on AWS.
  • Manage Hadoop clusters in production, development, Disaster Recovery environments.
  • Implemented SignalHub a data science tool and configured it on top of HDFS.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.
  • Recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.
  • Automated Hadoop deployment using Ambari blueprints and Ambari REST API’s.
  • Automated Hadoop and cloud deployment using Ansible.
  • Integrated active directory with Hadoop environment.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Configured Ranger for policy based authorization and fine grain access control to Hadoop cluster.
  • Implemented hdfs encryption for creating encrypted zones in hdfs.
  • Configured Rangerkms to manage encryption keys.
  • Implemented a multitenant Hadoop cluster and on boarded tenants to the cluster.
  • Achieved data isolation through ranger policy based access control.
  • Used YARN capacity scheduler to define compute capacity.
  • Responsible for building a cluster on HDP 2.5
  • Worked closely with developers to investigate problems and make changes to the Hadoop environment and associated applications.
  • Expertise in recommending hardware configuration for Hadoop cluster
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Trouble shooting many cloud related issues such as Data Node down, Network failure, login issues and data block missing.
  • Managing and reviewing Hadoop log files.
  • Proven results-oriented person with a focus on delivery.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Managed cluster coordination services through Zookeeper.
  • System/cluster configuration and health check-up.
  • Continuous monitoring and managing the Hadoop cluster through Ambari.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Resolving tickets submitted by users, troubleshoot the error documenting, resolving the errors.
  • Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.

Environment: HDFS, Map Reduce, Hive, Pig, Flume, Oozie, Sqoop, HDP2.5, Ambari 2.4, Spark, SOLR, Storm, Knox, Centos 7 and MySQL.

Hadoop Administrator

Confidential, Hoffman Estates, IL

Responsibilities:

  • Worked as Hadoop Admin and responsible for taking care of everything related to the clusters total of 100 nodes ranges from POC (Proof-of-Concept) to PROD clusters.
  • Worked as admin on Cloudera (CDH 5.5.2) distribution for clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Using Flume and Spool directory loading the data from local system to HDFS.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Retrieved data from HDFS into relational databases with Sqoop.
  • Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis Fine tuning hive jobs for optimized performance.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop. .
  • Creating collections and configurations, Register a Lily HBase Indexer configuration with the Lily HBase Indexer Service.
  • Creating and truncating HBase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: HDFS, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, CDH5, Apache Hadoop 2.6, Spark, SOLR, Storm, Knox, Cloudera Manager, Red Hat, MySQL and Oracle.

Hadoop Administrator

Confidential, Los Angeles, CA

Responsibilities:

  • Manage several Hadoop clusters in production, development, Disaster Recovery environments.
  • Responsible for building a cluster on HDP 2.2 and HDP 2.4
  • Work with engineering software developers to investigate problems and make changes to the Hadoop environment and associated applications.
  • Expertise in recommending hardware configuration for Hadoop cluster
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Trouble shooting many cloud related issues such as Data Node down, Network failure and data block missing.
  • Major Upgrade from HDP 2.2 to HDP 2.4.
  • Managing and reviewing Hadoop and HBase log files
  • Proven results-oriented person with a focus on delivery
  • Built and configured log data loading into HDFS using Flume.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Managed cluster coordination services through Zoo Keeper.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Ranger, Falcon, Smart sense, Storm, Kafka.
  • Recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.
  • Supporting Hadoop developers and assisting in optimization of map reduce jobs, Pig Latin scripts, Hive Scripts, and HBase ingest required.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • System/cluster configuration and health check-up.
  • Continuous monitoring and managing the Hadoop cluster through Ambari.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Resolving tickets submitted by users, troubleshoot the error documenting, resolving the errors.
  • Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.

Environment: HDFS, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, CDH5, Apache Hadoop 2.6, Spark, SOLR, Storm, Knox, Cloudera Manager, Red Hat, MySQL and Oracle.

Hadoop Administrator

Confidential, Plano, TX

Responsibilities:

  • Helped the team to increase cluster size from 16 nodes to 52 nodes. The configuration for additional data nodes managed by using Puppet.
  • Installed, Configured and deployed a 30 node Cloudera Hadoop cluster for development and production
  • Worked on setting up high availability for major production cluster and designed for automatic failover.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Configured Hive meta store with MySQL, which stores the metadata of Hive tables
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Wrote Nagios plugins to monitor Hadoop NameNode Health status, number of Task trackers running, number of Data nodes running.
  • Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Moved data from HDFS to RDBMS and vice-versa using SQOOP.
  • Developed HIVE queries and UDFs to analyze the data in HDFS.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment
  • Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Implemented Capacity schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Worked on analyzing Data with HIVE and PIG
  • Helped in setting up Rack topology in the cluster.
  • Helped in the day to day support for operation.
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured and deployed hive metastore using MySQL and thrift server.

Environment: Hadoop, HDFS, Map Reduce, Hive Pig, Sqoop, Oozie, HBase, Linux, Java, Xml.

Linux Administrator

Confidential

Responsibilities:

  • Responsible for configuring real time backup of web servers. Log file was managed for troubleshooting and probable errors.
  • Responsible for reviewing all open tickets, resolve and close any existing tickets.
  • Document solutions for any issues that have not been discovered previously.
  • Worked with File System includes UNIX file System and Network file system. Planning, scheduling and implementation of O/s. patches on both Solaris & Linux.
  • Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Highly experienced in optimizing performance of Web Sphere Application server using Workload Management (WLM)
  • Patch management of servers and maintaining server's environment in Development/QA/Staging /Production
  • Performing Linux systems administration on production and development servers (Redhat Linux, Cent OS and other UNIX utilities)
  • Installing Patches and packages on Unix/Linux Servers. Provisioning, building and support of Linux servers both Physical and Virtual using VMware for Production, QA and Developers environment.
  • Installed, configured and Administrated of all UNIX/LINUX servers, includes the design and selection of relevant hardware to Support the installation/upgrades of Red Hat Cent OS, Ubuntu operating systems.
  • Network traffic control, IPsec, Quos, VLAN, Proxy, Radius integration on Cisco Hardware via Red Hat Linux Software.
  • Responsible for managing the Chef client nodes and upload the cookbooks to chef-server from Workstation
  • Performance Tuning, Client/Server Connectivity and Database Consistency Checks using different Utilities.
  • Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
  • Environment: Linux/Centos 4, 5, 6, Logical Volume Manager, VMware ESX, Apache and Tomcat Web Server HPSM, HPSA.

Environment: Windows 2008/2007 server, Unix Shell Scripting, SQL Manager Studio, Red Hat Linux, Microsoft SQL Server 2000/2005/2008, MS Access, NoSQL, Linux/Unix, Putty Connection Manager, Putty, SSH.

Hire Now