We provide IT Staff Augmentation Services!

Big Data Admin Resume

Alpharetta, GA

SUMMARY:

  • Around 6+ years of professional experience, 3+ years of Hadoop Administration and 3 years as Linux Admin.
  • Experience in performing various major and minor Hadoop upgrades on large environments.
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
  • Experienced in installation, configuration, supporting and monitoring 300+ node Hadoop cluster using CDH and HDP.
  • Experience in HDFS data storage and support for running map - reduce jobs.
  • Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
  • Experience in using Cloudera Manager for Installation and management of Hadoop clusters.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Experience in working large environments and leading the infrastructure support and operations.
  • Installing and configuring Kafka and monitoring the cluster using Nagios and Ganglia.
  • Load log data into HDFS using Flume, Kafka and performing ETL integrations Experience with ingesting data from RDBMS sources like - Oracle, SQL and Teradata into HDFS using Sqoop.
  • Experience in big data technologies: Hadoop HDFS, Map-reduce, Pig, Hive, Oozie, Sqoop, Zookeeper.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Assisted with development of a MapR-DB Garbage Collection prototype as reader/write solution for another client. Development included using the MapR-DB service to create a semi-distributed table using the HBase Shell along with using MapR C-APIs.
  • Ultimately, started up HDFS (start-dfs.sh) and MapReduce (start-mapred.sh)
  • Monitoring and support through Nagios and Ganglia.
  • Migrating applications from existing systems like MySQL, oracle, db2 and Teradata to Hadoop.
  • Expertise with Hadoop, MapReduce, Pig, Sqoop, Oozie, and Hive.
  • Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Hive, Map Reduce, Pig, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, Kafka, MapReduce, Storm, CDH 5.3, CDH 5.4, 5.5

Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia.

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, Ruby, PHP

Programming Languages: C, Java, SQL, and PL/SQL.

Front End Technologies: HTML, XHTML, XML.

Application Servers: Apache Tomcat, WebLogic Server, Web sphere

Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.

PROFESSIONAL EXPERIENCE:

BIG DATA ADMIN

Confidential, Alpharetta, GA

Responsibilities:

  • Working as Hadoop Admin and responsible for taking care of everything related to the clusters total of 90 nodes ranges from POC (Proof-of-Concept) to PROD clusters.
  • Provided regular user and application support for highly complex issues involving multiple components such as Hive, Impala, Spark, Kafka, MapReduce.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Created Kafka topics, provide ACLs to users and setting up rest mirror and mirror maker to transfer the data between two Kafka clusters. Has used MapR, HDFS, YARN, MapReduce, Pig, Hive and Oozie using Amazon EMR.
  • Experienced in AWS services
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Experience integration of Kafka with Spark for real time data processing.
  • Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Installing and configuring Kafka cluster and monitoring the cluster using Nagios and Ganglia.
  • Using Flume and Spool directory loading the data from local system to HDFS.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Retrieved data from HDFS into relational databases with Sqoop.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Creating collections and configurations, Register a Lily HBase Indexer configuration with the Lily HBase Indexer Service.
  • Creating and truncating HBase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: HDFS, Map Reduce, Hive, Hue, Pig, Flume, Oozie, Sqoop, CDH5, Apache Hadoop, Spark, SOLR, Storm, Knox, Zeppelin, Kafka, Cloudera Manager, Red Hat, MySQL and Oracle.

HADOOP ADMINISTRATOR

Confidential, IL

Responsibilities:

  • Installed and configured a Confidential HDP and Hadoop using AMBARI.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Worked on installing cluster, commissioning & decommissioning of DataNode, NameNode recovery, capacity planning, and slots configuration.
  • Installed, Configured, Tested Datastax Enterprise Cassandra multi-node cluster which has 4 Datacenters and 5 nodes each.
  • Installed and configured Cassandra cluster and CQL on the cluster.
  • Loaded log data into HDFS using Flume, Kafka and performing ETL integrations.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Managing and reviewing Hadoop log files and debugging failed jobs.
  • Implemented Kerberos Security Authentication protocol for production cluster.
  • Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performancePig queries.
  • Responsible for adding new eco system components, like spark, storm, flume, Knox with required custom configurations based on the requirements.
  • Managed the design and implementation of data quality assurance and data governance processes.
  • Worked with Infrastructure teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Implemented Fair scheduler to allocate fair amount of resources to small jobs.
  • Assisted the BI team by Partitioning and querying the data in Hive.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop HDFS, MapReduce, Confidential, Hive, Pig Hive, Kafka, Oozie, Zeppelin, Flume Sqoop, HBase.

HADOOP ADMINISTRATOR

Confidential

Responsibilities:

  • Installed and configured Hadoop on a cluster.
  • Written multiple java-based MapReduce jobs for data cleaning and preprocessing.
  • Experienced in defining job flows using Oozie
  • Experienced in managing and reviewing Hadoop log files
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources and application
  • Supported Map Reduce Programs those are running on the cluster
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review logfiles.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Ambari.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Monitored workload, job performance and capacity planning
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.

Environment: Hadoop2.4, 2.5.2, HDFS, Map Reduce, Hive, Flume, Sqoop, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud.

LINUX ADMINISTRATOR

Confidential

Responsibilities:

  • Worked with legato networker tool to backup/retrieve the logs and transfer to FTP server at the request of the development to further investigate the incident.
  • Member of Storage Team giving support to a largest telecom company in Belgium
  • Handling both online and offline storage support.
  • Assigning tasks to all team members.
  • Giving offline support through EMC Networker 7.6. and handling few tasks on backup failure issues
  • Configuring backups for all types of clients(Solaris, HP, Linux, Windows)
  • Configuring database backup ( both online and RMAN)
  • Handling tickets based on severity raised against backup failures through Peregrine tool
  • Restoration of files on request.
  • Giving online support recently on HPXP Storage tasks like LUN creation and thin provisioning, setting up business copy.
  • Installed, Configured and Maintained Debian/RedHat Servers at multiple Data Centers.
  • Configured RedHat Kickstart server for installing multiple production servers.
  • Configuration and administration of DNS, LDAP, NFS, NIS and NISand Send mail on RedHat Linux/Debian Servers.
  • Hands on experience working with production servers at multiple data centers.
  • Involved in writing scripts to migrate consumer data from one production server to another production server over the network with the help of Bash and Perl scripting.
  • Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
  • Automated server building using System Imager, PXE, Kickstart and Jumpstart.
  • Password-less setup and agent-forwarding done for SSHS login using ssh-keygen tool.
  • Established and maintained network users, user environment, directories, and security.
  • Documented strongly the steps involved for data migration on production servers and also testing procedures before the migration.
  • Provided 24/7 on call support on Linux Production Servers. Responsible for maintaining security on Red Hat Linux.

Environment: RHEL 5.x/4.x, Solaris 8/9/10, Sun Fire, IBM blade servers, Web sphere 5.x/6.x, Apache 1.2/1.3/2.x, iPlanet, Oracle 11g/10g/9i, Logical Volume Manager, Veritas net backup 5.x/6.0, SAN Multipathing (MPIO, HDLM, Power path), VM ESX 3.x/2.x.

LINUX/ UNIX SYSTEM ADMINISTRATOR

Confidential

Responsibilities:

  • System installation and Configuration of AIX 5.3 operating system and Red hat 3.x, 4.x & 5.x servers.
  • User Administration, adding and removing user accounts, changing user attributes.
  • Working with paging spaces, creating, increasing and decreasing paging spaces as per requirement.
  • Configured VG's and LV's and extended LV's for file system growth needs using LVM commands.
  • Patch Management. Problem determination in File systems and Logical Volumes.
  • Creating and managing the default and User defined Paging Spaces.
  • Creating and updating the Crontab files.
  • NFS Administration. System Resource Controller Administration.
  • Corporate client support for mission critical environments.
  • Responsible for 200 Linux Servers: RHEL 3.0, 4.0 & 5.x, Bash scripting for automation of tasks.
  • Installation and configuration of many Open Source Packages.

Hire Now