We provide IT Staff Augmentation Services!

Hadoop Administrator/developer Resume

Chicago, IL

SUMMARY

  • 8+ Years of experience in IT industry including 5 years of experience in Hadoop Administration and Development using Apache, Cloudera (CDH), Hortonworks (HDP) Distributions.
  • Experience in installation, configuration, supporting and monitoring 100+ node Hadoop clusters from major distributions like CDH 4, CDH 5 using Cloudera Manager and Apache Ambari.
  • In - depth understanding of Hadoop Frameworks (version 1 and 2) Yarn/MapReduce/HDFS and its components like Job tracker, Task tracker, Name Node, Data Node, Resource Manager, Node Manager, & App Master.
  • Experience in installation and configuration of various Hadoop ecosystem components like Hive, Pig, Spark, Sqoop, Flume, Kafka, Oozie, Zookeeper, HBase, MongoDB, Cassandra, Impala, and R.
  • Expertise on designing and implementing complete end to end Hadoop Infrastructure.
  • Experience in cluster Capacity planning, Optimization of Cluster to meet the SLA.
  • Well versed in managing & reviewing of log files of Hadoop and ecosystem services to determine the root cause.
  • Metadata backup configuration and Disaster Recovery of Namenode using backed up editlogs and fsimage.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Experience configuring Rack Awareness in the Hadoop cluster.
  • Experience in Importing and exporting data using SQOOP from RDBMS to Hadoop and troubleshooting issues related to SQOOP jobs.
  • Experience in using Flume to stream data into HDFS - from various sources.
  • Experience using DistCp command line utility to copy Files between clusters.
  • Cluster coordination services through Zoo Keeper.
  • Setting up Kerberos authentication for Hadoop.
  • Hands on Experience using Network Monitoring Daemons like Ganglia and Service monitoring tools like Nagios.
  • Experience configuring Capacity Scheduler, Fair Scheduler, and HOD Scheduler for Job and user management.
  • Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
  • Experience in performing minor and major upgrades, HDFS Balancing, Commissioning and Decommissioning the Data nodes on Hadoop cluster.
  • Experience in using tools puppet, chef, and writing/modifying Shell Scripts for configuration process automation and cluster monitoring.
  • Good knowledge of setting up Hadoop cluster in AWS EC2 & S3 and also the automation of setting up & extending the clusters in AWS Amazon cloud.
  • Hands on experience in writing Ad-hoc queries for moving data from HDFS to Hive and analyzing data using Hive QL.
  • Very good knowledge on ETL process consisting of data transformation, data sourcing, mapping, conversion and loading.
  • Experience in performing ETL on structured, semi-structured data using Pig Latin Scripts.
  • Development experience with RDBMS, including writing SQL queries, views, stored procedure, triggers, etc.
  • Extensive experience in Linux admin activities on RHEL & Cent OS distributions.
  • Very good knowledge of Data warehouse tools.
  • Good knowledge and experience in Core Java, JSP, Servlets, Multi-Threading, JDBC, HTML.
  • Good understanding of Software Development Lifecycle (SDLC), Waterfall and Agile methodologies.
  • Effective problem solving and interpersonal skills. Ability to learn and use new technologies quickly.
  • Self-starter with ability to work independently as well as within a team environment.

TECHNICAL SKILLS

Big Data components: HDFS, MapReduce, YARN, HBase, Cassandra, MongoD, Pig, Hive, Spark, Impala, Kafka, Sqoop, Flume, Oozie, Zookeeper, & Kettle

Programming Languages: HiveQL, Pig Latin, Shell scripting, Java, J2EE, SQL, C/C++, & PL/SQL

Web Development: JavaScript, JQuery, HTML 5.0, CSS 3.0, AJAX, JSON.

UNIX Tools: Apache, Yum, RPM.

Operating Systems: Red Hat Linux, Cent OS, Ubuntu, Windows, Mac OS

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, Apache Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra, Oracle 10g, MySQL, MS SQL server

Encryption Tools: VeraCrypt, AxCrypt, BitLocker, GNU Privacy Guard

Tools: FileZilla, Putty, TOAD SQL Client, MySQL Workbench, ETL, Pentaho

PROFESSIONAL EXPERIENCE

Hadoop Administrator/Developer

Confidential - Chicago, IL

Responsibilities:

  • Installed, configured and administered Hadoop cluster CDH 5.2.3 and its components.
  • Deployed hardware and software for Hadoop to expand memory and storage on nodes according to requirement.
  • Performed data exchange operations using Sqoop and Flume between HDFS and different Web Applications and databases.
  • Monitored Data streaming between web sources and HDFS.
  • Configured YARN and optimized memory related settings.
  • Collaborated with infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Perform architecture design, data modeling, and implementation of SQL, Big Data platform and analytic applications for the consumer products.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Fine-tuned Hive jobs for better performance.
  • Performed rolling upgrades of Hadoop cluster.
  • Installed operating system and Hadoop updates, patches, version upgrades when required.
  • Screened job performances in Hadoop cluster and capacity planning.
  • Managed configuration changes based on volume of the data being processed.
  • Monitored connectivity and security of Hadoop cluster.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Imported and exported data between RDBMS and HDFS using Sqoop.
  • Performed data migration to Hadoop from existing data stores.
  • Set up new Linux users and tested HDFS, Hive, Pig and Map Reduce access for them.
  • Performed Linux systems administration on production and development servers (RHEL, CentOS and other UNIX utilities).
  • Commissioned and decommissioned data nodes in the Cluster.
  • Configured a Hadoop cluster with 20-30 nodes (Amazon EC2 spot Instance) to transfer the data from Amazon S3 to HDFS and HDFS to Amazon S3.
  • Job and user management using Capacity Scheduler.
  • Installed Patches and packages on Unix/Linux Servers.
  • Installed and configured vSphere client, created Virtual Server and allocated resources.
  • Used various Utilities to do Performance Tuning, Client/Server Connectivity and Database Consistency Checks.
  • Analyzed running statistics of Map and Reduce tasks and provided inputs to development team for efficient utilization memory and CPU of the cluster.

Environment: CDH 5.2.3, Cloudera Manager, Redhat Linux/Centos 4, 5, 6, AWS EC2, Logical Volume Manager, HDFS, Hive, Pig, Sqoop, Flume, ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11,12, Oracle Rac 12c, HPSM, HPSA, Kerberos security.

Hadoop Administrator/Developer

Confidential - Cleveland, OH

Responsibilities:

  • Worked on multiple projects on Architecting Hadoop Clusters.
  • Installed, Configured and Managed of Hadoop Cluster using Cloudera Manager and Puppet.
  • Upgraded Hadoop CDH 4.2 to CDH 4.6 in development environment.
  • Performed metadata backups and upgrades on Hadoop Development cluster.
  • Set up and configured Zookeeper for cluster coordination services.
  • Managed cluster configuration to meet the needs of analysis- I/O bound and CPU bound.
  • Managed and reviewed Hadoop Log files for troubleshooting issues.
  • Performed bench mark test on Hadoop clusters and tweaked the solution based on test results.
  • Commissioned and Decommissioned the Data nodes in Hadoop Cluster.
  • Performed data validation using HIVE dynamic partitioning.
  • Transformed large sets of structured and semi structured data by applying ETL processes using Hive.
  • Developed Map Reduce programs for data analysis.
  • Worked on troubleshooting, monitoring, tuning the performance of Map reduce Jobs.
  • Developed Pig scripts for transformation of raw data into intelligent data.
  • Supported data analysts in running Pig scripts and Hive queries.
  • Scheduled Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
  • Configured Fair scheduler on the Resource Manager to manage cluster resource for Jobs & users.
  • Migrated data across clusters using distcp.
  • Collaborated with DevOps team to meet the business requirements of customers and proposed Hadoop solutions.
  • Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion.
  • Supported data analysis projects through Elastic Map Reduce on the Amazon Web Services (AWS) and Rack space cloud. Performed Export and import of data into S3.
  • Preparing documentation on the cluster configuration for future reference.

Environment: Cloudera Hadoop, Linux, HDFS, Hive, Pig, Sqoop, Flume, Zookeeper, HBase, YARN, RDBMS, Oozie, AWS.

Hadoop Administrator

Confidential - Grand Rapids, MI

Responsibilities:

  • Managed, Administered and Monitored clusters in Hadoop Infrastructure.
  • Diligently teamed with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Collaborated with application teams to install Hadoop updates, patches, when required.
  • Managed connectivity of nodes and security on Hadoop cluster.
  • Commissioned and decommissioned Data nodes from the cluster.
  • Implemented Name Node High Availability.
  • Worked with data delivery teams to setup new Hadoop users.
  • Installed and configured Hadoop eco system components like Hive, Pig, Flume, Sqoop, and HBase.
  • Installed Oozie workflow engine to run multiple Map Reduce, Hive and pig jobs.
  • Configured Metastore for Hadoop ecosystem and management tools.
  • Hands-on experience with Nagios and Ganglia monitoring tools.
  • Experience in HDFS data storage and support for running Map Reduce jobs.
  • Performed tuning and troubleshooting of MR jobs by analyzing and reviewing Hadoop log files.
  • Loaded data into HDFS from dynamically generated files using Flume and from RDBMS using Sqoop.
  • Used Scoop to export the analyzed data from HDFS to RDBMS for business use cases.
  • Skillfully used distcp to migrate data between and across the clusters.
  • Installed and configured Zookeeper to co-ordinate Hadoop daemons.
  • Coordinated root cause analysis efforts to minimize future system issues.

Environment: Cloudera 4.2, HDFS, Hive, Pig, Sqoop, HBase, Chef, RHEL, Mahout, Tableau, MySQL, Shell Scripting.

Linux Administrator

Confidential, CT

Responsibilities:

  • Installation, configuration and administration of Red Hat Linux servers and support for Servers and regular upgrades of Red Hat Linux Servers using kick start based network installation.
  • Provided 24x7 System Administration support for Red Hat Linux 3.x, 4.x, 5.x servers and resolved trouble tickets on shift rotation basis.
  • Configured HP ProLiant, Dell Power edge, R series, and Cisco UCS and IBM p-series machines, for production, staging and test environments.
  • Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts.
  • Configured Linux native device mappers (MPIO), EMC power path for RHEL 5.5, 5.6, & 5.7.
  • Performance monitoring utilities like IOSTAT, VMSTAT, TOP, NETSTAT and SAR.
  • Worked on Support for Aix matrix sub system device drivers.
  • Worked on with the computing by both physical and virtual from the desktop to the data center using the SUSE Linux. Expertise in Build, Install, load and configure boxes.
  • Experienced in Installation, Configuration, and Troubleshooting of Tivoli Storage Manager.
  • Remediated failed backups, took manual incremental backups of failing servers.
  • Upgraded TSM from 5.1.x to 5.3.x. Worked on HMC Configuration and management of HMC Console which included up gradation, micro partitioning.
  • Installed and configured adapter card’s cables. Worked on Integrated Virtual Ethernet and building up of VIO servers.
  • Installed SSH Keys for Successful login of SRM data into the server without prompting password for daily backup of vital data such as processor utilization, disk utilization, etc.
  • Provided redundancy with HBA card, Ether channel configuration and network devices.
  • Coordinated with application and database teams for troubleshooting the application.
  • Coordinated with SAN team for allocation of LUN's to increase file system space.
  • Configured and administered Fiber Card Adapter and handled AIX part of SAN.

Environment: Red Hat Linux (RHEL 3/4/5), Solaris 10, Logical Volume Manager, Sun & Veritas Cluster Server, VMWare, Global File System, Red hat Cluster Servers.

Systems Administrator

Confidential

Responsibilities:

  • Administered RHEL 4.x and 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Created & cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrated servers between ESX hosts and Xen servers.
  • Installed Red Hat Linux using kick-start and applying security polices for hardening the server based on the company policies.
  • Installed and verified that all AIX/Linux patches or updates are applied to the servers.
  • Installed RPM and YUM packages patch and other server management.
  • Managed systems routine backup, scheduling jobs like disabling and enabling Cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning and testing.
  • Worked and performed data-center operations including rack mounting and cabling.
  • Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & Red Hat Linux.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
  • Configured multipath, adding SAN and creating physical volumes, volume groups, & logical volumes.

Environment: RHEL, VMware 3.5, Solaris 2.6/2.7/8, Oracle 10g, Weblogic10.x, Veritas NetBackup, Veritas Volume Manager, Samba, NFS, NIS, LVM, Shell Scripting.

Hire Now