Hadoop Administrator/Developer Resume Chicago, IL - Hire IT People

SUMMARY

8+ Years of experience in IT industry including 5 years of experience in Hadoop Administration and Development using Apache, Cloudera (CDH), Hortonworks (HDP) Distributions.
Experience in installation, configuration, supporting and monitoring 100+ node Hadoop clusters from major distributions like CDH 4, CDH 5 using Cloudera Manager and Apache Ambari.
In - depth understanding of Hadoop Frameworks (version 1 and 2) Yarn/MapReduce/HDFS and its components like Job tracker, Task tracker, Name Node, Data Node, Resource Manager, Node Manager, & App Master.
Experience in installation and configuration of various Hadoop ecosystem components like Hive, Pig, Spark, Sqoop, Flume, Kafka, Oozie, Zookeeper, HBase, MongoDB, Cassandra, Impala, and R.
Expertise on designing and implementing complete end to end Hadoop Infrastructure.
Experience in cluster Capacity planning, Optimization of Cluster to meet the SLA.
Well versed in managing & reviewing of log files of Hadoop and ecosystem services to determine the root cause.
Metadata backup configuration and Disaster Recovery of Namenode using backed up editlogs and fsimage.
Strong knowledge in configuring Name Node High Availability and Name Node Federation.
Experience configuring Rack Awareness in the Hadoop cluster.
Experience in Importing and exporting data using SQOOP from RDBMS to Hadoop and troubleshooting issues related to SQOOP jobs.
Experience in using Flume to stream data into HDFS - from various sources.
Experience using DistCp command line utility to copy Files between clusters.
Cluster coordination services through Zoo Keeper.
Setting up Kerberos authentication for Hadoop.
Hands on Experience using Network Monitoring Daemons like Ganglia and Service monitoring tools like Nagios.
Experience configuring Capacity Scheduler, Fair Scheduler, and HOD Scheduler for Job and user management.
Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
Experience in performing minor and major upgrades, HDFS Balancing, Commissioning and Decommissioning the Data nodes on Hadoop cluster.
Experience in using tools puppet, chef, and writing/modifying Shell Scripts for configuration process automation and cluster monitoring.
Good knowledge of setting up Hadoop cluster in AWS EC2 & S3 and also the automation of setting up & extending the clusters in AWS Amazon cloud.
Hands on experience in writing Ad-hoc queries for moving data from HDFS to Hive and analyzing data using Hive QL.
Very good knowledge on ETL process consisting of data transformation, data sourcing, mapping, conversion and loading.
Experience in performing ETL on structured, semi-structured data using Pig Latin Scripts.
Development experience with RDBMS, including writing SQL queries, views, stored procedure, triggers, etc.
Extensive experience in Linux admin activities on RHEL & Cent OS distributions.
Very good knowledge of Data warehouse tools.
Good knowledge and experience in Core Java, JSP, Servlets, Multi-Threading, JDBC, HTML.
Good understanding of Software Development Lifecycle (SDLC), Waterfall and Agile methodologies.
Effective problem solving and interpersonal skills. Ability to learn and use new technologies quickly.
Self-starter with ability to work independently as well as within a team environment.

TECHNICAL SKILLS

Big Data components: HDFS, MapReduce, YARN, HBase, Cassandra, MongoD, Pig, Hive, Spark, Impala, Kafka, Sqoop, Flume, Oozie, Zookeeper, & Kettle

Programming Languages: HiveQL, Pig Latin, Shell scripting, Java, J2EE, SQL, C/C++, & PL/SQL

Web Development: JavaScript, JQuery, HTML 5.0, CSS 3.0, AJAX, JSON.

UNIX Tools: Apache, Yum, RPM.

Operating Systems: Red Hat Linux, Cent OS, Ubuntu, Windows, Mac OS

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, Apache Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra, Oracle 10g, MySQL, MS SQL server

Encryption Tools: VeraCrypt, AxCrypt, BitLocker, GNU Privacy Guard

Tools: FileZilla, Putty, TOAD SQL Client, MySQL Workbench, ETL, Pentaho

PROFESSIONAL EXPERIENCE

Hadoop Administrator/Developer

Confidential - Chicago, IL

Responsibilities:

Installed, configured and administered Hadoop cluster CDH 5.2.3 and its components.
Deployed hardware and software for Hadoop to expand memory and storage on nodes according to requirement.
Performed data exchange operations using Sqoop and Flume between HDFS and different Web Applications and databases.
Monitored Data streaming between web sources and HDFS.
Configured YARN and optimized memory related settings.
Collaborated with infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Perform architecture design, data modeling, and implementation of SQL, Big Data platform and analytic applications for the consumer products.
Partitioned and queried the data in Hive for further analysis by the BI team.
Fine-tuned Hive jobs for better performance.
Performed rolling upgrades of Hadoop cluster.
Installed operating system and Hadoop updates, patches, version upgrades when required.
Screened job performances in Hadoop cluster and capacity planning.
Managed configuration changes based on volume of the data being processed.
Monitored connectivity and security of Hadoop cluster.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Imported and exported data between RDBMS and HDFS using Sqoop.
Performed data migration to Hadoop from existing data stores.
Set up new Linux users and tested HDFS, Hive, Pig and Map Reduce access for them.
Performed Linux systems administration on production and development servers (RHEL, CentOS and other UNIX utilities).
Commissioned and decommissioned data nodes in the Cluster.
Configured a Hadoop cluster with 20-30 nodes (Amazon EC2 spot Instance) to transfer the data from Amazon S3 to HDFS and HDFS to Amazon S3.
Job and user management using Capacity Scheduler.
Installed Patches and packages on Unix/Linux Servers.
Installed and configured vSphere client, created Virtual Server and allocated resources.
Used various Utilities to do Performance Tuning, Client/Server Connectivity and Database Consistency Checks.
Analyzed running statistics of Map and Reduce tasks and provided inputs to development team for efficient utilization memory and CPU of the cluster.

Environment: CDH 5.2.3, Cloudera Manager, Redhat Linux/Centos 4, 5, 6, AWS EC2, Logical Volume Manager, HDFS, Hive, Pig, Sqoop, Flume, ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11,12, Oracle Rac 12c, HPSM, HPSA, Kerberos security.

Hadoop Administrator/Developer

Confidential - Cleveland, OH

Responsibilities:

Worked on multiple projects on Architecting Hadoop Clusters.
Installed, Configured and Managed of Hadoop Cluster using Cloudera Manager and Puppet.
Upgraded Hadoop CDH 4.2 to CDH 4.6 in development environment.
Performed metadata backups and upgrades on Hadoop Development cluster.
Set up and configured Zookeeper for cluster coordination services.
Managed cluster configuration to meet the needs of analysis- I/O bound and CPU bound.
Managed and reviewed Hadoop Log files for troubleshooting issues.
Performed bench mark test on Hadoop clusters and tweaked the solution based on test results.
Commissioned and Decommissioned the Data nodes in Hadoop Cluster.
Performed data validation using HIVE dynamic partitioning.
Transformed large sets of structured and semi structured data by applying ETL processes using Hive.
Developed Map Reduce programs for data analysis.
Worked on troubleshooting, monitoring, tuning the performance of Map reduce Jobs.
Developed Pig scripts for transformation of raw data into intelligent data.
Supported data analysts in running Pig scripts and Hive queries.
Scheduled Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
Configured Fair scheduler on the Resource Manager to manage cluster resource for Jobs & users.
Migrated data across clusters using distcp.
Collaborated with DevOps team to meet the business requirements of customers and proposed Hadoop solutions.
Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion.
Supported data analysis projects through Elastic Map Reduce on the Amazon Web Services (AWS) and Rack space cloud. Performed Export and import of data into S3.
Preparing documentation on the cluster configuration for future reference.

Environment: Cloudera Hadoop, Linux, HDFS, Hive, Pig, Sqoop, Flume, Zookeeper, HBase, YARN, RDBMS, Oozie, AWS.

Hadoop Administrator

Confidential - Grand Rapids, MI

Responsibilities:

Managed, Administered and Monitored clusters in Hadoop Infrastructure.
Diligently teamed with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Collaborated with application teams to install Hadoop updates, patches, when required.
Managed connectivity of nodes and security on Hadoop cluster.
Commissioned and decommissioned Data nodes from the cluster.
Implemented Name Node High Availability.
Worked with data delivery teams to setup new Hadoop users.
Installed and configured Hadoop eco system components like Hive, Pig, Flume, Sqoop, and HBase.
Installed Oozie workflow engine to run multiple Map Reduce, Hive and pig jobs.
Configured Metastore for Hadoop ecosystem and management tools.
Hands-on experience with Nagios and Ganglia monitoring tools.
Experience in HDFS data storage and support for running Map Reduce jobs.
Performed tuning and troubleshooting of MR jobs by analyzing and reviewing Hadoop log files.
Loaded data into HDFS from dynamically generated files using Flume and from RDBMS using Sqoop.
Used Scoop to export the analyzed data from HDFS to RDBMS for business use cases.
Skillfully used distcp to migrate data between and across the clusters.
Installed and configured Zookeeper to co-ordinate Hadoop daemons.
Coordinated root cause analysis efforts to minimize future system issues.

Environment: Cloudera 4.2, HDFS, Hive, Pig, Sqoop, HBase, Chef, RHEL, Mahout, Tableau, MySQL, Shell Scripting.

Linux Administrator

Confidential, CT

Responsibilities:

Installation, configuration and administration of Red Hat Linux servers and support for Servers and regular upgrades of Red Hat Linux Servers using kick start based network installation.
Provided 24x7 System Administration support for Red Hat Linux 3.x, 4.x, 5.x servers and resolved trouble tickets on shift rotation basis.
Configured HP ProLiant, Dell Power edge, R series, and Cisco UCS and IBM p-series machines, for production, staging and test environments.
Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts.
Configured Linux native device mappers (MPIO), EMC power path for RHEL 5.5, 5.6, & 5.7.
Performance monitoring utilities like IOSTAT, VMSTAT, TOP, NETSTAT and SAR.
Worked on Support for Aix matrix sub system device drivers.
Worked on with the computing by both physical and virtual from the desktop to the data center using the SUSE Linux. Expertise in Build, Install, load and configure boxes.
Experienced in Installation, Configuration, and Troubleshooting of Tivoli Storage Manager.
Remediated failed backups, took manual incremental backups of failing servers.
Upgraded TSM from 5.1.x to 5.3.x. Worked on HMC Configuration and management of HMC Console which included up gradation, micro partitioning.
Installed and configured adapter card’s cables. Worked on Integrated Virtual Ethernet and building up of VIO servers.
Installed SSH Keys for Successful login of SRM data into the server without prompting password for daily backup of vital data such as processor utilization, disk utilization, etc.
Provided redundancy with HBA card, Ether channel configuration and network devices.
Coordinated with application and database teams for troubleshooting the application.
Coordinated with SAN team for allocation of LUN's to increase file system space.
Configured and administered Fiber Card Adapter and handled AIX part of SAN.

Environment: Red Hat Linux (RHEL 3/4/5), Solaris 10, Logical Volume Manager, Sun & Veritas Cluster Server, VMWare, Global File System, Red hat Cluster Servers.

Systems Administrator

Confidential

Responsibilities:

Administered RHEL 4.x and 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
Created & cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrated servers between ESX hosts and Xen servers.
Installed Red Hat Linux using kick-start and applying security polices for hardening the server based on the company policies.
Installed and verified that all AIX/Linux patches or updates are applied to the servers.
Installed RPM and YUM packages patch and other server management.
Managed systems routine backup, scheduling jobs like disabling and enabling Cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning and testing.
Worked and performed data-center operations including rack mounting and cabling.
Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & Red Hat Linux.
Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
Configured multipath, adding SAN and creating physical volumes, volume groups, & logical volumes.

Environment: RHEL, VMware 3.5, Solaris 2.6/2.7/8, Oracle 10g, Weblogic10.x, Veritas NetBackup, Veritas Volume Manager, Samba, NFS, NIS, LVM, Shell Scripting.

We provide IT Staff Augmentation Services!

Hadoop Administrator/developer Resume

Chicago, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship