Hadoop Administrator Resume
Kalamazoo, MI
PROFESSIONAL SUMMARY:
- Over 8 years of experience in experiential and project based professional experience with IT Technologies and good working knowledge as aHadoopadminwith experience configuring theHadoopcluster and installing theHadoopecosystem technologies.
- Around 3.5 years of experience inHadoopinfrastructure which include MapReduce, Hive, Oozie, Scoop, HBase, Pig, HDFS, Yarn, SAS interface configuration projects in direct client facing roles.
- Extensive experience in installing, configuring and administratingHadoopcluster for majorHadoop distributions like CDH5 and HDP.
- Hands on experience on configuring aHadoopcluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
- Hands on experience in installation, configuration, supporting and managingHadoopClusters using Apache, Hortonworks, and Cloudera.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
- Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra.
- Experience in monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
- Experience in implementing in setting up standards and processes forHadoopbased application design and implementation.
- Having strong experience/expertise in different data-warehouse tools including ETL tools like Ab Initio, Informatica, etc. and BI tools like Cognos, Micro strategy, Tableau and Relational Database systems like Oracle/PL/SQL, Unix Shell scripting.
- Having strong expertise/experience working in Legacy IBM Mainframe Z O/S/MVS, COBOL, DB2, CICS, JCL, and Enterprise level job schedulers like Control M, Autosys and Tivoli.
- Experience in deploying and managing the multi-node development, testing and production.
- Experience in developing and scheduling ETL workflows inHadoopusing Oozie.
- Also have substantial experience writing MapReduce jobs in Java, Pig, Flume, Zookeeper and Hive and Storm.
- Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux systems(RHEL).
KEY SKILLS:
- HDFS, MapReduce, Pig, Hive, Hbase, Sqoop, Scala, Zookeeper, Ambari, Oozie, Storm, Kafka, Flume, and Spark.
- NoSQL, MySQL, Oracle, SQL Server, Cassandra, Couchbase, Cloudera, Hortonworks, AWS
- Shell Scripting, Python, Java Scripting
- Apache Tomcat, JBOSS and Apache Http web server
- Net Beans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office
- Kerberos, Docker, NagiOS, Ganglia, Puppet, Ansible, Perl, chef
- Agile, Scrum environments
- Java, HTML, MVC, Hibernate
- Windows XP, 7, 8, UNIX, MAC
PROFESSIONAL EXPERIENCE:
Confidential, Kalamazoo, MI
Hadoop Administrator
Responsibilities:
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations. Worked on continuous Integration toolsJenkins.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data. And Implemented Kerberos security in all environments.
- Very good experience with NoSQL databases like Hbase, Cassandra and DynamoDB (AWS)
- Worked on Cassandra database on a multi-datacenter cluster. Have Experience in setting up Cassandra clusters.
- Using Spark streaming consumes topics from distributed messaging source Kafka and periodically pushes batch of data to spark for real time processing.
- Imported logs from web servers with Flume to ingest the data into HDFS using Flume and Spool directory loading the data from local system to HDFS.
- Involved in setup, configuration and management of security for Hadoop clusters using Kerberos and integration with LDAP/AD at an Enterprise level.
- Monitored cluster resources and configured alerts using Cloudera manager for the cluster. Experience working with reporting tools Cognos and Qlickview.
- Managing Amazon Web Services (AWS) infrastructure with automation and configuration management tools such as Chef, Ansible, Puppet, or custom-built designing cloud-hosted solutions, specificAWSproduct suite experience.
- Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using MapReduce for further analysis. Fine tuning hive jobs for optimized performance.
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks. Worked on NoSQL databases including HBase and Cassandra.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
Environment: HDFS, MapReduce, AWS, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Scala, Oozie, Sqoop, CDH5, HBase, Spark, Solr, Kafka, Storm, Impala, Cassandra, MySQL, Shell Scripting and Oracle9.
Confidential, American Fork, UT
Hadoop Administrator
Responsibilities:
- Responsible for implementation and ongoing administration of Hadoop infrastructure. Implemented Kerberos Security Authentication protocol for existing cluster.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and manually from command line.
- Hands on experience in provisioning and managing multi-node Hadoop Clusters on public cloud environment Amazon Web Services (AWS) - EC2 and on private cloud infrastructure.
- Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
- Installing, Upgrading and ManagingHadoopCluster onHortonworks
- Experienced on adding/installation of new components and removal of them throughAmbari.
- Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Used Informatica Power Center to create mappings, mapplets, User defined functions, workflows, worklets, sessions and tasks. Worked on installation of DataStax Cassandra cluster.
- Exporting data from RDBMS to HIVE, HDFS and HIVE, HDFS to RDBMS by using SQOOP. Installed and managed multiple Hadoop clusters - Production, stage, development.
- Setting up monitoring tools for Hadoop monitoring and alerting. Monitoring and maintaining Hadoop cluster Hadoop/ HBase/ Zookeeper.
- Write scripts to automate application deployments and configurations. Hadoop cluster performance tuning and monitoring. Troubleshoot and resolve Hadoop cluster related system problems.
Environment: Hadoop, Ambari, MapReduce, HDFS, Pig, Hive, Yarn, HBase, MapReduce, Sqoop, Flume, Zookeeper, Hortonworks, Eclipse, MYSQL, Python, Shell Scripting.
Confidential, Lexington, KY
Hadoop/Linux Administrator
Responsibilities:
- Performance tuning for infrastructure and Hadoop settings for optimal performance of jobs and their throughput.
- Involved in analyzing system failures, identifying root causes, and recommended course of actions and lab clusters. Designed the Cluster tests before and after upgrades to validate the cluster status.
- Involved in coding for J Unit Test cases,ANTfor building the application.Implemented the Hadoop Name-node HA services to make the Hadoop services highly available.
- Regular Maintenance of Commissioned/decommission nodes as disk failures occur using Cloudera Manager.
- Documented and prepared run books of systems processes and procedures for future references.
- Automated data loading between production and disaster recovery cluster. Migrated hive schema from production cluster to DR cluster.
- Integrated Kafka with Flume in sand box Environment using Kafka source and Kafka sink. Configured flume agent with flume syslog source to receive the data from syslog servers.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (like MapReduce, Pig, Hive, Sqoop) as well as system specific jobs.
- Working on multiple projects spanning from Architecting Hadoop Clusters, Installation, Configuration and Management of Hadoop Cluster.
- Designing and configuring of RedHat Enterprise Linux 5/4/3 User Administration, management and archiving.
- Expertise in package management involves creating, installing and configuring of packages using RedHat RPM.
- Worked in configuration of RedHat Satellite server and Managed, configured and Maintained customer entitlements including upgrading and patching of Linux servers
- Maintained and modified hardware and software components, content and documentation.
Environment: Hadoop, HDFS, MapReduce, Shell Scripting, Spark, Splunk, Solr, Pig, Hive, HBase, Sqoop, Flume, Oozie, Zookeeper, NOSQL, RedHat Linux.
Confidential
Linux/Unix Administrator
Responsibilities:
- Installed, Configured and Maintained Debian/RedHat Servers at multiple Data Centers. Hands on experience working with production servers at multiple data centers.
- Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
- Involved in writing scripts to migrate consumer data from one production server to another production server over the network with the help of Bash and Perl scripting.
- Configured RedHatKickstart server for installing multiple production servers.
- Configuration and administration of DNS, LDAP, NFS, NIS, NIS+ and Send mail on RedHat Linux/Debian Servers. Automated server building using System Imager, PXE, Kickstart and Jumpstart.
- Writing, optimizing, and troubleshooting dynamically created SQL within procedures
- Automated tasks using shell scripting for doing diagnostics on failed disk drives. Configured Global File System (GFS) and Zetta byte File System (ZFS).
- Troubleshooting production servers with IPMI tool to connect over SOL. Configured system imaging tools Clonezilla and System Imager for data center migration.
- Installed and configured DCHP server to give IP leases to production servers. Management of RedHat Linux user accounts, groups, directories and file permissions.
- Documented strongly the steps involved for data migration on production servers and also testing procedures before the migration.
- Provided 24/7 on call support on Linux Production Servers. Responsible for maintaining security on RedHat Linux.
Environment: IBM Blade servers, WebSphere 5.x/6.x, Apache 1.2/1.3/2.x, Oracle, Logical Volume Manager, VERITAS net backup 5.x/6.0, VM ESX 3.x/2, RHEL 5.x/4.x, Solaris 8/9/10, Sun Fire.
