We provide IT Staff Augmentation Services!

Hadoop Admin Resume

O Fallon, MO


  • Over 7+ years of experience in IT industry this includes 4+ years of proven experience in Hadoop Development and Administration using Cloudera and Hortonworks Distributions.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH4, CDH5),Yarn distributions.
  • Excellent understanding/knowledge of Hadoop architecture and various components such as HDFS, Resource Manager, Node Manager, Name Node, Data Node and MapReduce programming paradigm.
  • Hands - on experience on major components in Hadoop Ecosystem including Hive, Sqoop, HBase and knowledge of Mapper/Reduce/HDFS Framework.
  • Solid background in UNIX and Linux Network Programming.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HDFS, Hive, Hue, Impala, Oozie, Solr, Spark, Sqoop, YARN, ZooKeeper) using Cloudera Manager and Hortonworks Ambari.
  • Experience in working with Flumeto load the log data from multiple sources directly into HDFS.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
  • Experienced in installation, configuration, supporting and monitoring 200+ node Hadoopcluster using Cloudera manager.
  • Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, sqoop\automation.
  • As admin involved in Cluster maintenance, capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Involved in Adding/removing new nodes to an existing Hadoop cluster.
  • Involved in bench markingHadoop/HBasecluster file systems various batch jobs and workloads
  • Hands on experience in analyzingLog files for Hadoop and eco system servicesand finding root cause.
  • Scheduling all Hadoop/Hive jobs using beeline.
  • Rack aware configuration for quick availability and processing of data.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure-KDC server setup, creating realm/domain, managing principles, generation key tab file on each service and managing keytab using keytabtools.
  • Actively participated in the daily Scrum calls, Sprint planning, Effort Estimation, Sprint review, Sprint Demo and Retrospective sessions.
  • Experience in various software development life cycle like Waterfall and Agile methodologies.
  • Hands on experience in Core Java, Servlets, JSP, JDBC, Struts, Hibernate, Tomcat, Glassfish.
  • Good working experience using Eclipse, NetBeans IDE’s
  • Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.


Hadoop/BigData Components: HDFS, Hue, Map Reduce, Hive, Sqoop, Spark, Impala, Oozie, YARN, Flume, Kafka, Pig, Zookeeper

NoSql Databases: HBase, Cassandra

Programming Language: Java, HTML

Database: PostgreSQL, Derby, MySQL, SQL Server

Scripting Languages: Shell Scripting, Puppet

Frameworks: MVC, Spring, Struts, Hibernate

IDE: NetBeans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office

Operating Systems: Linux(Redhat, CentOS, UBUNTU), Windows, Mac

WEB Servers: Apache Tomcat, JBOSS and Apache Http web server

Cluster Management Tools: Cloudera Manager and HDP Ambari

Virtualization Technologies: VMware vSphere, Citrix XenServer


Confidential, O'Fallon, MO

Hadoop Admin


  • Managed 200+ Nodes CDH 5.13.1 Hadoop clusters with 14 petabytes of data using RHEL.
  • Involved in start to end process ofHadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage & review Hadoop log files.
  • Monitoring services, architecture design and implementation of Hadoop deployment, configuration management.
  • Experienced in define being job flows with Oozie.
  • Experienced in managing and reviewing Hadoop log files.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installation and configuration of Sqoop, Flume and Hbase
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As admin followed standard Back up policies to make sure the high availability of Cluster.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop Clusters.
  • Installed and configured hue, Hive, Sqoop and Oozie on the Hadoop Cluster.
  • Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
  • Involved in cluster migration and expansion
  • Involved in Adding new nodes to an existing cluster.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Involved in migrating Oozie database to Postgres due to Derby database deadlock issues.
  • Involved in upgrades like CDH 5.11.1 to 5.12.1 and Spark 1.6 to Spark 2.0
  • Involved in upgrde java 1.7 to java 1.8
  • Involved in installing Kudu 1.4 in CDH 5.12.1.
  • Involved in Cloudera Navigator access for auditing and viewing data.
  • Responsible for cluster availability and experienced on ON-call support
  • Coordinated with Cloudera support team through support portal to sort out the critical issues during upgrades.

Environment: Hadoop, HDFS, Hive, Hue, Zookeeper, Impala, Oozie, HBase, Sentry, Solr, Spark, Sqoop, Yarn (MR2 included) Oracle 11g with redhat, Cloudera CDH.

Confidential, Dearborn, MI

Hadoop Admin


  • Managed 100+ Nodes HDP 2.2.4 cluster with 10 petabytes of data using Ambari 2.0 and Linux Cent OS 6.5.
  • Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster.
  • Responsible for the design and implementation of a multi-datacenter Hadoop environment intended to support the analysis of large amounts of unstructured data along with ETL processing.
  • Coordinated with Hortonworks support team through support portal to sort out the critical issues during upgrades.
  • Conducting RCA to find out data issues and resolve production problems.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
  • Enabled Kerberos for Hadoop Cluster Authentication and integrate with active directory for managing users and application groups.
  • Developed Sqoop jobs to extract data from RDBMS databases - Oracle and Teradata.
  • Loaded the data from Teradata to HDFS using Teradata Hadoop connectors
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
  • Worked with big data developers, designers and scientists in troubleshooting mapreduce job failures and issues with Hive and Sqoop.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Worked on design and implementation, configuration, performance tuning of Hortonworks HDP 2.3 Cluster with High Availability and Ambari 2.2.
  • Analyzing the Server logs for errors and exceptions, Jenkins Job - Builds - Scheduling and monitoring the console outputs.
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
  • Experienced in managing and reviewing Hadoop log files.
  • Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Experience on Hbase High availability and manually tested using failover tests.
  • Create queues and allocated the cluster resources to provide the priority for jobs.
  • Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of Cluster metadata databases with cron jobs.
  • Provided technical assistance for configuration, administration and monitoring of Hadoop clusters.
  • Coordinated with technical teams for installation of Hadoop and third related applications on systems.
  • Supported technical team members for automation, installation and configuration tasks.
  • Assisted in designing, development and architecture of Hadoop and Hbase systems.
  • Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
  • Responsible for Cluster Maintenance, Monitoring, Troubleshooting, Tuning, commissioning andDecommissioning of nodes.
  • Responsible for cluster availability and experienced on ON-call support
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.

Environment: Hortonworks, Ambari, Hive, Pig, Sqoop, Zookeeper, Hbase, Knox, Spark, Yarn, MapReduce.

Confidential, Oregon

Hadoop Admin


  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis.
  • Fine tuning Hive jobs for optimized performance.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Implemented APACHE IMPALA for data processing on top of HIVE.
  • Fine tuning Hive jobs for better performance.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nodes on CDH3Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Imported weblogs from the web servers into HDFS using Flume.
  • Implemented test scripts to support test-driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Automated workflows using shellscripts to pull data from various databases into Hadoop.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Installed oozie workflow engine to run multiple Hive and Pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop HDFS, MapReduce, Hive, Pig, Oozie, Sqoop, Yarn,Zookeeper,Impala, Cloudera Manager.

Confidential, Columbus, Ohio

Hadoop Developer


  • Developed several advanced MapReduce programs to process data files received.
  • Developed Pig scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files into Hadoop.
  • Usage of Sqoop to import data into HDFS from MySQL database and vice-versa.
  • Developed Java programs to process huge JSON files received from marketing team to convert into format standardized for the application.
  • Installed, configured and deployed data node hosts for Hadoop Cluster deployment.
  • Installed various Hadoop ecosystems and Hadoop Daemons.
  • Managed commissioning & decommissioning of data nodes.
  • Implemented optimization and performance testing and tuning of Hive and Pig.
  • Used Sqoop to import data into HDFS and Hive from other data systems.
  • Knowledge transfer sessions on the developed applications to colleagues.
  • Involved in installation and configuration of Tableau Server.

Environment: Apache Hadoop, Apache Cassandra, Hive, Sqoop, Solr, Tomcat, Eclipse Kepler, SVN repository, Linux, Putty, WinSCP.




  • Developed the JSP pages as part of UI.
  • Performed validations using Validator Plug-in.
  • Developed the Control Logic as part of Action Classes

Environment: Core Java, Struts, Servlets, JSP, Hibernate, NetBeans, Tomcat.

Hire Now