Hadoop Admin Resume O'Fallon, MO - Hire IT People

SUMMARY

Over 7+ years of experience in IT industry this includes 4+ years of proven experience in Hadoop Development and Administration using Cloudera and Hortonworks Distributions.
Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH4, CDH5),Yarn distributions.
Excellent understanding/knowledge of Hadoop architecture and various components such as HDFS, Resource Manager, Node Manager, Name Node, Data Node and MapReduce programming paradigm.
Hands - on experience on major components in Hadoop Ecosystem including Hive, Sqoop, HBase and knowledge of Mapper/Reduce/HDFS Framework.
Solid background in UNIX and Linux Network Programming.
Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HDFS, Hive, Hue, Impala, Oozie, Solr, Spark, Sqoop, YARN, ZooKeeper) using Cloudera Manager and Hortonworks Ambari.
Experience in working with Flumeto load the log data from multiple sources directly into HDFS.
Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
Experienced in installation, configuration, supporting and monitoring 200+ node Hadoopcluster using Cloudera manager.
Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
Strong knowledge in configuring Name Node High Availability and Name Node Federation.
Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, sqoop\automation.
As admin involved in Cluster maintenance, capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
Involved in Adding/removing new nodes to an existing Hadoop cluster.
Involved in bench markingHadoop/HBasecluster file systems various batch jobs and workloads
Hands on experience in analyzingLog files for Hadoop and eco system servicesand finding root cause.
Scheduling all Hadoop/Hive jobs using beeline.
Rack aware configuration for quick availability and processing of data.
Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure-KDC server setup, creating realm/domain, managing principles, generation key tab file on each service and managing keytab using keytabtools.
Actively participated in the daily Scrum calls, Sprint planning, Effort Estimation, Sprint review, Sprint Demo and Retrospective sessions.
Experience in various software development life cycle like Waterfall and Agile methodologies.
Hands on experience in Core Java, Servlets, JSP, JDBC, Struts, Hibernate, Tomcat, Glassfish.
Good working experience using Eclipse, NetBeans IDE’s
Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Hadoop/BigData Components: HDFS, Hue, Map Reduce, Hive, Sqoop, Spark, Impala, Oozie, YARN, Flume, Kafka, Pig, Zookeeper

NoSql Databases: HBase, Cassandra

Programming Language: Java, HTML

Database: PostgreSQL, Derby, MySQL, SQL Server

Scripting Languages: Shell Scripting, Puppet

Frameworks: MVC, Spring, Struts, Hibernate

IDE: NetBeans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office

Operating Systems: Linux(Redhat, CentOS, UBUNTU), Windows, Mac

WEB Servers: Apache Tomcat, JBOSS and Apache Http web server

Cluster Management Tools: Cloudera Manager and HDP Ambari

Virtualization Technologies: VMware vSphere, Citrix XenServer

PROFESSIONAL EXPERIENCE

Confidential, O'Fallon, MO

Hadoop Admin

Responsibilities:

Managed 200+ Nodes CDH 5.13.1 Hadoop clusters with 14 petabytes of data using RHEL.
Involved in start to end process ofHadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage & review Hadoop log files.
Monitoring services, architecture design and implementation of Hadoop deployment, configuration management.
Experienced in define being job flows with Oozie.
Experienced in managing and reviewing Hadoop log files.
Installation of various Hadoop Ecosystems and Hadoop Daemons.
Installation and configuration of Sqoop, Flume and Hbase
Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
As admin followed standard Back up policies to make sure the high availability of Cluster.
Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop Clusters.
Installed and configured hue, Hive, Sqoop and Oozie on the Hadoop Cluster.
Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
Involved in cluster migration and expansion
Involved in Adding new nodes to an existing cluster.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Involved in migrating Oozie database to Postgres due to Derby database deadlock issues.
Involved in upgrades like CDH 5.11.1 to 5.12.1 and Spark 1.6 to Spark 2.0
Involved in upgrde java 1.7 to java 1.8
Involved in installing Kudu 1.4 in CDH 5.12.1.
Involved in Cloudera Navigator access for auditing and viewing data.
Responsible for cluster availability and experienced on ON-call support
Coordinated with Cloudera support team through support portal to sort out the critical issues during upgrades.

Environment: Hadoop, HDFS, Hive, Hue, Zookeeper, Impala, Oozie, HBase, Sentry, Solr, Spark, Sqoop, Yarn (MR2 included) Oracle 11g with redhat, Cloudera CDH.

Confidential, Dearborn, MI

Hadoop Admin

Responsibilities:

Managed 100+ Nodes HDP 2.2.4 cluster with 10 petabytes of data using Ambari 2.0 and Linux Cent OS 6.5.
Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster.
Responsible for the design and implementation of a multi-datacenter Hadoop environment intended to support the analysis of large amounts of unstructured data along with ETL processing.
Coordinated with Hortonworks support team through support portal to sort out the critical issues during upgrades.
Conducting RCA to find out data issues and resolve production problems.
Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
Enabled Kerberos for Hadoop Cluster Authentication and integrate with active directory for managing users and application groups.
Developed Sqoop jobs to extract data from RDBMS databases - Oracle and Teradata.
Loaded the data from Teradata to HDFS using Teradata Hadoop connectors
Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
Worked with big data developers, designers and scientists in troubleshooting mapreduce job failures and issues with Hive and Sqoop.
Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Worked on design and implementation, configuration, performance tuning of Hortonworks HDP 2.3 Cluster with High Availability and Ambari 2.2.
Analyzing the Server logs for errors and exceptions, Jenkins Job - Builds - Scheduling and monitoring the console outputs.
Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
Experienced in managing and reviewing Hadoop log files.
Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
Experience on Hbase High availability and manually tested using failover tests.
Create queues and allocated the cluster resources to provide the priority for jobs.
Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of Cluster metadata databases with cron jobs.
Provided technical assistance for configuration, administration and monitoring of Hadoop clusters.
Coordinated with technical teams for installation of Hadoop and third related applications on systems.
Supported technical team members for automation, installation and configuration tasks.
Assisted in designing, development and architecture of Hadoop and Hbase systems.
Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
Responsible for Cluster Maintenance, Monitoring, Troubleshooting, Tuning, commissioning andDecommissioning of nodes.
Responsible for cluster availability and experienced on ON-call support
Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.

Environment: Hortonworks, Ambari, Hive, Pig, Sqoop, Zookeeper, Hbase, Knox, Spark, Yarn, MapReduce.

Confidential, Oregon

Hadoop Admin

Responsibilities:

Monitored workload, job performance and capacity planning using Cloudera Manager.
Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis.
Fine tuning Hive jobs for optimized performance.
Partitioned and queried the data in Hive for further analysis by the BI team.
Implemented APACHE IMPALA for data processing on top of HIVE.
Fine tuning Hive jobs for better performance.
Involved in extracting the data from various sources into Hadoop HDFS for processing.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nodes on CDH3Hadoop cluster on Red hat LINUX.
Involved in loading data from LINUX file system to HDFS.
Imported weblogs from the web servers into HDFS using Flume.
Implemented test scripts to support test-driven development and continuous integration.
Worked on tuning the performance Pig queries.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Automated workflows using shellscripts to pull data from various databases into Hadoop.
Responsible to manage data coming from different sources.
Involved in loading data from UNIX file system to HDFS.
Services through Zookeeper.
Experience in managing and reviewing Hadoop log files.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Installed oozie workflow engine to run multiple Hive and Pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop HDFS, MapReduce, Hive, Pig, Oozie, Sqoop, Yarn,Zookeeper,Impala, Cloudera Manager.

Confidential, Columbus, Ohio

Hadoop Developer

Responsibilities:

Developed several advanced MapReduce programs to process data files received.
Developed Pig scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files into Hadoop.
Usage of Sqoop to import data into HDFS from MySQL database and vice-versa.
Developed Java programs to process huge JSON files received from marketing team to convert into format standardized for the application.
Installed, configured and deployed data node hosts for Hadoop Cluster deployment.
Installed various Hadoop ecosystems and Hadoop Daemons.
Managed commissioning & decommissioning of data nodes.
Implemented optimization and performance testing and tuning of Hive and Pig.
Used Sqoop to import data into HDFS and Hive from other data systems.
Knowledge transfer sessions on the developed applications to colleagues.
Involved in installation and configuration of Tableau Server.

Environment: Apache Hadoop, Apache Cassandra, Hive, Sqoop, Solr, Tomcat, Eclipse Kepler, SVN repository, Linux, Putty, WinSCP.

Confidential

Developer

Responsibilities:

Developed the JSP pages as part of UI.
Performed validations using Validator Plug-in.
Developed the Control Logic as part of Action Classes

Environment: Core Java, Struts, Servlets, JSP, Hibernate, NetBeans, Tomcat.

We provide IT Staff Augmentation Services!

Hadoop Admin Resume

O Fallon, MO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship