We provide IT Staff Augmentation Services!

Hadoop Admin Resume

3.00/5 (Submit Your Rating)

Plano, TX

SUMMARY:

  • 8 Years of IT Experience in Analysis,Design,Development,Implementation and Testing of wide application, Data Warehouse,Client Server Technologies and Web - based Applications.
  • Over 4 years of experience in dealing with Apache Hadoop components like HDFS, HIVE MapReduce, Hbase, PIG, SQOOP, NAGIOS, Spark, Impala, OOZIE and Flume Big Data and Analytics.
  • Experience in installation, configuration, supporting and monitoring Hadoop cluster using Cloudera and Horton works distributions.
  • Expert in Installing, configuring and maintaining Apache/Tomcat, samba, sendmail, Web Sphere Application Servers.
  • Experience in Amazon Web Services (AWS) provisioning and good knowledge of AWS services like EC2, ELB, Elastic Container Service, S3, DMS, VPC, Route53, Cloud Watch, IAM.
  • Hands-on experience with installing Kerberos Security and setting up permissions, set up Standards and Processes for Hadoop based application design and implementation.
  • Experience in working in a DevOps Environment on various technologies like puppet, Chef, Docker, GIT, Jenkins, AWS and Maven.
  • Experience in managing and handling Linux platform servers (especially Ubuntu) and hands on experience on Red hat Linux.
  • Good Experience in UNIX/LINUX Administrator along with SQL developer in designing and implementing Relational Database model as per business needs in different domains.
  • Hands on experience on major components in Hadoop Ecosystem including HDFS and MR framework, YARN, Hbase, Hive, Pig, Scoop, Zookeeper.
  • Experience in Networking Concepts, DNS, NIS, NFS and DHCP, troubleshooting network problems such as TCP/IP, providing support for users in solving their problems.
  • Hands on experience in installing, configuring Hadoop ecosystem components such as MapReduce, spark, HDFS, HBase, Oozie, Hive, Pig, impala, zookeeper, Yarn, Kafka and Sqoop.
  • Experience with scripting languages like shell, python and java script.
  • Good knowledge in Kafka installation & integration with Spark Streaming.
  • Strong hold on Informatica power center, Oracle, Vertica, hive, SQL Server, Shell scripting and Qlikview.
  • Architecture and designed Hadoop 30 nodes Innovation Cluster with SQRRL, SPARK, Puppet, HDP 2.2.4.
  • Good understanding of Big Data and experience in developing predictive applications using open source technologies.
  • Proficient in working on Hadoop 1.x (HDFS and MapReduce) and Hadoop 2.x (HDFS, MapReduce, YARN).
  • Setting up HDFS Quotas to enforce the fair share of computing resources.
  • Experience in Networking Concepts, DNS, NIS, NFS and DHCP, troubleshooting network problems such as TCP/IP, providing support for users in solving their problems.

WORK EXPERIENCE:

Hadoop Admin

Confidential, Plano, TX

Responsibilities:

  • Configuring, Maintaining, and Monitoring Hadoop Cluster using Cloudera Manager (CDH5) distribution.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
  • Enable High Availability Name Node, Resource manager, HBase and HiveServer2 automatic failover infrastructure to overcome single point of failure.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installed MySQL, Cassandra, and HBase.
  • Import and export hive tables and Hbase Snapshot.
  • Commissioning and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Production jobs debugging when failed.
  • Creating queues on YARN queue manager to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Responsible for creating Hive tables based on business requirements
  • Experienced on adding/installation of new components and removal of them through Cloudera Manager.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Participated in development and execution of system and disaster recovery processes.
  • Loaded data into NoSQL database HBase
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Regular disk management like adding /replacing hot swap able drives on existing servers/workstations, partitioning according to requirements, creating new file systems or growing existing one over the hard drives and managing file systems.
  • Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.

Environment: HDFS CDH3, CDH4, Hbase, NOSQL, Python, RHEL 4/5/6, Hive, Kerberos, Pig, Perl Scripting and AWS S3, EC2, Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Confidential, Portland, OR

Responsibilities:

  • Expertise on Cluster Planning, Performance tuning, Monitoring and Troubleshooting the Hadoop Cluster.
  • Responsible in building a Hortonworks cluster from scratch on HDP 2.x and deployed a Hadoop cluster integrated with Nagios and Ganglia.
  • Expertise on cluster audit findings and tuning configuration parameters.
  • Expertise in configuring MySQL to store the hive metadata.
  • Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller and Quorum Journal nodes.
  • Extensively involved Commissioning and Decommissioning Nodes from time to time.
  • Installed and Configured Hadoop monitoring and Administrating tools such as Nagios and Ganglia.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Responsible on adding/installation of new services and removal of them through Ambari.
  • Configured various views in Ambari such as Hive view, Tez view, and Yarn Queue manager.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
  • Responsible in setting log retention policies and setting up of trash interval time period.
  • Monitoring the data streaming between web sources and HDFS.
  • Involved in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Back up of data from active cluster to a backup cluster using DISTCP.
  • Worked on analyzing Data with Hive and PIG.
  • Involved working on Hadoop ecosystem components like Hadoop Map Reduce, HDFS, Zookeeper, Oozie, Hive, Sqoop, Pig, Flume.
  • Deployed Network file system for Name Node Metadata backup.
  • Monitor Hadoop cluster connectivity and security.
  • Install operating system and Hadoop updates, patches, version upgrades when required.
  • Dumped the data from HDFS to MYSQL database and vice-versa using Sqoop.

Environment: Hortonworks HDP 2.2, Ambari 2.x, HDFS, Zookeeper, Unix/Linux, HDFS, Map Reduce, Zookeeper, YARN, Pig, Hive, HBase, Flume, Sqoop, Shell Scripting, Ambari, Kerberos, Nagios & Ganglia.

Confidential, Fresno, CA

Responsibilities:

  • Experience in Commissioning and Decommissioning nodes.
  • Involved in installation, configuration, supporting and managing hadoop clusters, Hadoopcluster administration that includes commissioning & decommissioning of Data Node, capacity planning, slots configuration, performance tuning, cluster monitoring and troubleshooting.
  • Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop.
  • Built automated set up for cluster monitoring and issue escalation process.
  • Working Closely with SA Team to make sure all hardware and software is properly setup for Optimum usage of resources.
  • Plan and execute on system upgrades for existing Hadoop clusters.
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics.
  • Rack Aware Configuration, Configuring Client Machines Configuring, Monitoring and Management Tools.
  • Used Fair Scheduler to manage Map Reduce jobs so that each job gets roughly the same amount of CPU time.
  • Recover from Name Node failures.
  • Load and transform data into HDFS from large set of structured data/ As400/Mainframe/Sql server using Talend Big data studio.
  • Supporting Hadoop developers and assisting in optimization of map reduce jobs, Pig Latin scripts, Hive Scripts, and HBase ingest Required.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Handle the upgrades and Patch updates.
  • Worked on configuring security for Hadoop Cluster, managing and scheduling jobs on a Hadoop Cluster.
  • Commission or decommission the data nodes from cluster in case of problems.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Rack Aware Configuration and AWS working nature.
  • Cluster HA Setup.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Day to day support for the cluster issues and job failures.
  • Working with dev Team to tune Job Knowledge of Writing Hive Jobs.
  • Constantly learning various Big data tools and provide strategic direction as per development requirement.
  • Analyze and understand the business requirements.
  • Develop Informatica mappings using Powercenter Designer to load data from Flat files to Target database (Teradata).
  • Prepare test scripts and execute for unit testing.

Environment: cloudera 4.3.2, HDFS, Hive, Sqoop, Zookeeper and HBase, Windows 2000/2003 Unix Linux Java, HDFS Map Reduce, Pig Hive HBase Flume Sqoop, NOSQL Oracle 9i/10g/11g RAC with Solaris/red hat, Exadata Machines X2/X3, Big Data Cloud era CDH Apache Hadoop, Toad, MYSQL plus, Oracle Enterprise Manager (OEM), RMAN, Shell Scripting, Golden Gate, RedHat/Suse Linux, EM Cloud Control, Teradata 13.

Confidential, Dallas, TX

Responsibilities:

  • Involved in installing, configuring and using Hadoop Ecosystems (Hortonworks).
  • Involved in Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in managing and reviewing Hadoop log files.
  • Developed and Modified Oracle Packages, Procedures, Triggers as per the business requirements.
  • Implemented security (Kerberos) for various hadoop clusters.
  • Working with data delivery teams to setup new Hadoop users.
  • Design and deployment of a Secured Hadoop clusters using Kerberos, Knox, Ranger, LDAP.
  • Implemented Puppet modules / chef recipes to automate configuration of a broad range of services. Configure and setup multiple nodes with writing Puppet manifest scripts.
  • Building the nodes and managing the configuration of various services through Puppet tool.
  • Puppet Server and client Installation and creating class, modules.
  • Set up multi-node Hadoop cluster with configuration management/deployment tool (Puppet).
  • Worked on using tools like Cloudera Manager Ganglia and Nagios to monitor performance of the Hadoop Cluster and collect different metrics. Deploying & automating the tasks using Puppet tool.
  • Experience in designing and implementing of secure Hadoop cluster using Kerberos.
  • Design and implement ETL frameworks and concepts Hadoop Admin.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDD's, and Scala Python.

Environment: HDFS, Hive, Sqoop, Zookeeper and HBase, UNIX Linux Java, Chef, Python, HDFS Map Reduce, Pig, Hive, HBase, Flume, Puppet, Kafka, Sqoop, Shell Scripting.

Linux Admin

Confidential

Responsibilities:

  • Installing and maintaining the Linux servers. Installed, configured and Administrated of all UNIX/LINUX servers, includes the design and selection of relevant hardware to Support the installation/upgrades of Red Hat (5/6), CentOS 5/6, Ubuntu operating systems.
  • Support internal and external teams in relation to information security initiatives.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Used Oozie scripts for deployment of the application and perforce as the secure versioning software. Analyze existing automation scripts and tools for any missing efficiencies or gaps.
  • Hands on experience with working on Spark using both Scala and Python. Performed various actions and transformations on spark RDD's and Data Frames.
  • Developed User interface using Struts MVC frame work. Implemented JSP's using struts tag libraries and developed action classes.
  • Monitored cluster for performance and, networking and data integrity issues.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.

Environment: Java, J2EE, Struts 1.2, JSP, Hibernate 3.0, Spring 2.0, Servlets, JMS, XML, Python, SOAP, JDBC, ANT, HTML, JavaScript, CSS, UML, JAXP, CVS, Log 4J, JUnit, Weblogic 10.3, Eclipse 3.4.1, Oracle 10g.

Linux/Hadoop Admin

Confidential

Responsibilities:

  • Install and maintain the native Hadoop Cluster and Cloudera Manager Cluster.
  • Design and Configure the Cluster with the services required (HDFS, Hive, Hbase, Oozie, Zookeeper).
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Managed and maintained the existing Informatica interfaces: Tidal, Powercenter and OBDC connections etc.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing
  • Responsible to manage data coming from different sources.
  • Experience with using and setting up Scribe, Flume, Sqoop for data transfer from Data Centers.
  • Assisted in exporting analyzed data to relational databases using Sqoop.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Script.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce,loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, MySQL, LINUX, Java (jdk1.7), HBase and Big Data.

We'd love your feedback!