We provide IT Staff Augmentation Services!

Lead Bigdata Hadoop Administrator & Architect Resume

5.00/5 (Submit Your Rating)

Dallas, TX

SUMMARY

  • Over 10+ years of IT experience including Hadoop System administration, HDFS Cluster Administration, Hadoop Architecture & Design, RedHat Enterprise LINUX Systems Administration, HP - UX Systems Administration, Sun Solaris Systems Administration and AIX Server Administration, Server/VM build, Deployments.
  • Coordinated and directed projects, made detailed plans to accomplish tasks and goals as needed.
  • Build BigData Hadoop clusters (HDFS), Designed Cluster Architecture, created and configured Redhat GlusterFS, Docker Containers, Integrated Centrify, Kerberos and Sentry etc.
  • Involved in Backup Administration and project engineering.
  • Installation, configuration, supporting and managing Hortonworks Hadoop cluster.
  • Closely worked with different branches of IT operations such as Network engineering, Network monitoring, Database services, Systems Administration, Application Teams, Change management, Middleware & Systems performance etc.
  • Worked as Architect, Sr. Unix/Linux Systems Administrator in heterogeneous SAN environment comprising HP-UX, SUN, AIX, RedHat Enterprise Linux, Red Hat Enterprise Virtualization, Centos, AIX and also lead UNIX SME team.
  • Worked on Hadoop HDFS cluster installed and configured nodename and datanodes, secondary nodnames in Master slave architecture (Distributed Frame Work).
  • Extensively worked on Red Hat Virtualization technology, VMs RHEVM, RHN Satellite server, XEN Hypervisor, KVM Hypervisor, VMware, ESX and vSphere (snapshot/cloning etc), vCenter Orchestrator 5.1.0 and vCloud Director.
  • Good working Knowledge in Hadoop security like Kerberos and sentry.
  • xperience in different layers of Hadoop Framework - Storage (HDFS), Analysis (Pig and Hive), Engineering (Jobs and Workflows).
  • Worked with Infrastructure teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • AWS (Amazon Web Services) Cloud computing EC2.
  • OCFS2 and Automatic Storage Management (Oracle ASM) on RedHat Linux.
  • Worked on Redhat Firewall configuration using IPTABLES.
  • Extensively Worked on Physical servers to Virtual server's conversion (HP/SUN to RedHat).
  • IBM Edge Universal Load balancer configured on Linux servers.
  • Extensively Worked on LVM (HP/Linux/AIX) and VERITAS Volume manager (SUN).
  • LDAP-UX integration with AD server.
  • Storage migration using OpenMigrator and LVM.
  • Installations and configurations of multi-vendor software and applications support.
  • Installed and configured GroundWork using open source packages like JBOSS, Apache, Nagios, Cacti, and MySQL.
  • Used automation tools like SPLUNK, NewRelic, Puppet, Cfengine, VMWare, vCenter, RHEL Satellite.
  • Installed and set up PostgreSQL 9.2 on RedHat Enterprise Linux servers.
  • Worked on Red Hat cluster GFS and High Availability HA Cluster.
  • Supported 24/7 Datacenter environment.
  • Change and Configuration Management CCM

PROFESSIONAL EXPERIENCE

Confidential, Dallas, TX

Lead BigData Hadoop Administrator & Architect

Responsibilities:

  • Lead BigData Hadoop/YARN Architect and Hadoop Operations support.
  • Upgraded HDFS YARN cluster from version 4.5 to version5.4.5
  • Worked on analyzing Hortonworks Hadoop cluster and different big data analytic tools including Pig, HBase Database and Sqoop.
  • Built new Clusters (versions CDH4.X, 5.X) of sizes 15 nodes, 52 nodes, 118 nodes etc, from scratch and updated CM/CN/CDH versions as needed.
  • Implemented security (Kerberos) for various hadoop clusters
  • Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and through command line interface.
  • Worked on evaluating, architecting, installation/setup of Hortonworks 2.1/1.8 Big Data ecosystem
  • Managed YARN Cluster Resource Manager, Node Manager, Application Master, Job History server, Namenode, datanode, zookeeper services through Cloudera Enterprise Manager and CLI.
  • Building automation frameworks for data ingestion, processing in Python, and Scala with NoSQL and SQL databases and Chef, Puppet, Kibana, Elastic Search, Tableau, GoCd, RedHat infrastructurefor data ingestion, processing, and storage.
  • Implemented Kerberos security and integrated with Centrify for security and AD autantication.
  • Implemented Sentry enterprise security for fine-grained authorization to data. Sentry used for advanced authorization control to enable multi user applications and data sets.
  • Configured and implemented HDFS BDR replication between secured clusters and implemented snapshots.
  • Configured and implemented HIVE BDR replication between secured clusters
  • Implemented HDFS ACLs to achieve fine-grained file permissions for specific named users and named groups.
  • Implemented Kerberos on cloudera, Ambari,MCS clusters and integration with
  • Sentry.
  • Customize HDP services,moving Ambari server,tuning ambari server,lzo
  • compression
  • Used Cloudera Manager for Day-to-day Cluster operations management and node health monitoring and took preventive actions as needed.
  • Decommission and re-commission of cluster nodes.
  • Provided assistance and guidance (recommendations) in fine tuning the cluster.
  • Installed and configured Hadoop Framework HDFS, YARN, Master (Namenode) and Slave (Datanodes), Mapreduce MRv1 and MRv2 (YARN) HUE and eco system components Hive, PigLatin, Sqoop, Spark, Flume, Spark, Storm..etc.
  • Leveraged Chef to manage and maintain builds in various environments.
  • Configured and managed Datanode, Namenode high availability (HA) & HDFS/YARN cluster.
  • Configured high availability (HA) for HUE and Hive for resiliency.
  • Installed and configured PostgreSQL database in the environment.
  • Configured Cloudera Navigator Audit server to send to remote syslog server so that Splunk can pull logs for analysis and monitoring.
  • Worked with the Systems Engineering team to plan and deploy new Hadoop hardware and software environments and expanded existing Hadoop environments.
  • Monitored health of the cluster, Preventive maintenance steps have been taken as needed, viewed cluster logs and troubleshoot issues.
  • Hadoop User Administration: Worked with data center provisioning teams to setup new Hadoop users (LDAP). This includes setting up Linux users, setting up Kerberos TEMPprincipals, and testing HDFS, Hive, Pig, spark and MapReduce access for the new users to have access to appropriate datasets.
  • Provided support on Kerberos related issues.
  • Sentry is used to control access to databases/data-sets.
  • Coordinated Hadoop installations/upgrades and patch installations in the environment.
  • Installed, configured and implemented Hadoop Security on HDFS cluster.
  • Installed and configured HIVE, PIG, SQOOP, SPARK on Hadoop clients for DEV/Test environment.
  • Monitored multiple Hadoop cluster environments (Test/DEV/PROD ) using Cloudera Manager Enterprise, JIRA, CASE ticketing system. Also, worked with data center operations teams to resolve server hardware and software issues as needed.
  • Worked Hadoop users, developers, Researches, Data analysts and Data Scientists on hadoop related issues and provided resolutions/recommendations.
  • Worked with Vendor Cloudera for RCAs and maintained a RUN Book.
  • Performed Terasort bench mark test on newly built CDH5.4.4 clusters before releasing users access.
  • Created and reviewed standard operating procedures for Cluster maintenance and operation support.

Confidential, Boston, MA

Lead Hadoop Administrator

Responsibilities:

  • Worked as admin in Hortonworks (HDP 2.2.4.2) distribution for 4 clusters ranges from POC to PROD. Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Worked on evaluating, architecting, installation/setup of Hortonworks 2.1/1.8 Big Data ecosystem
  • Used Apache Spark API over Hortonworks Hadoop YARN cluster to perform analytics on data in Hive.
  • Involved in implementing security on Cloudera Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster
  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Monitoring systems and services through Ambaridashboard to make the clusters available for the business.
  • Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Hand on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
  • Expertise with Hortonworks Hadoop platform(HDFS, Hive, Oozie, Sqoop, Yarn)
  • Changing the configurations based on the requirements of the users for the better performance of the jobs.
  • Experienced in Ambari-alerts configuration for various components and managing the alerts.
  • Provided security and autantication with ranger where ranger admin provides administration and user sync adds the new users to the cluster.
  • Install and configure chef server / workstation and nodes via cli tools to AWS nodes.
  • Good troubleshooting skills on Hue, which provides GUI for developer's/business users for day to day activities.
  • Troubleshooting non-default databases with Ambari.
  • Configuring and Managing Ambari Alerts.
  • Configure ACLs on HDFS, Enabling Kerberos Autantication Using Ambari.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Implemented complex MapReduce programs to perform joins on the Map side using distributed cache
  • Setup flume for different sources to bring the log messages from outside to Hadoop Hdfs.
  • Implemented Name Node HA in all environments to provide high availability of clusters.
  • Capacity scheduler implementation in all environments to provide resources based on the allocation.
  • Create queues and allocated the clusters resources to provide the priority for jobs.
  • Experienced in Setting up the project and volume setups for the new projects.
  • Involved in snapshots and mirroring to maintain the backup of cluster data and even remotely.
  • Implementing the SFTP for the projects to transfer data from External servers to servers.
  • Experienced in managing and reviewing log files.
  • Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of cluster metadata databases with cron jobs.
  • Setting up MySQL master and slave replications and helping business applications to maintain their data in MySQL Servers.
  • Helping the users in production deployments throughout the process.
  • Experienced in production support which involves solving the user incidents varies from sev1 to sev5.
  • Managed and reviewed Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Documented the systems processes and procedures for future references.
  • Worked with systems engineering team to plan and deploy new environments and expand existing clusters.
  • Monitored multiple clusters environments using AMBRI Alerts, Metrics and Nagios. Environment: Hadoop Hdfs, Mapreduce, Hive, Pig, Flume, Oozie, Sqoop, Eclipse, Hortonworks, Ambari.

Confidential, NYC, NY

Sr. Hadoop Administrator

Responsibilities:

  • Handle the installation and configuration of a Hadoop cluster.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive, and HBase.
  • Handle the data exchange between HDFS and different web sources using Flume and Sqoop.
  • Monitor the data streaming between web sources and HDFS.
  • Installation, configuration and administration experience in Big Data platforms Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems
  • Monitor the Hadoop cluster functioning through monitoring tools.
  • Production experience in large environments using configuration management tools like Chefand Puppet supporting Chef Environment with 500+ servers and involved in developingmanifests. Developed Chef Cookbooks to manage systems configuration.
  • Aided in developing Pig scripts to report data for the analysis. Moved data between HDFS and RDBMS using Sqoop. Deploy and management of multi node HDP clusters (Hortonworks).
  • Responsible for implementation and ongoing administration of Hadoop infrastructure
  • Close monitoring and analysis of the MapReduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Performed the automation using RHEL 7, Chef Configuration management and managing the infrastructure environment with Puppet.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Installation, configuration and administration experience in Big Data platforms Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems.
  • Monitoring local file system disk space usage, CPU using Ambari.
  • Handle the upgrades and Patch updates.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Commission or decommission the data nodes from cluster in case of problems.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Set up and manage HA Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
  • Set up the checkpoints to gathering the system statistics for critical set ups.
  • Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.

Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, Cloudera.

Confidential, Chicago,IL

Sr. Linux/Hadoop Administrator

Responsibilities:

  • Supported in designing and architecture of BigData Cloudera Hadoop, YARN, HBase, MapReduce, Zookeeper.
  • Installed and configured Hadoop Framework HDFS Master (Namenode) and Slave (Datanodes), Mapreduce MRv1, YARN and eco system components Hive, PigLatin, Sqoop, Flume..etc. Configured and managed Datanode, Namenode high availability (HA) & HDFS cluster.
  • Hadoop security setup using MIT Kerberos, AD integration (LDAP) and Sentry authorization.
  • Worked with the Systems Engineering team to plan and deploy new Hadoop hardware and software environments and expanded existing Hadoop environments.
  • Expertise with Hortonworks Hadoop platform(HDFS, Hive, Oozie, Sqoop, Yarn)
  • Monitored health of the cluster, viewed cluster logs and troubleshoot issues.
  • Worked with data center provisioning teams to setup new Hadoop users (LDAP). This includes setting up Linux users, setting up Kerberos TEMPprincipals, and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Coordinated Hadoop installations/upgrades and patch installations.
  • Installed, configured and implemented Hadoop Security on HDFS cluster.
  • Installed and configured HIVE, PIG, SQOOP, on Hadoop clients.
  • Monitored multiple Hadoop cluster environments (Test/DEV/PROD ) using Cloudera Manager Enterprise, Dell Open Manage and other tools. Also, worked with data center operations teams to resolve server hardware and software issues as needed.
  • Performance monitoring and tuning of a HDFS cluster.
  • Coordinate with production application operators/admins and development teams to resolve application specific issues.
  • Resolved day-to-day issues as needed.
  • Maintained the integrity and security of the Linux Servers.
  • Used AWS (Amazon Web Services ) Cloud computing EC2 for provisioning like new instance (VM) creation.
  • Used puppet tool for configuration management.
  • Applied patches, firmware, version upgrades etc.
  • Used automation tools like SPLUNK, NewRelic to monitor system health and dynamic logs view and troublesshoot..
  • Installed and configured PostgreSQL 9.2 on RedHat Enterprise Linux servers.
  • Experience in installing Hadoop cluster using different distributions of Apache Hadoop, Cloudera, Hortonworks Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Used vCenter Orechetrator, VCOPS, Hyperic APPINSIGHT and vCloud director for visualization needs.
  • Managed Apache/JBOSS web and application servers.
  • Integrated and implemented server solutions and server components such as proxy servers, web servers, application servers and database servers.
  • Unix/Linux Production/DEV/Test Environment support.
  • Provided production support to all IRS applications and resolved issues as per SLAs.
  • Provided Unix support for Middleware servers (JBOSS, Apache, Websphere).
  • Troubleshooting OS/Networks issues.
  • Managed change requests and Incident tickets using ServiceNow ticketing system.

Confidential, Chicago,IL

Sr. UNIX/Linux Systems Administrator/Engineer

Responsibilities:

  • Lead UNIX/Linux SME (Subject Matter Expert) team and resolved day-to-day issues as needed.
  • Coordinated and directed projects, made detailed plans to accomplish tasks and goals, directed the integration of technical activities and negotiated the preparation of project technical specifications. All customer requests were handled and assigned to team members and prioritized tasks and delivered as per SLAs.
  • Coordinated and worked with different team leads like Network engineering, Network monitoring, Database services, Systems Administration, Change management, Middleware & Systems performance to accomplish tasks and to resolve issues as needed.
  • Assessing server loads and worked with technical leads SDLs to develop tuning recommendation.
  • IBM Edge Universal Load balancer (IBM ULB) configured on Linux servers.
  • Bonding and GigE: Configured NIC to GigE speed and bonded the interfaces.
  • Managed Redhat Firewall configuration using IPTABLES.
  • Configured Multipath on Linux servers.
  • Built Linux physical servers using KickStart configuration.
  • Built VMs (Linux Virtual Machines) using XEN, RHEV & VMware/vSphere. Configured Kickstart that makes standard base line all Linux servers in an Enterprise.
  • Cloned servers using vSphere (VMware)
  • Built Redhat clusters, GFS..etc
  • Conducting server builds and operation and maintenance coordinate server installations as per the application requirements. Creating change requests and requests for service from external and internal service providers.
  • Implemented Kernel tuning for systems optimal performance on Redhat Linux servers as per application need.
  • Installation and configurations on Linux CENTOS, Red Hat Enterprise v.3/4/5/6
  • O.S Installation up-gradation and Firmware upgrade.
  • Day to day UNIX administration on LINUX servers.
  • Linux VG/LV/FS management, worked extensively on vgextend/lvextend etc..
  • Installed and configured BMC Patrol, NET-SNMP, PowerPath and EMC Solutions Enabler, SYMCLI on Linux/UNIX hosts.
  • Implemented Server consolidation, Storage consolidation projects, P2V migration projects,
  • Worked on Virualization technologies, creating Linux virtual machines (VMs) and Linux VMs using VMWARE vSphere, RHEVM, XEN and KVM.
  • Configured Redhat repositories as needed.
  • Installed Dell OpenManage Systems management software on physical servers.
  • Worked on MC/ServiceGuard High Availability Cluster.
  • Configured Multipath, powerpath and NIC/HBA etc.
  • Unix/Linux Production Environment support.
  • Supported multiple applications in 24/7 environment.
  • Installation and configurations on HP-UX 10.20/11.0/11i, Linux Red Hat Enterprise v.3.x, v4.x, v5.x, v6.3, and Sun Solaris 8.0/ 8i/9.0.
  • Crated Repo servers and built and managed Redhat Satellite server.
  • Built new Linux servers using kickstart, RHN satellite server and installed RPM packages on Red Hat Enterprise Linux servers. Installed patches using RPM/YUM.
  • Installed/upgraded/configured Cleaercase, Middleware MQ, Websphere, Veritas Netbackup, EMC Powerpath, Dell Openmanage and Capacity planning Software TeamQuest etc on Redhat Linux 4.x/5.x/6.x
  • LVM management like create new filesystems, extend a filesystem, extend volume group on Redhat Linux servers.
  • Filesystems management, export and import volume groups.
  • UNIX supports on Oracle 10G/11G upgrades on Redhat Linux servers.
  • Installed and configured PostgreSQL database on RedHat Enterprise Linux servers.
  • Configured LDAP on HP-UX servers. LDAP-UX integration with eDirectory server on HPUX 11.31 servers.
  • Patch installation and firmware upgrades and managed patches on HPUX/AIX/Linux servers.
  • Performed IGNITE backup on all HP-UX servers.
  • Installed and configured HP MC Service Guard cluster.
  • User profile Administration and user security implementation.
  • Server hardening and PRR Production readiness review.
  • Server commissioning and decommissioning.
  • Disk space management, extending VGs and file systems etc.
  • Problem management and implementation of changes to the storage operational infrastructure.
  • Participated in storage capacity, measurement and planning.
  • Implemented cluster Red Hat Linux Cluster.
  • VERITAS Netbackup 6.5 client/server configuration, storage node configuration, scheduling backup configuration for linux servers.
  • Managing change requests Using REMEDY.
  • Change and Configuration Management CCM.
  • Troubleshooting and documentation and responded system alerts.

Confidential ., Chicago,ILJuly 2006 - Nov 2009

UNIX/Linux Administrator

Responsibilities:

  • Extensively worked with Solution Architects on multiple projects and provided UNIX support as needed.
  • Installation and configurations on HP-UX 10.20/11.0/11i, Linux Red Hat Enterprise v.3, and Sun Solaris 8.0/ 8i/9.0.
  • O.S Installation and up-gradation.
  • NPAR and VPAR configuration on HP-UX and AIX servers.
  • Extensively worked on Server consolidation and Storage consolidation.
  • Supporting an Enterprise Storage environment (EMC SAN) comprising DMX3000/DMX3/IBMFastT during server/storage migrations. Used utilities like openmigarator, symcli, inq, syminq etc.
  • Unix Production Environment support.
  • Supported multiple applications (SAS, Teradata) and projects in 24/7 environment.
  • Actively involved in Server consolidation, Oracle license consolidation and enterprise storage migrations at CapitalOne datacenters.
  • Supported Clearcase, Middleware MQ and Teradata etc.
  • LVM management on HP-UX and Linux servers.
  • Filesystems management, export and import volume groups.
  • Upgraded CapitalOne applications as needed.
  • UNIX support on Oracle 10G upgrades.
  • Patch installation on HP-UX servers (10.20/11.0/11.11i/11.23/11.31) and managed SD-UX patch depots.
  • Performed IGNITE backup.
  • User profile Administration and user security implementation.
  • Built new servers and implemented server hardening.
  • Server commissioning and decommissioning.
  • Supported Storage migration from EMC DMX3000 and IBMFasT to DMX3 Symmetrix Frames.
  • Disk space management and LVM management.
  • Installed and configured EMC SYMCLI, PowerPath and TimeFinder.
  • Involved in Datacenter storage consolidation.
  • Participated in storage capacity, measurement and planning.
  • Managing change requests Using HPOV Service desk HPSD.
  • Change and Configuration Management CCM.
  • Involved in BCE Business Continuity Exercise (DR).

TECHNICAL SKILLS:

Hadoop Framework: Hdfs, Map Reduce, Pig, Hive, Hbase, Sqoop, Zookeeper, Oozie, Hue, Hcatalog, Storm, Kafka, Spark, Key Value Store Indexer, Flume.

NoSQL Databases: Hbase.

Programming Language: Java, C#.Net, HTML.

Microsoft: MS Office, MS Project, MS Visio, MS Visual Studio 2003/ 2005/ 2008

Databases: MySQL, Oracle 8i/9i/10g, SQL Server, PL/SQL Developer.

Operating Systems: Linux, Cent OS, RHEL, …

Scripting: Shell Scripting, HTML Scripting, puppet

Programming: C, C++, Core Java, PL/SQL.

WEB Servers: Apache Tomcat, JBOSS and Apache Http web server, IIS.

Cluster Management Tools: HDP Ambari, Cloudera Manager, Hue, SolrCloud.

Environment: On-premise, AWS, Azure.IDE Net Beans, Eclipse, Visual Studio, Microsoft SQL Server, MS Office

We'd love your feedback!