We provide IT Staff Augmentation Services!

Lead Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

New, JerseY

SUMMARY

  • 10 years of IT experience which includes 7 years of experience with Hadoop, Cloudera and Hortonworks. HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, Oozie, Sqoop, Spark, Kafka, NiFi, Druid, Hadoop1, Hadoop2).
  • Have working knowledge of HaaS, PaaS. Can - do attitude. Positive and proactive style of working with the team.
  • Expertise in identify the product road map and product management activities.
  • As an Architect responsible for Modern Data Architecture, Hadoop, Big data, data and BI requirements and defining the strategy, technical architecture, implementation plan, management and delivery of Big Data applications and solutions.
  • Automated most of the repeated tasks, metadata backups, user logs, job history logs.
  • Experience in administering Hadoop. Experience in Installation, Configuration, Testing, Backup, Recovery, Customizing and Maintenance of clusters using Apache Hadoop, Cloudera Hadoop.
  • Proficient with container systems like Docker and container orchestration like EC2, GKE Container Service, Kubernetes, worked with Terraform.
  • Experience in SSL/TLS Integration using layer four Ingress in Kubernetes using certmanager, let’s encrypt.
  • Expertise in Identifying issues with Apache Nifi, HDF nifi, cluster troubleshooting, R&D customized .nar file deployed in local for authentication with base64 password, authorization in apache NiFi without any AD, LDAP.
  • Experience in using Flume to load logs files into HDFS.
  • Expertise in using Oozie for configuring job flows.
  • Experience in capacity planning and analysis for Hadoop infrastructures/clusters
  • Experience in File conversation formats, compression formats.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in Load log data into HDFS using Flume.
  • Hadoop security and access controls (Kerberos, Active Directory)
  • Hadoop cluster integration with Nagios and Ganglia.
  • Experience in developing Test Strategy, Test plan, Test cases, Test scripts and traceability matrices.
  • Managing, Cluster alerts using Artificial Intelligence (AI) bots.
  • Expertise in implementation and designing of disaster recovery plan for Hadoop cluster
  • Knowledge of networking (TCP/IP, Ethernet), FTP, NFS, DNS, DHCP and RAID levels.
  • Experience in scripting for automation, and monitoring using Shell & Python scripts.
  • Sound understanding of IT Infrastructure Administration with project management skills.
  • Good understanding of server hardware and hardware Architecture.
  • Implemented cloud services using Docker and OpenShift.
  • Team player with good management, analytical, communication and interpersonal skills.
  • Technical professional with management skills, excellent business understanding and strong communication skills.

TECHNICAL SKILLS

Big Data Technologies: Hadoop (HDFS, Hive, Pig, Flume, Yarn, Oozie, Sqoop, Zookeeper, HBASE, Kafka, Ranger, NiFi, NiFi Registry, Spark, Scala, Cloudera & Hortonworks).

Operating System: RHEL 6,7, Centos 7,8, Ubuntu, Windows10, XP/Vista/2010

Script Languages: Shell scripts, Python. Java Script, Ansible, Terraform.

Networks: NFS, DHCP, DNS, FDNS, NDM

RDBMS: MySQL, Postgre SQL, Xtra DB, Percona. Mongo DB.

Protocols: SSH, SFTP, HTTP, nslookup, TCP/IP, and tcpdump, Route 53

Programming Languages: C, SQL, HTML, Java, XML and Pig Latin, Python2,3.x

Environment: CLOUDERA, HORTONWORKS, APACHE, AWS.

CDH: 4.0/6.2.X, HDP 1.3/2.3.4, HIVE 0.7/0.13, SQOOP, APACHE HADOOP 2.4/2.7.KERBEROS, SENTRY, CDF, NIFI 1.X, RANGER. AMELIA (AI), NAGIOS, GANGLIA. PROMETHEUS, GRAFANA, DRUID, AUTOSYS AMAZON EC2, S3. APACHE NIFI

Version control: SVN, GITHUB.

PROFESSIONAL EXPERIENCE

Confidential, New Jersey

Lead Hadoop Administrator

Responsibilities:

  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Aligning with the systems engineering team to propose and deploy new hardware and
  • Software environments required for Hadoop and to expand existing environments.
  • Cluster maintenance as well as creation and removal of nodes using tools like Ganglia,
  • Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools.
  • Performance tuning of Hadoop clusters and Hadoop MapReduce routines.
  • Screen Hadoop cluster job performances and capacity planning· Screen Hadoop cluster job performances and capacity planning.
  • Screen Hadoop cluster job performances and capacity planning.
  • Manage and review Hadoop log files.
  • File system management and monitoring.
  • Diligently teaming with the infrastructure, network, database, application and business
  • Intelligence teams to guarantee high data quality and availability.
  • Collaboration with application teams to install operating system and Hadoop updates,
  • Patches, version upgrades when required.

Environment: Hortonworks HDP-3.1.4.x, HDF 3.4.x, Apache Apex, Data Torrents, JDK 1.8x, REDHAT 7.x, Centos 7, 8, Airflow, Druid, Apache Kafka, Spark, Python3, Prometheus, Grafana, Kubernetes, Terraform, Ansible, AWS, Azure, GCE, GKE, Docker.

Confidential, Fremont, CA

Hadoop Admin (Lead)

Responsibilities:

  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Aligning with the procurement team to propose and deploy new hardware required for Hadoop to expand existing environments.
  • Working with data delivery teams to setup new users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Kafka, NiFi, Druid and Spark-Scala access for the new users also for the apache clusters we are using FreeIPA tool for centralized authentication, authorization, ldap and for maintaining DNS.
  • Cluster maintenance as well as commissioning decommissioning nodes, we do have apache Kafka, NiFi, spark clusters in place, exploring new features by doing POC with the latest bundles, documenting pros and cons.
  • Developed python scripts to be used for monitoring, keep track of user activities in the cluster, generate report weekly basis.
  • Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of data and exploring of optimizing.
  • Implemented HDFS data Confidential rest encryption end-to-end encryption of data read from and written to HDFS, created multiple encryption zones.
  • Monitoring ongoing cluster job performances and capacity planning, monitoring cluster connectivity and security, review log files, archiving old data and file system management are my day to day activities.

Environment: Hortonworks HDP-3.1.4.x, HDF 3.4.x, Apache Nifi 1.9x, 1.13.2, JDK 1.8x, REDHAT 7.x, Centos 7, 8, Airflow, Druid, Spark, Scala, Python3, Prometheus, Grafana, Kubernetes, Terraform, Ansible, VMware, AWS, Azure, GCE, GKE.

Confidential, Charlotte NC

Application Architect

Responsibilities:

  • Working knowledge of Hadoop as a Service (HaaS) Technical Strategy & Architecture in the platform/initiatives.
  • Provided technical recommendations and trade-offs which address business needs and timelines.
  • Participated in cross-functional, cross-discipline architecture teams to enhance/set architectural direction for key business initiatives.
  • Provided technical information if impacting existing system operations as conceptual and logical architecture diagram to meet strategic business needs like, global information security and standards.
  • Documenting frequent ad-hoc requests, provide useful information to the apps/team various audiences including technology and business executives via SharePoint, enterprise confluence wiki.
  • Communicating effectively with the teams to gather requirements, introduce new tools to the existing platform environment and promote adherence to the bank policies and standards by initiating POCs.
  • As an architect able to drive initiatives, recommended best practices, patterns, reuse and acceleration.
  • Participated in a POC for blue data containerization and virtualization (Kubernetes, Docker), Building and configuring Kubernetes-based infrastructure on premise also in the cloud, Kube Proxy, Ingress, etc. and leverage k8s for deploying services such as CDP, Kafka, Spark and other distributed service.
  • Understanding infrastructure, security concerns, leveraging tools and strategies, worked with the vendors to thoroughly understand tool behavior, ability to deploy & integrate on the platform, best practices into problem avoidance and continues improvement.

Environment: Cloudera CDH 5.14, 6.2x, CDP DC, CFM NiFi 1.9x, JDK 1.8x, REDHAT 7.x, Autosys, GitHub, Ansible, Kubernetes, Dockers, Toad, AppHost VMs, OpenShift, Hitachi S3 Object Storage. APIs, JDBC, ODBC calls.

Confidential, Chicago IL

Hadoop Administrator

Responsibilities:

  • Developed Hadoop monitoring service checks using cloudera API’s.
  • Responsible for doing capacity planning based on the data size requirements provided by end-clients.
  • Participated in system configuration design to finalize on the master and slave configuration for the cluster.
  • Experience in designing, implementing and maintaining of high performing Hadoop clusters and integrating them with existing infrastructure.
  • Experience managing users and permissions on the cluster, using different authentication methods.
  • Involved in regular Hadoop cluster maintenance such as patching security holes & updating system packages.
  • Experience in doing performance tuning based on the inputs received from the currently running jobs.
  • Preparing Standard Operating Procedures (SOPs) for the process related tasks handled by team members.
  • Responsible for onsite-offshore coordination for smoother operations.
  • Troubleshooting Issues, Diagnose, remediate through automation.
  • Take admin actions on services and roles, such as start, stop, restart, failover, etc. Also available are the more advanced workflows, such as setting up high availability and decommissioning.
  • Installed CDH 5.11v Cloudera Manager also integrated other third-party tools and components.
  • Coordinating with clients, application teams, vendors for ongoing process.
  • Maintaining cluster, monitor user jobs and other cluster activities, taking backups frequently, and tuning the cluster performance.
  • Collaborating with offshore teams, vendors, various development teams, during failover, maintenance downtime.
  • Ability to document existing processes and recommend improvements.
  • Improving monitoring checks based on Nagios application (IPmon, IPdiscovery, IPcmdb)
  • Having knowledge on data security, coordinating with centralized data security teams.

Environment: Cloudera Hadoop, Redhat, CentOS, Oozie 4.2, Sqoop, Hive, Flume, HBase, SHA, SSH, Jdk 1.7, IPmon, Ipdiscovery, Ipcenter, IPcmdb, VMware VSphere client.

Confidential

Hadoop / MySQL Administrator

Responsibilities:

  • Migrated 14TB of data from HBase Cluster to MySQL Database.
  • Performed Data validation algorithm and test data.
  • Experience in using EMC tool for data migration techniques.
  • Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs.
  • Worked on importing and exporting data from Oracle database into HDFS and HIVE using Sqoop.
  • Monitored and analyzed of the Map Reduce job executions on cluster Confidential task level.
  • Extensively involved in Cluster Capacity planning, Hardware planning, Performance tuning of the Hadoop Cluster.
  • Implemented Rack Awareness for data locality optimization.
  • Optimized and tuned the Hadoop environments to meet performance requirements.
  • Hand-On experience with AWS cloud with EC2, S3.
  • Collaborating with offshore team.
  • Ability to document existing processes and recommend improvements.

Environment: Cloudera Hadoop, OOzie 4.2, Redhat, CentOS, Sqoop1.4.6, Hive 1.2, HBase, Oracle Sql Developer, Teradata, SVN, SFTP, SSH, Eclipse, Jdk 1.7, Maven.

Confidential, San Ramon, CA

Hadoop Administrator

Responsibilities:

  • Hands on Installation and configuration of Hortonworks Data Platform HDP 2.3.4
  • Worked on installing production cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration
  • Worked on Hadoop Administration, responsibilities include software installation, configuration, software upgrades, backup and recovery, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster up and run on healthy.
  • Implemented the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
  • Designed, developed and implemented connectivity products that allow efficient exchange of data between the core database engine and the Hadoop ecosystem.
  • Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs.
  • Implemented Name Node High Availability on the Hadoop cluster to overcome single point of failure.
  • Worked on YARN capacity scheduler by creating queues to allocate resource guarantee to specific groups.
  • Worked on importing and exporting data from Oracle database into HDFS and HIVE using Sqoop.
  • Monitored and analyzed of the Map Reduce job executions on cluster Confidential task level.
  • Extensively involved in Cluster Capacity planning, Hardware planning, Performance tuning of the Hadoop Cluster.
  • Wrote automation scripts and setting up crontab jobs to maintain cluster stability and healthy.
  • Installed Ambari on an already existing Hadoop cluster.
  • Implemented Rack Awareness for data locality optimization.
  • Optimized and tuned the Hadoop environments to meet performance requirements.
  • Hand-On experience with AWS cloud with EC2, S3.
  • Collaborating with offshore team.
  • Ability to document existing processes and recommend improvements.
  • Shares knowledge and assists another team member as needed.
  • Assist with maintenance and troubleshooting of scheduled processes.
  • Participated in development of system test plans and acceptance criteria.
  • Collaborate with offshore developers to monitor ETL jobs and troubleshoot steps.

Environment: Hortonworks HDP2.3x, Ambari, Oozie 4.2, Sqoop1.4.6, Hive 1.2, Redhat, CentOS, Mapreduce2, Ambari, blueprints, Oracle Sql Developer, Teradata, SVN, SFTP, SSH, Eclipse, Jdk 1.7, Maven.

Confidential

System Engineer

Responsibilities:

  • Involved in installation and configuration of Linux / UNIX Servers and Work Stations
  • Distributions include Red Hat.
  • Responsible for designing, implementing, troubleshooting and administration of servers,
  • Desktops, peripherals and systems on Windows NT, Linux, on Local Area Networks.
  • Server operating system installation, management of RHEL5, CentOs5 and Ubuntu.
  • Use various virus removal techniques to clean the infected computers as per user's request.
  • Installation of Windows Operating systems XP, Vista, Windows 7, 2003 server, 2008 R2 server.
  • Performed RPM and YUM package installations, patch and another server management.
  • Managed systems routine backup, scheduling jobs, enabling cron jobs, enabling system logging and network logging of servers for maintenance.
  • Monitoring System Metrics and logs for any problems.
  • Security Management, providing/restricting login and sudo access on business specific and Infrastructure servers & workstations.
  • Maintaining the RDBMS server and Authentication to required users for databases.
  • Running crontab to back up data and troubleshooting Hardware/OS issues.
  • Handling and debugging Escalations from L1 Team.
  • Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
  • Monitoring day-to-day administration and maintenance operations of the company network and systems working on Linux and Solaris Systems.
  • Extensively used log4j for logging the log files.

Environment: Linux, RHEL/SUSE, NFS, AUTOFS, NTP, Telnet, CentOS, FTP, HTML and UNIX/Linux.

Confidential

Internship

Responsibilities:

  • Experience with administering Red Hat Linux, HP UNIX, and Solaris UNIX based systems.
  • VMware and VSphere experience.
  • Experience with Clustering Technologies on Red Hat Linux, HP UNIX, and Solaris UNIX based systems.
  • Performing Linux and Unix Systems Administration duties
  • Experience with installing Operating System, software packages, and patches.
  • Have knowledge to work in Datacenter, Vendors. Knowledge of hardware administration.
  • Understanding of hardware and software RAID configurations and building is desirable.
  • Perform OS tuning, understand networking, perform user management, and perform security hardening, review OS logs as needed and perform system backups.
  • Familiarity with VMWARE / ESXi installation, vCenter installation, configuration, troubleshooting
  • Experience with scripting, particularly in shell scripting. Experience with PERL, SH and CSH variants, KSH required, more the better.
  • Lead Root Cause Analysis, Lessons Learned and prevention processes to provide high standard deliverables that meet SLA (Service Level Adherence) and ensure CSAT (Customer Satisfaction).

Environment: Linux, RHEL, NFS, AUTOFS, NTP, Telnet, CentOS.

We'd love your feedback!