Sr. Hadoop Administrator Resume
Charlotte, NC
SUMMARY
- Around 7 years of IT experience including 4+ years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster.
- Experienced in installation, configuration, monitoring and administration of ecosystem components such as HIVE, PIG, SQOOP, FLUME, ZOOKEEPER, SCALA, OOZIE and HBASE.
- Extensive experience in cluster planning, installing, configuring and administrating Hadoop cluster for major Hadoop distributions like Cloudera and Hortonworks.
- Possess knowledge on NoSQL database administration.
- Experienced in understanding the client's Big Data business requirements and transform it into Hadoop centric technologies.
- Hands on experience in Installing, Configuring and managing the Hue.
- Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.
- Worked with Sqoop Importing and exporting data, from different databases into HDFS, HBase and Hive.
- Defined job flows in Hadoop environment using Oozie.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Loading log files from multiple sources directly into HDFS.
- Hands on experience in Linux admin activities on RHEL & Cent OS.
- Experience in benchmarking, backing up and recovery of Namenode metadata.
- Experience in commissioning and decommissioning of nodes.
- Adept at configuring NameNode High Availability.
- Worked on setting up Namenode high availability for large production cluster and designed Automatic failover control using Zookeeper and journal nodes.
- Strong knowledge on Hadoop HDFS 1.x and 2.x architectures and Map-Reduce framework.
- Experienced in deploying and managing the multi-node cluster for development, testing and production environments.
- Experienced in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm.
- Used Cluster Monitoring tools like Ganglia and Nagios.
- Well-versed with multiple operating systems such as Windows, Dos, UNIX, LINUX.
- Excellent command on change management and coordinating deployments in distributed environments.
- Experienced in Installation of Operating Systems, Packages and Patches, adding peripherals, maintaining user accounts, System Security maintenance, performance tuning, troubleshooting at various levels.
- Hands on experience in configuration management systems like Chef.
- Security management like performing security health checks as per policies / procedures, patching the servers based on the advisories for the applications and operating systems.
- Hands-on experience in diagnosing, troubleshooting various networking, hardware & Linux server's services issues and performing preventive maintenance.
- Experienced in writing / modifying shell scripts for process automation of systems, applications, backups.
TECHNICAL SKILLS
Big Data Ecosystem: Cloudera, Hortonworks, MapR, HDFS, HBase, Hadoop MapReduce, Zookeeper, Hive, Spark, IMPALA, Pig, Sqoop, Flume, Oozie, YARN.
Database: Oracle, MySQL, Postgres, MS Access, NoSQL
Scripting Languages: Bash, UNIX shell scripting.
Operating Systems: Red Hat Linux 4.0, RHEL-5.4, RHEL 6.4, RHEL 7, IBM-AIX, HPUX 11.0, HPUX 11i, UNIX, VMware ESX 2.x. Windows XP, Server 2008/2012.
Management: Cloudera Manager, Ambari
Monitoring Tool: Splunk, IBM Guardium.
Cloud Computing: Amazon AWS Cloud, Azure, EC2 and VPC instances, S3.
SDLC: Agile (Scrum), Waterfall.
PROFESSIONAL EXPERIENCE
Confidential, Charlotte, NC
Sr. Hadoop Administrator
Responsibilities:
- Acting in the highest-level technical role as an individual contributor and/or team lead for the most complex computer applications. Administering the Enterprise Risk Management Technology (EBT) platform and providing day to day support.
- Installing, configuring, upgrading and providing support for Hadoop Hortonworks clusters and Quantitative Analytical tools in the Platform.
- Utilizing a thorough understanding of available technology, tools, and existing designs. Supporting development of Hortonworks Hadoop HDP/HDF platform, Spark clusters, ETL/ELT tools, open source tools (R, Python, Scala/Java) and Visualization tools. Troubleshooting Data Wrangling tools like Jupyter/Zeppelin Notebooks and tools. Provision Notebooks and Development tools and libraries as requested by Model Validators. Working with CI/CD tools like Git, JIRA, Jenkin, Artifactory.
- Installing, configuring, upgrading and providing support for Hadoop Hortonworks clusters and Quantitative Analytical tools in the Platform.
- Working on the most complex problems where analysis of situations or data requires evaluation of intangible variance factors. Working with CMOR business to act as liaison to understand the model validation efforts and work along with the technology team.
- Planning, performing, and acting as the escalation point for the most complex platform designs, coding, and testing. Optimizing and troubleshooting application performance with tuning of Hive/HQL, LLAP and Spark.
- Leading most complex multiple modeling, simulations, and analysis efforts. Design and build model solutions for business requirements using Big Data technologies. Working as a data scientist to understand the business needs and support as needed.
- Acting as expert technical resource to programming staff in the program development, testing, and implementation process. Administering, installing, configuring, Hortonworks HDP services like HDFS, YARN, MapReduce, Hive, Ranger, Zookeeper and Spark.
Environment: Ambari (2.7.1 & 2.7.5), Ranger, HDP (3.1.0 & 3.1.5), Kerberos, HDFS, Hive,Tez, Map Reduce, Shell Scripting, Jupyter, Git, Jenkin, spark2, solr, Sqoop, Kafka, Zookeeper, Redhat Linux 7.9.
Confidential, Charlotte, NC
Sr. Hadoop Administrator
Responsibilities:
- Experience managing and administrating Hadoop platforms and good architecture insight on large Hadoop data & analytics environments running MapR 5/6, CM & CDH 5.14, 6.2.1, and RHEL 6.10+/7.6+ services.
- As a platform admin & architect install/upgrade CDH & MapR components, set up new clusters, collaborate with SA team on OS revisions, conduct capacity planning, optimize platform performance, test/troubleshoot Hadoop components, explore new big data tools & technologies; collaborate with development teams to create common libraries, establish Hadoop programming standards/best practices.
- CM & CDH Upgrades and Rollbacks (5.14.4 - 6.2.1 ).
- Programming experience in Java and Linux Shell scripting.
- Understanding of Linux OS, networking, security using Kerberos/Active Directory/SSL/PowerBroker, space management to run Hadoop on bare metal servers.
- Experienced in Agile methods & Jira storyboarding.
- Data Analytics Platform Environment administration & support for Cloudera clusters and MapR clusters.
- Additional resource needed to support and transition of Global Markets applications from MapR to Cloudera Hadoop.
- In addition will be installing/administering new Hadoop clusters, expanding the existing clusters to meet growing demands.
- The resource will also work on new Proof of Concepts for products like IBM BigSQL, Data Privacy/Data Encryption tools, Object Storage.
- Upon successful POC, products like Bluedata, Nifi, Atscale, SAS InDB, Protegrity need to be installed and configured for tenant use.
Environment: Cloudera Manager (5.14.4 & 6.2.1), Sentry, MapR 5/6, Powerbroker, Kerberos, HDFS, HBase, Hive, Streamsets, Impala, Map Reduce, Shell Scripting, spark, solr, Sqoop, Kafka, Flume, Oozie, Zookeeper, Redhat Linux 6 & 7.0
Confidential, Raritan, NJ
Sr. Hadoop Engineer/Administrator
Responsibilities:
- Analyzes, designs, creates and implements Cloudera infrastructures, including access methods, device allocations, logical and physical infrastructure designs, validation checks, organization and security.
- Provide 24x7 production system support, Monitor Hadoop cluster connectivity, troubleshoot, security Manage and review Hadoop log files.
- Cloudera Infrastructure Migration i.e migrate Dev, QA, Prod applications (databases and data) from old to new Cloudera infrastructure which has higher resources.
- Upgrade CM & CDH from 5.11.2 to 5.15.
- POC’s on “Cloudera on Azure using Cloudera Director”, NiFi, Trifacta, Streamsets, Cloudera Data Science Workbench (CDSW), Control Hub, etc.
- Automate disk space utilization reports of projects for all environments.
- Experience in multiple Cloudera platforms and workloads migration from On premise to On premise or On premise to Cloud (Azure).
- Strong Cloudera admin with experience in Cloudera Manager, Cloudera Navigator, KUDU, HDFS/HIVE and spinning the cluster and enterprise level security and compliance handling.
- Assists in system planning, scheduling, and implementation.
- Initiates corrective actions to stay on schedule.
- Cloudera cluster installs (5.11, 5.15.2, 6.2) and tests complex big data deployments.
- Develops and implements recovery plans and procedures.
- Solid administrative knowledge of Cloudera Distribution of Hadoop.
- Experience with database replication and scaling.
- Design, install and maintain highly available systems (including monitoring, security, backup, and performance tuning).
- Experience in Linux (RHEL 6 & 7) and Scripting.
Environment: Cloudera Manager (5.11, 5.15.2 & 6.2), Sentry, Kerberos, HDFS, HBase, Hive, Streasets, Impala, Map Reduce, Shell Scripting, spark, solr, Sqoop, Kafka, Flume, Oozie, Tomcat, Zookeeper, Splunk, Redhat Linux 7.0.
Confidential, Charlotte, NC
Hadoop Security Engineer/Administrator
Responsibilities:
- Responsible for monitoring Cluster Maintenance, Upgrades, Patching, Fine Tunining, Commissioning and Decommissioning nodes, Monitoring, Troubleshooting, Manage and Review data backups and log files.
- Collaborate with key stakeholders within GIS and Cyber Security to establish logging standards to address various governance and develop specific use cases to address business needs. Also apply technical expertise in Hadoop, Cloudera and big data storage to the task of applying real-time database activity monitoring technology using IBM Guardium.
- Engage with information security and technical leaders to determine best approach for applying existing bank security policies to new data storage technologies.
- Develop automation for security tools management and build data pipelines from Hadoop to ArcSight i.e through Cloudera Navigator, Kafka, Guardium appliances, Splunk and ArcSight.
- Tested all the available commands for Hive, Impala, Hdfs and HBase and analyzed the audits of those commands in Cloudera Navigator. Also collaborated with Cloudera Support to fix issues with specific commands which are having issues in auditing.
- Implemented real-time Hadoop Monitoring use cases like Hadoop Failed logins, Unauthorized Admin commands, Unauthorized Access to Confidential data and Unauthorized Updates to Confidential data in GIS Production.
- Used Impala for Data Analytics as per the business requirement.
- Build DR Cluster to store and safeguard data of GIS Production Cluster.
- Documented the workflow, use cases, issues as per the requirement.
Environment: Cloudera Manager (5.9.1, 5.13.2 & 5.14), Sentry, Kerberos, HDFS, HBase, Hive, Impala, Map Reduce, Shell Scripting, spark, solr, Sqoop, Kafka, Flume, Oozie, Tomcat, Zookeeper, Splunk, IBM Guardium, Redhat Linux 7.0.
Confidential, Parlin, NJ
Hadoop/Systems Engineer
Responsibilities:
- As a Systems Engineer install hardware/software, gather business requirements and convert them into computer system specific technical requirements.
- Provisioning, installing, configuring, monitoring, and maintaining Cloudera cluster’s (5.5.4), HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig and Hive.
- Provided support to IT team during SDLC, developing scripts, write, update, debug repository for code revision.
- Work with Senior Management to develop forms and reports to structure different types of operational data collected.
- Attending the meetings to exchange, update, correct design and programmatic issues.
- Provide support to IT team during Software Development Life Cycle (SDLC).
- Develop scripts, write, update, debug repository for code revision
- Troubleshooting and resolving system faults and make project documentation
- Be able to work with Senior Management to develop forms and reports to structure different types of operational data collected.
- Troubleshooting, Manage & review data backups, Manage & reviewHadoopdata.
- Involved in loading data from UNIX file system to HDFS.
- Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive.
- Monitored multipleHadoopclusters environments using Cloudera Manager.
- Loading the data from the different Data sources from MySQL into HDFS using sqoop and load into Hive tables, which are partitioned.
- Built automated set up for cluster monitoring and issue escalation process.
- Administration, installing, upgrading and managing distributions ofHadoop(Cloudera manager), Hive, HBase.
Environment: HDFS, Map Reduce, Shell Scripting, spark, solr, Pig, Hive, HBase, Sqoop, Flume, Oozie, Zoo keeper, cluster health, monitoring security, Redhat Linux 7.0, impala, Cloudera Manager5.5.4, Matlab, C++, Java.
Confidential
Linux Administrator
Responsibilities:
- Installing and updating packages using YUM.
- Installing and maintaining the Linux servers.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
- Ensured data recovery by implementing system and application level backups.
- Performed various configurations which include networking and IPTable, resolving host names and SSH keyless login.
- Automate administration tasks through the use of scripting and Job Scheduling using CRON.
- Monitoring System Metrics and logs for any problems.
- Running cron-tab to back up data.
- Adding, removing, or updating user account information, resetting passwords, etc.
- Using Java Jdbc to load data into MySQL.
- Maintaining the MySQL server and Authentication to required users for databases.
- Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
Environment: Redhat 5x, Solaris 9/10, Linux, VMware, TCP/IP, Linux, CRON, MySQL, Windows 2008 server.
