Hadoop Admin Resume
Princeton, NJ
SUMMARY:
- 8 years of experience in design, development and implementations of robust technology systems, with specialized expertise in Hadoop Administration, Big Data, Linux Administration & Storage administration in EMC V - MAX/VNX/VPLEX, Symmetrix & NETAPP (NAS)
- Experience including Hadoop Development and Ecosystem Analytics, Development and Design of Java based enterprise applications.
- Experience in using various Hadoop infrastructures such as Map Reduce, Pig, Hive, Zookeeper, HBase, Sqoop, YARN, Spark, Kafka, Oozie, and Flume for data storage and analysis.
- Expertise in Commissioning and Decommissioning the nodes in Hadoop Cluster.
- Collecting and aggregating a large amount of Log Data using Apache Flume and storing data in HDFS for further analysis.
- Job/workflow scheduling and monitoring tools like Oozie.
- Experience in designing both time driven and data driven automated workflows using Oozie
- Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
- Installed and configured various Hadoop distributions like CDH-5.7 and HDP 2.6.5 and higher versions.
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
- Extensive experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
- Experience in Sentry, Ranger, Knox configuration to provide the security for Hadoop components.
- Good experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Experience in setting up of Hadoop cluster in cloud services like AWS and Azure.
- Knowledge on AWS services such as EC2, S3, Glaciers, IAM, EBS, SNS, SQS, RDS, VPC, Load Balancers, Auto scaling, Cloud Formation, Cloud Front and Cloud Watch.
- Experience in Linux System Administration, Linux System Security, Project Management and Risk Management in Information Systems.
- Involved in the functional usage and deployment of applications to Oracle WebLogic, JBOSS, Apache Tomcat, Nginx and WebSphere servers.
- Experience on working with VMware Workstation and Virtual Box.
PROFESSIONAL EXPERIENCE:
Confidential, PRINCETON, NJ
HADOOP ADMIN
Responsibilities:
- Hadoop clusters with 84 nodes including Prod and Dev environments with Hortonworks distribution.
- Build and maintained the clusters of HDP and HDF - NIFI (Hortonworks data platform & Hadoop Data Flow) in all the four environment Dev, Prod, UAT1 and UAT2.
- Installed and maintained Ambari tool to monitor/install the HDP and HDF.
- Installed Mysql and Postgres Database for the backend Database.
- Installed all the required services in all environments with prior requirements of business and developers uses.
- Implemented Row level security in environment through Ranger.
- Kerberos is enabled for authentication in the process of securing the cluster.
- Added additional nodes to the clusters for the better performance.
- Installed all the clusters in cloud on Microsoft Azure.
- Experienced on Shell scripting with good Unix and Linux knowledge. installed the services like Hive, Sqoop, Smart sense, etc.,
- Monitored HDFS file system/disk-space management, cluster & database connectivity, log files, management of backup/security and troubleshooting various user issues.
- Implemented various scripts for backing-up HDFS daily/weekly/monthly with retention period.
- Responsible for day-to-day activities which include HDFS support and maintenance, Cluster maintenance, creation/removal of nodes, Cluster Monitoring/Troubleshooting, Manage and review Hadoop log files, Backup restoring and capacity planning.
- Implemented different YARN capacity scheduler by creating queues to allocate resource guarantee to specific groups with application management which sits on top of Hadoop.
- Experience in methodologies such as Agile, Scrum, and Test-driven development
- Installed and maintained Atscale Application on all the platforms through edge node for the business usage.
- Hands on experience in installing Atscale 6, Atscale 7 and above.
Environment: HDP 2.6.5, HDP 2.6.3, HDF 2.2, HDF 2.3, Ambari 2.6.1, Ambari 2.6.2, Cloud Microsoft Azure nodes, Hive, Sqoop, Kafka, Spark, Spark2, Yarn, Hbase, Zookeeper, Smart sense and Slider
Confidential, SANTA CLARA, CA
HADOOP ADMIN
Responsibilities:
- Worked as Administrator for Hadoop Cluster (180 nodes).
- Performed Requirement Analysis, Planning, Architecture Design and Installation of the Hadoop cluster
- Experience in Upgrades and Patches and Installation of Ecosystem Products through Ambari.
- Automated the configuration management for several servers using Chef and Puppet.
- Monitored job performances, file system/disk-space management, cluster & database connectivity, log files, management of backup/security and troubleshooting various user issues.
- Responsible for day-to-day activities which include HDFS support and maintenance, Cluster maintenance, creation/removal of nodes, Cluster Monitoring/Troubleshooting, Manage and review Hadoop log files, Backup restoring and capacity planning.
- Design and deployment of clustered HPC monitoring systems, including a dedicated monitoring cluster.
- Develop and document best practices, HDFS support and maintenance, Setting up new Hadoop users.
- Responsible for the new and existing administration of Hadoop infrastructure.
- Included DBA Responsibilities like data modeling, design and implementation, software installation and configuration, database backup and recovery, database connectivity and security.
- Built data platforms, pipelines, and storage systems using the Apache Kafka, Apache Storm and search technologies such as Elastic search.
- Implemented concepts of Hadoop eco system such as YARN, MapReduce, HDFS, HBase, Zookeeper, Pig and Hive.
- In charge of installing, administering, and supporting Windows and Linux operating systems in an enterprise environment.
- Involved in Installing and configuring ranger for the authentication of users and Hadoop daemons.
- Experience in methodologies such as Agile, Scrum, and Test-driven development.
- Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration, Datawarehouse, and Migration, and installation on Kafka.
- Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
ENVIRONMENT: Hadoop, Map Reduce, Cassandra, HDFS, Pig, GIT, Jenkins, kafka, Puppet, Ansible, Maven Spark, Yarn, HBase, Oozie, MapR, NoSQL, ETL, MYSQL, agile, Windows, UNIX Shell Scripting
Confidential, SAN MATEO, CA
HADOOP ADMIN
Responsibilities:
- Installed and configured Hadoop and Ecosystem components in Cloudera and Hortonworks environments.
- Configured Hadoop, Hive and Pig on Amazon EC2 servers, Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2 cluster and Implemented Sentry for the Dev Cluster.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Worked on tuning the performance Pig queries.
- Converted ETL operations to Hadoop system using Pig Latin Operations, transformations and functions.
- Implemented best income logic using Pig scripts and UDFs Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Worked on YARN capacity scheduler by creating queues to allocate resource guarantee to specific groups.
- Implemented Hadoop stack and different bigdata analytic tools, migration from different databases to Hadoop (Hdfs).
- Developed backup policies for HADOOP systems and action plans for network failure.
- Involved in the User/Group Management in Hadoop with AD/LDAP integration.
- Resource management and load management using capacity scheduling and appending changes according to requirements.
- Implemented strategy to upgrade entire cluster nodes OS from RHEL5 to RHEL6 and ensured cluster remains up and running.
- Developed scripts in shell and python to automate lot of day to day admin activities.
- Implemented HCatalog for making partitions available for Pig/Java MR and established Remote Hive metastore using MySQL.
- Installed several projects on Hadoop servers and configured each project to run jobs and scripts successfully.
ENVIRONMENT: Cloudera Manager 4&5, Ganglia, Tableau, Shell Scripting, Oozie, Pig, Hive, Flume, bash scripting, Teradata, Kafka, Impala, Oozie, Sentry, CentOS.
Confidential, SAN ANTONIO, TX
SAN/ NAS ADMINISTRATOR
Responsibilities:
- Storage Area Networks (EMC VMAX, Network Attached Storage (Netapp 7-mode, C-mode), Data protection and Tape Backup (Data domain)/Restore technology in a mission critical enterprise environment.
- Worked on Creating Aggregates, Volumes, Qtrees, using command line interface and GUI (System Manager,Filer View from version 8 onwards in c-mode).
- Exported NAS volumes & qtrees, to Linux Servers, & created CIFS shares for windows access.
- Implemented multiprotocol accessing in NAS shares (ntfs security style share accessing in Unix & vice versa).
- Mapping unix to windows users and vice-versa for cross-platform accessing using user mapping and force group.
- Created CIFS share pointing to various subdirectories levels as per request and managing permissions on them.
- Co-ordinated with Storage vendors for storage upgrades (enginuity code upgrade on VMAX array, Performed OneFS updates on NAS Fabric switches).
- Migrated 10 peta bytes data through storage X & 7MMT tools in Netapp NAS environment from 7 mode to CMODE for over 500 applications with remediation.
- Experienced with Auto Provisioning in VMAX using SYMCLI and Symmetrix Management Console (SMC) and Virtual Provisioning for effective utilization of storage by the host as per the requirement.
- Experienced in managing CISCO switches using CISCO CLI, Fabric Manager, Device Manager and Brocade Switches using CLI and Web tools. Hands on experience in Zoning, Creating the devices, Meta Creation, Binding the devices, Mapping and Masking.
- Close collaboration with server, database and application data owners
- Track and manage service requests, enterprise storage usage and performance.
- Work closely with Storage/SAN/NAS and Backup Systems clients, providing proactive feedback for effective and efficient operation.
- Expertise hands-on experience on data migration using native windows (robocopy) and unix (rsync) tools.
- Operational tasks such as storage provisioning, decommissioning, documentation, troubleshooting, request and problem management.
- Specific, time-sensitive deadlines, Conducted general maintenance and administration of SAN Fabric and Storage Arrays.
- Follow standardized procedures as outlined in internal documents. Communicate periodically with users on status, including changes, problems, enhancements, and potential impacts.
- Assisted team members in configuring SMTP configurations to send email alerts.
- Worked on data domain for backup & exporting backup volumes.
- Participated in service restoration calls for supported managed SAN Fabric and Storage Arrays
- Assisted team members in configuring SMTP configurations to send email alerts.
- Configuring backup schedules on daily/weekly bases and also tape backup using NDMP in cluster mode 8.3
- Performed cleanup activities after the migration is completed by verifying all the Physical Disk are balanced.
- Daily effective and efficient management of SAN/NAS storage and backup infrastructure.
ENVIRONMENT: Symmetrix VMAX-1, VMAX40K, VNX 7500, VNX5300, V-Block, Netapp Data-ONTAP 8 mode &7-MODE, Netapp FAS2240-4, FAS3040, FAS3140, FAS3170, FAS3240, FAS6240, FAS8040, Cisco Fabric Manager, Cisco Device Manager, Performance Manager, Storage Scope, Symmetrix Optimizer,PowerPath, Navisphere Manager, Navisphere Analyzer, Solutions Enabler (SYMCLI), Windows 2003/W2K,Solaris 8,9,10. P4000, Fabric/Switch: Brocade DCX/48k.
Confidential
System Administrator
Responsibilities:
- Administer and maintain the Windows 2003 and 2008 Active Directory Infrastructure for Production.
- Migration/Move multiple application and print servers including data, shares and printers from Windows 2003 to Windows 2008
- Created multiple Device Groups depending the application and requirements of the environment.
- Provide user account administration for the distributed server environment and infrastructure Applications.
- Knowledge in Installing and configuring ESX 3.0/3.5 server,Configuring DRS and HA in Vsphere Knowledge in Fault tolerance, Migrating Virtual machines using Vmotion.
- Performing Storage V-Motion,Installing Virtual Centre and managing ESX hosts through VC.
- Deploying Vm’s with clones and Templates,Hot adding devices to virtual machines
- Providing high availability to Vm’s.
- Responsible for the general operation of the distributed server environment, including performance, reliability and efficient use of network resources.
- In-charged overseeing at the systems Allocated storage on EMC DMX-3, DMX-4, DMX 3000's/2000's and CX600/700's.
- Data Replication (BCV, SRDF)Used migration techniques like Array based i.e. SRDF/Open replicator to migrate the data from Old S DMX
- 3000/DMX 2000 to DMX-4/DMX-3 storage systems in UNIX, Windows, Linux and AIX environment for online/offline data migration.
- Implemented Business Continuance features like EMC Time Finder in Symmetrix/DMX arrays.
- Worked on NetApp SnapMirror, Flexvol, Snapshots, Netapp Filer view, Netapp Management console, Implementations of Aggregates, FAS 3200, Vseries 3200.
- Analyzed and maintained performance data to ensure optimal usage of the storage resources available.
- Created Raid groups, storage groups and bound the Clariion luns to the hosts using navisphere manager and navicli.
- Created larger luns (metas) to support the application needs using SYMCLI.
- Planned and configured File systems with CIFS and NFS protocol and implementation in multiprotocol environment
- Worked on NetApp SnapMirror, Flexvol, Snapshots, Netapp Filer view, Netapp Management console, Implementations of Aggregates, FAS 3200, Vseries 3200.
- Configured CIFS servers and VDM's for Windows only environment. Protected through DPM 2006,2007, SP1 2012
ENVIRONMENT: Symmetrix DMX 3000, VNX, Clariion CX3-80, CX3-20, CX3-10c, CX700, CX300, CX500, Maintained NetApp FAS 270, 960 and 3040 series, Brocade 5300 and 4800.