Sr. Hadoop Admin Resume
Greensboro, NC
SUMMARY
- 8+ years of professional IT experience which includes experience in Big data ecosystem related technologies.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Pig, Hive, Sqoop, Oozie, HA, HBase, Yarn and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, and Flume.
- Well versed wif installation, configuration, managing and supporting Hadoop cluster using various distributions like Apache Hadoop, Cloudera - CDH and Hortonworks HDP.
- Experience in managing and reviewing Hadoop log files.
- Experienced wif large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
- Coordinated wif technical teams for installation of Hadoop and third party related applications on systems
- Good Knowledge in Hadoop, Pig, Hive, Sqoop, Yarn and designing and implementing Map/Reduce jobs to support distributed data processing and process large data sets utilizing the Hadoop cluster.
- Capable of processing large sets of structured, semi-structured, unstructured data and supporting systems application architecture.
- Handled importing of data from various data sources, performed transformations using Hive, Pig, loaded data into HDFS and HBase.
- Experience in providing security for Hadoop Cluster wif Kerberos
- Familiar wif data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
- Hands-on programming experience in various technologies like JAVA, J2EEHTML, XML
- Loading the data from the different Data sources like (Teradata and DB2) into HDFS using sqoop and load into partitioned Hive tables
- Experience on Hadoop cluster maintenance, including data and metadata backups, file system checks, commissioning and decommissioning nodes and upgrades.
- Conducted detailed analysis of system and application architecture components as per functional requirements
- Hands on experience wif opens source monitoring tools including; Nagios and Ganglia.
- Good Knowledge on NoSQL databases such as Cassandra, MongoDB.
- Monitor and manage Linux servers (Hardware profiles, Resource usage, Service status etc) Server backup and restore Server status reporting, Managing user accounts, password policies and files permissions.
- Tech-functional responsibilities include interfacing wif users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
TECHNICAL SKILLS:
Big Data Technologies: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, YARN, Hadoop HA
BigData: Apache Hadoop, Cloudera Hadoop, Hortonworks
Scripting Languages: Javascript, Shell, Python
Programming Languages: Java, C++, C,SQL,PL/SQL
Web Services: SOAP(JAX-WS), WSDL, SOA, Restful(JAX-RS), JMS
Application Servers: Apache Tomcat, WebLogic Server, WebSphere, JBoss
Databases: Oracle9.x, 10g, 11g MS SQL Server, MySQL Server, DB2, HBaseMongoDB, Cassandra
Version Control: PVCS, CVS, VSS
Networking & Protocols: TCP/IP, Telnet, HTTP, HTTPS, FTP, SNMP, LDAP, DNS.
Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows
PROFESSIONAL EXPERIENCE
Sr. Hadoop Admin
Confidential, Greensboro, NC
Responsibilities:
- Installed, Configured and Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Extensively worked wif Cloudera Distribution Hadoop, CDH 5.x, CDH4.x
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Configured various property files like core-site.xml, hdfs-site.xml, yarn-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
- Worked on Hue interface for querying the data.
- Automating system tasks using Puppet.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Timely and reliable support for all production and development environment: deploy, upgrade, operate and troubleshoot.
- Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version.
- Implemented a Major version upgrade from 4.7.1 to 5.2.3 and also done few Minor Version upgrades from 5.2.5 to 5.3.3 and 5.3.3 to 5.4.3
- Created Hive tables to store the processed results in a tabular format.
- Utilized cluster co-ordination services through ZooKeeper.
- Designed, implemented and managed the Backup and Recovery environment.
- Experience in understanding the security requirements for Hadoop and integrating wif Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principals, generation key tab file each service and managing keytab using keytab tools.
- Configuring Sqoop and Exporting/Importing data into HDFS.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
- Configured NameNode high availability and NameNode federation.
- UsedSqoop to import and export data from RDBMS to HDFS and vice-versa.
- Performance tuned the Hadoop cluster to improve the efficiency.
- Involved in configuring Quorum based HA for NameNode and made the cluster more resilient.
- Integrated Kerberos into Hadoop to make cluster more strong and secure from unauthorized users.
Environment:CDH 5.4.3 and 4.x, Cloudera Manager CM 5.1.1, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, kafka, Redhat/Centos 6.5
Hadoop Admin
Confidential, San Leandro, CA
Responsibilities:
- Installed/Configured/Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop CDH 3.x, CDH 4.x.
- Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
- Involved in clustering of Hadoop in the network of 70 nodes.
- Experienced in loading data from UNIX local file system to HDFS.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in developing new work flow Map Reduce jobs using Oozie framework.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Worked on upgrading cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
- Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
- Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
- Use of Sqoop to import and export data from RDBMS to HDFS and vice-versa.
- Used Hive and created Hive external/internal tables and involved in data loading and writing Hive UDFs.
- Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports.
- Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Created Hive queries to compare the raw data wif EDW reference tables and performing aggregates.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Automated workflows using shell scripts to pull data from various databases into Hadoop.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3.x, CDH4.x, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux
Hadoop Admin
Confidential, Salt Lake City, Utah
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop clusters and Cloudera Distribution Hadoop CDH for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
- Performed Installation and configuration of Hadoop Cluster of 90 Nodes wif Cloudera distribution wif CDH3.
- Installed Namenode, Secondary name node, Job Tracker, Data Node, Task tracker.
- Performed benchmarking and analysis using Test DFSIO and Terasort.
- Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing wif blacklisted task trackers.
- Implemented Rack Awareness for data locality optimization.
- Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
- Used Ganglia and Nagios to monitor the cluster around the clock.
- Created a local YUM repository for installing and updating packages.
- Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Implemented Name node backup using NFS.
- Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
- Worked wif the Linux administration team to prepare and configure the systems to support Hadoop deployment
- Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
- Implemented Capacity schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Worked on importing and exporting Data into HDFS and HIVE using Sqoop.
- Worked on analyzing Data wif HIVE and PIG
- Helped in setting up Rack topology in the cluster.
- Helped in the day to day support for operation.
- Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Deployed Network file system for Name Node Metadata backup.
- Designed and allocated HDFS quotas for multiple groups.
- Configured and deployed hive metastore using MySQL and thrift server.
Environment: Hadoop distributions (CDH3), HDFS, MapReduce, Hive, Pig, Sqoop, Unix, Red hat Linux 6.x,7.x, Shell Scripts, Nagios, Ganglia monitoring, Kerberos, Shell scripting
Unix/Linux System administrator
Confidential, Chicago, IL
Responsibilities:
- Installed, configured and administered RHEL 5/6 on VMware server 3.5.
- Converted a lot of physical server on Dell R820 into virtual machines for a Lab environment.
- Managed file space and created logical volumes, extended file systems using LVM.
- Performed daily maintenance of servers and tuned system for optimum performance by turning off unwanted peripheral and vulnerable service.
- Managed RPM Package for Linux distributions
- Monitored system performance using TOP, FREE, VMSTAT & IOSTAT.
- Set up user and group login ID's, password, ACL file permissions, and assigned user and group quota
- Configured networking including TCP/IP and troubleshooting.
- Designed Firewall rules to enable communication between servers.
- Monitored scheduled jobs, workflows, and related to day to day system administration.
- Respond to tickets through ticketing systems.
Environment: Redhat Enterprise Linux 5/6, VMware server 3.5, NIS, NFS, DHCP, and Dell R820
UNIX/LINUX Administrator
Confidential, Madison, WI
Responsibilities:
- Installation and configuration of Red Hat Enterprise Linux (RHEL) 5x, 6x Servers on HP, Dell Hardware and VMware virtual environment
- Installation, setup and configuration of RHEL, CentOS, OEL and VMware ESX on HP, Dell and IBM hardware
- Installation and Configuration of Sun Enterprise Servers, HP and IBM Blade Servers, HP 9000, RS 6000, IBM P series
- Expertise in enterprise class storage including SCSI, RAID and Fiber-Channel technologies
- Configuration and maintenance of virtual server environment using VMWAREESX 5.1/5.5, VCenter
- Involved in supporting different HW ranging from Rack to Blade and also supported Vblock HW to run ESX environment
- Configured Enterprise UNIX/LINUX systems in heterogeneous environment (Linux (Redhat&SUSE), Solaris, HP-UX) wif SAN/NAS infrastructure across multiple sites on mission business critical systems
- Created a standard kickstart based installation method for RHEL servers. Installation includes all changes required to meet company’s security standards. Installation method is possible over HTTP, or via PXE on a separate network segment
- Adding SAN using multipath and creating physical volumes, volume groups, logical volumes
- Storage Provisioning, Volume and File system Management using LVM, VERITAS Volume Manager and VERITAS File system (VERITAS Storage Foundation), Configuring ZFS file systems
- Creating user accounts, user administration, local and global groups on Solaris and Red Hat Linux platform
- Setup, Implementation, Configuration, documentation of Backup/Restore solutions for Disaster/Business Recovery of clients using TSM backup on UNIX, SUSE &Redhat Linux platforms.
- Installed and configured Netscape, Apache web servers and Samba Server
- Installed and Configured WebSphere Application servers on Linux and AIX
- Installation and configuration MySQL on Linux servers.
- Performed quarterly patching of Linux, AIX, Solaris and HPUX servers on regular schedule
- Heavily utilize the LAMP stack (Linux,Apache,Mysql,PHP/Perl) to meet customer needs
- Installation and configuration of Oracle 11g RAC on Red Hat Linux nodes
- Installed and configured different applications like apache, tomcat, JBoss, xrdp, WebSphere, etc.. and worked closely wif the respective teams
- Setup, Implementation, Configuration of SFTP/FTP servers
- Setting up JBoss cluster and configuring apache wif JBoss on RetHat Linux. Proxy serving wif Apache. Troubleshooting Apache wif JBoss and Mod jk troubleshooting for the clients
- Installed Jenkins and created users and maintained Jenkins to deploy Java code developed by developers and build framework.
- System performance monitoring and tuning.
- Working on Shell and perl scripts
- Setting up the lab environment wif Tomcat/Apache, configuring the setup wif F5 virtual load balancer for customer application
- Document all system changes and create Standard Operating Procedure (SOP) for departmental use. Resolve all halpdesk tickets and add/update asset records using Remedy Action Request System
- Provide 24x7 oncall production support on rotation basis.
Environment: Redhat 4x, 5x, Solaris 9/10, AIX, HP-UX, VMware ESX 5.0/5.1/5.5, VSphere,HPProLiant Servers DL 380/580, Dell Servers (R series),Windows 2008 sever, EMCClariion, Netapp, SCSI, VMWare converter, Apache Webserver, F5 load balancer, oracle, MySQL, PHP, DNS, DHCP, BASH, NFS, NAS, Spacewalk, WebSphere, WebLogic, Java, Jenkins, JBoss, Tomcat, Kickstart.
Linux Administrator
Confidential
Responsibilities:
- Performed Red Hat Enterprise Linux, Oracle Enterprise Linux (OEL), Solaris, and Windows Server deployment to build new environment by using Kickstart and jumpstart.
- Preformed Installation, adding and replacement of resources likeDisks, CPU’s and Memory, NIC Cards, increasing the swapandMaintenanceof Linux/UNIX and Windows Servers.
- Implemented NFS, SAMBA file servers and SQUID caching proxy servers
- Implemented centralized user authentication using OpenLDAP and Active Directory
- Worked wifVMware ESX ServerConfigured for Red Hat EnterpriseLinux.
- Configured IT hardware- switches, HUBS, desktops, rack servers
- Structured datacenter stacking, racking, and cabling
- Install, configured, troubleshoot, and administer VERITAS and Logical volume manager and managing file systems.
- Monitored system performance, tune-up kernel parameter, added/removed/administered hosts and users
- Created and Administered user Accounts using native tools and managing access using sudo.
- Actively participated and supported in the migration of 460+ production servers from old data center to New Data Center
- Involved in using RPM for package management and Patching.
- Creating documentation for datacenter hardware setups, standard operational procedures and security policies
- Create and maintain technical documentation for new installations and systems changes as required
Environment: RHEL 3/4/5, Solaris 8/9, ESX 3/4, HP DL 180/360/580 G5, IBM p5 series, Fujitsu, M4000, EMC Symmetrix DMX 2000/3000, Linux Satellite Server