Hadoop & Linux Administrator Resume
Boston, MA
SUMMARY
- Over 7+years of proffessional experence in IT background which includes 3+years in Hadoop Administration, Big data ecosystem related technologies and 4 years in Linux Administration.
- Experience in full lifecycle development process including planning, design, development, testing and implementation of moderate to advanced complexity systems.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Resource Manager and Map Reduce programming paradigm.
- Well - versed with Hadoop distributions Like Cloudera, Horton Works.
- Experience in large-scale data processing, on an Amazon EC2 and Digital Ocean Cluster.
- Hands-on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
- Strong knowledge in Name Node High Availability and recovery of Name node metadata and data residing in the cluster.
- Strong knowledge on Hadoop HDFS architecture, Cluster planning and Map-reduce framework.
- Hands on experience in importing and exporting data from different databases like Oracle, Teradata into HDFS and Hive using Sqoop.
- Experience in managing and reviewing Hadoop log files.
- Worked on Disaster Management with Hadoop cluster.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Worked with Flume for collecting the logs from log collector into HDFS.
- Worked with Chef for automated deployments.
- Experience in configuration and management of security for Hadoop cluster using Kerberos and integration with LDAP/AD at an Enterprise level.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
- Experience in configuring Zookeeper to provide high availability and Cluster service co-ordination
- Skilled in setting up apache KNOX, RANGER and SENTRY.
- Providing support to Data analyst in running PIG and HIVE queries.
- Experience in Benchmarking, minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
- Writing shell scripts to dump the Shared Data from MySQL servers to HDFS.
- Good knowledge in Java, J2EE, HTML, JSP, Servlets, CSS, JavaScript, XML
- Familiar with Java virtual machine (JVM) and multi-threaded processing
- Hands on experience in Agile and Scrum methodologies.
- Extensive experience in working with the Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for the users.
TECHNICAL SKILLS
Big Data Ecosystem: HDFS, HBase, MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Spark.
RDBMS/ Database: SQL Server 2000/2005/2008, MySQL
Scripting Languages: Shell scripting, Java Scripting, UNIX shell scripting, Python, SQL, PIG LATIN
Network: Http/Https, TCP/IP, Ssh, Ftp, Teenet.
Security: Kerberos Security.
Operating Systems: Unix, Linux (RHEL, Ubuntu), AIX, Windows XP, Server 2000, 03, Server 2008.
PROFESSIONAL EXPERIENCE
Confidential, Boston, MA
Hadoop & Linux Administrator
Environment: HADOOP HDFS, MapReduce, HORTON WORKS,Hive, Pig, Oozie, Sqoop, Hbase.
Responsibilities:
- Installed and configured a Horton Works HDP 2.2 and Hadoop 2.6 using AMBARI.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Managing and reviewing Hadoop log files and debugging failed jobs.
- Implemented Kerberos Security Authentication protocol for production cluster.
- Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Worked with Infrastructure teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Backed up data on regular basis to a remote cluster using distcp.
- Responsible to manage data coming from different sources.
- Involved in data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
- Cluster coordination services through Zookeeper.
- Loaded the dataset into Hive for ETL Operation.
- Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented Fair scheduler to allocate fair amount of resources to small jobs.
- Assisted the BI team by Partitioning and querying the data in Hive.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Confidential, columbus,IN
Hadoop Administrator
Environment: Hadoop, MapReduce, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script
Responsibilities:
- Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
- Installed Cloudera Manager on an already existing Hadoop cluster.
- Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
- Job management using Fair scheduler.
- Created a local YUM repository for installing and updating packages.
- Implemented fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Implemented Name Node backup using NFS for High availability.
- Used HiveQL to write Hive queries from the existing SQL queries.
- Dumped data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Responsible for ongoing maintenance, expansion and improvement of a cross-regional ESX infrastructure, supporting over 400 Virtual Servers, as well as offshore Desktops systems.
- Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
- Closely worked and coordinate efforts with the storage team to analyze performance data, and used it to plan and deploy methods of maximizing performance and reliability.
- Implemented Rack Awareness for data locality optimization
- User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
- The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
- Designed and allocated HDFS quotas for multiple groups.
- Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
- Developed shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failures.
- Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, and Flume etc.
- Experience in managing and reviewing Hadoop log files.
- Also involved in the upgrading the CDH4 to latest version which includes the upgrading of Cloudera Hadoop Manager to version CDH5.
Confidential, Dallas, TX
Linux System Administrator
Environment: Linux, TCP/IP, LVM, RAID, XEN, Networking, Security, RPM, user management.
Responsibilities:
- Building, Installing, Configuring Sun/HP/Dell servers from scratch with OS of Solaris (10/8) and Linux (Red Hat 6.X, 5.X, 4.X).
- Performed network based automated installations of Operating System using Jumpstart for Solaris and Kickstart for RHEL Linux through TPM (Tivoli Provisioning Manager)
- Installation and Configuration of Mail server of Send mail
- Installation, configuration and maintenance of Virtualization technologies such as VMware
- Design, development, and implementation of the package and patch management process.
- Troubleshooting the VM machines using Virtual center and VMware Infrastructure client.
- Cloning and troubleshooting VM ESX hosts and guest servers.
- Installation, setup, configuration, security administration and maintenance for flavors of servers like Active Directory, NFS, FTP, Samba, NIS, NIS+, LDAP, DHCP, DNS, SMTP/Mail Server, Apache Servers, Proxy Servers in heterogeneous environment.
- Implementation of RAID Software and hardware.
- LVM implementation.
- Rsync Backup Scheduling Tool.
- Writing Shell scripts for system maintenance and automation of server.
- Creation of Virtual machine in the vSphere client.
- Yum configuration.
- Monitoring System performance and do kernel tuning to enhance the system Performance.
- Proactive maintenance on systems by timely scheduling of at jobs, batch jobs and cron jobs.
- Worked with NetBackup team to maintain backup on the servers through Veritas NetBackup/SAN.
- Verify successful completion of monthly Full backup, daily incremental backup and weekly cumulative backups, following developed procedures
- Supported 24/7 high availability production servers
Confidential
Linux Administrator
Environment: Redhat 4/5, Solaris 8/9/10, CentOS 4/5, SUSE Linux 10.1/10.3, VMware
Responsibilities:
- Maintained UNIX (Red Hat Enterprise Linux4, 5, CentOS4, 5, VMware) on Sun Enterprise servers & Dell Servers.
- Implemented the Jumpstart servers and Kickstart Servers to automate the server builds for multiple profiles.
- APACHE Server Administration with Virtual Hosting.
- Worked extensively in using VI editor to edit necessary files writing shell script.
- Installed and deployed RPM Packages
- Worked on adding new Users and groups and give sudo access in test and development servers and central file synchronization via sudoers, authorized keys, passwd, shadow, and group.
- Coordinated with application team in installation, configuration and troubleshoot issues with Apache, Web logic on Linux servers.
- Installation and Configuration of SSH, TELNET, FTP, DHCP, DNS.
- Local and Remote administering of servers, routers and networks using Telnet and SSH.
- Installed/Configured/Maintained/Administrated the network servers DNS, NFS and application servers Apache and Samba server.
- Involved in back up, firewall rules, LVM configuration, monitoring servers and on call support.
- Worked on UNIX shell scripting for system/application in automating server task, installing and monitoring applications and data feeding file transfer and log files.
- Created BASH shell scripts to automate cron jobs and system maintenance. Scheduled cron jobs for job automation.
- Monitored client disk quotas &disk space usage.
- Worked on backup technologies like VeritasNetbackup4.x, 5.0, 6.x and Tivoli Storage Manager 5.5.
- Worked on and installed Blade Servers.
- Worked with Red Hat Package Manager RPM, YUM and YAST.
Confidential
Linux Administrator
Environment: Linux ( Red Hat Enterprise, CentOS), Windows 2000/NT, HP, IBM, Solaris, Oracle 8i, Cisco routers/switches, Dell 6400, 1250, Sun E450, E250.
Responsibilities:
- Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
- Log management like monitoring and cleaning the old log files.
- Administration of RHEL4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- System audit report like no. of logins, success & failures, running cron jobs.
- System performance for hourly basis or daily basis.
- Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
- Created user roles and groups for securing the resources using local operating System authentication.
- Experienced in tasks like managing User Accounts and Groups, managing Disks and File systems.
- Installing RedHat Linux using kick start and applying security polices for hardening the server based on company’s policies.
- Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
- Configuring & monitoring DHCP server.
- Taking backup using tar and recovering during the data loss.
- Experience in writing bash scripts for job automation.
- Documenting the installation of third-party software’s.
- Configuring printers to the Solaris and Linux servers and also installing third party softwares.
- Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
- Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
- Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
- Performed regular installation of patches using RPM and YUM.
- Maintained LVM, VxVM and SVM filesystems along with NFS.