Hadoop Administrator Resume
Dallas, TX
PROFESSIONAL SUMMARY
- Around 7 years of professional IT experience including 3 years in Big data ecosystem related technologies.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
- Setting up monitoring infrastructure for Hadoop Cluster using Nagios and Ganglia.
- Performed importing and exporting data into HDFS and Hive using Sqoop.
- Experience in managing and reviewing Hadoop log files.
- Excellent technical and analytical skills with clear understanding of ETL design and project architecture based on reporting requirements.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Extending Hive and Pig core functionality by writing UDFs.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, Flume,Sentry,Zookeeper.
- Familiar with major Hadoop distributions Cloudera CDH, Apache Hadoop distribution, Hortonworks.
- Utilized Apache Hadoop environment by Hortonworks
- Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH 4&5 clusters.
- Expert level understanding of Hadoop technologies, integration and troubleshooting.
- Providing support to Data analyst in running PIG and HIVE queries.
- Writing shell scripts to dump the Shared Data from MySQL servers to HDFS.
- Good understanding of the NoSQL databases like MongoDB, Cassandra.
- Implemented configuration management system using Puppet
- Refactored site specific several Puppet manifests into reusable Puppet modules.
- Configuration management and deployment with Chef.
- Experience in designing, installing and configuring VMware ESXi, withinVSphere 5 environment with Virtual Center management, Consolidated Backup, DRS, HA, vMotion and VMware Data.
- Familiar with Java virtual machine (JVM) and multi-threaded processing
- Hands on experience in Agile and Scrum methodologies.
- Extensive experience in working with the Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for the users.
- Deep knowledge of the Java programming language and exposure to other JVM languages and tech especially Scala, Akka, and Storm
- Excellent coding skills in Java, Python and familiar with Code controls.
- Installing and configuring of security networks like certificates using SSL.
- Managed small projects interfacing with key stakeholders to update servers and operating systems, security vulnerabilities, patching, passwords, setup and installations and ad hoc projects.
- Worked on different security networking like VPN, IKE, IPSEC, Firewall and SSL Certificates.
TECHNICAL SKILLS
Big Data Ecosystem: HDFS, HBase, Hadoop MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Hue,Nagios,Whirr,Apache Spark,Mahout,Centry,Impala Sun Grid Enine,Puppet,Chef
RDBMS/ Database: SQL Server 2000/2005/2008 R2, MS-Access XP/2007/2008, ORACLE 10g/9i, MySQL, NoSQL: MongoDB, HBase, Cassandra
Scripting Languages: Shell scripting, Java Scripting, UNIX shell scripting, Python, SQL, PIG LATIN
Networking: HTTP, SMTP, NFS, FTP, DNS, DHCP, CIFS, TCP/IP
Operating Systems: Unix, Linux, AIX, Windows XP, Server 2000, 03, Server 2008.
PROFESSIONAL EXPERIENCE
Confidential, Dallas,TX
Hadoop Administrator
Environment: Hadoop, Map Reduce, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script, Hortonworks.
Responsibilities:
- Worked on performing major upgrade of cluster from CDH3u6 to CDH4.2.0.
- Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
- Installed Cloudera Manager on an already existing Hadoop cluster.
- Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
- Responsible for ongoing maintenance, expansion and improvement of a cross-regional ESX infrastructure, supporting over 400 Virtual Servers, as well as offshore Desktops systems.
- Closely worked and coordinate efforts with the storage team to analyze performance data, and used it to plan and deploy methods of maximizing performance and reliability.
- User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
- Used HiveQL to write Hive queries from the existing SQL queries.
- Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of OpenStack
- The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
- Developed custom MapReduce programs and custom User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
- Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
- Worked on installation of DataStax Cassandra cluster.
- Installing and monitoring the Hadoop cluster resources using Ganglia and Nagios.
- Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, Flume, Apache Spark, Sentry,
- Managed Extracting, Loading and transforming in and out of hadoop using HIVE and IMPALA.
- Rebuilt existing nagios system to integrate tightly with puppet for automatic monitoring configuration Collaborated closely with engineers to track down various issues causing segfaults and performance degredations in our environment
- Instituted change management process using git and puppet, added validation and automated problem reporting
- Accommodate load in its place before the data is analyzed using Apache Kafka with its fast, scalable, fault-tolerant system.
- Handled HUE the open source web based interface to interact with hadoop services.
- Utilized Apache Spark for Interactive Data Mining and Data Processing.
Confidential, NC
Hadoop Administrator
Environment: MapReduce, HDFS, Pig, Hive, HBase, Flume, Sqoop, Ooozie, Zookeeper, Nagios, Ganglia, Tableau and Cloudera Manager.
Responsibilities:
- Worked on setting up the Hadoop cluster for the dev, test and prod Environment.
- Worked on pulling the data from oracle databases into the hadoop cluster using the sqoop import.
- Worked with flume to import the log data from the reaper logs, syslog’s into the Hadoop cluster.
- Monitored disk, Memory, Heap, CPU utilization on all Master and Slave machines using Cloudera Manager.
- Worked on High Availability for Name Node using ClouderaManager to avoid single point of failure.
- Configured Fair Scheduler to provide fair resources to all the applications across the cluster.
- Worked with application teams to install Hadoop updates, patches, version upgrades as required.
- Manage and review data backups and log files.
- Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
- Implemented nagios and integrated with puppet for automatic monitoring of servers known to puppet.
- Effectively used Sqoop to transfer data from databases (MySql, Oracle) to HDFS, Hive.
- Created Hive Managed and External tables defined with static and dynamic partitions.
- Used Ganglia to monitor the cluster around the clock.
- Supported Data Analysts in running Map Reduce Programs.
- Contribute to the creation and maintenance of system documentation.
- Deployment hadoop on AWS EC2 with Whirr.
- Creating authenticated URL with time bounded validity using Amazon S3.
- Utilized the Apache Solr to replace the core content, boost features and better its performance.
Confidential, Sunnyvale, CA
Linux Administrator
Environment: Linux ( Red Hat Enterprise, CentOS), Windows 2000/NT, HP, IBM, Solaris, Oracle 8i, Cisco routers/switches, Dell 6400, 1250, Sun E450, E250.
Responsibilities:
- Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
- Log management like monitoring and cleaning the old log files.
- System audit report like no. of logins, success & failures, running cron jobs.
- System performance for hourly basis or daily basis.
- Remotely coping files using sftp, ftp, sfcp, winscp, and filezilla.
- Resolved Security Access Requests via Peregrine Service center to provide the requested User access related requests.
- Created user roles and groups for securing the resources using local operatingSystem authentication.
- Experienced in tasks like managing User Accounts and Groups, managing Disks and File systems.
- Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
- Configuring & monitoring DHCP server.
- Taking backup using tar and recovering during the data loss.
- Experience in writing bash scripts for job automation.
- Problem determination, Security, Shell Scripting.
- Documenting the installation of third-party software’s.
- Configuring printers to the Solaris and Linux servers and also installing third party softwares.
- Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
- Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
- Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
- Performed regular installation of patches using RPM and YUM.
- Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP
- Maintained LVM, VxVM and SVM filesystems along with NFS.
- AWS Elastic Map Reduce and S3/HDFS storage with Hbase, Apache Mahout Platform design and build.
Confidential
System Administrator
Responsibilities:
- Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
- Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
- Log management like monitoring and cleaning the old log files.
- System audit report like no. of logins, success & failures, running cron jobs.
- System performance for hourly basis or daily basis.
- Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
- Created user roles and groups for securing the resources using local operating System authentication.
- Experienced in tasks like managing User Accounts and Groups, managing Disks and Filesystems.
- Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
- Configuring & monitoring DHCP server.
- Taking backup using tar and recovering during the data loss.
- Experience in writing bash scripts for job automation.
- Documenting the installation of third-party software’s.
- Configuring printers to the Solaris and Linux servers and also installing third party softwares.
- Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
- Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
- Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
- Performed regular installation of patches using RPM and YUM.
- Maintained LVM, VxVM and SVM filesystems along with NFS.
