Hadoop Administrator Resume
Buffalo, NY
SUMMARY:
- Aspiring a career as a Hadoop Admin, I am a certified, experienced admin with a great passion on server support and Hadoop eco - system. Seeking employment with a company which will aide my passion to address and solve their technical needs, by applying my experience within the areas of my profession.
- 8+years of IT experience which includes 4 years of experience with Hadoop, Ambari, HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, HBASE, Oozie, Sqoop).
- Experience in Installation, Configuration,Backup, Recovery, Customizing and Maintenance of clusters using Hortonworks Hadoop .
- Implementing, managing and administrating the overall Hadoop infrastructure.
- Experience in Installation and Configuration of Hadoop ecosystem on AWS EMR’s .
- Experience in capacity planning and analysis for Hadoop infrastructure/clusters
- Good Experience in Importing and exporting data into HDFS and Hive using Sqoop
- Good Experience using Nagios and Ganglia
- Good Knowledge and experiencein BI tools like Cognos, MicroStrategy, Tableau
- Good Knowledge in Relational Databases like Oracle, MYSQL, Teradata .
- Experience with Hadoop ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Kafka, Yarn, kylin, Nifi, Oozie, and Zookeeper.
- Experience on Commissioning, Decommissioning, Load Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in designing both time driven and data driven automated workflows using Oozie
- Experience in implementing enterprise level security using AD/LDAP, Kerberos, Ranger, Knox and Sentry
- Optimizing performance of HBase/Hive/Pig jobs
- Hands on experience in Zookeeper and ZKFC in managing and configuring in Name Node failure scenarios.
- Experience in using Splunk to load logs files into HDFS and Experience in file conversation formats, compression formats.
- Experience in understanding Hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn .
- Experience in adding and removing the nodes in Hadoop cluster and experience in managing the Hadoop cluster with HDP,Cloudera .
- Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server and MS access and non-relational sources like flat files into staging area
- Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.
- Having Strong Experience in LINUX/UNIX Administration , expertise in Red Hat Enterprise Linux 4, 5 and 6, familiar with Solaris 9 &10 and IBM AIX 6
- Installing, upgrading and configuring Linux Servers using Kickstart as well as manual installations and the recovery of root password.
- Strong experience in System administration, installation, upgrading Patches, Migration, configuration, troubleshooting, security, Backup, disaster Recovery, performance Monitoring and Fine Tuning on Linux (RHEL) systems
- Experience in Creation and managing user accounts, security, rights. Disk space and process monitoring in RedHat Linux
- Good experience in Shell scripting (bash) to automate system administration jobs.
- Utilize industry standard tools for system management with emphasis on SSH/SCP/SFTP.
- Implemented Nagios for automatic monitoring of servers.
- User/ File management; Adding, removing and giving access rights to users on a server. Changing permissions, ownership of files and directories, and assigning special privileges to selected users and scheduling system related crone jobs .
AREAS OF EXPERTISE:
- Installation, Configuration, Backup, Recovery, Customizing and Maintenance of clusters using Hortonworks Hadoop .
- Installation and Configuration of Hadoop ecosystem on AWS EMR’s
- Capacity planning and analysis for Hadoop infrastructure/clusters
- Commissioning, Decommissioning, Load Balancing, and Managing Nodes
- Tuning server for optimal performance of the cluster
- Designing both time driven and data driven automated workflows using Oozie
- Using Splunk to load logs files into HDFS
- File conversation formats, compression formats
- LINUX/UNIX Administration in Red Hat Enterprise Linux 4, 5 and 6
- Familiar working with Solaris 9 &10 and IBM AIX 6
- Shell scripting (bash) to automate system administration jobs
TECHNICAL SKILLS:
Languages: Java, shell, Python, PowerShell
Databases: My SQL, SQL, Mango DB,Teradata, Cassandra, Oracle
Methodologies: Agile, Waterfall
Hadoop ecosystem: Ambari,HDFS,MapReduce,Hive,pig,Sqoop,HBase,Knox,Ranger,Zookeeper,Kafka,Splunk,kylin,Nifi,Flume,Oozie,Spark
Operating Systems: RHEL, Linux, Windows, CentOS, Ubuntu, SUSE Solaris, Mac
Web/App Servers: Apache, Tomcat, TFS, IIS, Nginix
Networks: NIS,NIS+,DNS,DHCP,TELNET,FTP,Rlogin
Network Protocols: TCP/IP,PPP,SNMP,SMTP,DNS,NFSv2,NFSv3
Hypervisor: VMware, ESXI, Microsoft Azure.
PROFESSIONAL EXPERIENCE:
Confidential, Buffalo, NY
Hadoop Administrator
Responsibilities:
- Implemented and Configured High Availability Hadoop Cluster .
- Involved in managing and reviewing Hadoop log files.
- Implemented Fair scheduler on the job tracker to share the resources of the cluster for the Map reduce jobs given by the users.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Hands on experience working on Hadoop ecosystem components like HDFS, Map Reduce, YARN, Zookeeper, Pig, Hive, Sqoop, Kafka.
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes .
- Responsible for ongoing administration of Hadoop Infrastructure.
- Effectively used oozie workflow engine to run multiple Hive and Pig jobs .
- Involved in configuring Ranger for the authentication of user and the Hadoop daemons.
- Implemented rack aware topology on the Hadoop cluster.
- Monitored and managed the Hadoop cluster through Nagious.
- Experience in Installation and Configuration of Hadoop ecosystem on AWS EMR’s.
- Experience in using Kafka to stream data into HDFS from various sources.
- Responsible for troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
- Implemented Spark stream framework for the real time data processing.
- Used Spark streaming to receive real time data from Kafka and stored the stream data to HDFS using Scala and NoSQL Databases like HBase and Cassandra
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
- Worked in the cluster disaster recovery plan for the Hadoop cluster by implementing the cluster data backup from the AWS storage solutions.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Used Hortonworks for installation and management of Hadoop cluster.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Involved in Commissioning and Decommissioning of nodes depending upon the amount of data.
- Automated the work flow using shell scripts .
- Performance tuning of the hive queries , written by other developers.
- Monitored workload, job performance and capacity planning.
Environment: Hadoop, Map reduce, YARN, Pig, Hive, HBase,Oozie, Zookeeper, HDFS, Sqoop, Spark, Kafka, Hartonworks, Linux.
Confidential, Alpharetta, GA
Hadoop Administrator
Responsibilities:
- Specifying the Cluster size, allocating Resource pool and monitoring of jobs
- Configured the Hive set up
- Export the result set from one SQL server to another MySQL using Sqoop .
- Helped in the HIVE queries for the analysts.
- Helped the team to increase Cluster from 25 Nodes to 40 Nodes. The configuration for additional Data Nodes was managed through Serengeti .
- Maintain System integrity of all sub-components across the multiple nodes in the cluster.
- Managed the data exchange between HDFS and different web sources using Flume and Sqoop.
- Involved in user/group management in Hadoop with AD/LDAP integration.
- Monitor Cluster health and clean up logs when required.
- Perform upgrades and configuration changes.
- Setup data authorization roles for the services through Apache Sentry .
- Commission/decommission Nodes as needed.
- Troubleshooting query performance through Cloudera manager .
- Created Hive tables from JSON data using data serialization framework like AVRO.
- Manage resources in a multi-tenancy environment.
- Configured Zookeeper to implement node coordination in clustering support.
- Used Snapshots to backup HDFS files and HBase tables through Cloudera manager.
- Involved in scheduling Snapshots for the backup on demand.
- Configured the Zookeeper in setting up the HA Cluster .
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Set up the compression for different volumes in the cluster.
- Developed Map Reduce programs to perform analysis research, identify and recommend technical and operational improvements resulting in improved reliability efficiencies in developing the Cluster.
- Wrote some Map reduce jobs for benchmark tests and automated them in a script .
Environment: Hadoop, HDFS, HBase, Pig, Hive, Oozie, MapReduce, Sqoop, Hartonworks, Cassandra, LINUX
Confidential, Alpharetta, GA
Hadoop Administrator
Responsibilities:
- Responsible for loading the customer's data and event logs from Kafka into HBase using REST API .
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files .
- Worked on debugging, performance tuning and Analyzing data using Hadoop components Hive & Pig .
- Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
- Worked on installing cluster, commissioning & decommissioning of Data node, Name node recovery, capacity planning, and slots configuration.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Worked on loading data from LINUX file system to HDFS .
- Used Hartonworks for installation and management of Hadoop Cluster .
- Moved data from Hadoop to Cassandra using Bulk output format class.
- Importing and exporting data into HDFS and Hive using Sqoop
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables using Oozie workflows.
- Responsible for processing unstructured data using Pig and Hive .
- Developed Pig Latin scripts for extracting data
- Extensively used Pig for data cleansing and HIVE queries for the analysts
- Created PIG script jobs in maintaining minimal query optimization .
- Strong experience on Apache server configuration
Environment: Hadoop, HDFS, HBase, Pig, Hive, Oozie, MapReduce, Sqoop, Hartonworks, Cassandra, Kafka, LINUX
Confidential
Linux Administrator
Responsibilities:
- Provided on-call support by rotation to provide 24x7x365 support within a 3000server environment.
- Administration of RHEL 5, 6 and IBM AIX which includes installation, configuration, testing on both physical and virtual machines.
- Setting up cron schedules for backups and monitoring processes.
- Automated server building using System Imager, PXE, Kickstart and Jumpstart RHEL Servers.
- Excellent in patches and packages installation on Linux & AIX Environment
- Setup and configure failover load balancing using IP bonding for network cards.
- Experience in iptables commands including those used to add, append, delete, insert, or replace rules within a particular chain, parameters are required to construct a packet filtering rule.
- Perform system installs and performance tuning, configure and administer Unix.
- Perform day to day LVM operations and System Admin tasks
- Set up and troubleshoot issues with Secure Shell in the environment to accommodate script automation and password changes
- Installed WebSphere Portal Server 5.1/6.0/6.1 and enabled Web Content management.
- Migrated WAS 5.0 Network Deployment and Base on AIX/Windows platforms to WAS 6.0 ND.
- Experience in installing configuring and administering MQ 5.3/6.0 on AIX, Linux environments.
- Installation and configuration of LPARs with AIX 5.3 on P5 servers. Manage LPARs and provide virtual memory management and memory optimization.
- Experienced working with Systems Engineers to implement storage solutions which provide high performance, data protection and cost-effective use of available storage.
- Installation, Configuration and Troubleshooting of Tivoli Storage Manager (TSM) and License Manager (TLM). Upgrade TSM from 5.1.x to 5.3.x.
- Management of all SAN storage systems, Hitachi, EMC, SUN capacity planning and performance tuning.
- Configuration of VIO server and VIO clients from Hardware Management Console.
- Installation and configuration of Redhat Device Multi pathing.
- Monitored Linux server for CPU Utilization, Memory Utilization, and Disk Utilization for performance monitoring.
Environment: DNS, TCP/IP, DHCP, Linus, Unix, Shell,VxVM
Confidential
Linux Administrator
Responsibilities:
- Installed RedHat Linux using kickstart.
- Created, cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts.
- Managed systems routine backup, scheduling jobs, enabling cron jobs, enabling system logging and network logging of servers for maintenance.
- Performed RPM and YUM package installations, patch and another server management.
- Installed and configured Logical Volume Manager - LVM and RAID.
- Documented all setup procedures and System Related Policies (SOP's).
- Provided 24/7 technical support to Production and development environments.
- Administrated DHCP, DNS, and NFS services in Linux.
- Created and maintained user's accounts, profiles, security, rights disk space and process monitoring.
- Provided technical support by troubleshooting Day-to-Day issues with various Servers on different platforms.
- Diagnose, solve and provide root cause analysis for hardware and OS issues
- Run prtdiag -v to make sure all memory and boards are online, check for failure
- Supported Linux and Sun Solaris Veritas clusters.
- Notify server owner if there was a failover or crash. Also notify Unix Linux Server Support L3
- Check for core files, if exist send to Unix Linux Server Support for core file analysis.
- Monitor CPU loads, restart processes, check for file systems.
- Installing, Upgrading and applying patches for UNIX, Red Hat/ Linux, and Windows Servers in a clustered and non-clustered environment.
- Helped and installed system using kickstart
- Installation & maintenance of Windows 2000 & XP Professional, DNS and DHCP and WINS for the Bear Stearns DOMAIN.
- Use LDAP to authenticate users in Apache and other user applications
- Remote Administration using terminal service, VNC and PCAanywhere.
- Create/remove windows accounts using Active Directory
- Reset user password with Windows Server 2003 using Ds mod command-line tool
- Provided end-user technical support for applications
- Maintain/Create and update documentation
Environment: DNS, TCP/IP, DHCP,LDAP, Linus, Unix, Shell