Hadoop Administrator Resume
NC
SUMMARY:
- Over 6 years of experience in Systems administration in various industries which include hands on experience in Big Data ecosystem related technologies.
- Experience in working with MapReduce using Apache Hadoop for working with Big Data.
- Strong knowledge on Hadoop HDFS architecture and MapReduce framework.
- Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions.
- Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS and OpenStack.
- Experience in installing Hadoop cluster using different distributions of Apache Hadoop, Cloudera and Hortonworks.
- Worked on Installing, Configuring and maintaining HBASE, also used Pig, Hive, Scoop, and Cloudera Manager.
- Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Loading logs from multiple sources directly into HDFS using tools like Flume.
- Good experience in performing minor and major upgrades.
- Worked on Disaster Management with Hadoop Cluster.
- Worked in deploying and managing the multi - node development, testing and production. Used puppet in application deployment.
- Familiar in commissioning and decommissioning and re-commissioning of nodes on Hadoop Cluster.
- Managing nodes on Hadoop cluster connectivity and security.
- Experience in Nagios and Ganglia monitoring tools.
- Experience in all the phases of Data warehouse life cycle involving Requirement analysis, Design, Coding, Testing, and Deployment.
- Excellent interpersonal and communication skills, creative, technically competent and result-oriented with problem solving and leadership skills.
- Experience supporting systems with 24X7 availability and monitoring
- Strictly following Security Compliance, management policies and ITIL Framework.
- Installed and configured Spark on multimode environment.
- Experienced in optimizing Hbase running on multi node cluster.
- Experienced in installing, administrating, and supporting Linux operating systems and hardware in an enterprise environment.
- Ability to handle multiple tasks and work independently as well as in a team under minimal or no supervision.
TECHNICAL SKILLS:
Operating Systems: Linux, CentOS, AIX, Sun Solaris, HP-UX, Windows 95/98/2000/2003/ NT & XP
Languages: C, C++, Java, SQL, PL-SQL, HTML, JavaScript
Scripting: Unix Scripting (ksh), Perl, WLST, VbScript, JavaScript
Databases: Oracle9.x,10g,11g MS SQL Server 2000/2005/2008, DB2, VSAM
BigData: Apache Hadoop, Cloudera, HortonWorks, Hadoop
Version Control: PVCS, CVS, VSS
Technologies: JSP, XML/XSL, SOAP, ASP, HTML/DHTML, SNA/COMTI, Crystal Reports, Actuate Reporting.
Integration Tech.: ALSB, Business Works, MQ Series 5.x.
Tools: Axway gateway Interchange, Axway CSOS and Axway Secure Transport, MSOffice, Wily, JProbe, Clear Quest, Clear Case.
Networking & Protocols: TCP/IP, Telnet, HTTP, HTTPS, FTP, SNMP, LDAP, DNS.
Application Servers: IBM webSphere (WAS), WebLogic 11g
PROFESSIONAL EXPERIENCE:
Confidential, NC
Hadoop Administrator
Responsibilities :
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Implemented ten nodes CDH3 Hadoop cluster on Ubuntu LINUX.
- Involved in loading data from LINUX file system to HDFS.
- Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Monitored health of all the Processes related to Name Node HA, HDFS, YARN, PIG, HIVE, HBASE, SPARK, Tez using Cloudera Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Responsible to manage data coming from different sources.
- Load and transform large sets of structured, semi structured and unstructured data.
- Experience in managing and reviewing Hadoop log files.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Migrated Hadoop Cluster from CDH 3.X.X to CDH 4.X.X
- Involved in Installing Cloudera Manager, Hadoop, Zookeeper, HBASE, HIVE, PIG etc.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Apache Hadoop, Spark, HDFS, Cloudera Manager, MapReduce, Hive, HBase PIG, Yarn, Sqoop, Oozie, Zookeeper, Tez, SQL and Java.
Confidential
On-Campus Job, Graduate Asst., Hadoop Lab
Responsibilities :
- Monitored disk, Memory, Heap, CPU utilization on all Master and Slave machines using Cloudera Manager and took necessary measures to keep the cluster up and running on 24/7 basis.
- Monitored all MapReduce Write Jobs running on the cluster using Cloudera Manager and ensured that they were able to write the data to HDFS without any issues and Data getting evenly distributed over the cluster.
- Installing, Upgrading and Managing Hadoop Cluster on Cloudera and HortonWorks distribution.
- Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
- Provided Statistics of all successfully completed jobs in detail report format.
- Provided Statistics of all failed jobs in detail report format and worked on finding the root cause and resolution E.g. Jobs failure due to disc errors, node issues etc.
- Viewed the performance of the Map and Reduced task that make up the job using Cloudera Manager.
- Involved in adding new node to a cluster and decommissioning of the effective nodes from the cluster.
- Interacted with developers when we have to deploy new jobs, Jobs throwing exceptions, and Data related issues on the production cluster.
- Fine Tuned Hadoop cluster by using compression for input and output data.
- Fine Tuned Hadoop cluster by setting proper number of map and reduced slots for the TaskTrackers
- Fine Tuned JobTracker by changing few properties mapred-site.xml.
- Fine Tuned shuffle, merge, sort, JVM, Memory parameters and increased the overall cluster performance time.
- Involved in configuring Quorum base HA for NameNode and made the cluster more resilient.
- Involved in configuring SLA to ensure that Hadoop user has the proper permissions.
- Involved in configuring Job authorization with ACL.
- Integrated Kerberos into Hadoop to make cluster more strong and secure from unauthorized users.
- Configured user authentication for accessing web UI.
Environment: HDFS, YARN, PIG, HIVE, SPARK, Tez, Cloudera Manager, Hadoop, Zookeeper
Confidential
Support Hadoop Administrator
Responsibilities :
- Experience in managing scalable Hadoop cluster environments.
- Involved in managing, administering and monitoring clusters in Hadoop Infrastructure.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
- Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades when required.
- Experience in HDFS maintenance and administration.
- Managing nodes on Hadoop cluster connectivity and security.
- Experience in commissioning and decommissioning of nodes from cluster.
- Experience in NameNode HA implementation.
- Working with data delivery teams to setup new Hadoop users.
- Installed Oozie workflow engine to run multiple MapReduce, Hive and pig jobs.
- Configured Metastore for Hadoop ecosystem and management tools.
- Hands-on experience in Nagios and Ganglia monitoring tools.
- Experience in HDFS data storage and support for running Map Reduce jobs.
- Performing tuning and troubleshooting of MR jobs by analyzing and reviewing Hadoop log files.
- Installing and configuring Hadoop eco system like Sqoop, Pig, Flume, and Hive.
- Maintaining and monitoring clusters. Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Experience in using DistCp to migrate data between and across the clusters.
- Installed and configured Zookeeper.
- Hands on experience in analyzing Log files for Hadoop eco system services.
- Coordinate root cause analysis efforts to minimize future system issues.
- Troubleshooting of hardware issues and closely worked with various vendors for Hardware/OS and Hadoop issues.
Environment: Cloudera4.2, Hive, Pig, Sqoop, HBase, Tableau, Micro strategy, Shell Scripting, RedHat Linux.
Confidential
System Administrator
Responsibilities:
- Installing and maintaining the Linux servers
- Responsible for managing Redhat Linux Servers and Workstations.
- Create, modify, disable, delete UNIX user accounts and Email accounts as per FGI standard process.
- Installing local area networks (LANs), wide area networks (WANs), network segments, intranets, and also gave support.
- Quickly arrange repair for hardware in occasion of hardware failure.
- Patch management, Patch updates on quarterly basis.
- Setup securities for users and groups and firewall intrusion detection systems
- Add, delete and Modify UNIX groups using the standards processes and resetting user passwords, Lock/Unlock user accounts.
- Effective management of hosts, auto mount maps in NIS, DNS and Nagios.
- Monitoring System Metrics and logs for any problems.
- Security Management, providing/restricting login and Sudo access on business specific and Infrastructure servers & workstations.
- Running Crontab to back up data and troubleshooting Hardware/OS issues.
- Involved in Adding, removing, or updating user account information, resetting passwords etc.
- Maintaining the RDBMS server and Authentication to required users for databases
- Handling and debugging Escalations from L1 Team.
- Took Backup at regular intervals and planned with a good disaster recovery plan.
- Correspondence with Customer, to suggest changes and configuration for their servers.
- Maintained server, network, and support documentation including application diagrams.
Environment: Oracle, Shell, PL/SQL, DNS, TCP/IP, Apache Tomcat, HTML and UNIX/Linux.