Sr. Hadoop Administrator Resume
Chicago, IL
SUMMARY:
- Over 8 years of professional IT experience including experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster. Expertise in Big Data technologies as administrator, proven capability in project based teamwork and also as an individual contributor with good communication skills.
- Experience in working with Hadoop clusters using AWS EMR, Cloudera (CDH5) and HortonWorks Distributions.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR),HDFS,HBase,Oozie,Hive,Sqoop,Spark,Kafka,Cassandra,Scala,Pig,Knox,SPARK STREAMING and Flume.
- Experience in managing backups and Disaster Recovery
- Worked on developing scripts for performing benchmarking with Terasort/Teragen.
- Experience in performing version upgrades from CDH 5.6 to CDH 5.7, HDP 2.3 to HDP 2.4
- Hands-on implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other eco-systems as a Data Storage and Retrieval systems.
- Performed importing and exporting data into HDFS and Hive using Sqoop.
- Experience in managing and reviewing Hadoop log files.
- Experience in analysing client’s Hadoop infrastructure and understanding the performance bottlenecks and providing the performance tuning accordingly.
- Experienced of Hadoop 1.x and Hadoop 2.X installation, configuration and Hadoop Clients online as well as using package implementation.
- Good experience installing, configuring, testing Hadoop ecosystem components.
- Capacity Planning, Hands on experience in various aspects of Database Systems as well as installing, configuring & maintaining the Hadoop clusters.
- Well-experienced in Hadoop cluster expansion and planning.
- Hands on experience in Installing, Configuring and managing the Hue and HCatalog.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Experience in installation, configuration, supporting and managing - Cloudera Hadoop platform along with CDH4&5 clusters.
- Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
- Expertise in working with databases likes Oracle, MS-SQL Server,Postgress, and MS Access 2000 along with exposure to Hibernate for mapping an object-oriented domain model to a traditional relational database.
- Hands on experience in Agile and Scrum methodologies.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Subnetting, Ethernet Bonding and Static IP).
- Worked on Disaster Management with Hadoop Cluster.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Have flair to adapt to new software applications and products, self-starter, have excellent communication skills and good understanding of business work flow.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Experienced of Service Monitoring, Service and Log Management, Auditing and Alerts, Hadoop Platform Security, and Configuring Kerberos
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating and managing the realm domain.
TECHNICAL SKILLS:
Languages: Shell Scripting, Python, MySql
Big Data Framework and Eco Systems: Hadoop, MapReduce, Hive, Pig, Kafka, Cassandra, HDFS, Zookeeper, Sqoop, Spark, Scala, Apache Crunch, Oozie and Flume
No SQL: Cassandra, HBase and MemBase
Linux / Unix: RedHat, CentOS, Ubuntu
Databases: Oracle 8i/9i/10g/11g, MySQL, PostGre SQL and MS-Access
Operating Systems: Windows XP/2000/NT, Linux (Red-Hat, CentOS), Machitosh, UNIX
Tools: Ant, Maven, TOAD, AgroUML, WinSCP, Putty, Lucene
Storage: CIFS/NFS, LUN, RAID
Version Control Tools: CVS, SVN
Hadoop: Cloudera, Apache, Hortonworks
WORK EXPERIENCE:
Confidential, Chicago, IL
Sr. Hadoop Administrator
Responsibilities:
- Involved in design and planning phases of Hadoop Cluster planning
- Responsible for Regular health checkups of the Hadoop cluster using custom scripts
- Installed and configured multi-node fully distributed Hadoop cluster of large number of nodes.
- Provided Hadoop, OS, and Hardware optimizations.
- Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
- Monthly Linux server maintenance, shutting down essential Hadoop name node and data node.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability
- Developed Shell and Python scripts to automate the jobs.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Experienced in managing and reviewing the Hadoop log files.
- Balancing Hadoop cluster using balancer utilities to spread data across the cluster equally.
- Implemented data ingestion techniques like Pig and Hive on production environment.
- Involved in installing and configuring the HDP 2.6.0.
- Routine cluster maintenance on every weekend to make required configuration changes, installation etc.
- Implemented Kerberos Security Authentication protocol for existing cluster.
- Worked extensively with development team to resolve performance issues, configuration changes, memory management etc
- Worked on NoSQL databases including HBase, MongoDB, and Cassandra
- Responsible for setting space quota for users
- Responsible for monitoring Disk usage and resolving HDFS disk space related issues
- Implemented Apache Spark data processing project to handle data from RDBMS and streaming sources.
- Monitoring and Debugging Hadoop jobs/Applications running in production.
- End to end Data flow management from sources to NoSQL (MongoDB) Database using Oozie.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Kerberos keytabs creation for ETL application use cases before on boarding to Hadoop.
- Responsible for adding User to Hadoop cluster
- Worked on Evaluating, comparing different tools for test data management with Hadoop.
- Helped and directed testing team to get up to speed on Hadoop Application testing.
- Worked on Installing 20 node UAT Hadoop cluster.
Environment:Cloudera, Java, RedHat Linux, HDFS, Mahout, Map-Reduce, Cassandra,Hive, Pig, Sqoop, Spark, Scala, Flume, Zookeeper, Oozie, DB2, HBase and Pentaho.
Confidential, Memphis,TN
Sr. Hadoop Administrator
Responsibilities:
- Primary responsibilities include building scalable distributed data solutions using Hadoop ecosystem.
- Responsible for installing, configuring, monitoring HDP stacks 2.1, 2.2, 2.3 and 2.4.
- Responsible for monitoring and troubleshooting the Hadoop Cluster using Ambari .
- Installed and configured Hadoop ecosystem components like Hive, Pig, Sqoop, Flume, Oozie and HBase .
- Monitor Hadoop cluster connectivitywith different interfacing Databases and downstream systems.
- Manage and review Hadoop log files
- Used Hive to query the Hadoop data stored in HDFS.
- Responsible for managing backups and version upgrades.
- Worked on installation of NoSQL database including MongoDB, Cassandra and Hbase.
- Worked with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing MFS, and Hive.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce,
loaded data into HDFS and Extracted the data from MySQL to HDFS using Scoop.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Experience in using Sqoop to migrate data to and fro from HDFS and MySQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Exported the analyzed data to relational databases using HIVE for visualization and to generate report.
- Perform data analysis on large datasets.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Environment: Hortonworks,Ranger,Pig, Eclipse, Hive, Map Reduce, JIRA, HBase, Sqoop,Cassandra, Spark, Scala,Kafka, Strom, Linux, Big Data, Java, SQL, NoSQLand MongoDB.
Confidential, Bloomington,IL
Jr. Hadoop Administrator
Responsibilities:
- Responsible for Hadoop cluster planning.
- Used monitoring tools like cloudera managerto monitor the hadoop cluster.
- Experience in InstallingCloudera and Hortonworks on multi-node cluster
- Performed Data integration from multiple internal clients using Apache Kafka.
- Used Data Access Policies to make proper HDFS ACLs for users and groups.
- Actively involved in tuning HIVE queries for better performance.
- Wrote python scripts for monitoring optimization and efficiencyof hadoop jobs.
- Actively involved in tuning HQL queries for better performance.
- Providing support for System Integration Testing & User Acceptance Testing.
- Involved in resolving the issues routed through trouble tickets from production floor.
- Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
- Involved in Performance Tuning of the application.
- Responsible for bench mark testing and tools comparisons log monitoring
- Used Log4J for extensible logging, edebugging and error tracing.
- Involved in Production support and Maintenance.
- Involved in transferring data from MySQL to HDFS using Sqoop.
- Developed java programs to clean the huge datasets and for pre processing.
- Day - to-day administration on Sun Solaris, RHEL 4/5 which includes Installation, upgrade & loading patch management & packages.
- Assist with overall technology strategy and operational standards for the UNIX domains.
- Interacted and reported the fetched results formultiple BI projects.
Environment: CDH, Hadoop, HDFS, Pig, Kafka, MySQL and MapReduce.
Confidential
Linux Administrator
Responsibilities:
- Installation, Configuration, Upgradation and administration of Windows, Sun Solaris, RedHat Linux and Solaris.
- Linux and Solaris installation, administration and maintenance.
- User account management, managing passwords setting up quotas and support.
- Worked on Linux Kick-start OS integration, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, and LDAP integration.
- Network traffic control, IPSec, Quos, VLAN, Proxy, Radius integration on Cisco Hardware via Red Hat Linux Software.
- Installation and configuration of MySql on Windows Server nodes.
- Responsible for configuring and managing Squid server in Linux and Windows.
- Configuration and Administration of NIS environment.
- Involved in Installation and configuration of NFS.
- Package and Patch management on Linux servers.
- Worked on Logical volume manager to create file systems as per user and database requirements.
- Data migration at Host level using Red Hat LVM, Solaris LVM, and Veritas Volume Manager.
- Expertise in establishing and documenting the procedures to ensure data integrity including system fail-over and backup/recovery in AIX operating system.
- Managed 100 + UNIX servers running RHEL, HPUX on Oracle HP and Dell server including blade centers.
- Solaris Disk Mirroring (SVM), ZONE installation and configuration
- Escalating issues accordingly, managing team efficiently to achieve desired goals.
Environment: Linux, Solaris, Kick-start OS, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, LDAP, Mysql, NIS, Red Hat LVM, Solaris Disk Mirroring.
Confidential
System Administrator
Responsibilities:
- Provided onsite and remote support for Redhat Linux &AIX Servers.
- Provided 24x7 on call server support for UNIX environment including AIX, Linux, HP-UX, and Sun Solaris.
- Configured HP Proliant, Dell Poweredge, R series, Cisco UCS and IBM p-series machines, for production, staging and test environments.
- Administration / installation /upgrade and maintenance of HP Proliant DL 585 G7 using Redhat Enterprise Linux jumpstart, flash archives and upgrade method.
- Responsible for setting up Oracle RAC for a three node RHEL5 cluster.
- Experience with provisioning Linux using Kickstart and Redhat Satellite server
- Extensively used NFS, NIS,DHCP, FTP, Send mail, and Telnet for Linux.
- Coordinated with database administrators while setting up Oracle 10g/11g on Linux.
- Monitoring and troubleshooting with the performance related issues.
- Configured Linux native device mapper (MPIO), EMC powerpath for RHEL 5.4, 5.5, 5.6, 5.7.
- Performance monitoring utilities like IOSTAT, VMSTAT, TOP, NETSTAT and SAR.
- Worked on Support for Aix matrix sub system device drivers.
- Worked on Virtualization of different machines.
- Expertise in Build, Install, load and configure boxes.
- Worked with the team members to create, execute and implement the plans.
- Experience in Installation, Configuration and Troubleshooting of Tivoli Storage Manager (TSM).
- Remediating failed backups, Take manual incremental backups of failing servers.
- Upgrading TSM from 5.1.x to 5.3.x.Worked on HMC Configuration and management of HMC Console which included up gradation, micro partitioning
- Installation of adapter cards cables and configuring them.
- Worked on Integrated Virtual Ethernet and building up of VIO servers.
- Install ssh Keys for Successful login of srmdata into the server without prompting password for daily backup of vital data such as processor utilization, disk utilization, etc..
- Coordinating with application and database team for troubleshooting the application or Database outages.
- Coordinating with SAN team for allocation of LUN's in order to increase filesystem space.
- Configuration and administration of Fiber card Adapter's and handling AIX part of SAN.
- Good LVM skills, used LVM, created VGs, LVs, and disk mirroring.
- Implemented PLM (Partition Load Manager) on AIX 5.3. and 6.1
Environment: Redhat Linux, Aix, Oracle 10g/11g, VIO servers, Kickstart, Redhat satellite server, RHEL, IOSTAT, VMSTAT, TOP, NETSTAT, SAR, HMC.
