Sr. Hadoop Admin Resume
Raleigh, NC
SUMMARY
- Over 8+ years of experience including 2.5 years of experience with Hadoop Ecosystem in installation and administrated of all UNIX/LINUX servers and configuration of different Hadoop eco - system components in the existing cluster project.
- Around 3.5 years of experience inHadoopinfrastructure which include Map reduce, Hive, Oozie, Scoop, Hbase, hive, Pig, HDFS, Yarn,SAS interface configuration projects in direct Clint role.
- Experience in setting up automated monitoring and escalation infrastructure for HadoopCluster usingGanglia and Nagios.
- Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
- Hands on experience in installation, configuration, supporting and managingHadoopClusters using Apache, Horton works, Cloudera and MapReduce.
- Extensive experience in installing, configuring and administratingHadoopcluster for majorHadoop distributions like CDH5 and HDP.
- Hands on experience on configuring aHadoopcluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Hadoop Ecosystem Cloudera, Horton works, Hadoop MapR, HDFS, HBase, Yarn, Zookeeper, Nagios, Hive, Pig, Ambari Spark Impala.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Excellent noledge of in NOSQL databases like HBase, Cassandra.Experience in monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
- Experience in creating custom Lucene/ Solr Query components.
- Experience in implementing in setting up standards and processes forHadoopbased application design and implementation.
- Having strong experience/expertise in different data-warehouse tools including ETL tools like Ab Initio, Informatica, etc. and BI tools like Cognos, Microstrategy, Tableau and Relational Database systems like Oracle/PL/SQL, Unix Shell scripting.
- Experience in developing and scheduling ETL workflows inHadoopusing Oozie. Also has substantial experience writing MapReduce jobs in Java, Pig, Flume, Zookeeper and Hive and Storm.
- Expertise with the tools inHadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
- Used NoSQL database with Cassandra, MongoDB and Monod.
- Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
- Extensively involved in Test Plan re-design, Test Case re-Creation, Test Automation and Test Execution of web and client server applications as per change requests.
- Experience in working on internet related technologies such as HTML, JavaScript, VBScript and XML.
- Having Strong Experience in LINUX/UNIX Administration, expertise in Red Hat Enterprise Linux 4, 5 and 6, familiar with Solaris 9 &10 and IBM AIX 6.
- Created POC to store Server Log data into Cassandra to identify System Alert Metrics.
TECHNICAL SKILLS
Big Data Technologies: HDFS, Hive, Map Reduce, Pig, Sqoop, Oozie, Zookeeper, YARN, Avro, Spark
Scripting Languages: Shell, Python, Perl
Tools: Quality center v11.0\ALM, TOAD, JIRA, HP QTP, HP UFT, Selenium, Test NG, JUnit
Programming Languages: Java, C++,C,SQL,PL/SQL
QA methodologies: Waterfall, Agile, V-model.
Front End Technologies: HTML, XHTML, CSS, XML, JavaScript, AJAX, Servlets, JSP
Java Frameworks: MVC, Apache Struts2.0, Spring and Hibernate
Defect Management: Jira, Quality Center.
Domain Knowledge: GSM, WAP, GPRS, CDMA and UMTS (3G)
Web Services: SOAP(JAX-WS), WSDL, SOA, Restful(JAX-RS), JMS
Application Servers: Apache Tomcat, Web Logic Server, Web Sphere, JBoss
Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2 NoSQL Databases HBase, MongoDB Cassandra Data Stax Enterprise 4.6.1
Cassandra RDBMS: Oracle 9i, Oracle 10g, MS Access, MS SQL Server, IBM DB2, PL/SQL
Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.
PROFESSIONAL EXPERIENCE
Sr. Hadoop Admin
Confidential, Raleigh, NC
Responsibilities:
- Currently working asHadoopAdminand responsible for taking care of everything related to the clusters total of 100 nodes ranges from POC to PROD clusters.
- Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and manually from command line.
- Responsible for adding new eco system components, like spark, storm, flume, Knox with required custom configurations based on the requirements.
- Used Data Wrangling tool called Trifictawhich helped in cleaning the data, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Postgres into HDFS using Sqoop. & Validate Infra readiness on H2Oand Ambari Components.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & reviewHadooplog files.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & reviewHadooplog files.
- Involved in Planning, installing, configuring, maintaining, and monitoringHadoopClusters and using Apache, Cloudera (CDH3, CDH4, and CDH5) distributions in "User Space & Single User Mode" non-root installation with CloudScoring Accelerator for large database.
- Load log data into HDFS using Flume, Kafka and performing ETL integrations
- Expertise with NoSQL databases like Hbase, Cassandra, DynamoDB (AWS) and MongoDB
- Experience in ClouderaHadoopUpgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
- Responsible for Importing and exporting data into HDFS using Sqoop.
- Responsible for Installation of variousHadoopEcosystems andHadoopDaemons.
- Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of databases.
- Implemented Kerberos Security Authentication protocol for existing cluster.
- Managed and reviewedHadoopLog files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
- Involved in transforming data from Mainframe tables to HDFS, and HBASE tables using Sqoop and Pentaho Kettle. And also worked on Impala to analyze stored data.
- Has deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.And supporting and managingHadoopClusters using Apache, Horton works, Cloudera and MapReduce
- Involved in loading data from UNIX file system to HDFS. And Created custom Solr Query components to enable optimum search matching
- Involved in writing Map reduce programs and tested using MRUnit.
- Installed and configured localHadoopCluster with 3 nodes and set up 4 nodes cluster on EC2 cloud.
- Written MapReduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase - Hive Integration.
- Developing scripts and batch job to schedule a bundle (group of coordinators), which consists of various Hadoopprograms using Oozie.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
- Installed and configuredHadoopMapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
- Worked with the Linux administration team to prepare and configure the systems to supportHadoop deployment.
- Created internal environments to reproduce customer reported issues and test/verifyHot Fixes and upgrades.
- Performing Linux systems administration on production and development servers (Red Hat Linux, CentOS and other UNIX utilities).
- Installation and Configuration of VMware vSphere client, Virtual Server creation and resource allocation.
- MonitoringHadoopCluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics
Environment: HDFS, Map Reduce, HBase, Kafka, Yarn, MongoDB, Hive, Impala, Oozie, Pig, Sqoop, HBase, Shell Scripting, MySQL, Red Hat Linux, CentOS and other UNIX utilities, Cloudera Manager.
Hadoop Admin
Confidential, Ridgefield, NJ
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Installed/Configured/Maintained ApacheHadoopclusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS. nvolved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Installed and configuredHadoop, MapReduce, HDFS (HadoopDistributed File System), developed multiple MapReduce jobs in java for data cleaning
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Experience in methodologies such as Agile, Scrum and Test driven development
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, Cassandra and slots configuration.
- Installed, configured, and administered a smallHadoopclusters consisting of 10 nodes. Monitored cluster for performance and, networking and data integrity issues.
- Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
- Formulated procedures for installation ofHadooppatches, updates and version upgrades.
- Architecture and designedHadoop30 nodes Innovation Cluster with SQRRL, SPARK, Puppet, HDP 2.2.4.
- Working with data delivery teams to setup new Hadoopusers. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
- Load log data into HDFS using Flume, Kafka and performing ETL integrations.
- Managed 350+ Nodes HDP 2.2.4 cluster with 4 petabytes of data using Ambari 2.0 and Linux Cent OS 6.5.
- Created 25+ Linux Bash scripts for users, groups, data distribution, capacity planning, and system monitoring.
- Install OS and administratedHadoopstack with CDH5 (with YARN) Cloudera Distribution including configuration management, monitoring, debugging, and performance tuning.
- Upgraded theHadoopcluster from CDH4.7 to CDH5.2.
- Supported MapReduce Programs and distributed applications running on theHadoopcluster.
- ScriptingHadooppackage installation and configuration to support fully-automated deployments.
- Worked with systems engineering team to plan and deploy newHadoopenvironments and expand existingHadoopclusters.
Environment: Hive, Pig, HBase, Zookeeper and Sqoop, ETL, Ambari 2.0, Linux Cent OS, HBase, MongoDB, Cassandra, Ganglia and Cloudera Manager.
Hadoop Administrator
Confidential, Atlanta, GA
Responsibilities:
- Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and reviewHadooplog files on Hortonworks, MapR and Cloudera clusters
- Responsible for architectingHadoopclusters with Hortonworks distribution platform HDP 1.3.2 and Cloudera CDH4.
- Experienced in Installation and configuration Hortonworks distribution HDP 1.3.2 and Cloudera CDH4
- Upgraded Hortonworks distribution HDP 1.3.2 to HDP 2.2
- Responsible on-boarding new users to theHadoopcluster (adding user a home directory and providing access to the datasets).
- Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
- Resolved tickets submitted by users, P1 issues, troubleshoot the errors, documenting, resolving the errors.
- Experienced in writing the automatic scripts for monitoring the file systems, key MAPR services.
- Responsible for giving presentations about new ecosystems to be implemented in the cluster with the teams and managers.
- Helped the users in production deployments throughout the process.
- Managed and reviewedHadoopLog files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
- Applied patches to cluster.
- Added new Data Nodes when needed and ran balancer.
- Responsible for building scalable distributed data solutions usingHadoop.
- Continuous monitoring and managing theHadoopcluster through Ganglia and Nagios.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Done major and minor upgrades to theHadoopcluster.
- Upgraded the ClouderaHadoopecosystems in the cluster using Cloudera distribution packages.
- Done stress and performance testing, benchmark for the cluster.
- Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
- Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team from Cloudera.
Environment: Flume, Oozie, Pig, Sqoop, Mongo, Hbase, Hive, Map-Reduce, YARN, Hortonworks and Cloudera Manager
UNIX/LINUX Administrator
Confidential
Responsibilities:
- Installation and configuration of Red Hat Enterprise Linux (RHEL) 5x, 6x Servers on HP, f Sun Enterprise Servers, HP and IBM Blade Servers, HP 9000, RS 6000, IBM P series nstalling Configuration and maintenance of Solaris 9/10/11, RedHat Linux 4/5/6, SUSE 10.3, 11.1, HP-UX 11.x and IBM AIX operating systems.
- Worked on configurations and installations Solaris and Linux servers using Custom Jumpstart and Kick start.
- Expertise in enterprise class storage including SCSI, RAID and Fiber-Channel technologies
- Configured Enterprise UNIX/LINUX systems in heterogeneous environment (Linux (Red hat & SUSE), Solaris, HP-UX) with SAN/NAS infrastructure across multiple sites on mission business critical systems
- Created a standard kick start based installation method for RHEL servers. Installation includes all changes required to meet company's security standards. Installation method is possible over HTTP, or via PXE on a separate network segment
- Setup, Implementation, Configuration, documentation of Backup/Restore solutions for Disaster/Business Recovery of clients using TSM backup on UNIX, SUSE & Red hat Linux platforms.
- Installation and configuration of Oracle 11g RAC on Red Hat Linux nodes
- Installed and configured different applications like apache, tomcat, JBoss, xrdp, Web Sphere, etc.. and worked closely with the respective teams
- Setting up JBoss cluster and configuring apache with JBoss on Ret Hat Linux. Proxy serving with Apache. Troubleshooting Apache with JBoss and Mod jk troubleshooting for the clients
Environment: Red hat 4x, 5x, Solaris 9/10, AIX, HP-UX, VMware ESX 5.0/5.1/5.5, VSphere, HPProLiant Servers DL 380/580, Dell Servers (R series), Windows 2008 sever, EMCClariion, Netapp, SCSI, VMWare converter, Apache Webserver, F5 load balancer, oracle, MySQL, PHP, DNS, DHCP, BASH, NFS, NAS, Spacewalk, WebSphere, WebLogic, Java, Jenkins, JBoss, Tomcat, Kick start.
Linux Admin
Confidential
Responsibilities:
- Provisioning, building and support of Linux servers both Physical and Virtual using VMware for Production, QA and Developers environment.
- Installed, configured and Administrated of all UNIX/LINUX servers, includes the design and selection of relevant hardware to Support the installation/upgrades of Red Hat (5/6), CentOS 5/6, Ubuntu operating systems.
- Network traffic control, IPsec, Quos, VLAN, Proxy, Radius integration on Cisco Hardware via Red Hat Linux Software.
- Responsible for managing the Chef client nodes and upload the cookbooks to chef-server from Workstation
- Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
- Responsible for configuring real time backup of web servers. Log file was managed for troubleshooting and probable errors.
- Responsible for reviewing all open tickets, resolve and close any existing tickets.
- Document solutions for any issues that has not been discovered previously.
- Worked with File System includes UNIX file System and Network file system. Planning, scheduling and implementation of O/s. patches on both Solaris & Linux.Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
- Patch management of servers and maintaining server's environment in Development/QA/Staging /Production
- Performing Linux systems administration on production and development servers (RedHat Linux, Cent OS and other UNIX utilities).
- Installing Patches and packages on Unix/Linux Servers.
- Installation, Configuration, Upgradation and administration of Sun Solaris, Redhat Linux.
- Installation and Configuration of VMware vSphere client, Virtual Server creation and resource allocation.
- Performance Tuning, Client/Server Connectivity and Database Consistency Checks using different Utilities.
- Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
Environment: Red hat Linux/Centos 4, 5, 6, Logical Volume Manager, VMware ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11,12, HPSM, HPSA.
