Hadoop Administrator Resume
San Rafael, CA
SUMMARY:
- Maintained Hadoop clusters for Development/Quality/Production.
- Good knowledge on Hadoop cluster architecture and monitoring the cluster.
- Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation and administration.
- Experience in writing Oracle queries to do some database side validations.
- Experience in Installing and configuring complete Hadoop ecosystem (components such as HDFS, Map reduce, pig, hive, oozie, flume).
- Good knowledge on Hadoop cluster architecture and monitoring the cluster.
- Experience in Importing and exporting data from different databases like MySQL, Oracle into HDFS using Sqoop.
- Experience in deploying and managing the Hadoop cluster using Cloudera Manager, Ambari and MCS
- Experience in using Pig, Hive, Sqoop, Zookeeper.
- Exposure to traditional and non - traditional databases likes RDBMS.
- Experienced in providing security to Hadoop cluster with Kerberos.
- Experience in monitoring and managing 100+ node Hadoop cluster.
- Created a complete processing engine based on Cloudera distribution.
- Provided level 1, 2 & 3 technical supports.
- Hands-on experience with Production Hadoop applications such as administration, configuration management, debugging and performance tuning.
- Implemented Kerberos in Hadoop Ecosystem.
- Experienced in automating job flows using Oozie.
- Supported Map Reduce programs running on the cluster.
- Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
- Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
- Worked on disaster management with Hadoop cluster.
- Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
- Built data transform framework using Map Reduce and Pig.
- Worked with business users to extract clear requirements to create business value.
- Worked with big data teams to move ETL tasks to Hadoop.
- Investigated on new technologies like Spark to catch up with industry developments.
- Played a major role in integrating and smooth functioning of the analytics cluster with the data-store.
- Well experienced in building servers like DHCP, PXE with kickstart, DNS and NFS and used them in building infrastructure in Linux Environment and working with Puppet for application deployment.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Sub netting, Ethernet Bonding, and Static IP).
- Good communication and interpersonal skills, a committed team player and a quick learner.
TECHNICAL SKILLS:
Hadoop/Big Data: HDFS, Map reduce, Yarn, HBase, Pig, Hive, Sqoop, Flume, Navigator, Sentry, Kerberos, Puppet, Oozie, Zoo keeper
Operating Systems: UNIX, Linux, Windows vista/XP/ 2000/98/95/7/8, MS DOS
Database Languages: SQL, PL/SQL, Teradata loading tools
NoSQL Databases: HBase, Cassandra, Spark, Impala
Programming Languages: C, C++, Java, J2EE, Python, Ant scripts, Linux shell scripts
MSOFFICE: Excel, Word, PowerPoint
Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL
PROFESSIONAL EXPERIENCE:
Confidential, San Rafael, CA
Hadoop Administrator
Responsibilities:
- Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
- Installed 64+ node Hadoop clusters using Cloudera Hadoop, Horton Works and MapR.
- Performance tuned and optimized Hadoop clusters to achieve high performance.
- Implemented schedulers on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
- Helped the team to increase cluster size from 32 nodes to 64 nodes.
- Have worked with the spark databases.
- Have good knowledge on Impala.
- Installed, configured and deployed a 64 node Cloudera Hadoop cluster for development, production.
- Worked on setting up high availability for major production cluster and designed automatic failover
- Performed both major and minor upgrades to the existing CDH cluster.
- Performance tune Hadoop cluster to achieve higher performance.
- Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
- Benchmarking Hadoop clusters using Teragen, and Terasort.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Reviewing ETL application use cases before on boarding to Hadoop.
- Created Hive external tables and managed tables, designed data models in Hive.
- Involved in business requirements gathering and analysis of business use cases.
- Prepared System Design document with all functional implementations.
- Involved in Data model sessions to develop models for HIVE tables.
- Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP and Pig Latin.
- Worked on Sequence files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
- Worked with SQOOP import and export functionalities to handle large data set transfer between DB2 database and HDFS.
Confidential, Murray Hill, NJ
Hadoop Consultant
Responsibilities:
- Helped the team to increase cluster size from 35 nodes to 113 nodes. The configuration for additional data nodes was managed using Puppet.
- Installed, configured and deployed a 60 node Cloudera Hadoop cluster for development, production.
- Worked on setting up high availability for major production cluster and designed automatic failover.
- Performed both major and minor upgrades to the existing CDH cluster.
- Performance tune Hadoop cluster to achieve higher performance.
- Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
- Having exposure on spark & impala data bases.
- Benchmarking Hadoop clusters using DFSIO, Teragen, and Terasort.
- Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
- Used Ganglia and Nagios for monitoring the cluster around the clock.
- Wrote Nagios plugins to monitor Hadoop Name node Health status, number of Task trackers running, number of Data nodes running.
- Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
- Developed multiple Map reduce jobs in java for data cleaning and preprocessing.
- Moved data from HDFS to RDBMS and vice-versa using SQOOP.
- Developed HIVE queries and UDFs to analyze the data in HDFS.
- Performed Analyzing/Transforming data with Hive and Pig.
- Implemented Hadoop Float equivalent to the Teradata Decimal
Confidential, San Jose, CA
Hadoop Consultant
Responsibilities:
- Installed and configured 64 node Hadoop clusters HDP 2.X
- Installed and configured Hadoop cluster across various environments.
- Performed both major and minor upgrades to the existing Hortonworks Hadoop cluster.
- Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
- Experience in using Ambari Web UI for automatic installations and also manual installations.
- Good knowledge on installations & using with spark & impala data bases
- Experience in setting up a new cluster from existing cluster using Ambari blueprint.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
- Performance tuned and optimized Hadoop clusters to achieve high performance.
- Configured hive view and file view in ambari and maintained ambari database.
- Configured High Availability (HA) for Namenode and Hiveserver2.
Confidential
System Administrator
Responsibilities:
- Serve as the primary systems administrator for Linux servers and services.
- Monitor system performance and utilization
- Install, configure, and document new servers and applications
- Maintain and audit user accounts
- Manage fileserver utilization (shared folders, quotas, etc.)
- Maintain servers and system security according to campus standards
- Track vulnerabilities and apply appropriate patches and upgrades
- Generate statistics for operational review and planning
- Manage backup process and perform data recoveries as needed.
- Ensure systems team support requests are answered within one business day.
- Respond rapidly to system maintenance needs, including on weekends and evenings.