We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

San Rafael, CA

SUMMARY:

  • Maintained Hadoop clusters for Development/Quality/Production.
  • Good knowledge on Hadoop cluster architecture and monitoring the cluster.
  • Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation and administration.
  • Experience in writing Oracle queries to do some database side validations.
  • Experience in Installing and configuring complete Hadoop ecosystem (components such as HDFS, Map reduce, pig, hive, oozie, flume).
  • Good knowledge on Hadoop cluster architecture and monitoring the cluster.
  • Experience in Importing and exporting data from different databases like MySQL, Oracle into HDFS using Sqoop.
  • Experience in deploying and managing the Hadoop cluster using Cloudera Manager, Ambari and MCS
  • Experience in using Pig, Hive, Sqoop, Zookeeper.
  • Exposure to traditional and non - traditional databases likes RDBMS.
  • Experienced in providing security to Hadoop cluster with Kerberos.
  • Experience in monitoring and managing 100+ node Hadoop cluster.
  • Created a complete processing engine based on Cloudera distribution.
  • Provided level 1, 2 & 3 technical supports.
  • Hands-on experience with Production Hadoop applications such as administration, configuration management, debugging and performance tuning.
  • Implemented Kerberos in Hadoop Ecosystem.
  • Experienced in automating job flows using Oozie.
  • Supported Map Reduce programs running on the cluster.
  • Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
  • Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
  • Worked on disaster management with Hadoop cluster.
  • Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
  • Built data transform framework using Map Reduce and Pig.
  • Worked with business users to extract clear requirements to create business value.
  • Worked with big data teams to move ETL tasks to Hadoop.
  • Investigated on new technologies like Spark to catch up with industry developments.
  • Played a major role in integrating and smooth functioning of the analytics cluster with the data-store.
  • Well experienced in building servers like DHCP, PXE with kickstart, DNS and NFS and used them in building infrastructure in Linux Environment and working with Puppet for application deployment.
  • Experienced in Linux Administration tasks like IP Management (IP Addressing, Sub netting, Ethernet Bonding, and Static IP).
  • Good communication and interpersonal skills, a committed team player and a quick learner.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Map reduce, Yarn, HBase, Pig, Hive, Sqoop, Flume, Navigator, Sentry, Kerberos, Puppet, Oozie, Zoo keeper

Operating Systems: UNIX, Linux, Windows vista/XP/ 2000/98/95/7/8, MS DOS

Database Languages: SQL, PL/SQL, Teradata loading tools

NoSQL Databases: HBase, Cassandra, Spark, Impala

Programming Languages: C, C++, Java, J2EE, Python, Ant scripts, Linux shell scripts

MSOFFICE: Excel, Word, PowerPoint

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

PROFESSIONAL EXPERIENCE:

Confidential, San Rafael, CA

Hadoop Administrator

Responsibilities:

  • Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
  • Installed 64+ node Hadoop clusters using Cloudera Hadoop, Horton Works and MapR.
  • Performance tuned and optimized Hadoop clusters to achieve high performance.
  • Implemented schedulers on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
  • Helped the team to increase cluster size from 32 nodes to 64 nodes.
  • Have worked with the spark databases.
  • Have good knowledge on Impala.
  • Installed, configured and deployed a 64 node Cloudera Hadoop cluster for development, production.
  • Worked on setting up high availability for major production cluster and designed automatic failover
  • Performed both major and minor upgrades to the existing CDH cluster.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Benchmarking Hadoop clusters using Teragen, and Terasort.
  • Worked on Providing User support and application support on Hadoop Infrastructure.
  • Reviewing ETL application use cases before on boarding to Hadoop.
  • Created Hive external tables and managed tables, designed data models in Hive.
  • Involved in business requirements gathering and analysis of business use cases.
  • Prepared System Design document with all functional implementations.
  • Involved in Data model sessions to develop models for HIVE tables.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP and Pig Latin.
  • Worked on Sequence files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
  • Worked with SQOOP import and export functionalities to handle large data set transfer between DB2 database and HDFS.

Confidential, Murray Hill, NJ

Hadoop Consultant

Responsibilities:

  • Helped the team to increase cluster size from 35 nodes to 113 nodes. The configuration for additional data nodes was managed using Puppet.
  • Installed, configured and deployed a 60 node Cloudera Hadoop cluster for development, production.
  • Worked on setting up high availability for major production cluster and designed automatic failover.
  • Performed both major and minor upgrades to the existing CDH cluster.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Having exposure on spark & impala data bases.
  • Benchmarking Hadoop clusters using DFSIO, Teragen, and Terasort.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Wrote Nagios plugins to monitor Hadoop Name node Health status, number of Task trackers running, number of Data nodes running.
  • Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
  • Developed multiple Map reduce jobs in java for data cleaning and preprocessing.
  • Moved data from HDFS to RDBMS and vice-versa using SQOOP.
  • Developed HIVE queries and UDFs to analyze the data in HDFS.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Implemented Hadoop Float equivalent to the Teradata Decimal

Confidential, San Jose, CA

Hadoop Consultant

Responsibilities:

  • Installed and configured 64 node Hadoop clusters HDP 2.X
  • Installed and configured Hadoop cluster across various environments.
  • Performed both major and minor upgrades to the existing Hortonworks Hadoop cluster.
  • Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
  • Experience in using Ambari Web UI for automatic installations and also manual installations.
  • Good knowledge on installations & using with spark & impala data bases
  • Experience in setting up a new cluster from existing cluster using Ambari blueprint.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Performance tuned and optimized Hadoop clusters to achieve high performance.
  • Configured hive view and file view in ambari and maintained ambari database.
  • Configured High Availability (HA) for Namenode and Hiveserver2.

Confidential

System Administrator

Responsibilities:

  • Serve as the primary systems administrator for Linux servers and services.
  • Monitor system performance and utilization
  • Install, configure, and document new servers and applications
  • Maintain and audit user accounts
  • Manage fileserver utilization (shared folders, quotas, etc.)
  • Maintain servers and system security according to campus standards
  • Track vulnerabilities and apply appropriate patches and upgrades
  • Generate statistics for operational review and planning
  • Manage backup process and perform data recoveries as needed.
  • Ensure systems team support requests are answered within one business day.
  • Respond rapidly to system maintenance needs, including on weekends and evenings.

We'd love your feedback!