Hadoop Administrator Resume San Rafael, CA - Hire IT People

SUMMARY:

Maintained Hadoop clusters for Development/Quality/Production.
Good knowledge on Hadoop cluster architecture and monitoring the cluster.
Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation and administration.
Experience in writing Oracle queries to do some database side validations.
Experience in Installing and configuring complete Hadoop ecosystem (components such as HDFS, Map reduce, pig, hive, oozie, flume).
Good knowledge on Hadoop cluster architecture and monitoring the cluster.
Experience in Importing and exporting data from different databases like MySQL, Oracle into HDFS using Sqoop.
Experience in deploying and managing the Hadoop cluster using Cloudera Manager, Ambari and MCS
Experience in using Pig, Hive, Sqoop, Zookeeper.
Exposure to traditional and non - traditional databases likes RDBMS.
Experienced in providing security to Hadoop cluster with Kerberos.
Experience in monitoring and managing 100+ node Hadoop cluster.
Created a complete processing engine based on Cloudera distribution.
Provided level 1, 2 & 3 technical supports.
Hands-on experience with Production Hadoop applications such as administration, configuration management, debugging and performance tuning.
Implemented Kerberos in Hadoop Ecosystem.
Experienced in automating job flows using Oozie.
Supported Map Reduce programs running on the cluster.
Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
Worked on disaster management with Hadoop cluster.
Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
Built data transform framework using Map Reduce and Pig.
Worked with business users to extract clear requirements to create business value.
Worked with big data teams to move ETL tasks to Hadoop.
Investigated on new technologies like Spark to catch up with industry developments.
Played a major role in integrating and smooth functioning of the analytics cluster with the data-store.
Well experienced in building servers like DHCP, PXE with kickstart, DNS and NFS and used them in building infrastructure in Linux Environment and working with Puppet for application deployment.
Experienced in Linux Administration tasks like IP Management (IP Addressing, Sub netting, Ethernet Bonding, and Static IP).
Good communication and interpersonal skills, a committed team player and a quick learner.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Map reduce, Yarn, HBase, Pig, Hive, Sqoop, Flume, Navigator, Sentry, Kerberos, Puppet, Oozie, Zoo keeper

Operating Systems: UNIX, Linux, Windows vista/XP/ 2000/98/95/7/8, MS DOS

Database Languages: SQL, PL/SQL, Teradata loading tools

NoSQL Databases: HBase, Cassandra, Spark, Impala

Programming Languages: C, C++, Java, J2EE, Python, Ant scripts, Linux shell scripts

MSOFFICE: Excel, Word, PowerPoint

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

PROFESSIONAL EXPERIENCE:

Confidential, San Rafael, CA

Hadoop Administrator

Responsibilities:

Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
Installed 64+ node Hadoop clusters using Cloudera Hadoop, Horton Works and MapR.
Performance tuned and optimized Hadoop clusters to achieve high performance.
Implemented schedulers on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
Helped the team to increase cluster size from 32 nodes to 64 nodes.
Have worked with the spark databases.
Have good knowledge on Impala.
Installed, configured and deployed a 64 node Cloudera Hadoop cluster for development, production.
Worked on setting up high availability for major production cluster and designed automatic failover
Performed both major and minor upgrades to the existing CDH cluster.
Performance tune Hadoop cluster to achieve higher performance.
Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
Benchmarking Hadoop clusters using Teragen, and Terasort.
Worked on Providing User support and application support on Hadoop Infrastructure.
Reviewing ETL application use cases before on boarding to Hadoop.
Created Hive external tables and managed tables, designed data models in Hive.
Involved in business requirements gathering and analysis of business use cases.
Prepared System Design document with all functional implementations.
Involved in Data model sessions to develop models for HIVE tables.
Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP and Pig Latin.
Worked on Sequence files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
Worked with SQOOP import and export functionalities to handle large data set transfer between DB2 database and HDFS.

Confidential, Murray Hill, NJ

Hadoop Consultant

Responsibilities:

Helped the team to increase cluster size from 35 nodes to 113 nodes. The configuration for additional data nodes was managed using Puppet.
Installed, configured and deployed a 60 node Cloudera Hadoop cluster for development, production.
Worked on setting up high availability for major production cluster and designed automatic failover.
Performed both major and minor upgrades to the existing CDH cluster.
Performance tune Hadoop cluster to achieve higher performance.
Configured Hive metastore with MySQL, which stores the metadata of Hive tables.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
Having exposure on spark & impala data bases.
Benchmarking Hadoop clusters using DFSIO, Teragen, and Terasort.
Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
Used Ganglia and Nagios for monitoring the cluster around the clock.
Wrote Nagios plugins to monitor Hadoop Name node Health status, number of Task trackers running, number of Data nodes running.
Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
Developed multiple Map reduce jobs in java for data cleaning and preprocessing.
Moved data from HDFS to RDBMS and vice-versa using SQOOP.
Developed HIVE queries and UDFs to analyze the data in HDFS.
Performed Analyzing/Transforming data with Hive and Pig.
Implemented Hadoop Float equivalent to the Teradata Decimal

Confidential, San Jose, CA

Hadoop Consultant

Responsibilities:

Installed and configured 64 node Hadoop clusters HDP 2.X
Installed and configured Hadoop cluster across various environments.
Performed both major and minor upgrades to the existing Hortonworks Hadoop cluster.
Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
Experience in using Ambari Web UI for automatic installations and also manual installations.
Good knowledge on installations & using with spark & impala data bases
Experience in setting up a new cluster from existing cluster using Ambari blueprint.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
Performance tuned and optimized Hadoop clusters to achieve high performance.
Configured hive view and file view in ambari and maintained ambari database.
Configured High Availability (HA) for Namenode and Hiveserver2.

Confidential

System Administrator

Responsibilities:

Serve as the primary systems administrator for Linux servers and services.
Monitor system performance and utilization
Install, configure, and document new servers and applications
Maintain and audit user accounts
Manage fileserver utilization (shared folders, quotas, etc.)
Maintain servers and system security according to campus standards
Track vulnerabilities and apply appropriate patches and upgrades
Generate statistics for operational review and planning
Manage backup process and perform data recoveries as needed.
Ensure systems team support requests are answered within one business day.
Respond rapidly to system maintenance needs, including on weekends and evenings.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

San Rafael, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship