Hadoop Admin Resume Irvine, CA - Hire IT People

SUMMARY

8 plus years of IT industrial experience in Administrating Linux, Database management, developing Map - reduce applications, designing, building and administrating large scale Hadoop production Clusters
2.5 years of experience in big data technologies: Hadoop HDFS, Map-reduce, Pig, Hive, Oozie, Flume, Sqoop, Zookeeper, And NoSQL: Cassandra and Hbase.
Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Hortonworks Ambari.
Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
Strong knowledge of Apache Hive data warehouse, data cubes, Hive server, partitioning, bucketing, clustering and writing UDFS, UDAFS, and UDTFS in Java for hive.
Solid experience in Pig administration and development and writing PIG UDFS (Eval, Filter, Load and Store) and macros.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
Strong knowledge in configuring Name Node High Availability and Name Node Federation.
Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
Experience in using Flume to stream data into HDFS - from various sources.
Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2.
Experience in installing and administering PXE Server with kick start, setting up FTP, DHCP, DNS servers and Logical Volume Management.
Experience in configuring and managing storage devices NAS (file level access - NFS) and SAN (block level access-iSCSI)
Experience in Storage management including JBOD, RAID Levels 1 5 6 10, Logical Volumes, Volume Groups and Partitioning
Exposure to Maven/Ant, GIT along with Shell Scripting for Build & Deployment Process.
Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing key tab using key tab tools.
Experience in handling multiple relational databases: MySQL, SQL Server.
Familiar with Agile Methodology (SCRUM) and Software Testing.
Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER

NoSQL Database: Hbase, Cassandra

Security: Kerberos

Database: MySQL, SQL Server

Cluster management Tools: Cloudera Manager, Ambari

Os: LINUX (Centos, RHEL), windows, mac

PROFESSIONAL EXPERIENCE

Confidential, Irvine, CA

Hadoop Admin

Responsibilities:

Managing 5 Hortonworks cluster size of 1500 nodes altogether (Development, R&D, Discovery PROD, MARS HBase and MARS PROD)
Designed and Architected R&D cluster with HDP 2.3.2 andAmbari2.2.0
Worked on 4 different versions of HDP (1.3.2, 2.1.5, 2.2.6, 2.3.2 Latest Enterprise Release )
Upgraded HDP 1.3.2 and 2.1.5 to 2.2.6 using Blueprints. 2.2.6 to 2.3.2 using Rolling upgrade with no downtime to PROD Cluster
Configured Hadoop High Availability on Namenode, HBase, Hive, Yarn and Storm (Nimbus)
Configured Hadoop security Kerberos. Ranger and Knox for secured cluster
Configured HDFS data at rest Encryption using Ranger KMS
Configured Storm HA
Installed and configured Spark
Created kafka topics, produced and consumed messages
Cluster performance tuning
Setup 3 instance of zookeeper dedicated for HBase, Storm and kafka. 1st instance managed byAmbariand other 2 are out ofAmbari
Configured Apache Ranger centralized security and auditing for HDFS, YARN, HIVE, HBase, Storm and Kafka.
Installed and configured Informatica 9.6.1 HF1 Big Data Edition for Hadoop ETL
Commissioning and decommissioning of datanodes
Troubleshoot the issues reported by Nagios
Built and configured log data loading into HDFS using Flume.
Wrote shell script to monitor few components out ofAmbari
Performed Importing and exporting data into HDFS and Hive using Sqoop.
Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive
Recovering from node failures and troubleshooting common Hadoop cluster issues
Supporting Hadoop developers and assisting in optimization of map reduce jobs, Pig Latin scripts, Hive Scripts, and HBase ingest Required

Confidential, San Francisco, CA

Hadoop Admin

Responsibilities:

Designed and developed data solutions to help business and product teams make data driven decisions
Worked closely with data analysts to construct creative solutions for their analysis tasks
Lead end-to-end efforts to design, develop, and implement data warehousing and business intelligence solutions
Worked on performing major upgrade of cluster from CDH3u6 to CDH4.4.0
Developed Puppet modules to automated the installation, configuration and deployment of software, OS's and network infrastructure at a cluster level
Implemented Namenode HA and automatic failover infrastructure to overcome single point of failure for Namenode utilizing Zookeeper services
ImplementedClouderaManager on existing cluster
Optimized our Hadoop infrastructure at both the software and hardware level
Ensured our Hadoop clusters are built and tuned in the most optimal way to support the activities of our Big Data teams
Developed MapReduce programs to extract and transform the data sets and results were exported back to RDBMS using Sqoop
Installed, Configured and managed Flume Infrastructure
Was responsible for importing the data (mostly log files) from various sources into HDFS using Flume
Created tables in Hive and loaded the structured (resulted from MapReduce jobs) data
Configured Hive metastore to use MySQL Database, to make available all the tables created in Hive different users simultaneously.
Using HiveQL developed many queries and extracted the business required information.
Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data.

Confidential, Jersey City, NJ

Hadoop Admin

Responsibilities:

Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
Implemented authentication service using Kerberos authentication protocol.
Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
Master nodes disks are configured with RAID 1+0
Performed benchmarking on the Hadoop cluster using different benchmarking mechanisms.
Tuned the cluster by Commissioning and decommissioning the Data Nodes.
Upgraded the Hadoop cluster.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Deployed high availability on the Hadoop cluster quorum journal nodes.
Implemented automatic failover zookeeper and zookeeper failover controller.
Configured Ganglia which include installing GMOND and GMETAD daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Deployed Network file system for Name Node Metadata backup.
Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.
Designed and allocated HDFS quotas for multiple groups.
Configured and deployed hive metastore using MySQL and thrift server.
Used hive schema to create relations in pig using Hcatalog.
Development of Pig scripts for handling the raw data for analysis.
Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
Deployed and configured flume agents to stream log events into HDFS for analysis.
Performed deploying yarn, which facilitate multiple applications to run on the cluster.
Configured Oozie for workflow automation and coordination.
Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
Custom shell scripts for automating redundant tasks on the cluster.
Worked with BI teams in generating the reports and designing ETL workflows on Pentaho.

Environment: LINUX, HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER

Confidential

Java Developer

Responsibilities:

Involved in the design and followed Agile Software Development Methodology throughout the software development lifecycle.
Designed Use Cases, Class Diagrams, and Sequence Diagrams using Visual Paradigm to model the detail design of the application.
Developed User Interface using JSP standard tags and Java script, HTML, CSS for Presentation layer.
Used the spring validation for Web Form Validation by implementing the Validator interface.
Application was built on Spring MVC framework and Hibernate as ORM
Used Spring-Core module for Dependency Injection and integrated view using Apace Tiles.
Consumed Web Services (WSDL, SOAP, UDDI) from third party for authorizing payments to/from customers using CXF Framework
Used JMS Queue communication in authorization module.
Mapped (one-to-many, one-to-one, many-to-one relations) DTOs to Oracle Database tables and Java data types to SQL data types by creating Hibernate mapping XML files
Oracle database was used, wrote stored procedures for common SQL queries
Used ANT for building the enterprise application modules, Used CVS for Version control, Log4J to monitor the error logs and performed unit testing using J Unit.
Deployed the applications on IBM Web Sphere Application Server 5.0.

We provide IT Staff Augmentation Services!

Hadoop Admin Resume

Irvine, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship