We provide IT Staff Augmentation Services!

Hadoop Admin Resume

4.00/5 (Submit Your Rating)

Bellevue, WA

SUMMARY:

  • Over 7+ years of experience in the IT industry, including proven experience in Big Data Administration and development technologies.
  • Experienced in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Flume.
  • Expertise in setting up processes for Hadoop based application design and implementation.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and Vice Versa.
  • Good Understanding of NoSQL Databases like HBase, Cassandra.
  • Hands on Experience in developing Hadoop applications on Spark using Scala as Functional and Object - Oriented Language.
  • Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
  • Experience working with all kinds of datafiles such as XML, JSON, Parquet, Avro and in Databases.
  • Expertise working with Web Technologies likes HTML 5, CSS 3, Vanilla JavaScript, ES6, and Bootstrap.
  • Expertise in version control GIT.
  • Security administration during installation and knowledge on Kerberos, Apache Ranger etc.
  • Knowledge of processing and analyzing real time data streams using Kafka.
  • Experience in coding in Python, Scala and Core Java.
  • Analyzing Streaming Data and identifying important trends in Data for further analysis using Spark Streaming.
  • Hands on Experience in writing Spark SQL scripts and implementing Spark RDD transformations and actions using Python/Scala.
  • Experience in analyzing Log files for Hadoop ecosystem services and finding root cause.
  • Experience in working on Starting/Stopping the Hadoop services during OS patching and during hardware failures on the data nodes.
  • Experience in Involving and communicating effectively with the Onsite Team and coordinating the offshore team activities accordingly.
  • Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Knowledge on Kerberos security, Kafka, Storm, Spark and AWS.
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
  • Managed Docker orchestration and Docker containerization using Kubernetes.
  • Used Kubernetes to orchestrate the deployment, scaling and management of Docker Containers.
  • Monitored and configured a Test Cluster on Amazon Web Services with EMR, EC2 instances for further testing process and gradual migration.
  • Proficient in deployment and troubleshooting of JAR, WAR and EAR files in clustered environments.
  • Hands-on-experience in Linux admin activities on RHEL & Cent OS .
  • Overall Strong experience in system Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance monitoring and Fine-tuning on Linux (RHEL) systems.

TECHNICAL SKILLS:

Big data / Hadoop Technologies: HDFS, MapReduce, Hive, Spark, Sqoop, Zookeeper

Programming Languages: Python, Scala, Core Java, Shell Scripting and Bash.

Operating Systems: Windows, Linux.

Databases: HBase, Cassandra, SQL Server.

Web Servers: Web Logic, Web Sphere, Apache

Web Technologies: HTML, CSS, JavaScript, React JS, Bootstrap

PROFESSIONAL EXPERIENCE:

Hadoop Admin

Confidential, Bellevue, WA

Responsibilities:

  • Install, Configure and maintain Single-node and Multi-node Hadoop clusters.
  • Responsible for all configuration changes and applying patches on the cluster.
  • Responsible for all customer issues linked with Big Data Hadoop cluster during offshore hours and providing support through service now ticketing tool for various severity tickets.
  • Administered back end services and databases in the virtual environment.
  • Coordinated with technical teams for installation of Hadoop and third related applications on systems.
  • Configured Kerberos security to enable a Secure Hadoop cluster in PROD/DEV environment.
  • Hands on experience on Cluster monitoring tools like Ambari.
  • Configuring SQOOP to import data from external database - MYSQL.
  • Load balancing the cluster through balancer scripts.
  • User management on Hadoop for HDFS and Map Reduce.
  • Assisted with performance tuning and monitoring.
  • Supported technical team members for automation, installation and configuration tasks.
  • Work with user to resolve issues related to access and jobs running on cluster.
  • Commissioning and decommissioning worker nodes.
  • Imported/exported data from RDBMS to HDFS using Data Ingestion tools like Sqoop.
  • Commissioning and Decommissioning nodes to Hadoop cluster.
  • Used Fair Scheduler to manage Map Reduce jobs so that each job gets roughly the same amount of CPU time.
  • Recovering nodes from failures and troubleshooting common Hadoop cluster issues.
  • Managed Docker orchestration and Docker containerization using Kubernetes.
  • Used Kubernetes to orchestrate the deployment, scaling and management of Docker Containers.
  • Involved in creating Hive Internal/External tables, loading with data and troubleshoot with Hive jobs.
  • Security administration during installation and knowledge on Kerberos, Apache Ranger etc.
  • Worked on configuring security for Hadoop Cluster, managing and scheduling jobs on a Hadoop Cluster.

Environment: Hadoop, HDFS, Spark, Kafka, Hive, HTML, CSS, JavaScript, React JS, Bootstrap

Big Data Administrator

Confidential, Franklin Lakes, NJ

Responsibilities:

  • Involved in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks Distribution (HDP) to Cloudera Distributions Hadoop (CDH).
  • Worked on Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Loaded Data into Spark RDD and performed in-memory data computation to generate the output response.
  • Preparing builds, deploy and Co-ordinate with the release management team to ensure that the proper process is followed during the release.
  • Optimized Hive Joins for large tables and developed SQL for full outer join of two large tables.
  • Installing Hadoop/HBASE, Performing Initial HDFS Configuration, Performing Initial MapReduce Configuration, Setting up High Availability (HA) clusters.
  • De-Normalized and Flattening tables of Power sensor’s data which is in multiple tables to perform HBase operations.
  • Developed Spark Scripts by using Scala commands as per requirements.
  • Developed Rich UI using HTML, CSS, Vanilla JavaScript and Bootstrap.
  • Importing and exporting data jobs to perform operations like copying data from RDBMS and to HDFS using Scoop.
  • Performing Spark SQL to draw useful insights from the data loaded and presenting to the board.
  • Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Cassandra.
  • Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
  • Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.
  • Well Experience in projects using JIRA, Testing, Maven, MS Build and Jenkins build tools.

Environment: HDFS, Spark RDD, Spark SQL, HTML, CSS, Scoop, Jira.

Hadoop Administrator/Developer

Confidential, Dallas, TX

Responsibilities:

  • Responsible for Data Cleaning for several Geomatics Data files received on Daily basis.
  • Responsible for development of the web pages from mockups.
  • Involved in the design and development of new programs and sub programs, as well as enhancements, modifications and corrections to the existing software.
  • Hands on experience Installation, configuration, maintenance, monitoring, performance and tuning, and troubleshooting Hadoop clusters in different environments such as Development Cluster, Test Cluster, and Production.
  • Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Good experience on cluster audit findings and tuning configuration parameters.
  • Worked extensively on GitHub as a part of transition process of the project which replaced Sub Version. Expertise in Git commands.
  • Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Built data models using D3.js of the Geomatics data received and presented to the board.
  • Experienced in AGILE environment which involved collaboration between cross functional teams.
  • Handled Front end development of the company website using HTML, CSS, Sass and flexbox.
  • Followed OOPS design and development for building the enterprise web application.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Designed and implemented Incremental Imports into Hive tables.
  • Worked in Loading and transforming large sets of structured, semi structured and unstructured data
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Experienced in managing and reviewing the Hadoop log files.
  • Worked in AWS environment for development and deployment of Custom Hadoop Applications.

Environment: HDFS, GitHub, Data Cleaning, Web Technologies, GitHub, HTML, CSS, AWS, JavaScript

Linux Administrator

Confidential

Responsibilities:

  • Installed and configured of hardware and Linux like HP-UX and RHEL 5/6 on x86 servers.
  • Build servers, configured, installed tools, installed patches, and transitioned servers to production support.
  • Experience supporting RedHat Cluster and Oracle RAC environment running Oracle databases in High Availability.
  • Experience creating and managing HP ProLiant DL G4, G5, G6 & G7, and C7000 Blade Centers.
  • Improved monitoring with Nagios and custom plugins. Designed and implemented Nagios installation for monitoring of servers.
  • Installing and managing packages using command line utility using RPM and YUM.
  • Adding storage to the cluster disks and increasing/ decreasing the filesystem in RHEL.
  • Responsible for providing reliable network infrastructure, file sharing services using IPv4/v6 TCP stack, and Installation, configuration and maintenance of LDAP, NIS, DHCP, DNS, FTP, VSFTP, NFS, NIS, NIS+, AUTOFS servers, SAMBA mail server, Red hat Package Manager (RPM) and Updated YUM Repositories.
  • Monitoring systems, CPU, memory, and disk utilization using Top, Vmstat, Netstat etc., and experienced in using various network protocols like HTTP, UDP, POP, FTP, TCP/IP, and SMTP.
  • Provide 2nd tier technical support and issues resolution of Linux-based servers. Installing and configuring Redhat Linux locally or over the network- kick start- (NFS, FTP, and HTTP).
  • Configure dynamic and static network settings for IPv4 and filter packets, Review file system management concepts, and removable media and configure NFS shares with Auto FS.
  • Manage file systems using Software RAID and recover an array, manage file systems using Logical Volume Management, resize file systems and protect them with LVM Snapshots.
  • Experience in automation using scripts in Perl and shell (bash and korn).
  • Experience supporting single sign on authentication using LDAP on Linux environment.
  • Configured Kick-start servers to install Red Hat Linux and VMWare ESX on multiple machines.
  • Experience configuring LDAP clients and performing activities like user administration using LDAP.
  • Involved in virtualization with VMWare ESX vSphere 4.1 and created VM’s and performed P2V and P2P migrations.

Environment: Linux-RHEL 4.x and 5.x, VM Ware, VSphere, ESXi, GIT, IBM Rational Clear Quest, SVN, ANT, Shell (bash), LVM, DNS, DHCP, HTTP, TFTP, Apache Tomcat, NFS, RPM, YUM, and RAID.

We'd love your feedback!