Hadoop Admin Resume
Bellevue, WA
SUMMARY:
- Over 7+ years of experience in the IT industry, including proven experience in Big Data Administration and development technologies.
- Experienced in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Flume.
- Expertise in setting up processes for Hadoop based application design and implementation.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and Vice Versa.
- Good Understanding of NoSQL Databases like HBase, Cassandra.
- Hands on Experience in developing Hadoop applications on Spark using Scala as Functional and Object - Oriented Language.
- Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
- Experience working with all kinds of datafiles such as XML, JSON, Parquet, Avro and in Databases.
- Expertise working with Web Technologies likes HTML 5, CSS 3, Vanilla JavaScript, ES6, and Bootstrap.
- Expertise in version control GIT.
- Security administration during installation and knowledge on Kerberos, Apache Ranger etc.
- Knowledge of processing and analyzing real time data streams using Kafka.
- Experience in coding in Python, Scala and Core Java.
- Analyzing Streaming Data and identifying important trends in Data for further analysis using Spark Streaming.
- Hands on Experience in writing Spark SQL scripts and implementing Spark RDD transformations and actions using Python/Scala.
- Experience in analyzing Log files for Hadoop ecosystem services and finding root cause.
- Experience in working on Starting/Stopping the Hadoop services during OS patching and during hardware failures on the data nodes.
- Experience in Involving and communicating effectively with the Onsite Team and coordinating the offshore team activities accordingly.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
- Knowledge on Kerberos security, Kafka, Storm, Spark and AWS.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Managed Docker orchestration and Docker containerization using Kubernetes.
- Used Kubernetes to orchestrate the deployment, scaling and management of Docker Containers.
- Monitored and configured a Test Cluster on Amazon Web Services with EMR, EC2 instances for further testing process and gradual migration.
- Proficient in deployment and troubleshooting of JAR, WAR and EAR files in clustered environments.
- Hands-on-experience in Linux admin activities on RHEL & Cent OS .
- Overall Strong experience in system Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance monitoring and Fine-tuning on Linux (RHEL) systems.
TECHNICAL SKILLS:
Big data / Hadoop Technologies: HDFS, MapReduce, Hive, Spark, Sqoop, Zookeeper
Programming Languages: Python, Scala, Core Java, Shell Scripting and Bash.
Operating Systems: Windows, Linux.
Databases: HBase, Cassandra, SQL Server.
Web Servers: Web Logic, Web Sphere, Apache
Web Technologies: HTML, CSS, JavaScript, React JS, Bootstrap
PROFESSIONAL EXPERIENCE:
Hadoop Admin
Confidential, Bellevue, WA
Responsibilities:
- Install, Configure and maintain Single-node and Multi-node Hadoop clusters.
- Responsible for all configuration changes and applying patches on the cluster.
- Responsible for all customer issues linked with Big Data Hadoop cluster during offshore hours and providing support through service now ticketing tool for various severity tickets.
- Administered back end services and databases in the virtual environment.
- Coordinated with technical teams for installation of Hadoop and third related applications on systems.
- Configured Kerberos security to enable a Secure Hadoop cluster in PROD/DEV environment.
- Hands on experience on Cluster monitoring tools like Ambari.
- Configuring SQOOP to import data from external database - MYSQL.
- Load balancing the cluster through balancer scripts.
- User management on Hadoop for HDFS and Map Reduce.
- Assisted with performance tuning and monitoring.
- Supported technical team members for automation, installation and configuration tasks.
- Work with user to resolve issues related to access and jobs running on cluster.
- Commissioning and decommissioning worker nodes.
- Imported/exported data from RDBMS to HDFS using Data Ingestion tools like Sqoop.
- Commissioning and Decommissioning nodes to Hadoop cluster.
- Used Fair Scheduler to manage Map Reduce jobs so that each job gets roughly the same amount of CPU time.
- Recovering nodes from failures and troubleshooting common Hadoop cluster issues.
- Managed Docker orchestration and Docker containerization using Kubernetes.
- Used Kubernetes to orchestrate the deployment, scaling and management of Docker Containers.
- Involved in creating Hive Internal/External tables, loading with data and troubleshoot with Hive jobs.
- Security administration during installation and knowledge on Kerberos, Apache Ranger etc.
- Worked on configuring security for Hadoop Cluster, managing and scheduling jobs on a Hadoop Cluster.
Environment: Hadoop, HDFS, Spark, Kafka, Hive, HTML, CSS, JavaScript, React JS, Bootstrap
Big Data Administrator
Confidential, Franklin Lakes, NJ
Responsibilities:
- Involved in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks Distribution (HDP) to Cloudera Distributions Hadoop (CDH).
- Worked on Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Loaded Data into Spark RDD and performed in-memory data computation to generate the output response.
- Preparing builds, deploy and Co-ordinate with the release management team to ensure that the proper process is followed during the release.
- Optimized Hive Joins for large tables and developed SQL for full outer join of two large tables.
- Installing Hadoop/HBASE, Performing Initial HDFS Configuration, Performing Initial MapReduce Configuration, Setting up High Availability (HA) clusters.
- De-Normalized and Flattening tables of Power sensor’s data which is in multiple tables to perform HBase operations.
- Developed Spark Scripts by using Scala commands as per requirements.
- Developed Rich UI using HTML, CSS, Vanilla JavaScript and Bootstrap.
- Importing and exporting data jobs to perform operations like copying data from RDBMS and to HDFS using Scoop.
- Performing Spark SQL to draw useful insights from the data loaded and presenting to the board.
- Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
- Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Cassandra.
- Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
- Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.
- Well Experience in projects using JIRA, Testing, Maven, MS Build and Jenkins build tools.
Environment: HDFS, Spark RDD, Spark SQL, HTML, CSS, Scoop, Jira.
Hadoop Administrator/Developer
Confidential, Dallas, TX
Responsibilities:
- Responsible for Data Cleaning for several Geomatics Data files received on Daily basis.
- Responsible for development of the web pages from mockups.
- Involved in the design and development of new programs and sub programs, as well as enhancements, modifications and corrections to the existing software.
- Hands on experience Installation, configuration, maintenance, monitoring, performance and tuning, and troubleshooting Hadoop clusters in different environments such as Development Cluster, Test Cluster, and Production.
- Hands on experience on Cluster monitoring tools like Ambari and Cloudera manager.
- Adding/installation of new components and removal of them through Cloudera Manager.
- Good experience on cluster audit findings and tuning configuration parameters.
- Worked extensively on GitHub as a part of transition process of the project which replaced Sub Version. Expertise in Git commands.
- Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Built data models using D3.js of the Geomatics data received and presented to the board.
- Experienced in AGILE environment which involved collaboration between cross functional teams.
- Handled Front end development of the company website using HTML, CSS, Sass and flexbox.
- Followed OOPS design and development for building the enterprise web application.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Designed and implemented Incremental Imports into Hive tables.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Experienced in managing and reviewing the Hadoop log files.
- Worked in AWS environment for development and deployment of Custom Hadoop Applications.
Environment: HDFS, GitHub, Data Cleaning, Web Technologies, GitHub, HTML, CSS, AWS, JavaScript
Linux Administrator
Confidential
Responsibilities:
- Installed and configured of hardware and Linux like HP-UX and RHEL 5/6 on x86 servers.
- Build servers, configured, installed tools, installed patches, and transitioned servers to production support.
- Experience supporting RedHat Cluster and Oracle RAC environment running Oracle databases in High Availability.
- Experience creating and managing HP ProLiant DL G4, G5, G6 & G7, and C7000 Blade Centers.
- Improved monitoring with Nagios and custom plugins. Designed and implemented Nagios installation for monitoring of servers.
- Installing and managing packages using command line utility using RPM and YUM.
- Adding storage to the cluster disks and increasing/ decreasing the filesystem in RHEL.
- Responsible for providing reliable network infrastructure, file sharing services using IPv4/v6 TCP stack, and Installation, configuration and maintenance of LDAP, NIS, DHCP, DNS, FTP, VSFTP, NFS, NIS, NIS+, AUTOFS servers, SAMBA mail server, Red hat Package Manager (RPM) and Updated YUM Repositories.
- Monitoring systems, CPU, memory, and disk utilization using Top, Vmstat, Netstat etc., and experienced in using various network protocols like HTTP, UDP, POP, FTP, TCP/IP, and SMTP.
- Provide 2nd tier technical support and issues resolution of Linux-based servers. Installing and configuring Redhat Linux locally or over the network- kick start- (NFS, FTP, and HTTP).
- Configure dynamic and static network settings for IPv4 and filter packets, Review file system management concepts, and removable media and configure NFS shares with Auto FS.
- Manage file systems using Software RAID and recover an array, manage file systems using Logical Volume Management, resize file systems and protect them with LVM Snapshots.
- Experience in automation using scripts in Perl and shell (bash and korn).
- Experience supporting single sign on authentication using LDAP on Linux environment.
- Configured Kick-start servers to install Red Hat Linux and VMWare ESX on multiple machines.
- Experience configuring LDAP clients and performing activities like user administration using LDAP.
- Involved in virtualization with VMWare ESX vSphere 4.1 and created VM’s and performed P2V and P2P migrations.
Environment: Linux-RHEL 4.x and 5.x, VM Ware, VSphere, ESXi, GIT, IBM Rational Clear Quest, SVN, ANT, Shell (bash), LVM, DNS, DHCP, HTTP, TFTP, Apache Tomcat, NFS, RPM, YUM, and RAID.