Hadoop Admin/developer Resume
Middletown, NJ
PROFESSIONAL SUMMARY:
- Over 8 years of professional IT experience which includes around 4 years of hands on experience in Hadoop using Cloudera, Hortonworks, and Hadoop working environment includes Map Reduce, HDFS, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Cassandra and Flume.
- Hands on Experience in Installing, Configuring and using Hadoop Eco System Components like HDFS, Hadoop Map Reduce, Yarn, Zookeeper, Sqoop, Flume, Hive, HBase, Spark, Pig, Oozie .
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs.
- Experience in installing and configuring Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- Experience in deploying a Hadoop cluster using Cloudera 5.X integrated with Ambari for monitoring and Alerting.
- Experience in launching and setting up of HADOOP Cluster on AWS as well as physical servers, which includes configuring different components of HADOOP.
- Experience in developing and monitoring Puppet Configuration Manager to automate the configuration files of Hadoop Ecosystem.
- Experience in configuring various configuration files like core - site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
- Experience in installing and configuring the Zookeeper to co-ordinate the Hadoop daemons
- Working knowledge in importing and exporting data into HDFS using Sqoop.
- Experience in defining batch job flows with Oozie.
- Experience in Loading log data directly into HDFS using Flume.
- Experienced in managing and reviewing Hadoop log files to troubleshoot the issues occurred.
- Experience in following standard Back up Measures to make sure the high availability of cluster.
- Experience in Implementing Rack Awareness for data locality optimization.
- Experience in scheduling snapshots of volumes for backup and find root cause analysis of failures and documenting bugs and fixes; scheduled downtimes and maintenance of cluster.
- Experience in database imports, worked with imported data to populate tables in Hive.
- Exposure about how to export data from relational databases to Hadoop Distributed File System.
- Experience in cluster maintenance, commissioning and decommissioning the data nodes.
- Experience in monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Experience working with systems engineering team to plan and deploy new HADOOP environments and expand existing Hadoop clusters.
- Experience in monitoring multiple Hadoop clusters environments using Ganglia and Nagios as well as workload, job performance and capacity planning using Cloudera Manager.
- Experience in installing and configuring Kerberos for the authentication of users and Hadoop daemons.
- Hands on experience in Linux admin activities on RHEL &Cent OS.
- Knowledge on Cloud technologies like AWS Cloud.
- Experience in Benchmarking, Backup and Disaster Recovery of Name Node Metadata.
- Experience in performing minor and major Upgrades of Hadoop Cluster.
- Experience with Source Code Management tools and proficient in GIT.
- Excellent interpersonal and communication skills, creative, research-minded with problem solving skills.
TECHNICAL SKILLS:
Big Data Tools: HDP2.2, CDH4.x and CDH5.x, Apache Hadoop 1.2.1/ Hadoop 2.x, HDFS, HBase, Hive, Pig, Sqoop, Oozie, Zookeeper, Map-R Sandbox, HDP Hortonworks Sandbox
Operating System: Linux, RHEL 5.x, 6.x, UNIX, CentOS, Windows
Databases: MySQL, MS SQL Server, Oracle 11g, NoSQL Databases HBase, DB2
Scripting: Shell Scripting
Programming: C, Core Java, SQL, Hive, Pig
Tools: Puppet, WinSCP, Putty
PROFESSIONAL EXPERIENCE:
Confidential, Middletown, NJ
Hadoop Admin/Developer
Responsibilties:
- Installed, Configured and Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Extensively worked with Cloudera Distribution Hadoop, CDH 5.x, CDH4.x
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version.
- Created Hive tables to store the processed results in a tabular format.
- Used Oozie workflows to automate jobs on Amazon EMR.
- Utilized cluster co-ordination services through ZooKeeper.
- Configuring Sqoop and Exporting/Importing data into HDFS.
- Implement Flume, Spark, Spark Stream framework for real time data processing.
- Implemented Proofs of Concept on Hadoop and Spark stack and different big data analytic tools, using Spark SQL as an alternative to Impala
- Used Sqoop to import and export data from RDBMS to HDFS and vice-versa.
- Integrated Kerberos into Hadoop to make cluster more strong and secure from unauthorized users.
- Spin up EMR clusters with required EC2 Instance types understanding job type and data size.
- Updated cloud formation templates with IAM roles for S3 bucket access, security groups, subnet ID, EC2 Instance Types, Ports, and AWS Tags. Worked on bitbucket, git and bamboo to deploy EMR clusters.
- Involved in updating scripts and step actions to install ranger plugins.
- Debug spark job failures and provided workarounds.
- Used Splunk to analyze job logs and ganglia to monitor servers.
- Involved in enabling ssl for hue on prem CDH cluster.
- Written shell scripts and successfully migrated data from on Prem to AWS EMR (S3)
Environment: CDH 5.7.6, Cloudera Manager, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Red hat/Centos 6.5.
Confidential, Denver, COHadoop Admin
Responsibilties:
- Installed and configured Hadoop cluster across various environments through Cloudera Manager.
- Enable TLS between Cloudera manager and agents.
- Installed and configured MYSQL and Enabled High Availability.
- Installed and configured Revolution R and RStudio Server and integrated with Hadoop Cluster.
- Worked on setting up High availability for major Hadoop Components like NameNode, Resource Manager, HUE, Hive and Cloudera Manager.
- Designed automatic failover using Zookeeper.
- Installed and configured Sentry server to enable schema level Security.
- Troubleshot Cluster issues and preparing run books.
- Integrated external components like Informatica BDE, Tibco and Tableau with Hadoop using Hive server2.
- Implemented HDFS snapshot feature.
- Migrated data across clusters using DISTCP.
- Performed both major and minor upgrades to the existing Cloudera Hadoop cluster.
- Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
- Applied patches and bug fixes on Hadoop Clusters.
- Performed tuning and optimized Hadoop clusters to achieve high performance.
- Implemented schedulers on the Resource Manager to share the resources of the cluster.
- Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
- Monitored Hadoop Clusters using Cloudera Manager and 24x7 on call support.
- Implemented and designed disaster recovery plan for Hadoop Cluster.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Prepared System Design document with all functional implementations.
- Worked with SQOOP import and export functionalities to handle large data set transfer between traditional databases and HDFS.
Environment: CDH 5.x, Hive2, Hue, R, Mysql, Tableau, Sqoop, CM 5.x,Zookeeper
Confidential, Detroit, MIHadoop Systems Administrator
Responsibilities:
- Installed, configured, monitored, and maintained HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive.
- Worked on Scripting Hadoop package installation and configuration to support fully-automated deployments.
- Supported Hadoop developers and assisting in optimization of map reduce jobs, Pig Latin scripts, Hive Scripts and HBase ingest required.
- Defined job flows and managed and reviewed Hadoop and Hbase log files.
- Ran Hadoop streaming jobs to process terabytes of text data.
- Loaded and transformed large sets of structured, semi structured and unstructured data.
- Supported Map Reduce Programs those are running on the cluster.
- Loaded data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive QL scripts.
- Managed data coming from different sources.
- Created Hive tables, loaded with data and wrote hive queries which will run internally in MapReduce way.
- Built and configured log data loading into HDFS using Flume.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Managed several Hadoop clusters in production, development, Disaster Recovery environments.
- Trouble shot many cloud related issues such as Data Node down, Network failure and data block missing.
- Managed cluster coordination services through Zoo Keeper.
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Worked on System/cluster configuration and health check-up.
- Monitored and managed the Hadoop cluster through Ambari.
- Created user accounts and given users the access to the Hadoop cluster.
- Resolved tickets submitted by users, troubleshot the error documenting and resolved the errors.
Environment: Hadoop HDFS, Map Reduce, Hive, Pig, Puppet, Zookeeper, HBase, Flume, Ganglia, Sqoop, Linux, CentOS, Ambari.
ConfidentialSystems Administrator
Responsibilties:
- Worked on Administration of RHEL 4.x and 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- Created and cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrated servers between ESX hosts and Xen servers.
- Installed RedHat Linux using kick-start and applying security polices for hardening the server based on the company policies.
- Installed and verified that all AIX/Linux patches or updates are applied to the servers.
- Installed RPM and YUM packages patch and other server management.
- Managed systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning and testing.
- Worked and performed data-center operations including rack mounting and cabling.
- Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & RedHat Linux.
- Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
- Configured multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
Environment: VMware 3.5, Solaris 2.6/2.7/8, Oracle 10g, Weblogic10.x,Veritas NetBackup, Veritas Volume Manager, Samba, NFS, NIS, LVM, Linux, Shell Programming.
Confidential
Linux Administrator
Responsibilties:
- Worked on daily basis on user access and permissions, Installations and Maintenance of Linux Servers.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Monitored System activity, Performance and Resource utilization.
- Maintained Raid-Groups and LUN Assignments as per agreed design documents.
- Performed all System administration tasks like cron jobs, installing packages and patches.
- Used LVM extensively and created Volume Groups and Logical volumes.
- Performed RPM and YUM package installations, patch and other server management.
- Configured Linux guests in a VMware ESX environment.
- Built, implemented and maintained system-level software packages such as OS, Clustering, disk, file management, backup, web applications, DNS, LDAP.
- Performed scheduled backup and necessary restoration.
- Configured Domain Name System (DNS) for hostname to IP resolution.
- Troubleshot and fixed the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.
Environment: Solaris 8,9,10, Red Hat Linux AS/EL 4.0, AIX 5.2, 5.3, Sun E10k, E25K, E4500, SonicMQ 7.0, VxFS 4.1, VxVM 4.1, SVM, DB2 Connect, Perl, Shell, Veritas, OpenOffice, C, MySQL, Oracle, DB2.