Hadoop Administration Resume
Tempe, AZ
SUMMARY:
- Over all 7 years of working experience, including with 5+ years of experience as a Hadoop Administration and along with around 2 years of experience in admin related roles.
- As a Hadoop Administration responsibility include software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions like Hortonworks & Cloudera.
- Experience in installation, management and monitoring of Hadoop cluster using Apache.
- Optimized the configurations of Map Reduce, pig and hive jobs for better performance.
- Advanced understanding in Hadoop Architecture such as HDFS, Yarn, Helix.
- Strong experience configuring Hadoop Ecosystem tools with Docker, including Pig, Hive, HBase, Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark and Storm.
- Experience in designing, installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks.
- Good understanding of the AWS cloud computing platform and related services.
- Expert - level understanding of the performance characteristics of the Puppet.
- Experience in managing the Hadoop infrastructure with Ambari.
- Working experience on Importing and exporting data into HDFS and Hive using Sqoop
- Working experience on Import & Export of data using ETL tool Sqoop from MySQL to HDFS
- Working experience on ETL Data Integration tool Talend.
- Strong Knowledge in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- Experience in Backup configuration and Recovery from a Name Node failure.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Involved in Cluster maintenance, bug fixing, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Helix, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
- Management of security in Hadoop Clusters using Kerberos, Ranger, Knox, Acl’s.
- Used helix for cluster management and distributed resources among the nodes.
- Excellent experience in Shell Scripting.
- Performed day to day Linux administration such as user accounts, logon scripts, directory services, file system shares, and permissions etc.
- Supported in installation of packages/patches on Linux platforms
TECHNICAL SKILLS:
Databases: MS SQL Server, Google cloud platform, MS Excel, Amazon Web Services Ec2
Operating Systems: Windows 10/8/7/XP/Vista, Mac OS, Linux (Red Hat, Ubuntu), Cent OS 6.0
Languages: Linux Commands, SQL, UNIX Shell \Scripting, C, Python, Html
Hadoop Frameworks: HDFS, Spark, Map Reduce, Hive, Alteryx, Pig, Zookeeper, Yarn, Ranger, Hive, Druid.
Relational Database: MySQL
NoSQLData Bases: HBase, Cassandra
Data Ingestion: Flume, Sqoop, Storm. Kafka
Security: Kerberos, AD, LDAP
PROFESSIONAL EXPERIENCE:
Confidential - Tempe, AZ
Hadoop Administration
Responsibilities:
- Worked on Installation, configuration, maintenance, monitoring, performance and tuning, and troubleshooting Hadoop clusters in different environments such as Development Cluster, Test Cluster, and Production.
- Enable High Availability Name Node, Resource manager, HBase and HiveServer2 automatic failover infrastructure to overcome single point of failure.
- Installation of various Hadoop Ecosystems and Hadoop Daemons.
- Import and export hive tables and HBase Snapshot.
- Commissioning and Decommissioning Hadoop Cluster Nodes Including Balancing HDFS block data.
- Debugging of production jobs when failed.
- Enabling and configuring Static Service Pools.
- Involved in Active Directory/LDAP security integration with Big Data.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Responsible for creating Hive tables based on business requirements
- Participated in development and execution of system and disaster recovery processes.
- Created Cluster Utilization Reports & Dashboards on Cloudera.
- Worked on block count, Small files & Data compaction.
- Implemented and managed MySQL database backups & recovery.
- Involved in extracting the data from various sources into Hadoop HDFS for processing.
- Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
- Good experience on cluster audit findings and tuning configuration parameters.
Environment: SQL, Hadoop, Cloudera, Hive, Pig, Sqoop, Flume, HBase, Kafka, Impala, Spark, zookeeper, Cloud, Yarn, Hdfs.
Confidential - Cumberland, RI
Hadoop/zookeeper Administration
Responsibilities:
- Install, Configure, and Administer Hortonworks Hadoop in Dev, QA, and Prod environments.
- Maintain multiple Hadoop clusters (min 50 nodes), Hadoop ecosystems, third party software, and database(s) with updates/upgrades, performance tuning and monitoring.
- Support/Troubleshoot/Schedule jobs running in the Production cluster.
- Integrated Hive to druid for streaming data and long - term storage of CVS.com data.
- Replicated and distributed resources hosted on a cluster of nodes using apache Helix
- Resolve issues, answer questions, and provide support for users or clients on a day to day basis related to Hadoop and its ecosystem.
- Install, configure, and operate Zookeeper, helix, Sqoop, Hive, HBase, Kafka for business needs.
- Intensively worked on zookeeper
- Worked on installing, deploying, maintaining and securing of nodes and multi node cluster.
- Technical expert in Hadoop Admin guided the team and help them to solve the problems.
- Expertise in Kerberos, LDAP integration.
- Involved in LDAP for security purpose.
- Worked on ETL process and handled importing data from various data sources, performed transformations.
- Setup Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning and performance tuning.
- Responsible for software tool administration for system/application monitoring tools.
Environment: Java, Hadoop, Hortonworks, Hive, Pig, Sqoop, Flume, Druid, HBase, Spring, Kafka, helix, zookeeper, Cloud, Data lake, Yarn, Hdfs.
Confidential - Atlanta, GA
Big data/Hadoop Administration
Responsibilities:
- Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.
- In depth understanding of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Resource Manager, Node Manager and YARN / Map Reduce programming paradigm.
- Monitoring Hadoop Cluster through Ambari and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
- Very good understanding and knowledge of assigning number of mappers and reducers to Map reduce cluster.
- Setting up HDFS Quotas to enforce the fair share of computing resources.
- Strong Knowledge in Configuring and maintaining YARN Schedulers (Fair, and Capacity)
- Implemented Puppet modules to automate configuration of a broad range of services.
- Support for parallel data load into Hadoop.
- Involved in setting up HBase which includes master and region server configuration, High availability configuration, performance tuning and administration.
- Created user accounts and provided access to the Hadoop cluster.
- Involved in loading data from UNIX file system to HDFS.
- Worked on ETL process and handled importing data from various data sources, performed transformations.
Environment: HDFS, Hive, Sqoop, Zookeeper and HBase, Unix Linux Java, HDFS Map Reduce, Pig, Hive, HBase, Flume, Kafka, Sqoop, Shell Scripting.
Confidential - Atlanta, GA
Hadoop Administration
Responsibilities:
- Installed and configured various components of Hadoop ecosystem and maintained their integrity
- Planning for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done
- Designed, configured and managed the backup and disaster recovery for HDFS data.
- Experience with Unix or Linux, including shell scripting
- Installing, Upgrading and Managing Hadoop Cluster.
- Commissioned Data Nodes when data grew and decommissioned when the hardware degraded
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Worked with application teams to install Hadoop updates, patches, version upgrades as required
- Installed and Configured Hive, Pig, Sqoop and Oozie on the HDP cluster.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services
- Cluster expansion, and reconfiguration with helix.
- Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
- Ran monthly security checks through UNIX and Linux environment and installed security patches required to maintain high security level for our clients
Environment: HDFS, Map Reduce, Hive, Pig HBase, Sqoop, Helix, RDBMS/DB used: Flat files, MySQL, HBase.