Hadoop Admin Resume
White Plains, NY
SUMMARY:
- 7+ years of experience in IT sector including 5+ years of experience in Hadoop administration.
- Experienced with Cloudera CDH and Hortonworks HDP environment.
- Experience in Linux, Cloudera Manager, Ambari, HDFS, Hive, Pig, Sqoop, Oozie, Cloudera Navigator, Zookeeper, Kerberos, Apache Sentry, Apache Ranger, Nagios, MySQL, SQL Server and some others. Working knowledge in Bash, Python and Apache HBASE. Theoretical knowledge on different Amazon AWS services.
- Installation and configuration of Cloudera CDH and Hortonworks HDP environment; operational, security, monitoring and management related components and tools for Dev/Test and Prod server.
- Upgrading, Monitoring and maintenance of Hadoop components, services and deamons.
- Securing Hadoop clusters by implementing Kerberos installation, LDAP Integration, data transport encryption with TLS, and data - at-rest encryption with Cloudera Navigator Encrypt.
- Data migration from relational databases to HDFS and Hive using Sqoop.
- Experience on design, configure and manage the backup and disaster recovery using Data Replication, Snapshots, Cloudera BDR utilities.
- Experience in Data Warehousing and ETL processes.
- Performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing job counters and application log files.
- Additional responsibilities including interacting with offshore team on a daily basis, communicating the requirement and delegating the tasks to offshore/on-site team members and reviewing their delivery.
- Implemented Role based authorization for HDFS, HIVE, IMPALA using Apache Sentry, Apache Ranger.
- Experience in analyzing Log files and finding the root cause and then involved in analyzing failures, identifying root causes and taking/recommending course of actions.
- Participating in the application on-boarding meetings along with Application owners, Architects and helps them to identify/review the technology stack, use case and estimation of resource requirements.
- Fix the issues by interacting with dependent and support teams and log the cases based on the priorities.
- Experience in documenting standard practices, compliance policies and meeting SLA, OLA.
- Administration support 24/7, on demand.
- Optimizing, ensuring proper distribution and increasing efficiency of stored data and available resources.
WORK EXPERIENCE:
Hadoop Admin
Confidential, White Plains, NY
Responsibilities:
- Installation and configuration of an ongoing Hadoop test cluster in Cloudera CDH.
- Monitoring and maintenance of Hadoop production cluster using Nagios, Cloudera Manager and Cloudera Navigator.
- Migration of medication and research related relational data from SQL Server to HDFS; analyzing, planning and provisioning on using and storing these data in Hadoop cluster, assisting developers to organize and modify data according to business requirement and then store in Hive warehouse.
- Creating new users, assigning roles and setting permissions on different objects.
- Ensuring client authentication using Kerberos and authorization using Apache Sentry.
- Installation and configuration of Cloudera Manager, Hive, Pig, Impala, HBASE, Cloudera Navigator, Oozie, Spark, Zookeeper.
- Exporting/importing of relational data from different RDBMSs to Hive warehouse using Sqoop.
- Moving data across clusters using distcp command.
- Integration of BI tools like Tableau with Cloudera CDH.
- Monitoring and troubleshooting map reduce jobs.
- Troubleshooting performance and resource allocation issues of different running jobs and using Linux commands, hdfs commands and process management tools.
- Deploying configuration changes by provisioning server downtime, informing users, cross checking compatibility with related tools, services and daemons and monitoring closely after deployment for any unexpected issues meeting pre and post installation requirement.
- Allocation of available resources using resource pool.
- Optimizing HDFS by removing orphan files, merging multiple small files into one big file(.har file), altering file formats, compressing and encrypting files depend on their sensitivity.
Environment: Cloudera CDH 5, HDFS, Cloudera Manager, Map Reduce, YARN, Pig, Hive, Impala, Sqoop, Oozie, Zookeeper, Spark, Nagios, Cloudera Navigator, Kerberos, Apache Sentry, JIRA.
Hadoop Admin
Confidential, Baltimore, MD
Responsibilities:
- Installation, configuration, monitoring and maintenance of Hadoop cluster in CDH5 environment.
- Configuration of Cloudera Manager, Hive, Pig, Sqoop, Impala, Oozie, Spark, HBASE, Zookeeper.
- Resource management among users and groups using different types of resource schedulers.
- Assisting system team to upgrade RHEL from ver. 6.x to 7.x by cross checking dependencies with CDH, JDK and related components versions, planning and provisioning for server downtime, deploying updates, configuration changes needed for the project and keeping continuous communication between system team and cloudera support team for successful completion of the project.
- Creating new users and set roles according to the requirement.
- Ensuring client authentication using Kerberos and authorization using apache sentry.
- Monitoring and troubleshooting jobs and log files for seamless performance and prevention of future downtime.
- Supporting offshore development team to run queries and jobs using hive and Spark to analyze data, bright out meaningful insights and save the result in HDFS, further import in SQL Server for future use.
- Ensuring security and confidentiality of sensitive data by periodical auditing, tracking and monitoring different clients activity using Cloudera Navigator.
- Monitoring health of HDFS, commissioning and decommissioning data nodes.
- Keep track of the health of active name node and standby name node to ensure HA.
- Installation and Upgrading of daemons and services.
- Transferring data between clusters.
- Monitoring and troubleshooting ETL jobs of Spark and Hive.
- Loading data from different RDBMSs like Teradata, SQL Server to HDFS or Hive warehouse using Sqoop.
- Configuring jobs using Oozie.
- Working with other Hadoop administrators to upgrade CDH and related components and daemons versions, to implement important configurations and bug fixing.
- Integration of BI tool like Tableau with Cloudera CDH for data analysis and report generation.
- Ensuring admin support 24/7 on emergency.
Environment: Cloudera CDH5, HDFS, Map Reduce, YARN, Pig, Hive, Sqoop, Oozie, Zookeeper, Impala, Cloudera Manager, Cloudera Navigator, Kerberos, Apache Sentry, JIRA, trello.
Hadoop Admin
Confidential, Boston, MA
Responsibilities:
- Involved in Hadoop cluster installation, configuration, monitoring and maintenance in Hortonworks HDP environment.
- Commissioning and decommissioning of data nodes and services.
- Monitoring and troubleshooting of data backups, log files, health and performance of the entire cluster and other hardware and network related issues using Linux CLI, Ambari and Nagios.
- Installation and Upgrading of Ambari, Hive, Pig, Oozie, and Zookeeper .
- Configuring various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based depending on the job requirement .
- Loading data from HDFS to Hive warehouse.
- Provisioning, configuring, monitoring, and maintaining HDFS, YARN, Sqoop, Oozie, Pig, Hive.
- Loading data from the RDBMS data sources like MySQL, Netezza into HDFS or Hive using Sqoop.
- Creating new users and setting permissions on different objects and tools according to the requirement.
- Documentation of various processes and issues of Hadoop for future .
- Supporting Developer and BI teams 24/7 on demand.
Environment: Hortonworks HDP, HDFS, Map Reduce, YARN, Ambari, Pig, Hive, Sqoop, Oozie, Zookeeper, Kerberos, Aache Ranger, Nagios, JIRA.
DBA (SQL)
Confidential, Boston, MA
Responsibilities:
- Installation and configuration of SQL Server ver. 2008 R2 and 2012 for both development and production environment.
- Building ETL packages for data warehousing using SSIS.
- Generating reports based on company requirements using SSRS.
- Creating tables, views, functions, stored procedures.
- Data migration from SQL Server 2005/2008 R2 to SQL Server 2012.
- Assisting on replication, backup and disaster recovery.
- Monitoring and troubleshooting performance issues using Performance Monitor.
- Running, tuning, and optimizing queries using SQL Profiler.
- Creating users, roles and implementing permissions on different objects to ensure security.