We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Oaks, PA

SUMMARY:

  • Over 7 years of experience in large - scale Information Technology projects.
  • Around 3 years of experience in implementation of Big Data analytics based on Hadoop ecosystem.
  • Maintained Hadoop clusters for Development/Quality/Production environments.
  • Experience with Hortonworks & Cloudera Manager Administration.
  • Excellent knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN, Resource Manager, Node Manager & MapReduce.
  • Hands on experience on Installation and configuration of Apache Hadoop cluster on Linux platform.
  • Adept in installing, configuring and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Pig, Hive, Sqoop, Zookeeper, Flume and Kafka.
  • Good experience in implementing Kerberos & Ranger in Hadoop Ecosystem.
  • Provided level 1, 2 & 3 technical supports.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Strong knowledge of Hadoop platforms and other distributed data processing platforms.
  • Monitor Hadoop cluster using tools like Nagios, Ganglia, Ambari & Cloudera Manager.
  • Investigated on new technologies like Spark to catch up with industry developments.
  • Good Knowledge of Informatica and SAS.
  • Experience creating reports using Tableau.
  • Strong knowledge in writing Shell Scripts.
  • Demonstrated leadership capability and ability to effectively communicate with all levels of management.
  • Worked with business users to extract clear requirements to create business value.
  • Worked with big data teams to move ETL tasks to Hadoop.
  • Exceptionally well organized that demonstrates self-motivation, learning, creativity & initiatives, extremely dedicated & possess skills in actively learning new technologies within short span of time.
  • Achieved SLA/SLO standards.
  • Strong experience in 24x7 environments
  • Good experience in both Waterfall & Agile methodologies.

TECHNICAL SKILLS:

Hadoop: HDFS, MapReduce, Yarn, Hive, HBase, Impala, Hue, HCatalog, Pig, Sqoop, Flume, Oozie, Zookeeper, Ranger, Kerberos, Knox, Cassandra

Operating System: UNIX, Linux, Windows vista/XP/ 2000/98/95/7/8, MS DOS

Database Languages: SQL, PL/SQL

NoSQL Databases: HBase, Cassandra

Hadoop Cloud: AWS, Rackspace, S3

Hadoop Monitoring & Management: Ganglia, Ambari, Nagios, Ambari, Cloudera Manager

Hadoop Distributions: Hortonworks HDP, AWS, Cloudera Manager

Other Tools: MS-OFFICE - Excel, Word, PowerPoint, Adobe Photoshop, Macromedia Flash

PROFESSIONAL EXPERIENCE:

Confidential, Oaks, PA

Hadoop Administrator

Responsibilities:

  • Helped the team to increase cluster size from 35 nodes to 80+ nodes. The configuration for additional data nodes was managed using Puppet.
  • Performed both major and minor upgrades to the existing Hortonworks Hadoop cluster.
  • Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
  • Applied patches and bug fixes on Hadoop Clusters.
  • Worked on setting up High Availability for major production cluster and designed automatic failover.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Experience in setting up a new cluster from existing cluster using Ambari blueprint.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Configured Hive metastore with MySQL, which stores the metadata for Hive tables.
  • Experience in setting up Queues so as the application teams SLA is met and finally ensured in having a robust capacity scheduling.
  • Worked on Apache Ranger for HDFS, HBase, Hive access and permissions to the users through active directory.
  • Worked on different compression codecs and ORC file format.
  • Worked on integration of Hiveserver2 with Tableau.
  • Benchmarking Hadoop cluster using DFSIO, Teragen and Terasort.
  • Installed, configured Disaster Recovery using Falcon.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with active directory for managing users and application groups.
  • Moved data from HDFS to RDBMS and vice-versa using SQOOP.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Mentoring & supporting team members in managing Hadoop cluster
  • Reviewing and updating all configuration & documentation of Hadoop clusters as part of continuous improvement processes.
  • Performed data completeness, correctness, data transformation & data quality testing using available tools & techniques
  • Wrote Shell Scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Developed some Map reduce jobs in java for data cleaning and preprocessing.
  • Provided rotational on call support and educated users on new implementations as per maintenance.

Environment: Hortonworks Data Platform, Linux, HBase, Hive, Pig, Kerberos, Ranger, Knox, Sqoop, Oozie, Zookeeper, Capacity Scheduler, Fairscheduler, Shell Scripting, MySQL, Ganglia, Nagios, Git & Jenkins tools, RHEL, CentOS etc.

Confidential, Sacramento, CA

Hadoop Technical Consultant

Responsibilities:

  • Installation and Administration of a Hadoop cluster using Cloudera Distribution.
  • Debugging and troubleshooting the issues in development and Test environments.
  • Commissioned data nodes as per the cluster capacity and decommissioned when the hardware degraded or failed.
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Designed workflow by scheduling Hive processes for file data, which is streamed into HDFS using Flume.
  • Design & Develop ETL workflow using Oozie for business requirements, which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
  • Performed systems monitoring, upgrades, performance tuning.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Performed migration of data across clusters using DISTCP.
  • Implemented schedulers on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
  • Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
  • Monitoring Hadoop Clusters using CM UI, Ganglia and 24x7 on call support.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Worked on Impala performance tuning with different workloads and file formats.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP and Pig Latin.
  • Worked on providing User support and application support on Hadoop Infrastructure
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
  • Preformed Linux server administration to the server hardware and operating system.

Environment: CDH, Cloudera Manager, RHEL, Hive, Pig Latin, Oozie, SQOOP, HBase, Cassandra, AWS, S3, Ganglia.

Confidential, Dallas, TX

Hadoop Consultant

Responsibilities:

  • Installed, configured and deployed a 30 node Cloudera Hadoop cluster for development and production.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Loading the data from the different Data sources like (Teradata and DB2) into HDFS using Sqoop and load into Hive tables.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Developed Pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Developed Hive queries for Analysis across different banners.
  • Worked on pulling the data from relational databases, Hive into the Hadoop cluster using the Sqoop import for visualization and analysis.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Used Nagios plugins to monitor Hadoop Name Node Health status, number of Task trackers running, number of Data nodes running.
  • Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
  • CRON job automation for backups and recurring jobs needed for other applications.
  • 24x7 on call support on a rotational basis.

Confidential, New York

LINUX System Administrator

Responsibilities:

  • Installation and configuration of Red Hat Linux, Solaris, Fedora and Centos OS on new server builds as well as during the upgrade situations.
  • Log management like monitoring and cleaning the old log files.
  • System audit report like no. of logins, success & failures, running cron jobs.
  • System performance for hourly basis or daily basis.
  • Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
  • Created user roles and groups for securing the resources using local OS authentication.
  • Configuring & monitoring DHCP server.
  • Taking backup using tar and recovering during the data loss.
  • Experience in writing bash scripts for job automation.
  • Maintaining relations with project managers, DBA's, Developers, Application support teams and Operational support teams to facilitate effective project deployment.
  • Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
  • Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
  • Performed regular installation of patches using RPM and YUM.
  • Experience managing various file systems using LVM and SVM and also configured file systems through network using NFS, NAS, SAN methodologies and installed RAID devices.

Confidential, Boston, MA

Linux Administrator

Responsibilities:

  • Installing and maintain the Linux servers.
  • Good knowledge on Apache Web Server, Mail Servers, Database Servers, DNS Servers, Backup Management, Server Security Management, Live migration between servers remotely, Virtualization (Open VZ) and RAID devices.
  • MySQL Database - Tuning MySQL server parameters, repairing tables if crashed, error logging, database Maintenance, ensuring remote MySQL availability.
  • Monitoring System Metrics and logs for any problems.
  • Running cron-tab & autosys jobs to back up MySQL data.
  • Installing and updating packages using YUM.
  • Performance Monitoring - DDOS monitoring, Load Balancing, Remote Desktops, VNC, etc.
  • Application Protocols - Excellent understanding of SMTP, HTTP, FTP, PPTP and Open VPN Systems/Hardware - IPMI, KVMs.
  • Security - IP Tables, CSF, SSH. Preventing DDOS attack prevention using IP Tables and other security tools.

Hire Now