Hadoop Administrator Resume
San Diego, CA
SUMMARY:
- Overall 8+ years of experience in Linux/System administration and around 4 years of experience in Big Data Hadoop technologies
- Proficient in working with (Jaguar ORNL Super Computer) Linux based High - Performance clusters
- Experience in installation, configuration and management of Hadoop Clusters
- Experience with Cloudera CDH3, CDH4 and CDH5 distributions
- Experience in set up and manage Hadoop cluster using Amazon AWS EC2 instances
- Experience in using Cloudera Manager for tracking cluster utilization and Cloudera navigator for defining data lifecycle rules
- In depth knowledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient
- Experience in providing security for Hadoop Cluster with Kerberos
- Access control to the data staged based on the user groups using the extended ACL features in HDFS
- Experience in creating job pools, assigning users to pools and restricting production job submissions based on pool
- Experience in setting up the monitoring tools such as Nagios and Gangalia to monitor and analyze the functioning of cluster.
- Experience in setting up and managing data gathering tools such as Flume for event ingest and Sqoop for batch ingest
- Good understanding of NoSQL databases such as Hbase and Cassandra
- Experience in analyzing data on HDFS through MapReduce, Hive and Pig
- Extensive experience with ETL and Query big data tools like Pig Latin and Hive QL
- Experience in setting up workflows and scheduling the workflows using Oozie
- Experience on UNIX commands and Shell Scripting
- Excellent interpersonal, communication, documentation and presentation skills
TECHNICAL SKILLS:
Hadoop/Big Data platform: HDFS, MapReduce, Hbase, Cassandra, Hive, Pig, Oozie, Zookeeper, Flume, Sqoop
Hadoop distribution: Cloudera
Admin operations: Access control, Cluster maintenance, Performance tuning, Storage capacity management
Programming Languages: C, C++, MATLAB, Pig Latin
Scripting Languages: Shell Scripting, Perl, Python, XML
Databases: MySQL, Cassandra, Hbase
Operating Systems: UNIX, Linux, Windows XP/Vista/7/8, Mac OSX
Writing & Plotting Tools: MS Office, LaTex, Origin
PROFESSIONAL EXPERIENCE:
Confidential, San Diego, CA
Hadoop Administrator
Responsibilities:
- Handle the installation and configuration of a Hadoop cluster using CDH4 with the help of Cloudera support team
- Handle the setting up Data Ingestion, Data partitioning, Access control, Staging, building data pipelines and Productionization for carrying ETL tasks on cluster
- Handle enterprise data movement with the help of Sqoop and Flume
- Close monitoring and analysis of the MapReduce job executions on cluster at task level
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks
- Handle the upgrades and Patch updates
- Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups
- Automated procedures to analyze the remaining storage space on clusters for sending alerts if cluster needs to be scaled.
- Set up the checkpoints to gathering the system statistics for critical set ups
- Discussions with other technical teams on regular basis regarding upgrades, Process changes, any Special processing and feedback
- Hive architecture and administration
- Handle the adhoc requests for admin tasks on Hive tables
Environment: Hadoop, HDFS, MapReduce, Hive, Flume, Sqoop, Cloudera CDH4
Confidential, Woburn, MA
Hadoop Administrator
Responsibilities:
- Capturing data from existing databases that provide SQL interfaces using Sqoop
- Implemented Hadoop stack and different bigdata analytic tools, migration from different databases to Hadoop
- Processed information from Hadoop HDFS. This information will comprise of various useful insights that can be used in the decision making process. All these insights will be presented to the users in the form of Charts
- Working on different Big Data technologies, good knowledge of Hadoop, Map-Reduce, Hive
- Developed various POCs over Hadoop, Big data
- Worked on deployments and automation task
- Installed and configured Hadoop cluster in pseudo and fully distributed mode environments
- Involved in developing the data loading and extraction processes for big data analysis
- Worked on professional services engagements to help customers design, build clusters, applications, troubleshoot network, disk and operating system related issues
- Administer linux servers, other unix variants, and managed hadoop clusters
- Installed and configured local Hadoop Cluster with 3 nodes and set up 4 nodes cluster on EC2 cloud.
- Written MapReduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase-Hive Integration
- Work with HBase and Hive scripts to extract, transform and load the data into HBase and Hive
- Continuous monitoring and managing of the Hadoop cluster
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior
- Installed Oozie workflow engine to run multiple Hive and Pig jobs
- Developing scripts andbatch jobto scheduleabundle(group of coordinators) which consists ofvarious Hadoop programsusingOozie
- Exported theanalyzed datato the relational databases using Sqoop for visualization and to generate reports
Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud
Confidential, Columbus, OH
Hadoop Engineer/Admin
Responsibilities:
- Participated in design and development of scalable and custom Hadoop solutions as per dynamic data needs
- Coordinated with technical team for production deployment of software applications for maintenance
- Provided operational support services relating to Hadoop infrastructure and application installation
- Handled the imports and exports of data onto HDFS using Flume and Sqoop
- Supported technical team members in management and review of Hadoop log files and data backups
- Participated in development and execution of system and disaster recovery processes
- Formulated procedures for installation of Hadoop patches, updates and version upgrades
- Automated processes for troubleshooting, resolution and tuning of Hadoop clusters
- Set up automated processes to send alerts in case of predefined system and application level issues
- Set up automated processes to send notifications in case of any deviations from the predefined resource utilization
- Handle the data pre processing tasks during data ingestion from data sources
- Schedule the workflows for performing ETL tasks on cluster using Oozie
- Handle the adhoc requests to run Pig scripts and HiveQL for performing data transformations
- Handle the adhoc requests to execute Hive queries for extracting and loading of data
- Use Sqoop to load the data from Hive tables into MySQL database
Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Oozie
Confidential, Framingham, MA
Linux/System Admin
Responsibilities:
- Support, troubleshot, maintain multi-site & collocated hardware including servers, storage, hardware, and related network devices and peripherals
- Develop scripts, documentation, and operating procedures for initiatives as appropriate
- Worked with IT management to develop and support IT strategy, including upgrades, capacity planning, innovation initiatives, and replenishment
- Plan, test and execute a comprehensive infrastructure maintenance program, including patching, reporting, and upgrade/update management
- Identify and remediate known and emerging platform vulnerabilities, and perform threat detection/response, and other systems security-related tasks in a timely manner
- R&D new software, devices and other emerging technology for the establishment and execution of the company’s business roadmap
Environment: Linux, UNIX, System maintenance, Cron jobs, Support & Troubleshoot
Confidential
System Admin
Responsibilities:
- Installation, setup and configuration of RHEL, CentOS products in different hardware in a datacenter environment
- Performed systems administration functions for the RedHat Linux operating system, which included setting up user accounts and user account security contexts, adding packages, updates, upgrade, and patches. Administered a Cisco router
- Developed scripts for inbound and outbound of the data on servers
- Installed and configured new hard drives and memory upgrades
- Created new slices, mounted new file systems and un-mounted file systems
- Performed software audits. Planned, procured, and installed all software, hardware, and networking equipment
- Provided technical assistance and consulting to the users, and developed tools for managing the system
- Designed and developed server disaster recovery strategies
Environment: Linux, Installation and setup of OS, System maintenance, Software upgrades, Network
Confidential
Linux/Systems Engineer
Responsibilities:
- Executed Linux server hardware and software configuration, installation, patching, integration, upgrades and performance tuning
- Developed utilities to resolve software and hardware compatibility and operability issues
- Maintained storage, hardware, and related network devices and peripherals
- Performed and supported system hardware assembly and software integration at customer location
- Strong scripting knowledge (Perl and C) to perform different administration tasks
- Experience in Shell scripting (ksh, bash) to automate system administration jobs
- Performed automated installations of Operating System using Jumpstart for Solaris and Kickstart for Linux
- Configured various services, devices as well as applications of UNIX servers and worked with application team to customize the environment
Environment: Linux, System maintenance, Installation and setup of OS, Software upgrades