We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Boston, MA

SUMMARY

  • Over 7+years of professional IT experience including 4+ years in Big data ecosystem related technologies.
  • Experience in full lifecycle development process including planning, design, development, testing and implementation of moderate to advanced complexity systems.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
  • Performed importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Extending Hive and Pig core functionality by writing UDFs.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
  • Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH4&5 clusters.
  • Providing support to Data analyst in running PIG and HIVE queries.
  • Writing shell scripts to dump the Shared Data from MySQL servers to HDFS.
  • Good knowledge in Java, J2EE, HTML, JSP, Servlets, CSS, JavaScript, XML
  • Familiar with Java virtual machine (JVM) and multi-threaded processing
  • Hands on experience in Agile and Scrum methodologies.
  • Extensive experience in working with the Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for the users.

TECHNICAL SKILLS

Big Data Ecosystem: HDFS, HBase, Hadoop MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Hive, Sun Grid Enine

RDBMS/ Database: SQL Server 2000/2005/2008 R2, MS-Access XP/2007/2008, ORACLE 10g/9i, MySQL

Scripting Languages: Shell scripting, Java Scripting, UNIX shell scripting, Python, SQL, PIG LATIN

Operating Systems: Unix, Linux, AIX, Windows XP, Server 2000, 03, Server 2008.

PROFESSIONAL EXPERIENCE

Confidential, Boston, MA

Hadoop Administrator

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH5 Hadoop cluster on Red hat Enterprise Linux 5&6.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with Infrastructure teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Also involved in the upgrading the CDH4 to latest version which includes the upgrading of Cloudera Hadoop Manager to version CDH5.

Confidential, columbus,IN

Hadoop Administrator

Environment: Hadoop, MapReduce, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script

Responsibilities:

  • Worked on performing major upgrade of cluster from CDH3u6 to CDH4.2.0.
  • Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
  • Installed Cloudera Manager on an already existing Hadoop cluster.
  • Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
  • Responsible for ongoing maintenance, expansion and improvement of a cross-regional ESX infrastructure, supporting over 400 Virtual Servers, as well as offshore Desktops systems.
  • Closely worked and coordinate efforts with the storage team to analyze performance data, and used it to plan and deploy methods of maximizing performance and reliability.
  • User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
  • Used HiveQL to write Hive queries from the existing SQL queries.
  • The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
  • Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
  • Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, and Flume etc.

Confidential

Big Data Analyst

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Shell Scripting

Responsibilities:

  • Developed PIG scripts to report data for the analysis purpose.
  • Exporting and Importing data from HDFS to RDBMS and vice versa using the SQOOP tool
  • Written UDF's in Python scripting language.
  • Ability to analyze the MapReduce jobs for the data coordination.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with archived data exists in NFS tapes.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Data Management through database (HBASE) to analyze the weblogs logs.

Confidential

Linux Administrator

Environment: Linux ( Red Hat Enterprise, CentOS), Windows 2000/NT, HP, IBM, Solaris, Oracle 8i, Cisco routers/switches, Dell 6400, 1250, Sun E450, E250.

Responsibilities:

  • Installation and configuration of Red Hat Linux, Solaris, Fedora and CentOS on new server builds as well as during the upgrade situations.
  • Log management like monitoring and cleaning the old log files.
  • System audit report like no. of logins, success & failures, running cron jobs.
  • System performance for hourly basis or daily basis.
  • Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
  • Created user roles and groups for securing the resources using local operating System authentication.
  • Experienced in tasks like managing User Accounts and Groups, managing Disks and File systems.
  • Install and configure Instruction Detection System (IDS) like Tripwire, Snort, and Lids.
  • Configuring & monitoring DHCP server.
  • Taking backup using tar and recovering during the data loss.
  • Experience in writing bash scripts for job automation.
  • Documenting the installation of third-party software’s.
  • Configuring printers to the Solaris and Linux servers and also installing third party softwares.
  • Maintaining relations with project managers, DBA’s, Developers, Application support teams and operational support teams to facilitate effective project deployment.
  • Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
  • Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
  • Performed regular installation of patches using RPM and YUM.
  • Maintained LVM, VxVM and SVM filesystems along with NFS.

We'd love your feedback!