We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

NewyorK

SUMMARY:

  • Cloudera Certified Administrator for Apache Hadoop.
  • Multi - talented, cross-functional technical professional with progressive experience in Hadoop cluster Administration, Linux System, Security and technical support within large scale IT portfolios, service projects.
  • 10 years extensive IT experience that includes application development, Data warehousing, Systems Administration, monitoring and troubleshooting experience on UNIX, Linux Hadoop and Teradata environments.
  • 3+ years of experience in Hadoop Cluster Administration with Cloudera/Hortonworks distribution.
  • Adding/Removing a Node with Cloudera Manager, Data Rebalancing. Maintaining backups for name node.
  • Skilled in: Hadoop Cluster Installation, Administration and maintenance, Shell/Bash Scripting automation.
  • Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise, iDRAC (Integrated Dell Remote Controller), Dell Open Manage and other tools.
  • Excellent troubleshooting skills in Hardware, Software, Application and Network.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as hdfs, Resource Manager, Node Manager, Name Node, Data Node and mapreduce concepts.
  • Extensively worked on commissioning and decommissioning of cluster nodes, replacing failed disks, file system integrity checks and maintaining cluster data replication.
  • Installing and Administration of various Hadoop distributions like Cloudera, Hortonworks
  • Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
  • Hadoop integration with Business intelligence tools
  • Namenode and Resource Manager High Availability
  • General system performance monitoring for machines running a Hadoop Cluster with Nagios, (for monitoring parameters like disk space, disk partitions, etc.); managing Kerberos authentication for Hadoop.
  • Management Tool- Ambari Tool, Cloudera Manager
  • Participate in the research, design, and implementation of new technologies for scaling our large and growing data sets, for performance improvement, and for analyst workload reduction.
  • End-to-end performance tuning of Hadoop clusters.
  • Monitor Hadoop cluster job performance and capacity planning. HDFS support and maintenance
  • Grown/Shrunk a Hadoop cluster by adding/removing Nodes and tuning server for optimal performance of the cluster; using Namenode and ResourceManager UI and rebalancing load in a cluster.
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Involved in bench marking Hadoop cluster file systems various batch jobs and workloads.
  • Experience in HDFS data storage and support for running map-reduce jobs.
  • Support the integration needs of Hadoop with ETL & Reporting tools.
  • Extensive knowledge on Data warehousing concepts, reporting, relational data bases.
  • Hadoop Upgrades - CM & CDH Upgrade i.e., CDH 5.4, CDH 5.8, CDH 5.9.

TECHNICAL SKILLS:

Hadoop/BIG Data: HDFS, HBase, Hive, Sqoop, Oozie, Flume, Pig, Python, MapReduce and Spark.

BI tools: Ab Initio.

No SQL Databases: HBase

Database: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata.

Operation System: HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Scheduling tools: Autosys, Trivoli, Control-M, CA7.

Languages: SQL, C, C++, Core Java, AWK, Shell Scripting.

PROFESSIONAL EXPERIENCE

Confidential, NewYork

Hadoop Administrator

Responsibilities:

  • Installed, Configured and Maintained Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, Hbase, Zookeeper and Sqoop.
  • Extensively worked with Cloudera Distribution Hadoop CDH 5.x
  • Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
  • Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery etc.,
  • Installed and configured Hue interface for UI access of Hadoop components like hive,pig, oozie, sqoop, Hbase, file browser etc.,
  • Timely and reliable support for all production and development environment: deploy, upgrade, operate and troubleshoot.
  • Helped in Hive queries tuning for performance gain.
  • Configured Data lake which serves as a base layer to store and do analytics on data flowing from multiple sources into Hadoop Platform
  • Provide support to developers, Install their custom software’s, upgrade Hadoop components, solve their platform issues, and help them troubleshooting their long running jobs.
  • Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version if needed.
  • Daily status check for Oozie workflow and monitor Cloudera manager and check data node status to ensure nodes are up and running.
  • Expertise in performing Hadoop cluster tasks like commissioning and decommissioning of nodes without any effect to running jobs and data.
  • Used sqoop import and export extensively.
  • Extensively working on Spark using Python and Scala.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
  • Configured NameNode high availability and Resource Manager high availability
  • Resolved various issues faced by users which are related to platform.
  • Worked directly with vendors, partners and internal clients on gathering and refining technical requirements and designs in order to develop a working solution that addressed needs.
  • Act as point of contact for workflow failure/hitches.
  • Worked round the clock especially during deployments.
  • Monitoring and maintaining Hadoop cluster Hadoop/HBase/zookeeper using these tools Ganglia.

Environment: Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, Oozie, Flume, Java (jdk1.6), Cloudera CDH 5.4, CDH 5.8, CDH 5.9, CentOS 6.6, UNIX Shell Scripting.

Confidential, Hartford, CT

Hadoop Administrator

Responsibilities:

  • Responsible for setting up 24/7 Support on Big Data Clusters.
  • Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
  • Responsible for Availability of clusters and on boarding of the projects into Big Data Clusters.
  • Automate common maintenance and BAU activities.
  • Collaborate with cross-functional teams to ensure that applications are properly tested, configured, and deployed.
  • Cluster maintenance including adding and removing cluster nodes; cluster Monitoring and Troubleshooting
  • Extensively worked on cluster configuration files like hadoop-env.sh, core-site.xml, hdf-site.xml, Mapred-site.xml etc.
  • Strong knowledge of open source system monitoring and event handling tools like Nagios and Ganglia.
  • Experience in trouble shooting cluster problems by analyzing logs and setting log levels, fixing miss configuration, and resource exhaustion problems.
  • Worked on troubleshooting performance issues and tuning Hadoop cluster.
  • Managing the cluster and troubleshooting the issues using Cloudera manager.
  • Involved in the Complete Software development life cycle (SDLC) to develop the application.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Involved in loading data from LINUX file system to HDFS.
  • Experience in managing and reviewing Hadoop log files.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Analyzed large data sets by running Hive queries.
  • Mentored analyst and test team for writing Hive Queries.
  • Installed Oozie workflow engine to run multiple MapReduce jobs.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Confidential, Hartford, CT

ETL - Ab Initio Administrator.

Responsibilities:

  • Extensively worked on UNIX shell scripting & Autosys for automation.
  • Administration of Ab Initio project directories, data directories and the standard environment on Red Hat Linux servers.
  • Maintenance of UNIX shell scripting for creating project directories on the UNIX System Services side of Mainframe.
  • Gathering business requirements from the Business Partners and Subject Matter Experts.
  • Experience on UNIX commands and Shell Scripting.
  • Understanding the business data model and customer requirements.
  • Preparing high and low level designs of the system along with build strategy.
  • Building various components i.e. Graphs, Scripts and Oracle Queries.
  • Testing the changed and impacted components.
  • Communicating timely statuses to the Client about the on-goings of the enhancement and code-reviews.
  • Checking END-TO-END functionality of code by considering the performance and o/p relevance with the requirements.
  • Unit testing for all the solutions to ensure minimum defects.
  • Support for troubleshooting and fixing defects.
  • Performing smooth cycle execution from source to target to provide test/production data.
  • Production batch monitoring and fixing job failures.
  • Resolving the incidents raised by users.
  • Execution of IA cycle for every release.
  • Responding to the defects raised by QA team and Supporting QA queries.
  • Providing analysis report to BA queries for next releases.IES.

Hire Now