Hadoop Administrator Resume NewYork - Hire IT People

SUMMARY:

Cloudera Certified Administrator for Apache Hadoop.
Multi - talented, cross-functional technical professional with progressive experience in Hadoop cluster Administration, Linux System, Security and technical support within large scale IT portfolios, service projects.
10 years extensive IT experience that includes application development, Data warehousing, Systems Administration, monitoring and troubleshooting experience on UNIX, Linux Hadoop and Teradata environments.
3+ years of experience in Hadoop Cluster Administration with Cloudera/Hortonworks distribution.
Adding/Removing a Node with Cloudera Manager, Data Rebalancing. Maintaining backups for name node.
Skilled in: Hadoop Cluster Installation, Administration and maintenance, Shell/Bash Scripting automation.
Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise, iDRAC (Integrated Dell Remote Controller), Dell Open Manage and other tools.
Excellent troubleshooting skills in Hardware, Software, Application and Network.
In depth understanding/knowledge of Hadoop Architecture and various components such as hdfs, Resource Manager, Node Manager, Name Node, Data Node and mapreduce concepts.
Extensively worked on commissioning and decommissioning of cluster nodes, replacing failed disks, file system integrity checks and maintaining cluster data replication.
Installing and Administration of various Hadoop distributions like Cloudera, Hortonworks
Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
Hadoop integration with Business intelligence tools
Namenode and Resource Manager High Availability
General system performance monitoring for machines running a Hadoop Cluster with Nagios, (for monitoring parameters like disk space, disk partitions, etc.); managing Kerberos authentication for Hadoop.
Management Tool- Ambari Tool, Cloudera Manager
Participate in the research, design, and implementation of new technologies for scaling our large and growing data sets, for performance improvement, and for analyst workload reduction.
End-to-end performance tuning of Hadoop clusters.
Monitor Hadoop cluster job performance and capacity planning. HDFS support and maintenance
Grown/Shrunk a Hadoop cluster by adding/removing Nodes and tuning server for optimal performance of the cluster; using Namenode and ResourceManager UI and rebalancing load in a cluster.
Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
Involved in bench marking Hadoop cluster file systems various batch jobs and workloads.
Experience in HDFS data storage and support for running map-reduce jobs.
Support the integration needs of Hadoop with ETL & Reporting tools.
Extensive knowledge on Data warehousing concepts, reporting, relational data bases.
Hadoop Upgrades - CM & CDH Upgrade i.e., CDH 5.4, CDH 5.8, CDH 5.9.

TECHNICAL SKILLS:

Hadoop/BIG Data: HDFS, HBase, Hive, Sqoop, Oozie, Flume, Pig, Python, MapReduce and Spark.

BI tools: Ab Initio.

No SQL Databases: HBase

Database: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata.

Operation System: HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Scheduling tools: Autosys, Trivoli, Control-M, CA7.

Languages: SQL, C, C++, Core Java, AWK, Shell Scripting.

PROFESSIONAL EXPERIENCE

Confidential, NewYork

Hadoop Administrator

Responsibilities:

Installed, Configured and Maintained Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, Hbase, Zookeeper and Sqoop.
Extensively worked with Cloudera Distribution Hadoop CDH 5.x
Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery etc.,
Installed and configured Hue interface for UI access of Hadoop components like hive,pig, oozie, sqoop, Hbase, file browser etc.,
Timely and reliable support for all production and development environment: deploy, upgrade, operate and troubleshoot.
Helped in Hive queries tuning for performance gain.
Configured Data lake which serves as a base layer to store and do analytics on data flowing from multiple sources into Hadoop Platform
Provide support to developers, Install their custom software’s, upgrade Hadoop components, solve their platform issues, and help them troubleshooting their long running jobs.
Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version if needed.
Daily status check for Oozie workflow and monitor Cloudera manager and check data node status to ensure nodes are up and running.
Expertise in performing Hadoop cluster tasks like commissioning and decommissioning of nodes without any effect to running jobs and data.
Used sqoop import and export extensively.
Extensively working on Spark using Python and Scala.
Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
Configured NameNode high availability and Resource Manager high availability
Resolved various issues faced by users which are related to platform.
Worked directly with vendors, partners and internal clients on gathering and refining technical requirements and designs in order to develop a working solution that addressed needs.
Act as point of contact for workflow failure/hitches.
Worked round the clock especially during deployments.
Monitoring and maintaining Hadoop cluster Hadoop/HBase/zookeeper using these tools Ganglia.

Environment: Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, Oozie, Flume, Java (jdk1.6), Cloudera CDH 5.4, CDH 5.8, CDH 5.9, CentOS 6.6, UNIX Shell Scripting.

Confidential, Hartford, CT

Hadoop Administrator

Responsibilities:

Responsible for setting up 24/7 Support on Big Data Clusters.
Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
Responsible for Availability of clusters and on boarding of the projects into Big Data Clusters.
Automate common maintenance and BAU activities.
Collaborate with cross-functional teams to ensure that applications are properly tested, configured, and deployed.
Cluster maintenance including adding and removing cluster nodes; cluster Monitoring and Troubleshooting
Extensively worked on cluster configuration files like hadoop-env.sh, core-site.xml, hdf-site.xml, Mapred-site.xml etc.
Strong knowledge of open source system monitoring and event handling tools like Nagios and Ganglia.
Experience in trouble shooting cluster problems by analyzing logs and setting log levels, fixing miss configuration, and resource exhaustion problems.
Worked on troubleshooting performance issues and tuning Hadoop cluster.
Managing the cluster and troubleshooting the issues using Cloudera manager.
Involved in the Complete Software development life cycle (SDLC) to develop the application.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
Involved in loading data from LINUX file system to HDFS.
Experience in managing and reviewing Hadoop log files.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Importing and exporting data into HDFS and Hive using Sqoop.
Implemented test scripts to support test driven development and continuous integration.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Analyzed large data sets by running Hive queries.
Mentored analyst and test team for writing Hive Queries.
Installed Oozie workflow engine to run multiple MapReduce jobs.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Confidential, Hartford, CT

ETL - Ab Initio Administrator.

Responsibilities:

Extensively worked on UNIX shell scripting & Autosys for automation.
Administration of Ab Initio project directories, data directories and the standard environment on Red Hat Linux servers.
Maintenance of UNIX shell scripting for creating project directories on the UNIX System Services side of Mainframe.
Gathering business requirements from the Business Partners and Subject Matter Experts.
Experience on UNIX commands and Shell Scripting.
Understanding the business data model and customer requirements.
Preparing high and low level designs of the system along with build strategy.
Building various components i.e. Graphs, Scripts and Oracle Queries.
Testing the changed and impacted components.
Communicating timely statuses to the Client about the on-goings of the enhancement and code-reviews.
Checking END-TO-END functionality of code by considering the performance and o/p relevance with the requirements.
Unit testing for all the solutions to ensure minimum defects.
Support for troubleshooting and fixing defects.
Performing smooth cycle execution from source to target to provide test/production data.
Production batch monitoring and fixing job failures.
Resolving the incidents raised by users.
Execution of IA cycle for every release.
Responding to the defects raised by QA team and Supporting QA queries.
Providing analysis report to BA queries for next releases.IES.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

NewyorK

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship