We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

Franklin Lakes, NJ


  • Around 8+ years of experience in IT wif over 5+ years of hands - on experience as Hadoop Administrator.
  • Hands on experiences wif Hadoop stack. (HDFS, MapReduce, YARN, Sqoop, Flume, Hive-Beeline, Impala, Tez, Pig, Zookeeper, Oozie, Solr, Sentry, Kerberos, HBASE, Centrify DC, Falcon, Hue, Kafka, and Storm).
  • Experience wif Cloudera Hadoop Clusters wif CDH 5.6.0 wif CM 5.7.0.
  • Experienced on Horton works Hadoop Clusters wif HDP 2.4 wif Ambari 2.2.
  • Hands on day-to-day operation of teh environment, knowledge and deployment experience in Hadoop ecosystem.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon teh job requirement.
  • Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Experience in installing, configuring and optimizing Cloudera Hadoop version CDH3, CDH 4.X and CDH 5.X in a Multi Clustered environment.
  • Commissioning and de-commissioning teh cluster nodes, Data migration. Also, involved in setting up DR cluster wif BDR replication setup and Implemented Wire encryption for Data Confidential REST.
  • Implemented Security TLS 3 over on all CDH services along wif Cloudera Manager.
  • Data Guise Analytics implementation over secured cluster.
  • Blue-Talend integration and Green Plum migration has been successfully implemented.
  • Ability to plan, manage HDFS storage capacity and disk utilization.
  • Assist developers wif troubleshooting MapReduce, BI jobs as required.
  • Provide granular ACLs for local file datasets as well as HDFS URIs. Role level ACL Maintenance.
  • Cluster monitoring and troubleshooting using tools such as Cloudera, Ganglia, Nagios, and Ambari metrics.
  • Experienced in administrative tasks such as Hadoop installation in pseudo distribution mode, multimode cluster and installation of Apache Ambari in Hortonworks Data Platform (HDP2.5).
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
  • Manage and review HDFS data backups and restores on Production cluster.
  • Implement new Hadoop infrastructure, OS integration and application installation. Install OS (rhel6, rhel5, centos, and Ubuntu) and Hadoop updates, patches, version upgrades as required.
  • Implement and maintain security LDAP, Kerberos as designed for cluster.
  • Expert in setting up Horton works (HDP2.4) cluster wif and wifout using Ambari2.2
  • Experienced in setting up Cloudera (CDH5.6) cluster using packages as well as parcels Cloudera manager 5.7.0.
  • Expertise to handle tasks in Red HatLinux includes upgrading RPMS using YUM, kernel, configure SAN Disks, Multipath and LVM file system.
  • Good exposure on SAS Business Intelligence Tools like SAS OLAP Cube Studio, SAS Information Map Studio, SAS Stored Process, SAS Web Applications.
  • Creating and maintaining user accounts, profiles, security, rights, disk space and process monitoring. Handling and generating tickets via teh BMCRemedy ticketing tool.
  • Configure UDP, TLS, SSL, HTTPD, HTTPS, FTP, SFTP, SMTP, SSH, Kickstart, Chef, Puppet and PDSH.
  • Overall Strong experience in system Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance monitoring and Fine-tuning on Linux (RHEL) systems.


Sr. Hadoop Administrator

Confidential - Franklin Lakes, NJ


  • Created Hive tables and worked on them utilizing Hive QL.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Managed 350+ Nodes CDH 5.2 cluster wif 4 petabytes of data using Cloudera Manager and Linux RedHat 6.5.
  • Analyzed teh data by performing Hive queries and running Pig scripts to know client conduct.
  • Used COBOL, so, by migrating or offloading from mainframe to Hadoop.
  • Strong experience working wif Apache Hadoop Including creating and debug production level jobs.
  • Successfully upgraded Hortonworks Hadoop distribution stack from 2.3.4 to 2.5.
  • Analyzed Complex Distributed Production deployments and made recommendations to optimize performance.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop clusters using Nagios and Ganglia.
  • Created and Managed Splunk DB connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls.
  • Installed application on AWS EC2 instances and configured teh storage on S3 buckets.
  • Driven HDP POC's wif various lines of Business successfully.
  • Cloudera distribution of MR1 to MR2.
  • Configuration Memory setting for YARN and MRV2.
  • Design and develop Automated Data archival system using Hadoop HDFS. Teh system has
  • Configurable limit to set archive data limit for efficient usage of disk space in HDFS.
  • Configure Apache Hive tables for Analytic job and create Hive QL scripts for offline Jobs.
  • Designed Hive tables for partitioning and bucketing based on different use cases.
  • Develop UDF to enhance Apache Pig and Hive features for client specific data filtering
  • Logics.
  • Experience wif Cloudera Navigator and Unravel data for Auditing Hadoop access.
  • Designed and implemented a stream filtering system on top of Apache Kafka to reduce stream size.
  • Written Kafka Rest API to collect events from Front end.
  • Implemented Apache Ranger Configurations in Hortonworks distribution.
  • Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
  • Setup, configured, and managed security for teh Cloudera Hadoop cluster.
  • Involved in migration of ETL processes from Oracle to Hive to test teh easy data manipulation.
  • Managed log files, backups and capacity.
  • Implemented Spark solution to enable real time reports from Cassandra data.
  • Found and troubleshot Hadoop errors.
  • Created Ambari Views for Tez, Hive and HDFS.
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
  • Working wif data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos TEMPprincipals and testing HDFS, Hive.
  • Custom monitoring scripts for Nagios to monitor teh daemons and teh cluster status.
  • Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts.
  • Complete end-to-end design and development of Apache Nifi flow, which acts as teh agent between middleware team and EBI team and executes all teh actions mentioned above.
  • Worked wif operational analytics and log management using ELK and Splunk.
  • Created 25+ Linux Bash scripts for users, groups, data distribution, capacity planning, and system monitoring.
  • Upgraded teh Hadoop cluster from CDH4.7 to CDH5.2.
  • Supported MapReduce Programs and distributed applications running on teh Hadoop cluster.
  • Continuous monitoring and managing EMR cluster through AWS Console.

Environment: Hive, MR1, MR2, YARN, Pig, HBase Apache Nifi, PL/SQL, Hive, Mahout, Java, Unix Shell scripting, Sqoop, ETL, Business Intelligence (DWBI), Ambari 2.0, Splunk, Linux Cent OS, HBase, MongoDB, Cassandra, Ganglia and Cloudera Manager.

Hadoop Administrator

Confidential, IL


  • Worked on Installing and configuring teh HDP Hortonworks 2.x Clusters in Dev and Production Environments.
  • Worked on Capacity planning for teh Production Cluster.
  • Involved in creating Hive tables, loading wif data and writing hive queries, which will run, internally in map reduce way.
  • Successfully upgraded Hortonworks Hadoop distribution stack from 2.7.1 to 2.7.2.
  • Worked on Configuring Oozie Jobs.
  • Create a complete processing engine, based on Hortonworks distribution, enhanced to performance.
  • Performed on cluster upgradation in Hadoop from HDP 2.2 to HDP 2.4.
  • Worked wif Nifi for managing teh flow of data from source to HDFS.
  • Ability to Configuring queues in capacity scheduler and taking Snapshot backups for HBase tables.
  • Worked on fixing teh cluster issues and Configuring High Availability for Name Node in HDP 2.4.
  • Involved in Cluster Monitoring backup, restore and troubleshooting activities.
  • Execute teh Standard SQL queries using spark API same as teh way we execute in web UI of Big Query.
  • Built a Production and QA Cluster wif teh latest distribution of Hortonworks - HDP stack 2.6.1 managed by Ambari 2.5.1 on AWS Cloud.
  • Worked on Kafka cluster by using Mirror Maker to copy to teh Kafka cluster on Azure.
  • Familiarity wif a NoSQL database such as MongoDB.
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Used Sql to extract data from Google Big Query for data analysis and Weekly reports.
  • Documented tool to perform "chunk uploads' of big data into Google Big Query.
  • Worked on MapR version 5.2 to maintaining teh Operations, installations, configuration of 150+ node clusters.
  • Performed dynamic updates of Hadoop Yarn and MapReduce memory settings.
  • Working on POC to source data into Kudu for row level updates using impala and spark
  • Worked on creating comprehensive MongoDB API and Document DB API using Storm into Azure Cosmos DB.
  • Installed Kerberos secured Kafka cluster wif no encryption on Dev and Prod. Also set up Kafka ACL's into it.
  • Created Custom Spout and Bolt in Storm application by into Cosmos DB according to teh business rules.
  • Performing tuning and troubleshooting of MR jobs by analyzing and reviewing Hadoop log files.
  • Worked on Storm-Mongo DB design to map Strom tuple values to either an update operation or an insert.
  • Involved in Data integration wif Talend wif other systems in Enterprise.
  • Create AWS instances and create a working cluster of multiple nodes in cloud environment.
  • Experienced wif Hadoop ecosystems such as Hive, HBase, Sqoop, Kafka, Oozie etc.
  • Creating end-to-end Spark applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to teh requirement.
  • Hands on experience in installation, configuration, management and development of big data solutions using Hortonworks distributions.
  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Monitoring systems and services through Ambari dashboard to make teh clusters available for teh business.
  • Experience wif installing and configuring Distributed Messaging System like Kafka.
  • Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
  • Created HD Insight cluster in Azure (Microsoft Specific tool) was part of teh deployment and Component unit testing using Azure Emulator.
  • Worked on Configuring Kerberos Autantication in teh cluster.
  • Maintaining teh Operations, installations, configuration of 150+ node clusters wif MapR distribution.
  • Experienced in provisioning and managing multi-datacenter Cassandra cluster on public cloud environment Amazon Web Services (AWS) - EC2.
  • Design a sort of data pipeline to migrate my Hive tables into Big Query by using shell script. Handle any casting issue from Big Query itself, so selecting from teh table just written and handling manually any casting.
  • Installed Apache Nifi to make data ingestion fast, easy and secure from internet of anything wif Hortonworks data flow.
  • Using HDInsight Storm, Created Topology in ingesting data from HDInsight Kafka and writes data to MongoDB.
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics and Implemented Cassandra connector for Spark in Java.
  • Worked wif cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration, and installation on Kafka.
  • Installed Ranger in all environments for Second Level of security in Kafka Broker.
  • Gathered business requirements to configure and maintain ITSM configuration data.
  • Worked wif Nifi for managing teh flow of data from source to HDFS.
  • Worked on Installation of HORTONWORKS 2.1 in AWS Linux Servers.
  • Worked on indexing teh HBase tables using Solr and indexing teh Json data and Nested data.
  • Hands on experience on installation and configuring teh Spark and Impala.
  • Developed and designed system to collect data from multiple portals using Kafka
  • Wrote features to filter raw data by JSON processor from Big Query, AWS SQS, and Publishing API.
  • Successfully install and configuring Queues in Capacity scheduler and Oozie scheduler.
  • Worked on configuring queues in and Performance Optimization for teh Hive queries while Performing tuning in teh Cluster level and adding teh Users in teh clusters.
  • Manage AWS EC2 instances utilizing Auto Scaling, Elastic Load Balancing and Glacier for our QA and UAT environments as well as infrastructure servers for GIT and Chef/ Ansible.
  • Automated teh configuration management for several servers using Chef and Puppet.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce teh impact and documenting teh same and preventing future issues.
  • Adding/installation of new components and removal of them through Ambari.
  • Populated HDFS and Cassandra wif huge amounts of data using Apache Kafka.
  • Worked on MapR components like Map Stream, Map DB and Drill wif development team.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.

Environment: Hadoop, Map Reduce, MapR, Yarn, Hive, HDFS, PIG, Sqoop, Solr, Oozie, Impala, Spark, Hortonworks2.8, Flume, Big Query, HBase, Agile/scrum, Puppet, Chef, Zookeeper and Unix/Linux, Hue (Beeswax), AWS.

Hadoop/Cloudera Administrator

Confidential, tlanta, GA


  • Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director.
  • Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring teh Hadoop Cluster in Cloudera.
  • Researched and codified teh Kafka Consumer using Kafka Consumer API 0.10 and Kafka Producer API
  • Managing, monitoring and troubleshooting Hadoop Cluster using Cloudera Manager.
  • Installed and configured RHEL6 EC2 instances for Production, QA and Development environment.
  • Installed MIT Kerberos for autantication of application and Hadoop service users.
  • Installing, configuring and administering Jenkins CI tool on AWS EC2 instances.
  • Configured Nagios to monitor EC2 Linux instances wif Ansible automation.
  • Used Cronjob to backup Hadoop Service databases to S3 buckets.
  • Used Kafka for building real-time data pipelines between clusters.
  • Supported technical team in management and review of Hadoop logs.
  • Design a sort of data pipeline to migrate my Hive tables into Big Query by using shell script.
  • Assisted in creation of ETL processes for transformation of Data from Oracle and SAP to Hadoop Landing Zone.
  • Also deployed Kibana wif Ansible and connected to Elastic search Cluster. Tested Kibana and ELK by creating a test index and injected sample data into it.
  • Implementing Hadoop security solutions Kerberos for securing Hadoop clusters.
  • Installed Kafka manager for consumer lags and for monitoring Kafka Metrics also this has been used for adding topics, Partitions etc.
  • Creating queues on YARN queue manager to share teh resources of teh Cluster for teh Map Reduce jobs given by teh users.
  • Responsible for developing Kafka as per teh software requirement specifications.
  • Involved in monitoring data and filtering data for high-speed data handling using Kafka.
  • Worked in Spark streaming to get ongoing information from teh Kafka and store teh stream information to HDFS.
  • Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
  • Utilized AWS framework for content storage and Elastic Search for document search.
  • Used NIFI to pull teh data from different source and to push teh data to HBASE and HIVE
  • Wrote Lambda functions in python for AZURE Lambda which invokes python scripts to perform various transformations and analytics on large data sets in EMR clusters.
  • Installed application on AWS EC2 instances and configured teh storage on S3 buckets.
  • Developing data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Installing, Upgrading and Managing Hadoop Cluster on Cloudera distribution.
  • Worked wif developer teams on Nifi workflow to pick up teh data from rest API server, from Data Lake as well as from SFTP server and send dat to Kafka broker.
  • Troubleshot and rectified platform and network issues using Splunk / Wireshark.
  • Installed Kerberos secured Kafka cluster wif no encryption in all environments.
  • Experience in Upgrades and Patches and Installation of Ecosystem Products through Ambari.
  • Worked wif Kafka for teh proof of concept for carrying out log processing on a distributed system.
  • Manually upgrading and MRV1 installation wif Cloudera manager.
  • coordinated Kafka operations and monitoring(via JMX) wif dev ops personnel
  • Involved in creating Hive tables, loading data, and writing Hive queries.
  • Done Proof of Concept in Apache Nifi workflow in place of Oozie to automate teh tasks of loading.
  • Configured CDH Dynamic Resource Pools to schedule and allocate resources to YARN applications.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
  • Monitor Hadoop cluster using tools like Nagios, Ganglia, Ambari and Cloudera Manager.
  • Implemented APACHE IMPALA for data processing on top of HIVE.
  • Scheduled jobs using OOZIE workflow.
  • Worked on bit bucket, Git and bamboo to deploy EMR clusters.
  • Worked in teh cluster disaster recovery plan for teh Hadoop cluster by implementing teh cluster data backup in Amazon S3 buckets.
  • Installed and Configured Data tax Obscener and Nagios for Cassandra Cluster maintenance and alerts.
  • Working wif Talend to loading data into Hadoop Hive tables and Performing ELT aggregations in Hadoop Hive and Extracting data from Hadoop Hive.
  • Worked on POC for streaming data using Kafka and spark streaming.
  • Implemented Kafka Customer wif Spark-streaming and Spark SQL using Scala.
  • Used AWS S3 and Local Hard Disk as underlying File System (HDFS) for Hadoop.
  • Configured CDH Dynamic Resource Pools to schedule and allocate resources to YARN applications.
  • Created Cluster utilization reports for capacity planning and tuning resource allocation for YARN Jobs.
  • Implemented high availability for Cloudera production clusters.
  • Working wif Hortonworks Sandbox distribution and its various versions HDP 2.4.0, HDP 2.5.0.
  • Used Cloudera Navigator for data governance: Audit and Linage.
  • Configured Apache Sentry for fine-grained authorization and role-based access control of data in Hadoop.
  • Monitoring performance and tuning configuration of services in Hadoop Cluster.
  • Worked on resolving production issues and documenting root cause analysis and updating teh tickets using ITSM.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Creation of Users, Groups and mount points for NFS support.
  • As a Lead of Data Services team, built Hadoop cluster on Azure HD Insight Platform and deployed Data analytic solutions using tools like Spark and BI reporting tools.
  • Imported teh data from relational databases into HDFS using Sqoop.
  • Involved in creating Hive DB, tables and load flat files.
  • Configured Apache Phoenix on top HBase to query data through SQL.

Environment: Oozie, CDH 5.8, 5.9 and 5.10 Hadoop Cluster, bitbucket, GIT, Ansible, Nifi, AWS, EC2, S3, HDFS, Hive, IMPALA, Pig, yarn, Sqoop, Python, Elastic Search, Flume RHEL6 EC2, Sqoop, Teradata, Apache Splunk, SQL.

Hadoop Administrator

Confidential - St. Louis, MO


  • Responsible for installing, configuring, supporting and managing of Hadoop Clusters.
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes.
  • Responsible for troubleshooting issues in teh execution of MapReduce jobs by inspecting and reviewing log files.
  • Installed, configured and maintained Hadoop Clusters in Cloudera and Horton works Distributions.
  • Experience in teh Azure components & APIs.
  • Thorough knowledge on Azure platforms IAAS, PaaS
  • Manage Azure based SaaS environment.
  • Configure and Install Splunk Enterprise, Agent, and Apache Server for user and role autantication and SSO.
  • Monitoring system performance of virtual memory, managing swap space, Disk utilization and
  • CPU utilization. Monitoring system performance using Nagios.
  • Azure Data Lakes and Data Factory.
  • Used Horton works 2.5 and 2.6 versions.
  • Worked on configuring Hadoop cluster on AWS.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using Sqoop.
  • Used Hive and created Hive tables, loaded data from Local file system to HDFS.
  • Created user accounts and given users teh access to teh Hadoop cluster.
  • Performed HDFS cluster support and maintenance tasks like adding and removing nodes wifout any TEMPeffect to running nodes and data.
  • Experience on Oracle OBIEE.
  • Involved in designing Cassandra data model for cart and checkout flow.
  • Changes to teh configuration properties of teh cluster based on volume of teh data being processed and performance of teh cluster.
  • Handle teh upgrades and Patch updates.
  • Set up automated processes to analyze teh System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Maintained Hortonworks cluster wif HDP Stack 2.4.2 managed by Ambari 2.2.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsible for HBase REST server administration, backup and recovery.
  • As a Hadoop admin, monitoring cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration xml files.
  • Install and maintain teh Splunk add-on including teh DB Connect 1, Active Directory LDAP for work wif directory and SQL database.
  • Monitored all MapReduce Read Jobs running on teh cluster using Cloudera Manager and ensured dat they could read teh data and write to HDFS wifout any issues.
  • Involved in collecting metrics for Hadoop clusters using Ganglia and Ambary.
  • Prepared Oozie workflow engine to run multiple Hive and Pig jobs which run independently wif time and data availability.
  • Supported Data Analysts in running MapReduce Programs.
  • Responsible for deploying patches and remediating vulnerabilities.
  • Provided highly available and durable data using AWS S3 data store.
  • Experience in setting up Test, QA, and Prod environment.
  • Involved in loading data from UNIX file system to HDFS.
  • Created root cause analysis (RCA) efforts for teh high severity incidents.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Worked hands on wif ETL process. Handled importing data from various data sources, performed transformations.
  • Documenting teh procedures performed for teh project development.
  • Assigning tasks to offshore team and coordinate wif them in successful completion of deliverables.

Environment: RedHat/Suse Linux, EM Cloud Control, Cloudera 4.3.2, HDFS, Hive, Sqoop, Zookeeper and HBase, HDFS Map Reduce, Pig, NO SQL, Oracle 9i/10g/11g RAC wif Solaris/Red hat, Exadata Machines X2/X3, HDP, Toad, MYSQL plus, Oracle Enterprise Manager (OEM), RMAN, Shell Scripting, Golden Gate, Azure platform, HDInsight.

Hadoop Administrator



  • Experience in managing scalable Hadoop cluster environments.
  • Involved in managing, administering and monitoring clusters in Hadoop Infrastructure.
  • Diligently teaming wif teh infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Collaborating wif application teams to install operating system and Hadoop updates, patches, version upgrades when required.
  • Installation, configuration, supporting and managing Hortonworks Hadoop cluster.
  • Monitored all Map Reduce Read Jobs running on teh cluster using Cloudera Manager and ensured dat they were able to read teh data to HDFS wifout any issues.
  • Loading data into Splunk including syslog and log files.
  • Experience in HDFS maintenance and administration.
  • Managing nodes on Hadoop cluster connectivity and security.
  • Experience in commissioning and decommissioning of nodes from cluster.
  • Experience in Name Node HA implementation.
  • Working on architected solutions dat process massive amounts of data on corporate and AWS cloud based servers.
  • Worked wif cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration, and installation on Kafka.
  • Set up automated processes to archive/clean teh unwanted data on teh cluster, in particular on Name node and Secondary name node.
  • Set up and manage HA Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
  • Set up teh checkpoints to gathering teh system statistics for critical set ups.
  • Discussions wif other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
  • Working wif data delivery teams to setup new Hadoop users.
  • Installed Oozie workflow engine to run multiple Map Reduce, Hive and pig jobs.
  • Configured Metastore for Hadoop ecosystem and management tools.
  • Worked on evaluating, designing, installation/setup of Hortonworks 2.1/1.8 Big Data ecosystem, which includes Hadoop, Pig, Hive, Sqoop etc.
  • Hands-on experience in Nagios and Ganglia monitoring tools.
  • Experience in HDFS data storage and support for running Map Reduce jobs.
  • Performing tuning and troubleshooting of MR jobs by analyzing and reviewing Hadoop log files.
  • Installing and configuring Hadoop eco system like Sqoop, Pig, Flume, and Hive.
  • Maintaining and monitoring clusters. Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Experience in using distcp to migrate data between and across teh clusters.
  • Installed and configured Zookeeper.
  • Monitor teh data streaming between web sources and HDFS.
  • Monitor teh Hadoop cluster functioning through monitoring tools.
  • Close monitoring and analysis of teh MapReduce job executions on cluster Confidential task level.
  • Inputs to development regarding teh efficient utilization of resources like memory and CPU utilization based on teh running statistics of Map and Reduce tasks.
  • Hands on experience in analyzing Log files for Hadoop eco system services.
  • Coordinate root cause analysis efforts to minimize future system issues.
  • Troubleshooting of hardware issues and closely worked wif various vendors for Hardware/OS and Hadoop issues.

Environment: Cloudera4.2, HDFS, Hive, Pig, Sqoop, HBase, Mahout, Tableau, Micro strategy, Shell Scripting, RedHat Linux.

Linux System Admin



  • Installing and maintaining teh Linux servers.
  • Installed RedHat Linux using kickstart.
  • Responsible for managing RedHat Linux Servers and Workstations.
  • Created, cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts.
  • Managed systems routine backup, scheduling jobs, enabling cron jobs, enabling system logging and network logging of servers for maintenance.
  • Performed RPM and YUM package installations, patch and another server management.
  • Create, modify, disable, delete UNIX user accounts and Email accounts as per FGI standard process.
  • Quickly arrange repair for hardware in occasion of hardware failure.
  • Patch management, Patch updates on quarterly basis.
  • Setup securities for users and groups and firewall intrusion detection systems.
  • Add, delete and Modify UNIX groups using teh standards processes and resetting user passwords, Lock/Unlock user accounts.
  • TEMPEffective management of hosts, auto mount maps in NIS, DNS and Nagios.
  • Monitoring System Metrics and logs for any problems.
  • Security Management, providing/restricting login and sudo access on business specific and Infrastructure servers & workstations.
  • Installation & maintenance of Windows 2000 & XP Professional, DNS and DHCP and WINS for teh Bear Stearns DOMAIN.
  • Use LDAP to autanticate users in Apache and other user applications
  • Remote Administration using terminal service, VNC and PCA anywhere.
  • Create/remove windows accounts using Active Directory.
  • Running crontab to back up data and troubleshooting Hardware/OS issues.
  • Involved in Adding, removing, or updating user account information, resetting passwords etc.
  • Maintaining teh RDBMS server and Autantication to required users for databases.
  • Handling and debugging Escalations from L1 Team.
  • Took Backup Confidential regular intervals and planned wif a good disaster recovery plan.
  • Correspondence wif Customer, to suggest changes and configuration for their servers.
  • Maintained server, network, and support documentation including application diagrams.

Environment: Oracle, Shell, PL/SQL, DNS, TCP/IP, Apache Tomcat, HTML and UNIX/Linux.


Big Data Technologies: HDFS, Hive, MapReduce, Cassandra, Pig, Hcatalog, Sqoop, Flume, Zookeeper, Kafka, Mahout, Oozie, CDH, HDP

Tools Quality: center v11.0\ALM, TOAD, JIRA, HP UFT, Selenium, Kerberos, JUnit

Programming Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH, Java

QA Methodologies: Waterfall, Agile, (TM) V-model.

Front End Technologies: HTML, XHTML, CSS, XML, JavaScript, AJAX, Servlets, JSP

Java Frameworks: MVC, Apache Struts2.0, Spring and Hibernate

Domain Knowledge: GSM, WAP, GPRS, CDMA and UMTS (3G) Web Services SOAP(JAX-WS), WSDL, SOA, Restful (JAX-RS), JMS

Application Servers: Apache Tomcat, Web Logic Server, Web Sphere, JBoss

Databases Oracle: 11g, MySQL, MS SQL Server, IBM DB2 NoSQL Databases HBase, MongoDB, Cassandra

Operating Systems: Linux, UNIX, MAC, Windows

Hire Now