We provide IT Staff Augmentation Services!

Hadoop Security Administrator Resume

3.00/5 (Submit Your Rating)

Jersey City, NJ

SUMMARY:

  • Results driven professional offering a progressive career in IT industry with 6+ years of experience, including 3+ years on Big Data Hadoop Administration.
  • Strong experience in Hortonworks and Cloudera distribution of Hadoop and its ecosystem components.
  • Experience in installation, management and monitoring of Hadoop cluster using Apache, Cloudera Manager.
  • Good troubleshooting skills on over all Hadoop stack components, ETL services and Hue, Rstudio which provides GUI for developers/business users for day - to-day activities.
  • Have experience in 15 node clusters step up in Ubuntu Environment.
  • Expert level understanding of the AWS cloud computing platform and related services.
  • Experience in managing the Hadoop Infrastructure with Cloudera Manager and Ambari.
  • Working experience on importing and exporting data into HDFS and Hive using Sqoop.
  • Working experience on import & export of data using ETL tool Sqoop from MySQL to HDFS.
  • Working experience on ETL Data Integration Tool Talend.
  • Good knowledge of computer applications and scripting like Shell, Python, Power Shell and Groovy.
  • Strong Knowledge in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Experience in Backup configuration and Recovery from a Name Node failure.
  • Strong experience on Red Hat server administration and Kerberos Security.
  • Certified Java/J2EE Developer with more than 2+ years of experience in Java, J2EE system design, development, testing, integration, implementation, maintenance of n-tier web based applications and server side components in enterprise environment.
  • Strong experience on Distributed Server Operations, ITIL and work with Incident management, change management & problem management.
  • At PayPal, Worked as Big data hadoop administrator lead on one of the world’s largest Big Data Hadoop clusters of 1500 plus nodes of multiple clusters with 50 petabytes of scalable cluster.
  • Optimized the configurations of Map Reduce, Pig andHive jobs for better performance.
  • Advanced understanding in Hadoop Architecture such as HDFS, Yarn.
  • Strong experience configuring Hadoop Ecosystem tools with including Pig, Hive, Hbase, Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark and Storm.
  • Experience in designing, Installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks and Cloudera.
  • Handled more than 10+ clients on Hadoop and efficient delivery based on customer needs
  • Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
  • Responsible for configuring, integrating, and maintaining all Development, QA, Staging and Production PostgreSQL databases within the organization.
  • Responsible for all backup, recovery, and upgrading of all of the PostgreSQL databases.
  • Experience in Data Meer as well as big data Hadoop. Experienced in NoSQL databases such as HBase, and MongoDB. Store and manage the data coming from the users in Mongo DB database.
  • Good experience in installation/upgradation of VMware. Automated server building using System Imager, PXE, Kickstart and Jumpstart.
  • Experience on Commissioning, Decommissioning, Balancing and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experienced with Devops tools like Chef, Puppet, Ansible, Jenkins, Jira, Docker and Splunk.
  • Involved in Cluster maintenance, bug fixing, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
  • Management of security in Hadoop Clusters using Kerberos, Ranger, Knox, Acl's.
  • Excellent experience in Shell Scripting.
  • Expertise with Hadoop solution architecture, design and delivery of any data and analytics requirements.
  • Created strategies and plans for data security, disaster recovery and total business continuity.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, HCatalog, Phoenix, Falcon, Scoop, Zookeeper, Nifi, Mahout, Flume, Oozie, Avro, HBase, MapReduce, HDFS, Storm, Hortonworks and Cloudera distribution.

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH, Ruby, PHP

Databases: Oracle 11g, MySQL, MS SQL Server, Hbase, Cassandra, MongoDB

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP

Monitoring Tools: Cloudera Manager, Solr, Ambari, Nagios, Ganglia

Application Servers: Apache Tomcat, Weblogic Server, WebSphere

Security: Kerberos, Ranger, Rangerkms, Knox.

Reporting Tools: Cognos, Hyperion Analyzer, OBIEE & BI+

ElasticsearchLog stash: Kibana

Automation tools: Puppet, chef, Ansible

PROFESSIONAL EXPERIENCE:

Confidential - Jersey city, NJ

Hadoop Security Administrator

Responsibilities:

  • Hadoop installation, Configuration of multiple nodes using Hortonworks platform.
  • Designed and implemented end to end big data platform solution on AWS.
  • Manage Hadoop clusters in production, development, Disaster Recovery environments.
  • Implemented SignalHub a data science tool and configured it on top of HDFS.
  • Design and implement disaster recovery for the PostgreSQL 11.1 Database. Configuring third client software pgadmin-III, phppgadmin to access the database server.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.
  • Recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Complete end to end design and development of Apache NiFi flow which acts as the agent between middleware team and EBI team and executes all the actions
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.
  • Automated Hadoop deployment using Ambari blueprints and Ambari REST API’s.
  • Automated Hadoop and cloud deployment using Ansible.
  • Integrated active directory with Hadoop environment.
  • Patch, Upgrade and keep the PostgreSQL DBs current. Develop and enhance scripts to automate and execute various DBA tasks.
  • Designed and developed the User Interface module using JSP, JQuery, HTML and JavaScript.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Configured Ranger for policy based authorization and fine grain access control to Hadoop cluster.
  • Implemented hdfs encryption for creating encrypted zones in hdfs.
  • Configured Rangerkms to manage encryption keys.
  • Implemented a multitenant Hadoop cluster and on boarded tenants to the cluster.
  • Achieved data isolation through ranger policy based access control.
  • Used YARN capacity scheduler to define compute capacity.
  • Responsible for building a cluster on HDP 2.5
  • Worked closely with developers to investigate problems and make changes to the Hadoop environment and associated applications.
  • Expertise in recommending hardware configuration for Hadoop cluster
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Trouble shooting many cloud related issues such as Data Node down, Network failure, login issues and data block missing.
  • Managing and reviewing Hadoop log files.
  • Proven results-oriented person with a focus on delivery.
  • Built NiFi system for replicating the whole database
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Managed cluster coordination services through Zookeeper.
  • System/cluster configuration and health check-up.
  • Continuous monitoring and managing the Hadoop cluster through Ambari.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Resolving tickets submitted by users, troubleshoot the error documenting, resolving the errors.
  • Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.

Environment: HDFS, Map Reduce, Hive, Pig, Flume, Oozie, Sqoop, HDP2.5, Ambari 2.4, Spark, SOLR, Storm, Knox, Centos 7 and MySQL.

Confidential - Brooklyn, NY

Hadoop Administrator

Responsibilities:

  • Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.
  • In depth understanding of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Resource Manager, Node Manager and YARN / Map Reduce programming paradigm.
  • Monitoring Hadoop Cluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
  • Extensively worked on commissioning and decommissioning of cluster nodes, file system integrity checks and maintaining cluster data replication.
  • Responsible for Installing, setup and Configuring Apache Kafka and Apache Zookeeper.
  • Responsible for efficient operations of multiple Cassandra clusters
  • Implemented Python script which calculates the cycle time from the Rest API and fix the wrong cycle time data in Oracle database.
  • Developed a NiFi Workflow to pick up the data from Data Lake as well as from server and send that to Kafka broker.
  • Involved in developing new work flow Map Reduce jobs using Oozie framework.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Created NiFi flows to trigger spark jobs and used put email processors to get notifications if there are any failures.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, NameNode recovery, capacity planning, and slots configuration.
  • Involved and experienced in Cassandra cluster connectivity and security.
  • Very good understanding and knowledge of assigning number of mappers and reducers to Map reduce cluster.
  • Experience migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Setting up HDFS Quotas to enforce the fair share of computing resources.
  • Strong Knowledge in Configuring and maintaining YARN Schedulers (Fair, and Capacity)
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.
  • Experience in projects involving movement of data from other databases to Cassandra with basic knowledge of Cassandra Data Modeling.
  • Used ANT as a build tool for building the application and deploying it in the Sun Java System Application Server(SJSAS)
  • Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics.
  • Support for parallel data load into Hadoop.
  • Involved in setting up HBase which includes master and region server configuration, High availability configuration, performance tuning and administration.
  • Created user accounts and provided access to the Hadoop cluster.
  • Upgraded cluster from CDH 5.3 to CDH 5.7 and Cloudera manager from CM 5.3 to 5.7.
  • Involved in loading data from UNIX file system to HDFS.
  • Worked on ETL process and handled importing data from various data sources, performed transformations.

Environment: Hadoop, Map Reduce, Shell Scripting, spark, Pig, Hive, HDFS, Yarn, Hue, Sentry, Oozie, Zoo keeper, Impala, Solr, Kerberos, cluster health, Puppet, Ganglia, Nagios, Flume, Sqoop, storm, Kafka, KMS

Confidential - New York city, NY

Hadoop Administrator

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Extensively involved in Installation and configuration of Cloudera distribution, Namenode, Secondary Name Node, Job Tracker, Task Trackers and Data Nodes.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
  • Involved in clustering of Hadoop in the network of 70 nodes.
  • Experienced in loading data from UNIX local file system to HDFS.
  • Experienced on Application Servers like BEA WebLogic 8.1/9.2, JBoss 4.2, Apache Tomcat 3.0/5.5, Oracle Application Server 10.1.2 andSun Java System Application Server.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked on monitoring of VMware virtual environments with ESXi 4 servers and Virtual Center. Automated tasks using shell scripting for doing diagnostics on failed disk drives.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Used Pig as ETLtool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
  • Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Used Hive and created Hive external/internal tables and involved in data loading and writing Hive UDFs.
  • Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Automated workflows using shell scripts to pull data from various databases into Hadoop.

Environment: s: Hadoop, HDFS, Map Reduce, Impala, Sqoop, HBase, Hive, Flume, Oozie, Zoo keeper, solr, Performance tuning, cluster health, monitoring security, Shell Scripting, NoSQL/HBase/Cassandra, Cloudera Manager.

Confidential

Hadoop Administrator

Roles & Responsibilities:

  • Hadoop administrator managing multi node Hortonworks HDP clusters (HDP
  • 2.6.0.3/2.4.2 ) distributions for 3 clusters for Dev, Pre Prod and PROD environments with 200+ nodes With overall storage capacity of 5 PB.
  • Day to day responsibilities include monitoring and troubleshooting incidents resolving developer issues & Hadoop eco system run time failures, enabling security policies, managing data storage and compute resources.
  • Automating multiple Cassandra activities like running repair & cleaning of snapshot using shell scripts and cron jobs
  • Responsible for Cluster maintenance, Cluster Monitoring, Troubleshooting, Manage and & review log files and provide 24X7 on call support with scheduled rotation.
  • Hands on experience in installation, configuration, management and development of big data solutions using Hortonworks distributions.
  • Installed Apache Nifi to make data ingestion fast, easy and secure from internet of anything with Hortonworks data flow.
  • Write Apache Spark Scala code to move data from Cassandra table to Filodb table or vice versa
  • Responsibilities include implementing change orders for creating hdfs folders, hive DB/tables, hbase
  • Namespace/commissioning and decommissioning Data nodes, troubleshooting, manage and review data backups, manage & review log files.
  • Implemented HDP upgrade from 2.4.2 to 2.6.0.3 version.
  • Implemented High Availability for Namenode/Resource Manager/Hbase/Hive/Knox Services.
  • Installing, configuring new hadoop components and upgrading the cluster with proper strategies which include ATLAS/Phoenix/Zeppelin.
  • Diligently teaming with the infrastructure, network, database and application teams to guarantee high data quality and availability.
  • Aligning with the systems engineering team to propose and help deploy new hardware and software environments required for Hadoop and to expand existing environments.
  • Analyze the Performance of the Linux System to identify Memory, disk I/O and network problem.
  • Troubleshoot issues with hive, hbase, pig, spark /scala scripts to isolate /fix issues.
  • Screen Hadoop cluster job performances and capacity planning.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.

Environment: Hortonworks HDP 2.6.0.3, Hbase, Hive, Hbase, Ambari 2.5.0.3, Linux, Azure Cloud.

We'd love your feedback!