We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

FL

SUMMARY:

  • 7+ Years of extensive IT experience with 4 years of experience in Hadoop Administration & Big Data Technologies and 3+ years of experience into Linux administration and also good hands on experiences in following areas.
  • Hands on experience with "Productionalizing" Hadoop applications (i.e. administration, configuration, management, monitoring, debugging, and performance tuning)
  • Experience in software configuration, build, release, deployment and DevOps with Windows and UNIX based operating systems
  • Installation, configuration, supporting and managing Hadoop Clusters using Hortonworks, Cloudera, MapR.
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Planning, Installing and Configuring Hadoop Cluster in Cloudera and Hortonworks Distributions.
  • Excellent understanding of Hadoop architecture and underlying framework including storage management.
  • Experience in building new OpenStack Deployment through Puppet and managing them in production environment.
  • Have extensively worked on Pivotal HD (3.0) and Hortonworks (HDP 2.3), MapR,EMR and Cloudera (CDH5) distributions.
  • Hands on experience in creating and upgrading Cassandra clusters
  • Experience in using various Hadoop infrastructures such as MapReduce, Pig, Hive, Zookeeper, HBase, Sqoop, YARN 2.0, Scala, Spark, Kafka, Strom, Impala, Oozie, and Flume for data storage and analysis.
  • Experience with Oozie Scheduler in setting up workflow jobs with MapReduce and Pig jobs.
  • Knowledge on architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
  • Extending HIVE and PIG core functionality by using custom UDF’s.
  • Experience in troubleshooting errors in HBase Shell/API, Pig, Hive, Sqoop, Flume, Spark and MapReduce.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
  • Collected logs of data from various sources and integrated into HDFS Using Flume.
  • Experienced in running MapReduce and Spark jobs over YARN.
  • Good understanding of HDFS Designs, Daemons and HDFS high availability (HA).
  • Good experience in Big data analytics tools like Tableau and Trifacta.
  • Setting up the Linux environments, Password less SSH, creating file systems, disabling firewalls,swappiness, Selinux and installing Java.
  • Provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2 and on private cloud infrastructure - Open Stack cloud platform.
  • Implementing a Continuous Integrations and Continuous Delivery framework using Jenkins, Puppet, and Maven&Nexus in Linux environment. Integration of Maven/Nexus, Jenkins, Urban Code Deploy with Patterns/Release, Git, Confluence, Jira and Cloud Foundry.
  • Hands on experience in Zookeeper in managing and configuring in NameNode failure scenarios.
  • Worked on Hadoop Security with MIT Kerberos, Ranger with LDAP.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain.
  • Extensive experience in data analysis using tools like Sync sort and HZ along with Shell Scripting and UNIX.
  • Experience with writing Oozie workflows and Job Controllers for job automation.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, HCatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, Storm, CDH 5.3, CDH 5.4

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH, Ruby, PHP

Databases: Oracle 11g, MySQL, MS SQL Server, Hbase, Cassandra, MongoDB

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP

Monitoring Tools: Cloudera Manager,Solr, Ambari, Nagios, Ganglia

Application Servers: Apache Tomcat, Weblogic Server, WebSphere

Reporting Tools: Kerberos Cognos, Hyperion Analyzer, OBIEE & BI+

ElasticsearchLogstash: Kibana

PROFESSIONAL EXPERIENCE:

Hadoop Administrator

Confidential, Fl

Roles &Responsibilities:

  • Installed and configured Hadoop on YARN and other ecosystem components.
  • Configured and used HCatalog to access the table data maintained in the Hive metastore and use the same table information for processing in Pig .
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Cluster Planning, Manage and review data backups, Manage & review log files
  • Worked with the Data Science team to gather requirements for various data mining projects.
  • Here I have installed 5 Hadoop clusters for different teams, we have developed a Data lake which serves as a Base layer to store and do analytics for Developers, we provide services to developers, install their custom software’s, upgrade Hadoop components, solve their issues, and help them troubleshooting their long running jobs, we are L3 and L4 support for the Data lake, and I also manage clusters for other teams.
  • Building automation frameworks for data ingestion, processing in Python, and Scala with NoSQL and SQL databases and Chef, Puppet, Kibana, Elastic Search, Tableau, GoCd, RedHat infrastructure for data ingestion, processing, and storage.
  • I’m a mix of DevOps and Hadoop admin here, and work on L3 issues and installing new components as the requirements comes and did as much automation and implemented CI /CD Model.
  • Involved in implementing security on Cloudera Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
  • Responsible for upgrading Cloudera CDH5 and MapReduce 2.0 with YARN in Multi Clustered Node environment. Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, Spark and loaded data into HDFS.
  • Hadoop security setup using MIT Kerberos, AD integration (LDAP) and Sentry authorization.
  • Migrated services from a managed hosting environment to AWS including: service design, network layout, data migration, automation, monitoring, deployments and cutover, documentation, overall plan, cost analysis, and timeline.
  • Managing Amazon Web Services (AWS) infrastructure with automation and configuration management tools such as Chef, Ansible, Puppet, or custom-built designing cloud-hosted solutions, specific AWS product suite experience.
  • Configured Zookeeper to implement node coordination, in clustering support.
  • Load log data into HDFS using Flume, Kafka and performing ETL integrations.
  • Configured Kafka for efficiently collecting, aggregating and moving large amounts of click stream data from many different sources to MaprFS.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Performed a Major upgrade in production environment from CDH4 to CDH5.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera. Installed and configured Hortonworks and Cloudera distributions on single node clusters for POCs.
  • Used Trifacta for data cleansing and data wrangling.
  • Wrote MapReduce job using Java API for data Analysis.
  • Developed Python, Shell/Perl Scripts and Power shell for automation purpose.
  • Implementing a Continuous Delivery framework using Jenkins, Puppet, and Maven & Nexus in Linux environment. Integration of Maven/Nexus, Jenkins, Urban Code Deploy with Patterns/Release, Git, Confluence, Jira and Cloud Foundry.
  • Involved in running Hadoop jobs for processing millions of records of text data. Troubleshoot the build issue during the Jenkins build process. Implement Docker to create containers for Tomcat Servers, Jenkins.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • I have used Service now and JIRA to track issues, Mostly Managing and reviewing Log files as a part of administration for troubleshooting purposes, meeting the SLA’s on time.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, HBase, ANT and Maven, Chef, Puppet, DevOps, Jenkins, Clear case.

Hadoop Administrator

Confidential - San Jose, CA

Roles &Responsibilities:

  • Worked on a live Big Data Hadoop production environment with 200 nodes.
  • Configured, installed, monitored MapR Hadoop on 10 AWS EC2 instances and configured MapR onAmazon EMR making AWS S3 as default file system for the cluster.
  • Developed Use cases and Technical prototyping for implementing PIG, HDP, HIVE and HBASE.
  • Analyzed the alternatives for NOSQL Data stores and intensive documentation for HBASE vs. Accumulo data stores.
  • Communicate with developers using in-depth knowledge of Cassandra Data Modeling for converting some of the applications to use Cassandra instead of Oracle. Responsible for design and development of Big Data applications using Hortonworks Hadoop.
  • Modular wise Data integrity and Data Validation practices.
  • Working on data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Maintaining and troubleshooting Hadoop core and ecosystem components (HDFS,Map/Reduce, Name node, Data node, Job tracker, Task tracker, Zookeeper, YARN, Oozie, Hive, Hue, Flume, HBase, and Fair Scheduler). Hands on experience installing, configuring, administering, debugging and troubleshooting Apache and DataStax Cassandra clusters.
  • Led the evaluation of Big Data software like Splunk, Hadoop for augmenting the warehouse, identified use cases and led Big Data Analytics solution development for Customer Insights and Customer Engagement teams.
  • Developing data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Worked on Identifying and eliminating duplicates in datasets thorough IDQ 8.6.1 components.
  • Used Kafkato allow a single cluster to serve as the central data backbone for a large organization.
  • Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs, to enable healthy operation of Map reduce jobs to push the data from SQL to NoSQL store.
  • Configuring Security with Active Directory onto Hadoop using Kerberos, perimeter defense with Knox, and granular access auditing with Ranger.
  • Successfully perform various data migration projects from Oracle to NoSQL databases and consulting projects at customer sites using my own Big Data migration products like Big Data Pumper, MongoDB Pumper, Couch base Pumper, NoSQL Viewer.
  • Assembled Puppet Master, Agent and Database servers on Red Hat Enterprise Linux Platforms.
  • Built,stood up and delivered Hadoop cluster in Pseudo distributed Mode with NameNode, Secondary Name node, Job Tracker, and the Task tracker running successfully with Zookeeper installed, configured and Apache Accumulo(NO SQL Google's Big table) is stood up in Single VM environment.
  • Involved in migrating java test framework to python flask.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Addressed Data Quality Using Informatica Data Quality (IDQ) tool.

Environment: Hadoop, Map Reduce, HDFS, Pig, GIT, JENKINS, Puppet, Chef, Maven Spark, Yarn, HBase, CDH 5.4, Oozie, MapR, NoSQL, ETL, MYSQL, agile, Windows, UNIX Shell Scripting, Teradata.

System Engineer/Admin

Confidential

Roles &Responsibilities:

  • Wrote command line utility used to issue commands across hundreds of host in parallel, similar to push but through serial console.
  • Wrote auditing script used to populate and maintain system, which documented addresses for 4500+ hosts.
  • Wrote many small utilities to automate a variety of tasks from pulling data out of admin portal, to gathering data from switches, to rebooting systems via ipmitool.
  • Identify repeated issues in production by analyzing production tickets after each release and strengthen the system testing process to arrest those issues moving to production to enhance customer satisfaction
  • Maintain, document and adhere to strict change control procedures for the automated management of RedHat RHEL, SuSE Linux and Sun Solaris Unix server environments.
  • Develop and maintain monitoring and automation framework built around CFEngine, Shell and Perl.
  • Manage SAN (HP 3Par, EMC) and NAS (Netapp) storage technologies.
  • Manage Veritas VxFS file systems and VCS cluster environment for operating high availability Oracle databases.Writing, optimizing, and troubleshooting dynamically created SQL within procedures
  • Creating database objects such as Tables, Indexes, Views, Sequences, Primary and Foreign keys, Constraints and Triggers.
  • Responsible for creating virtual environments for the rapid development.
  • Responsible for handling the tickets raised by the end users which includes installation of packages, login issues, access issues User management like adding, modifying, deleting, grouping
  • Responsible for preventive maintenance of the servers on monthly basis. Configuration of the RAID for the servers. Resource management using the Disk quotas.
  • Responsible for change management release scheduled by service providers.
  • Generating the weekly and monthly reports for the tickets that worked on and sending report to the management.
  • Managing Systems operations with final accountability for smooth installation, networking, and operation, troubleshooting of hardware and software in LINUX environment.
  • Identifying operational needs of various departments and developing customized software to enhance System's productivity.
  • Established/implemented firewall rules, Validated rules with vulnerability scanning tools.
  • Proactively detecting Computer Security violations, collecting evidence and presenting results to the management.
  • Accomplished System/e-mail authentication using LDAP enterprise Database.
  • Implemented a Database enabled Intranet web site using LINUX, Apache, MySQLDatabase backend.
  • Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers. Monitoring System Metrics and logs for any problems.
  • Running Cron-tab to back up Data. Applied Operating System updates, patches and configuration changes.
  • Maintaining the MySQL server and Authentication to required users for Databases. Appropriately documented various Administrative & technical issues.

Environment: Red Hat Enterprise Linux, Ubuntu, Centos, Sun Solaris 8,9,10, VERITAS Cluster Server, Veritas Volume Manager, SLURM, Oracle 11G, HP UX, HP Blade, IBM AIX, HP ProLiant DL 385, 585 Weblogic, Oracle RAC/ASM, MS Windows 2008 server.

Confidential

Junior Systems Administrator

  • Unix/Linux systems administrator with configuring, monitoring, upgrading and maintaining systems hardware, software and related infrastructure.
  • Strong analytical skills; able to work with technicians from various engineering disciplines to troubleshoot complex system-level issues.
  • Experience providing Unix support to maintain systems in world-class production data centers.

We'd love your feedback!