We provide IT Staff Augmentation Services!

Hadoop Admin Resume

5.00/5 (Submit Your Rating)

Oak Brook, IL

SUMMARY:

  • Around 8 of total Information Technology experience with expertise in Administration and Operations experience, in Big Data and Cloud Computing Technologies.
  • Hands on Experience in Installing, Configuring and using Hadoop Eco System Components like HDFS, Hadoop Map Reduce, Yarn, Zookeeper, Sqoop, NiFi, Flume, Hive, HBase, Spark, Oozie.
  • Experience in deploying a Hadoop cluster using Horton works integrated with Ambari for monitoring and Alerting.
  • Experience using Horton works platform and their eco systems. Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, HDFS, Hive and Flume.
  • Installed, Configured, and perform Administration of Hadoop cluster of major Hadoop distributions such as Horton works Data Platform (HDP1 and HDP2) and Cloudera Enterprise (CDH3 and CDH4)
  • Experience in configuring various configuration files like core - site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml based upon the job requirement.
  • Experience in Apache Phoenix enables OLTP and operational analytics.
  • Experience in installing and configuring the Zookeeper to co-ordinate the Hadoop daemons
  • Working knowledge in importing and exporting data into HDFS using Sqoop.
  • Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
  • Experience in defining batch job flows with Oozie.
  • Experience in Loading log data directly into HDFS using Flume.
  • Experienced in managing and reviewing Hadoop log files to troubleshoot the issues occurred.
  • Experience in following standard Back up Measures to make sure the high availability of cluster.
  • Experience in Implementing Rack Awareness for data locality optimization.
  • Experience in scheduling snapshots of volumes for backup and find root cause analysis of failures and documenting bugs and fixes, scheduled downtimes and maintenance of cluster.
  • Good experience in Hive, Phoenix data modeling/queries.
  • Experience in database imports, worked with imported data to populate tables in Hive.
  • Hands on experience in data mining process, implementing complex business logic and optimizing the query using Hive QL and controlling the data distribution by partitioning and bucketing techniques to enhance performance.
  • Experience working with Hive data, extending the Hive library using custom UDF's to query data in non-standard formats
  • Exposure about how to export data from relational databases to Hadoop Distributed File System.
  • Experience in cluster maintenance, commissioning and decommissioning the data nodes.
  • Experience in monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Experience in monitoring multiple Hadoop clusters environments using Cloudera Manager and Ambari as well as workload, job performance and capacity planning.
  • Experience in installing and configuring Kerberos for the authentication of users and Hadoop daemons.
  • Hands on experience in Linux admin activities on Cent OS.
  • Knowledge on Cloud technologies like AWS Cloud.
  • Experience in Benchmarking, Backup and Disaster Recovery of Name Node Metadata.
  • Experience in performing minor and major Upgrades of Hadoop Cluster.
  • Perform hands-on administration, monitoring and troubleshooting of all company networks and programs, resulting in optimum performance and minimum downtime.

PROFESSIONAL EXPERIENCE:

Hadoop Admin

Confidential - Oak Brook, IL

Responsibilities:

  • Responsible for installing and configuring HDP and HDF clusters on-prem.
  • Responsible for maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Cluster Planning, Manage and review data backups, Manage & review log files
  • Responsible for installing 6 (4 HDF (3.0) and 2 HDP (2.6.1)) Hadoop clusters which would be used for collecting logs and analyzing as part of SDL architecture.
  • Developed a Data lake which serves as a Base layer to store and do analytics for Developers, we provide services to developers, install their custom software's, upgrade Hadoop components, solve their issues, and help them troubleshooting their long running jobs.
  • Worked on a live Big Data Hadoop production environment with 220 nodes.
  • Involved in implementing security on HDF and HDF Hadoop Clusters with Kerberos for authentication and Ranger for authorization and LDAP integration for Ambari, Ranger, NiFi.
  • Responsible for upgrading Horton works Hadoop HDP 2.6.0 to 2.6.1 Multi Node environment.
  • Configured Hadoop security setup using AD Kerberos.
  • Configured and Automated SSL/TLS for Ambari, HDFS, Yarn, Hive, Hbase, Ranger, Ambari-metrics, Oozie, KNOX, Spark, NiFi, Kafka, Solr.
  • Worked on configuring F5 LB for Ranger, NiFi and Oozie.
  • Configured Zookeeper to implement node coordination, in clustering support.
  • Configured Kafka for efficiently collecting, aggregating and moving large amounts of click stream data from many different sources to HDFS.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Monitored multiple Hadoop clusters environments using Ambari. Monitored workload, job performance and capacity planning using Ambari. Installed and configured Hortonworks.
  • Involved in running Hadoop jobs for processing millions of records of text data. Troubleshoot the build issue during the Jenkins build process. Implement Docker to create containers for Tomcat Servers, Jenkins.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • I have used Service now and JIRA to track issues, Mostly Managing and reviewing Log files as a part of administration for troubleshooting purposes, meeting the SLA's on time.

Environment: Hadoop, HDFS, Hive, MapReduce, Sqoop, HBase, NiFi, kafka, Kerberos, Ranger, Atlas, Knox, Solr, Shell Scripting, Python, PySpark, Linux Red Hat 7.

Hadoop Admin

Confidential - Dallas, TX.

Responsibilities:

  • Managed GCP Instances as per requirement and creating cluster network and load balancing.
  • Creating Virtual Network Platform and Firewall policies as per project requirement.
  • Stack driver Configuration to monitor all projects and Resources to ensure 99.99uptime in GCP.
  • Creating Looker cluster in GCP and monitoring using stack driver.
  • Configuration Looker and creating required users in Looker console.
  • Monitoring and working with application team for Git Lab issues on Google Cloud Platform.
  • Experience in Installing and Configuration of Hadoop (Cloudera and Horton works)
  • Monitoring cluster via Ambari, Cloudera Manager and Nagios.
  • Installing and Configuration of Apache Solr.
  • Troubleshooting Solr issues and monitoring them using Cloudera Manager.
  • Created work instructions for upgrading and applying patches for Hadoop cluster (Steps and methods to follow to upgrade).
  • Created and Implemented Decommission process of Hadoop cluster from data center.
  • Experience in migrating large data sets of data using Distcp(Distributed Copy) Utility.
  • Monitoring Hadoop cluster using tools like Nagios, Ganglia and Cloudera Manager.
  • Experience in Shell scripting and developed some of the scripts for Data usage on the cluster and removing of all packages from cluster.
  • Administrating and optimizing the Hadoop clusters, monitoring Map Reduce jobs and worked with development team to fix the issues.
  • Experience in evaluating big data tools and NoSQL databases.
  • Working and created Cloud Computing documents for customer purpose.
  • Experience in managing Kafka issues.
  • Adding creating and deleting users for Hadoop and Hue on as is request.
  • Working on jobs performance issue and fix them for returning jobs.
  • Capacity planning for new and ongoing cluster implementation after studying current and expected data growth on quarterly, half yearly and yearly growth.
  • Working with Developer and testing team for their problems.

Environment: AWS - EC2, S3 HDFS, Hortonworks, Hive, Pig, Hbase, Sqoop, Python scripting, UNIX Shell Scripting, Nagios, Jenkins, Git, Splunk, Chef.

Hadoop Admin

Confidential - Dublin, OH.

Responsibilities:

  • Installed, Upgrade and maintain the Hadoop Clusters using Horton Works.
  • Deployed 150 node Horton works Hadoop Cluster (HDP 2.1) using Ambari server 1.6
  • Performed major and minor upgrades in large environments.
  • Hands on experience with Apache & Horton works Hadoop Ecosystem components such as Sqoop, Hbase and MapReduce.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
  • Design and Configure the Cluster with the services required (Sentry, Hive server2, Kerberos, HDFS, Hue, Spark, Hive, Hbase, Zookeeper).
  • Involved in Monitoring and support through Nagios and Ganglia.
  • Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational databases management systems using Sqoop.
  • Flume configuration for the transfer of data from the webservers to the HDFS.
  • Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
  • Supported users in running Pig and Hive queries and with the debugging.
  • Responsible for troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
  • Design and maintain the Name node and Data nodes with appropriate processing capacity and disk space.
  • Performed Data scrubbing and processing with Oozie.
  • Used Tableau to visualize the analyzed data.
  • Installed and configured Kerberos for Hadoop and all of its eco system tools for security.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Monitor System health and respond accordingly to any warning or failure conditions.
  • Writing automation scripts for loading data to cluster and deployment with installation of services using scripts.
  • Create Execute and Debug SQL queries to perform data completeness, correctness, data transformation and data quality testing.
  • Sentry configuration for appropriate user permissions accessing Hive server2/beeline.
  • Cluster maintenance as well as creation and removal of nodes using Horton works.
  • Created Data migration plan from one cluster to another using BDR.
  • Monitoring the cluster on a daily basis and to check the error logs and debugging them.
  • Provided support for users and helped to resolve any job failures.

Environment: CentOS, Horton Works, FLUME, HBase, HDFS, Map-Reduce, Hive, Oozie, Zookeeper, Sqoop.

Hadoop Admin

Confidential - Pleasanton, CA

Responsibilities:

  • Worked on Hadoop cluster, which ranged from 30 nodes in development stage, 40 nodes in pre-production and 140 nodes in production.
  • Worked with Nifi for managing the flow of data from source to HDFS.
  • Used Apache NiFi to copy the data from local file system to HDP.
  • Responsible to manage data coming from different sources and importing structured and unstructured data.
  • Handle the installation and configuration of a Hadoop cluster.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive, and HBase.
  • Handle the data exchange between HDFS and different web sources using Flume and Sqoop.
  • Monitor the data streaming between web sources and HDFS.
  • Monitor the Hadoop cluster functioning through monitoring tools.
  • Close monitoring and analysis of the MapReduce job executions on cluster Confidential task level.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Commission or decommission the data nodes from cluster in case of problems.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Set up and manage HA Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.

Environment: Hadoop 1.0.0 and Hadoop 2.0.0, HDFS, Map Reduce, Cloudera, SQOOP, Hive, Pig, HBase, Java, and Flume 1.2.0, Eclipse IDE.CDH3.

Linux Admin

Confidential

Responsibilities:

  • Supported Solaris/Linux servers in production/QA/Development Environment, including Solaris Zone and RHEL VM's.
  • Installed ESXi 4.1 Hypervisor on HP Servers.
  • Installing, configuring and maintaining apache, samba, Web Sphere& Web Logic Application Servers.
  • Worked on VMware, VMware View, and vSphere 4.0.
  • Installation of systems using Jumpstart for Sun Servers and Kickstart for RHEL on HP Hw.
  • Configure, support and perform routine maintenance of hardware and software for Linux and Solaris servers.
  • Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and Configuring launched instances with respect to specific applications.
  • Supported 200+ AWS cloud instances running Ubuntu, Redhat and Windows environments.
  • Automation of various administrative tasks on multiple servers using Puppet.
  • Deployed Puppet, Puppet Dashboard, and Puppet DB for configuration management to existing infrastructure.
  • Proficient in installation, configuration and maintenance of applications like Apache, LDAP, PHP
  • Involved installing and managing different automation and monitoring tools on Redhat Linux like Nagios, Splunk and Puppet.
  • Resolved configuration issues and problems related to OS, NFS mounts, LDAP user ids DNS and issues.
  • Regularly applying patches for Redhat Linux, Sun and HP systems.

Software Developer

Confidential

Responsibilities:

  • Designed a system and developed a framework using J2EE technologies based on MVC architecture.
  • Involved in the iterative/incremental development of project application. Participated in the requirement analysis and design meetings.
  • Designed and Developed UI's using JSP by following MVC architecture
  • Designed and developed Presentation Tier using Struts framework, JSP, Servlets, TagLibs, HTML and JavaScript.
  • Designed the control which includes Class Diagrams and Sequence Diagrams using VISIO.
  • Programmed the views using JSP pages with the struts tag library, Model is a combination of EJB's and Java classes and web implementation controllers are Servlets.
  • Generated XML pages with templates using XSL. Used JSP and Servlets, EJBs on server side.
  • Developed a complete External build process and maintained using ANT.
  • Implemented Home Interface, Remote Interface, and Bean Implementation class.
  • Extensive usage of XML - Application configuration, Navigation, Task based configuration.
  • Used EJB features effectively- Local interfaces to improve the performance, Abstract persistence schema, CMRs.
  • Used Struts web application framework implementation to build the presentation tier.
  • Wrote PL/SQL queries to access data from Oracle database.
  • Set up Web Sphere Application server and used ANT tool to build the application and deploy the application in Web sphere.
  • Implemented JMS for making asynchronous requests

Environment: Java, J2EE, Struts, Hibernate, JSP, Servlets, HTML, CSS, UML, JQuery, Log4J, XML Schema, JUNIT, Tomcat, JavaScript, Oracle 9i, UNIX, Eclipse IDE.

We'd love your feedback!