We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

3.00/5 (Submit Your Rating)

Chicago, IL

­­­­

PROFESSIONAL SUMMARY:

  • Overall 7+ years of experience with proven expertise in system development activities including requirement analysis, design, and implementation and supporting with emphasis on Object Oriented, SQL and Hadoop (HDFS, Map Reduce, Pig, Hive, Hbase, Oozie, Flume, Sqoop and Zookeeper) technologies.
  • Experienced in installing, configuring, and administrating Hadoop cluster of major distributions.
  • Excellent experience in schedulers like Control - M and Tidal schedulers.
  • Hands on experience on ActiveMQ, SQS and Kafka messaging queues.
  • Designed and implemented Complex HQL data migration and performance tuning.
  • Working experience on Hortonworks (HDP) and Cloudera distribution.
  • Excellent experience in supporting production clusters and handling critical issues.
  • Coordinated with technical teams for installation of Hadoop and third party related applications on systems.
  • Performed upgrades, patches and bug fixes in HDP and CDH clusters.
  • Experience on building dashboards for operations from FS Image to project existing and forecasted data growth.
  • Built various automation plans from operations stand point.
  • Worked with tableau team to build dashboards over Hive data.
  • Participated in building Splunk dashboards for reporting access breaches.
  • Hands-on experience security applications like Ranger, Knox and Kerberos.
  • Hands-on experience on Hortonworks (HDP) and Cloudera Distribution.
  • Assisted in designing and architecture of Hadoop systems.
  • Excellent experience working on the Hadoop Operations on the ETL infrastructure with other BI teams like TD and Tableau.
  • Working experience on Teradata for data validations.
  • Worked on getting Hadoop into SOX compliance.
  • In-depth knowledge and excellent hands-on experience on all Hadoop ecosystem tools to perform analysis on platform related issues on job failures and troubleshoot Hadoop applications related issues.
  • Very good experience working on implementing Kerberos.
  • Excellent experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
  • Excellent experience in Pig, Hive, Sqoop, Hbase, Yarn, Oozie and Map/Reduce jobs to support distributed data processing and process large data sets utilizing the Hadoop cluster.
  • Experience in Agile Engineering practices.

TECHNICAL SKILLS:

Bigdata: Hadoop (MapReduce, HBASE, HDFS, Sqoop, Hive, oozie, flume, PIG, Zookeeper).

Other Big data tools: Storm, NIFI, Kafka, ActiveMQ,Scala

Distributions: Cloudera, Hortonworks(HDP).

AWS: SQS, SNS, RedShift, Lambda.

NoSQL Data Stores: HBASE

Scripts: bash, Python.

Query Languages: Hive, Sql, PIG, PL/SQL.

Operating Systems: Linux, Unix, MacOS, Win XP/7/8.

PROFESSIONAL EXPERIENCE:

Confidential, Chicago, IL

Hadoop Administrator

Responsibilities:

  • Responsible to drive and fix any production “severity one” from technical stand point.
  • Manage over ~2500 Hadoop ETL jobs in production.
  • Manage Production cluster comprises of 220 nodes.
  • Involved in deploying a Hadoop cluster using Hortonworks Ambari HDP 2.2 integrated with Sitescope for monitoring and Alerting.
  • Launching and Setup of HADOOP Cluster on AWS as well as physical servers, which includes configuring different components of HADOOP.
  • Created a local YUM repository for installing and updating packages.
  • Responsible for building system that ingests terabytes of data per day into Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
  • Developed data pipelines that ingests data from multiple data sources and process them.
  • Expertise in Using Sqoop to connect to the ORACLE, MySQL, SQL Server, TERADATA and move the pivoted data to Hive tables or Hbase tables.
  • Configured Kerberos for authentication, Knox for perimeter security and Ranger for granular access in the cluster.
  • Configured and installed several Hadoop clusters in both physical machines as well as the AWS cloud for POCs.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Extensively used Sqoop to move the data from relational databases to HDFS.
  • Used Flume to move the data from web logs onto HDFS.
  • Used Pig to apply transformations validations, cleaning and deduplication of data from raw data sources.
  • Used Storm service extensively to connect to Active MQ and KAFKA to push data to HBASE and HIVE tables.
  • Used NIFI to pull the data from different source and to push the data to HBASE and HIVE
  • Worked on installing SPARK and performance tuning.
  • Integrated schedulers Tidal and Control-M with the Hadoop clusters to schedule the jobs and dependencies on the cluster.
  • Worked closely with the Continuous Integration team to setup tools like Github, Jenkins and Nexus for scheduling automatic deployments of new or existing code.
  • Actively monitored the Hadoop Cluster of 220 Nodes with Hortonworks distribution with HDP 2.4.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked on performing minor upgrade from HDP 2.2.2 to HDP 2.2.4
  • Upgraded the Hadoop cluster from HDP 2.2 to HDP 2.4
  • Integrated BI tool Tableau to run visualizations over the data.
  • Have Worked on installing Hadoop services on cloud integrated with AWS
  • Worked closely with the Continuous Integration team to setup tools like Github, Jenkins and Nexus for scheduling automatic deployments of new or existing code.
  • Provided 24 x 7 on call support as part of a scheduled rotation with other team members

Environment: Java,HADOOP HDFS, MAPREDUCE, HIVE, PIG, OOZIE, SQOOP, Ambari, NIFI, STORM, AWS S3, EC2, IAM, ZOOKEEPER, SPLUNK, KAFKA.

Confidential, Houston, TX

Hadoop Administrator

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH4 Hadoop cluster on Ubuntu LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration.
  • Provided technical assistance for configuration, administration and monitoring of Hadoop clusters.
  • Worked with application teams to install Hadoop updates, patches, version upgrades as required.
  • Involved in loading data from UNIX file system to HDFS.
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Installed Oozie workflow to run multiple Hive, pig and mapreduce jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop, HDFS, Pig, Hive, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Confidential

Hadoop Admin

Responsibilities:

  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Involved in installing CDH3 Hadoop Ecosystem components.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Assisted in designing, development and architecture of Hadoop systems.
  • Used bash scripts tools like, grep, awk and sed.
  • Coordinated with technical teams for installation of Hadoop and third party related applications on systems.
  • Cluster security with Kerberos.
  • Wrote bash scripts to automate Sqoop jobs.
  • Wrote python scripts to load data of various file formats when implementing POC cluster.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Installed and configured Pig and also written PigLatin scripts.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop, MapReduce, HDFS, Hive, Python, Pig, Linux, XML, MySQL.

Confidential

Software Programmer

Responsibilites:

  • Used Hibernate ORM tool as persistence Layer - using the database and configuration data to provide persistence services (and persistent objects) to the application.
  • Responsible for developing DAO layer using Spring MVC and configuration XML’s for Hibernate and to also manage CRUD operations (insert, update, and delete).
  • Implemented Dependency injection of spring frame work.
  • Developed and implemented the DAO and service classes.
  • Developed reusable services using BPEL to transfer data.
  • Participated in Analysis, interface design and development of JSP.
  • Configured log4j to enable/disable logging in application.
  • Developed Rich user interface using HTML, JSP, AJAX, Java Script, JQuery and CSS.
  • Implemented PL/SQL queries, Procedures to perform data base operations.
  • Wrote UNIX Shell scripts and used UNIX environment to deploy the EAR and read the logs.
  • Implemented Log4j for logging purpose in the application.
  • Involved in code deployment activities for different environments.
  • Implemented agile development methodology.

Environment: Java, Jest, Struts, Spring, Hibernate, Web services(JAX-WS), JMS,, Web logic 10.1 Server, JDeveloper, Sql Developer, HTML, LDAP, XML, CSS, JavaScript, JSON, SQL, PL/SQL, Oracle and UNIX/Linux.

We'd love your feedback!