We provide IT Staff Augmentation Services!

Hadoop Admin Resume

3.00 Rating

VA

SUMMARY:

  • Around 6+ years of experience including 3years in Hadoop and related technologies.
  • Worked on installation configuration and maintenance of 100+ node Hadoop cluster.
  • Involved in collecting requirements from business users, designing and implementing data pipelines and ETL workflows end to end.
  • Experience in performing various major and minor Hadoop upgraded on large environments.
  • Experience wif Securing Hadoop clusters using Kerberos.
  • Experience in using Cloudera Manager for Installation and management of Hadoop clusters.
  • Monitoring and support through Nagios and Ganglia
  • Benchmarking Hadoop clusters to validate teh hardware before and after installation to tweak teh configurations to obtain better performance.
  • Experience in performing POCs to test teh usability of new tools on top of Hadoop.
  • Experience in working large environments and leading teh infrastructure support and operations.
  • Migrating applications from existing systems like MYSQL, oracle, db2 and Teradata to Hadoop.
  • Expertise wif Hadoop, Map reduces, Pig, SQOOP, OOZIE, and Hive.
  • Developed and automated Hive queries on daily basis.
  • Extensive noledge on Migration of applications from existing sources.
  • Experience in driving OS upgrades on large Hadoop clusters wifout down times.
  • Expertise in Collaborating across Multiple technology groups and getting things done.

PROFESSIONAL EXPERIENCE:

Confidential, VA

Hadoop Admin

Responsibilities:

  • Gatheird teh business requirements from teh Business Partners and Subject Matter Experts.
  • Installed and configured Apache Hadoop, Hive and Pig environment on teh prototype server.
  • Configured MYSQL Database to store Hive metadata.
  • Responsible for loading unstructured data into Hadoop File System (HDFS).
  • Maintained documentation for corporate Data Dictionary wif attributes, table names and constraints.
  • Involved in running Hadoop streaming jobs to process terabytes of xml format data.
  • Involved in managing and reviewing Hadoop log files.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Supported Map Reduce Programs those are running on teh cluster.
  • Imported data using SQOOP to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Created jobs to load data from MongoDB into Data warehouse.
  • Wrote Hive queries for data analysis to meet teh business requirements.
  • Extensively worked wif SQL scripts to validate teh Pre, and Post data load
  • Wrote Java MapReduce jobs to process teh tagging functionality for each chapter, sections and sub sections on teh data stored in HDFS.

Confidential, Plymouth, MN

Hadoop Administrator

Responsibilities:

  • Responsible for Building and managing Hadoop clusters. responsible on working Cloudera manager
  • Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load) and Hive QL.
  • Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Pig, SQOOP.
  • Expertise in Hadoop cluster task like Adding and Removing Nodes wifout any TEMPeffect to running jobs and data.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively wif SQOOP for importing metadata from Oracle.
  • Designed a data warehouse using Hive.
  • Created partitioned tables in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Extensively used Pig for data cleansing.
  • Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Developed teh Pig UDF’S to pre - process teh data for analysis.

Confidential, Memphis, TN

JR. Hadoop Admin

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and SQOOP.
  • Extensively involved in Installation and configuration of Cloudera distribution Hadoop 2, 3, Name Node, Secondary Name Node, Job Tracker, Task Trackers and Data Nodes.
  • Wrote teh shell scripts to monitor teh health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
  • Involved in clustering of Hadoop in teh network of 70 nodes.
  • Experienced in loading data from UNIX local file system to HDFS.
  • Developed data pipeline using Flume, SQOOP, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Involved in developing new work flow Map Reduce jobs using OOZIE framework.
  • Collected teh logs data from web servers and integrated in to HDFS using Flume.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
  • Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing teh data onto HDFS.
  • Involved in teh installation of CDH3 and up-gradation from CDH3 to CDH4.
  • Responsible for developing data pipeline using HDInsight, flume, SQOOP and pig to extract teh data from weblogs and store in HDFS.
  • Installed OOZIE workflow engine to run multiple Hive and Pig Jobs
  • Use of SQOOP to import and export data from HDFS to RDBMS and vice-versa.
  • Used Hive and created Hive external/internal tables and involved in data loading and writing Hive UDFs.
  • Exported teh analyzed data to relational databases using SQOOP for visualization and to generate reports.
  • Involved in migration of ETL processes from Oracle to Hive to test teh easy data manipulation.
  • Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using SQOOP.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra. Kafka
  • Created Hive External tables and loaded teh data in to tables and query data using HQL.
  • Created Hive queries to compare teh raw data wif EDW tables and performing aggregates.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Automated workflows using shell scripts to pull data from various databases into Hadoop.

Confidential,

Linux Admin

Responsibilities:

  • Configuring and tuning system and network parameters for optimum performance.
  • Gained noledge on troubleshooting and problem Solve skills, including application and network-level troubleshooting ability.
  • Gained noledge and experience on writing shell scripts to automate teh tasks.
  • Identifying and triaging outages monitor and remediate systems and network performance.
  • Developing tools to automate teh deployment, administration, and monitoring of a large-scale Linux environment.
  • Performing server tuning, operating system upgrades.
  • Participating in teh planning phase for system requirements on various projects for deployment of business functions.
  • Participating in 24x7 on-call rotation and maintenance windows.
  • Communication & coordination wif internal / external groups and operations.

TECHNICAL SKILLS:

Hadoop eco system components: Hadoop, Map reduce, yarn, hive, pig, SQOOP, flume, impala, OOZIE.

Tools: SVN, Tableau, Micro strategy integrations wif hive.

Programming Languages: Unix Shell scripting, JAVA, SQL.

Monitoring and Alerting: Nagios, Ganglia.

Operating Systems: Linux Centos 5,6, Redhat7.

We'd love your feedback!