We provide IT Staff Augmentation Services!

Hadoop Developer/ Admin Resume

0/5 (Submit Your Rating)

Newark, CA

SUMMARY

  • Around 5+ years of experience in Hadoop and related technologies.
  • Hands on installation, configuration and maintenance of Multi node Hadoop cluster.
  • Experience in performing various major and minor Hadoop upgraded on large environments.
  • Experience with Securing Hadoop clusters using Kerberos.
  • Experience in using Cloudera Manager for Installation and management of Hadoop clusters.
  • Monitoring and support through Nagios and Ganglia.
  • Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
  • Experience in performing POCs to test the usability of new tools on top of Hadoop.
  • Experience in working large environments and leading the infrastructure support and operations.
  • Migrating applications from existing systems like Mysql, oracle, db2 and Teradata to Hadoop.
  • Expertise with Hadoop, Map reduce, Pig, Sqoop, Oozie, and Hive.
  • Developed and automated Hive queries on daily basis.
  • Extensive knowledge on Migration of applications from existing sources.
  • Experience in driving OS upgrades on large Hadoop clusters without down times.
  • Expertise in Collaborating across Multiple technology groups and getting things done.

TECHNICAL SKILLS

Hadoop eco system components: Hadoop, Map reduce, yarn, hive, pig, sqoop, flume,Kafka impala, Oozie.

Tools: Tableau, Micro strategy integrations with hive.

Programming Languages: Unix Shell scripting, Python, SQL

Monitoring and Alerting: Nagios, Ganglia.

Operating Systems: Linux Centos 5,6, Redhat 6.

PROFESSIONAL EXPERIENCE

Confidential, Newark, CA

Hadoop Developer/ Admin

Responsibilities:

  • Worked on a live40 nodes Hadoop cluster running CDH5.4.4, CHD5.2.0, CDH5.2.1
  • Developed scripts for automation and monitoring purpose.
  • Participated in Design, build and maintain a build automation system for our software products.
  • Provide software product build configuration management.
  • Performed Sqooping for various file transfers through the Cassandra tables for processing of data to several NoSQL DBs.
  • Implemented authentication usingKerberosand authentication usingApache Sentry.
  • Designing and upgrading CDH 4 to CDH 5
  • Examining job fails and trouble shooting.
  • Performance tuning for kafka cluster (failure and success metrics).
  • Monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Automate installation of Hadoop on AWS using Cloudera Director
  • Performance tuning of HDFS, YARN
  • Automation using Cloud formation
  • Manage Users and Groups using IAM
  • Hands - on experience in creating Hadoop environment on Google Cloud Engine (GCE)
  • Installed Hadoop cluster on GCE. Worked on POC Recommendation System for social media using Movie lens dataset.

Confidential

Hadoop Administrator

Responsibilites:

  • Responsible for architecting Hadoop cluster.
  • Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load) and HiveQL.
  • Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Hive, Pig, Sqoop.
  • Expertise in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Manage and review Hadoop Log files.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively with Sqoop for importing data.
  • Designed a data warehouse using Hive.
  • Created partitioned tables in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Extensively used Pig for data cleansing.
  • Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Worked on pulling the data from relational databases, Hive into the Hadoop cluster using the Sqoop import for visualization and analysis.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.

Confidential

Linux Admin

Responsibilities:

  • Configuring and tuning system and network parameters for optimum performance.
  • Gained knowledge on troubleshooting and problem solving skills, including application and network-level troubleshooting ability.
  • Gained knowledge and experience on writing shell scripts to automate the tasks.
  • Identifying and triaging outages monitor and remediate systems and network performance.
  • Developing tools to automate the deployment, administration, and monitoring of a large-scale Linux environment.
  • Performing server tuning, operating system upgrades.
  • Participating in the planning phase for system requirements on various projects for deployment of business functions.
  • Participating in 24x7 on-call rotation and maintenance windows.
  • Communication & coordination with internal / external groups and operations.
Environment: Cloudera CHD4, Hortonworks, Amazon Web Services, HBase, CDH5, Scala, Python Hive, Sqoop, Flume, Oozie, CentOS, Ambari, EC2, EMR, S3, RDS, Cloud Watch, VPC Kafka.

We'd love your feedback!