Hadoop Developer/ Admin Resume
0/5 (Submit Your Rating)
Newark, CA
SUMMARY
- Around 5+ years of experience in Hadoop and related technologies.
- Hands on installation, configuration and maintenance of Multi node Hadoop cluster.
- Experience in performing various major and minor Hadoop upgraded on large environments.
- Experience with Securing Hadoop clusters using Kerberos.
- Experience in using Cloudera Manager for Installation and management of Hadoop clusters.
- Monitoring and support through Nagios and Ganglia.
- Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
- Experience in performing POCs to test the usability of new tools on top of Hadoop.
- Experience in working large environments and leading the infrastructure support and operations.
- Migrating applications from existing systems like Mysql, oracle, db2 and Teradata to Hadoop.
- Expertise with Hadoop, Map reduce, Pig, Sqoop, Oozie, and Hive.
- Developed and automated Hive queries on daily basis.
- Extensive knowledge on Migration of applications from existing sources.
- Experience in driving OS upgrades on large Hadoop clusters without down times.
- Expertise in Collaborating across Multiple technology groups and getting things done.
TECHNICAL SKILLS
Hadoop eco system components: Hadoop, Map reduce, yarn, hive, pig, sqoop, flume,Kafka impala, Oozie.
Tools: Tableau, Micro strategy integrations with hive.
Programming Languages: Unix Shell scripting, Python, SQL
Monitoring and Alerting: Nagios, Ganglia.
Operating Systems: Linux Centos 5,6, Redhat 6.
PROFESSIONAL EXPERIENCE
Confidential, Newark, CA
Hadoop Developer/ Admin
Responsibilities:
- Worked on a live40 nodes Hadoop cluster running CDH5.4.4, CHD5.2.0, CDH5.2.1
- Developed scripts for automation and monitoring purpose.
- Participated in Design, build and maintain a build automation system for our software products.
- Provide software product build configuration management.
- Performed Sqooping for various file transfers through the Cassandra tables for processing of data to several NoSQL DBs.
- Implemented authentication usingKerberosand authentication usingApache Sentry.
- Designing and upgrading CDH 4 to CDH 5
- Examining job fails and trouble shooting.
- Performance tuning for kafka cluster (failure and success metrics).
- Monitoring and managing the Hadoop cluster using Cloudera Manager.
- Automate installation of Hadoop on AWS using Cloudera Director
- Performance tuning of HDFS, YARN
- Automation using Cloud formation
- Manage Users and Groups using IAM
- Hands - on experience in creating Hadoop environment on Google Cloud Engine (GCE)
- Installed Hadoop cluster on GCE. Worked on POC Recommendation System for social media using Movie lens dataset.
Confidential
Hadoop Administrator
Responsibilites:
- Responsible for architecting Hadoop cluster.
- Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load) and HiveQL.
- Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Hive, Pig, Sqoop.
- Expertise in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
- Manage and review Hadoop Log files.
- Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Worked extensively with Sqoop for importing data.
- Designed a data warehouse using Hive.
- Created partitioned tables in Hive.
- Mentored analyst and test team for writing Hive Queries.
- Extensively used Pig for data cleansing.
- Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability.
- Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
- Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
- Worked on pulling the data from relational databases, Hive into the Hadoop cluster using the Sqoop import for visualization and analysis.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Confidential
Linux Admin
Responsibilities:
- Configuring and tuning system and network parameters for optimum performance.
- Gained knowledge on troubleshooting and problem solving skills, including application and network-level troubleshooting ability.
- Gained knowledge and experience on writing shell scripts to automate the tasks.
- Identifying and triaging outages monitor and remediate systems and network performance.
- Developing tools to automate the deployment, administration, and monitoring of a large-scale Linux environment.
- Performing server tuning, operating system upgrades.
- Participating in the planning phase for system requirements on various projects for deployment of business functions.
- Participating in 24x7 on-call rotation and maintenance windows.
- Communication & coordination with internal / external groups and operations.