We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

San Jose, CA

SUMMARY:

  • Around 8 years of IT experience along with 4+ years of BigData/Hadoop experience.
  • Experienced BigData/Hadoop developer with good knowledge of Hadoop Distributed File System and Eco Systems - MapReduce, Pig, Hive, Hbase, Spark, Sqoop, and Zookeeper.
  • Experience in architecting Hadoop clusters using major Hadoop Distributions - CDH4 & CDH5.
  • Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration and setting up the rack topology for large clusters in HadoopAdministration/Architecture/Developer with multiple distributions like Horton Works & Cloudera.
  • Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
  • Knowledge of NoSQL databases such as HBase.
  • Experience in analyzing data using Pig Latin and Hive QL.
  • Expertise with managing and reviewing Hadoop log files.
  • Knowledge in job/workflow scheduling and monitoring tools like Oozie&Zookeeper.
  • Excellent Working Knowledge in Spark Core, Spark SQL, Spark Streaming.
  • Hands on experience in writing MR jobs using Java, expert knowledge of MRv1 and MRv2.
  • Set up of Cluster servers on AWS and management of cluster servers.Experience workingon EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3 (Simple Storage Service), set EMR (Elastic MapReduce).
  • Experience in working with different data sources like Flat files, Spreadsheet files, log files and Databases.
  • Knowledge of monitoring and managing Hadoop cluster using CDH4/5 Cloudera Manager, Ganglia, Ambari and Nagios.
  • Work experience in different layers of Hadoop Framework - Storage layer (HDFS),Analysis Layer (Pig and Hive), Engineering Layer (Jobs and Workflows).
  • Background with traditional databases such as Oracle, SQL Server, MySQL.
  • Good understanding of SCRUM and AGILE methodologies.
  • Working knowledge of RDBMS/Oracle 9i, SDLC, QA/UAT & Technical documentation.

TECHNICAL SKILLS:

Programming Languages: SQL, PL/SQL, C, C++, Java, Shell scripting, Python

Web Technology: XML, HTML, CSS, Java scripts, JQuery, PHP, Scala Program

Database: Oracle 11g/10g, My SQL, NoSQL: Operating Systems, MS-DOS, Windows 7/NT/XP, UNIX, Linux,CentOS, RHEL

A pplication Packages: MF Office suite, Big data Technologies/Tools, Apache Hadoop, Cloudera Manager, Hortonworks, Ambari, Ganglia, Nagios, HDFS, Map Reduce, Hbase, Pig,Yarn, Hive, Sqoop, Oozie, Spark, Solr, Flume, Zookeeper, Puppet, Chef

PROFESSIONAL EXPERIENCE:

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it using Map Reduce programs.
  • Worked on Pig and Hiveqlfor processing and analyzing data generated by distributed IOT networks.
  • Created the Hive queries for data sampling and analysis of the data generated by CustomerIQ application.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from HDFStoMYSQLusing Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Worked on storing/retrieving data in the SilverLink Data Platform.
  • Migrating various hive UDF’s and queries into Spark SQL for faster requests as part of POC implementation.
  • Used Spark for Parallel data processing and better performances.
  • Worked in data warehouse schema creation and management.
  • Worked on Oozie workflows to run multiple Hive and Pig jobs.
  • Balanced and tuned HDFS, Hive, MapReduce, and Oozie work flows.
  • Worked on installing operating system and Hadoop updates, patches, version upgrades when required.
  • Performance tuning of Hadoop clusters and Hadoop MapReduce routines.

Environment: CDH 5.7.1, CDH 5.6.1, CentOS 7,RHEL 7,Ganglia, Hadoop, Hive,Oozie,Pig,Java, HDFS, Map Reduce, Spark, Sqoop

Confidential, Bellevue, WA

Hadoop Developer

Responsibilities:

  • Worked on and designed Big Data analytics platform for processing customer interface preferences and comments using Java, Hadoop, Hive and Pig .
  • Involved in Hive-Hbase integration by creating hive external tables and specifying storage as Hbase format.
  • Performance tuning of the Hadoop cluster workloads, bottle necks and job queuing.
  • Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
  • Worked on accessing Hive tables to perform analytics from java applications using JDBC
  • Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date.
  • Worked with various HDFS file formats like Avro, SequenceFile and various compression formats like Snappy, bzip2.
  • Developed the Pig UDF's to pre-process the data for analysis.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Developed Hive queries for data sampling and analysis to the analysts.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Used Solr search API and have developed custom Solr Request Handler
  • Developed custom Python and Unix SHELL scripts to do data sampling, pre and post validations of master and slave nodes, before and after configuring the name node and data nodes respectively.
  • Developed and used Pig Scripts to process and query flat files in HDFS which cannot be accessed using HIVE.

Environment: RedHat Linux 5, MS SQL Server, Oracle, Hadoop CDH 4, PIG, Hive, ZooKeeper, Flume, HDFS, HBase, Sqoop, Solr, Python, Oozie, UNIX Shell Scripting, PL/SQL.

Confidential, Costa Mesa, CA

Hadoop Operations Engineer

Responsibilities:

  • Responsible for cluster maintenance, monitoring, commissioning and decommissioning data nodes, manage data backups.
  • Supported MapReduce Programs that are running on the cluster.
  • Designed appropriate partitioning/bucketing schema to allow faster data retrieval during analysis using HIVE.
  • Involved in creating Hive tables, loading data and running hive queries.
  • Extensive working knowledge of partitioned table, UDFs , performance tuning , compression-related properties in Hive.
  • Implemented and configured High Availability Hadoop Cluster (Quorum Based).
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
  • Worked in using Flume to stream data into HDFS - from various sources. Managed interdependent Hadoop jobs and automated several types of Hadoop map-reduce jobs, Hive.
  • Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
  • Provided operational support services related to Hadoop infrastructure and application installation. Handled the imports and exports of data onto HDFS using Flume and Sqoop.
  • Supported technical team members in management and review of Hadoop log files and data backups.

Environment: HDFS, CDH3, CDH4, Hbase, NOSQL, RHEL 4/5, Hive, Pig, Perl Scripting, Sqoop, Flume

Confidential, Redmond, WA

Hadoop Admin/Developer

Responsibilities:

  • Involved in Cluster maintenance using Cloudera Manager, used JobTracker UI to analyze incomplete or failed jobs and ran file merger to consolidate small files and directories.
  • Worked with data delivery teams, Linux admin team to setup new users, user spaces, quotas, setting up Kerberos principals and testing HDFS/MapReduce access, Hive/Pig access for them.
  • Teamed with infrastructure, network, database, application and business intelligence teams to evaluate new host requests and resource management, perform updates and upgrades to the existing farm from time to time.
  • Wrote shell scripts and used Cloudera Manager to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Performed tuning of Hadoop MapReduce routines written in Java and provided 24X7 support for developers who use Hadoop stack. Automated MapReduce job workflows using Oozie scheduler.

Environment: Cloudera,Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Kerboros, RedHat Linux

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • Analyzed all business functionality related to back end database interfaces.
  • Developed technical specifications for various back end modules from business requirements. Specifications are done according to standard specification formats.
  • Worked with DBA in making enhancements to physical DB schema. Also coordinated with DBA in creating and managing table, indexes, tablespaces, triggers, db links and privileges.
  • Analyzed and designed tables based on small and large database transactions.
  • Developed back end interfaces using PL/SQL stored packages, procedures, functions, Collections, Object Types, triggers, C, K-Shell scripts.
  • Developed screens and reports using Oracle Forms/Reports.
  • Responsible in taking crystal reports and SQL reports
  • Utilized SQL*Loader to load flat files database tables.
  • Involved in Extracting, Transforming and Loading by using Informatica tool.
  • Responsible for SQL tuning and optimization using Analyze, Explain Plan, TKPROF utility and optimizer hints.
  • Utilized SQL developer tool in developing all back end database interfaces.
  • Responsible for performing code reviews.
  • Developed user documentation for all the application modules. Also responsible for writing test plan documents and unit testing for the application modules.

Environment: SQL, PL/SQL, Java, Oracle 10g, SQL*Plus, Windows, SQL*Loader, Explain Plan and TKPROF tuning utility, SQL Developer, TOAD

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • As a Software Developer, responsible for design and development of module specifications.
  • Analyzed, designed, optimized and tuned Java programs, PL/SQL Procedures, Oracle StoredProcedures
  • Wrote cursors and control structures using PL/SQL.
  • Creating PL/SQL objects like stored procedure, functions, Packages, Cursors with best Optimized Techniques.
  • Creating various types of triggers like DML triggers, DDL triggers, Database Triggers.
  • Involved in bug fixing of tickets.
  • Preparation of Unit Test Data.
  • Execution of Unit Test Plan Conditions and Test Cases.
  • Simulation and Code Walk through.

Environment: SQL, PL/SQL, Java, Oracle 10g, SQL*Plus, Windows, SQL*Loader, Explain Plan and TKPROF tuning utility, SQL Developer, TOAD

We'd love your feedback!