We provide IT Staff Augmentation Services!

Lead Hadoop Developer Resume

2.00/5 (Submit Your Rating)

New, YorK

SUMMARY

  • 10 Years of IT industry experience encompassing a wide range of skill sets. Roles and industry verticals.
  • Certified Big data (HADOOP) developer, Certified in IBM Database Developer and lean concepts
  • Hands on experience in performing development and data analytics using HADOOP (BIG - DATA) tools and technologies which included HDFS, MAP REDUCE, HIVE, PIG, HBASE, SPARK, FLUME, SQOOP and OOZIE.
  • Strong database skills in DB2, HIVE, Oracle, PL/SQL, MySQL, BigSQL and No-SQL databases like HBASE, familiarity with CASSANDRA.
  • Experienced in installing, configuring, managing, and testing Big-data “HADOOP” ecosystem components.
  • Experienced in developing map reduce program using java.
  • Used Apache Spark for large-scale data processing, handling real-time analytics and designed ETL.
  • Experienced in Data warehouse concepts and ETL tools (Teradata).
  • Experienced usingTeradataSQL Assistant, data import/export, data loading with utilities like BTEQ, Multi Load, Fast Load, and Fast Export on UNIX/Mainframes environments.
  • Experienced in Stored Procedure, Trigger and macros, SQL Loader.
  • Experienced in UNIX Shell Scripting
  • Experience in the area of legacy systems based on Mainframe platform
  • Good knowledge in Maestro, StartTeam, Buildforge.
  • Experienced with workflow schedulers, data architecture including data ingestion pipeline design and data modeling.
  • Possess functional knowledge in the areas of Insurance systems, Financial Systems, Banking System and Healthcare System.
  • Good experience in all phases of systems life cycle Development, Testing (Unit test, System test, Integration Testing and Regression Testing) and Pre-Production support.
  • Proficient in analyzing and translating business requirements to technical requirements and architecture.
  • Performed Knowledge management in the form of AIDs and Project knowledge and change documents.
  • Experienced in handling internal and external functional, process and data audits.

TECHNICAL SKILL

Big Data Ecosystems: HDFS, Hive, Pig, Map Reduce, Spark Sqoop, HBase, Cassandra, Zookeeper, Flume, Oozie, Avaro and Hue

Languages: Java, PL/SQL, Python, Scala Unix shell scripting, Hiveql, Pig scripts, and Cobol

Data Base: MY SQL, BIGSQL, NOSQL, Oracle, DMS1100 and DB2

Operating System: Unix, Windows, MVS/ESA, ZOS

ETL/Reporting: Teradata

Methodologies: Waterfall, Scrum, and Agile

Tools: RPM, MPP, Test Direct,TWS Scheduler, Clarity, Quality Center, Service Center, SFTP, Teradata Sql assistant, Toda, SSH, HUE, Eclipse, Maven, Putty, BigInsight, Cloudera, Beeline Connect

PROFESSIONAL EXPERIENCE

Confidential, New York

Lead Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hive, Impala, Spark & Pig
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Implemented Partitioning, Bucketing in HIVE.
  • Creating Hive tables and working on them using Hiveql.
  • Load and transform large sets of structured, semi structured and unstructured
  • Deployed Algorithms in Scala with Spark, using complex datasets and done Spark based development with Scala
  • Created Java UDFs in PIG and HIVE.
  • Experience in using Sequence files, AVRO, PARQUET and ORC file formats.
  • Good working knowledge of Amazon Web Service components like EC2, EMR, S3, EBS, ELB
  • Come up with estimations and Technical Design Specifications for projects.
  • Requirement Analysis & Prepares solutions for each requirement

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, HBASE, OOZIE, yarn, Spark, Core Java, Teradata, SQL, UBUNTU/UNIX, eclipse, Maven, JDBC drivers, Mainframe, MySQL, Linux, AWS, XML, CRM, SVN, HUE, Putty, Cloudera, Beeline connect, TWS Scheduler.

Confidential, Fund, CA

Lead Hadoop Developer

Responsibilities:

  • Create the project using HIVE, BIGSQL, PIG
  • Implemented Partitioning, Bucketing in HIVE.
  • Involved in data modeling in Hadoop.
  • Creating Hive tables and working on them using Hiveql.
  • Written Apache PIG scripts to process the HDFS data.
  • Created Java UDFs in PIG and HIVE.
  • Designed end to end ETL work flow using Hadoop.
  • Involved in data modeling in Hadoop.
  • Participated in backup and recovery of Hadoop file system.
  • Automated tasks using UNIX shell scripts.
  • Requirement Analysis & Prepares solutions for each requirement
  • Gathered the business requirements from the Business Partners and Subject Matter Experts.

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, Python, HBASE, OOZIE, yarn, Spark, Core Java, Oracle, SQL, UBUNTU/UNIX, eclipse, Maven, JDBC drivers, Mainframe, MySQL, Linux, AWS, XML, CRM, SVN, PDSH, Putty, BigInsights

Confidential, NJ

Senior Developer

Responsibilities:

  • Understand the requirement and build the HBASE data model
  • Loaded history Data as well as incremental customer and other data to Hadoop through Hive.
  • Applied the required Business logic to the data in hive and generated the required output in the form of Flat file.
  • Experienced in writing complex Pig jobs.
  • Importing and exporting large data sets from various data sources into HDFS using Sqoop.
  • Implemented Partitioning, Bucketing in HIVE.
  • Load balancing of data across the cluster and performance tuning of various jobs running on the cluster.
  • Involved in analyzing and debugging errors occurring during jobs execution in Big Data cluster environment.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process.
  • Provide solutions to Walkups, operational, incident tickets
  • Provide data fixes and code fixes related to defects
  • Developed Queries for reporting
  • Developed applications using Eclipse
  • Performed process enhancement by SQL Tuning.
  • Provide low level and high level solution design document.
  • Responsible for disaster recovery of systems
  • Participate and Perform software upgrades and conversions.
  • Translate customer requirements into formal requirements and design documents, establish specific solutions, and leading the efforts including programming and testing that culminate in client acceptance of the results.

Environment: HADOOP, HDFS, MAPREDUCE, java, HIVE, Hue, PIG, Flume, SQOOP, HBASE, OOZIE, Yarn, Zookeeper eclipse, Maven, BigInsight

Confidential, NJ

Senior Developer

Responsibilities:

  • Requirement Analysis & Prepares solutions for each requirement
  • Designed TDD (low level) from SRS (High level)
  • Assigning of task, daily updates, weekly status update to client.
  • Responsible for Design, Data Mapping Analysis and Mapping rules
  • Used Python script to transform the data.
  • Fixed issues with the existing Fast Load/ Multi Load Scripts in for smooth loading of data in the warehouse more effectively.
  • Worked on loading of data from several flat files sources to Staging using MLOAD, FLOAD.
  • Created Bteq scripts with data transformations for loading the base tables.
  • Generated reports usingTeradataBTEQ.
  • Worked on optimizing and tuning theTeradataSQLs to improve the performance of batch and response time of data for users.
  • Fast Export utility to extract large volume of data and send files to downstream applications
  • Created stored procedure as per business requirement and involved in performance tuning

Environment: TeradataV2R12,TeradataSQL Assistant, MLOAD, FASTLOAD, BTEQ, Erwin, Unix Shell Scripting, Macros, Stored procedure, Db2, Cobol, Python, SAS, PL/SQL, FileZilla

Confidential

Developer

Responsibilities:

  • Created and reorganized all types of database objects including tables, views, indexes, sequences, synonyms and setting proper parameters and values for all the objects.
  • Wrote database triggers, stored procedures, stored functions, and stored packages to perform various automated tasks for better performance.
  • Created indexes on the tables for faster retrieval of the data to enhance database performance.
  • Created Shell Scripts for invoking SQL scripts.
  • Created and modified several UNIX shell scripts according to the changing needs of the project.
  • Used different joins, sub-queries and nested queries in SQL query
  • Effectively made use of Table Functions, Indexes, Table Partitioning, Analytical functions, and Materialized Views
  • Used the Cursor for Loop to fetch unconditional number of rows
  • Imported/Exported data from/to different databases using utilities like SQL*Loader.
  • Experience with Performance Tuning for Oracle RDBMS using Explain Plan and HINTS.
  • Involved in the continuous enhancements and fixing of production problems.
  • Verified and validated data using SQL queries.
  • Analyzed and prepared High and low level designs.
  • Made clear, maintainable, efficient and reusable codes
  • Provide post-production support to Developed modules during QA&UAT phase

Environment: Oracle 10g, java, SQL, PL/SQL, UNIX, SQL*Loader, SQL Navigator, TOAD, SQL DEVELOPER.

We'd love your feedback!