We provide IT Staff Augmentation Services!

Hadoop/bigdata Lead Developer Resume

3.00/5 (Submit Your Rating)

Los Angeles, CA

SUMMARY

  • Have 10+ years of Experience in building high - performing Hadoop distributed systems and data warehousing applications that manage large volumes of client data to transform and store the data in an efficient manner for trend analysis, billing and business intelligence.
  • Expertise with Big Data Architectural design and Programming skills including Apache Hadoop, Hive, Impala, Pig, Sqoop, Oozie, YARN, Flume, etc.
  • Strong knowledge and hands-on experience with managing the cluster resources using YARN and provided group services and configuring them with centralized Zookeeper service.
  • Excellent programming skills in OOPs concepts, Java technologies, Map Reduce.
  • Experience in working with large data sets using NoSQL databases like HBase.
  • Understanding of distributed stream processing ecosystems and tools including Spark and Kafka.
  • Strong working knowledge in Python programming language.
  • Strong knowledge in Data warehousing concepts like Star Schema and Snowflakes Schema dimensional modeling, Fact tables, Summary tables and Slowly Changing Dimensions.
  • Experience in Teradata Utilities such as FastLoad, MultiLoad, FastExport and BTEQ.
  • Expertise in implementing complex business rules by creatingmappings using various Informatica transformations, Sessions and Workflows.
  • Work extensively on tuning mappings, identifying and resolving performance problems at various stages like sources, targets, mappings, and sessions and Pushdown Optimization.
  • Experience in Unix Shell scripting and job scheduling tools like Autosys.
  • Experience in creating logical and physical data models for Relational and Dimensional data modeling using Erwin data modeling tool.
  • Expertise in implementing robust reusable PL/SQL packages, procedures, functions, triggers, objects, pipeline functions using TOAD and SQL Developer.
  • Ability to understand the business requirements, functional spec and translate it into Design data/ETL flow charts and tech spec.
  • In-depth knowledge of Agile/Scrum methodology.
  • Excellent problem solving, communication, leadership, analytic and interpersonal skills.
  • Working independently or as part of a team, highly effective at communicating with all levels of management and coworkers, and committed to delivering superior quality work.
  • Experience in Production support activities and communicating with offshore teams.

TECHNICAL SKILLS

Hadoop stack: Hadoop MR, HDFS, YARN, Pig, Hive, Sqoop, Impala, Oozie, Flume, etc.

Programming Languages: Java, Map-Reduce, Spark, Struts Framework, JSP, Spring Framework, J2EE, Hibernate Framework

Data ware housing Concepts:

No-SQL Databases: HBase

RDBMS Databases: Teradata, Oracle

ETL Tools: Informatica Power Center 8.5, 8.6, 9.1

SQL Languages: HiveQL, Teradata SQL, PL/SQL

Scheduling Tools: Autosys

Other Tools: Eclipse IDE, Toad, Oracle SQL Developer, ERwin, Visio

Operating Systems: Windows and UNIX

PROFESSIONAL EXPERIENCE

Confidential, LOS ANGELES, CA

Hadoop/Bigdata Lead Developer

Responsibilities:

  • Received the large volume of click stream data from various 3rd parties like Adobe, Conviva into HDFS.
  • Written Map Reduce code to remove invalid and incomplete data.
  • Created Hive external tables on top the valid data sets.
  • Developed complex business rules using Hive, Impala and Pig to transform and store the data in an efficient manner for trend analysis, billing and business intelligence.
  • Written Hive user defined functions to accomplish critical logic.
  • Managing the cluster resources using YARN.
  • Provided group services and configuring them with centralized Zookeeper service.
  • Integrate Hadoop with Teradata and Oracle RDBMS systems by Importing and Exporting Customer data using Sqoop.
  • Ingested tweets related to Confidential using Flume into HDFS.
  • Automated end-to-end process with the help of Oozie workflows and Autosys scheduling tool.
  • Built user click stream data warehouse aggregate tables in Hive to be accessed by Qlik View.
  • Accessing Hadoop Environment using HUE.
  • Developed No-SQL scripts to read/write customer account info in HBase.
  • Implementing Spark applications to process real time streaming data.

Environment: CDH 4.X, CDH 5.X, HUE, Eclipse, Java, UNIX, Teradata, Oracle …

Confidential, Beaverton, OR

EDW/ETL Lead Developer

Responsibilities:

  • Provide support in database architecture of the organization through database design, modeling and implementation.
  • Develop plans, strategies and standards for data within the enterprise.
  • Understand the end-to-end scope of work and design architectural solutions accordingly.
  • Designed and developed the logic to handle business logic and scenarios while moving data form source to target.
  • Integrate all data in the data management platform enabling accessibility.
  • Worked with DBA in making enhancements to physical database schema's, creating and managing tables, indexes, table spaces, triggers, partitioning, database links and privileges
  • Used Informatica Designer and Workflow Manager to create complex mappings and sessions.
  • Created various transformations like filter, router, lookups, stored procedure, joiner, update strategy, expressions and aggregator to pipeline data to Data Warehouse and monitored the Daily and Weekly Loads.
  • Created Mappings, Mapplets, Sessions and Workflows with effective caching and logging using Informatica Power Center 8.6/9.1
  • Involved in Fine tuning SQL overrides in Source Qualifier and Look-up SQL overrides for performance Enhancements.
  • Created reusable Mapplets and transformations in Informatica.
  • Developed UNIX shell scripts to pick up flat files from FTP server, covert it into generic names, matching the counts with the control files before loading files and send email on success/failure accordingly.
  • Knowledge of Teradata Utilities such as MultiLoad, FastExport, FastLoad and BTEQ.
  • Exposure to use various types of Teradata Indexes efficiently to Process SQL Statement or Access Data.
  • Extensively used Autosys to run Informatica jobs.
  • Developed/Modified mappings and sessions which meet the business requirements, and carried out Unit and System testing for functionality and effective performance.
  • Migrated Informatica mappings and workflows to UAT environment and provided production support.

Environment: Informatica Power Center 8.6/9.1 Teradata 13.X, Oracle 11g, Autosys, UNIX Operating system, Toad 8.6, ERwin, MS Visual Source Safe.

Confidential, Beaverton, OR

Sr. EDW/ETL Developer

Responsibilities:

  • Worked on informatica: Source Analyzer, Warehouse designer, Mapping Designer, Mapplet, Transformation developer and workflow manager.
  • Involved in requirement analysis, OLTP system analysis.
  • Designed mappings on transformations such as Source qualifier, Joiner, Aggregators, lookups, Expression, Router, Sequence Generator, Update strategy and filters.
  • Developed slowly changing dimensional methodology for dimensions tables loading.
  • Used Informatica Power Center for extraction, loading and transformation (ETL) of data in the data warehouse
  • Worked on session parameters, Mapping variable/parameters and created parameter files for flexible runs of workflows based on changing variable values.
  • Creation of Mappings, Tasks, Worklets, Workflows
  • Performed Mapping Optimizations to ensure maximum Efficiency
  • Involved in Optimizing and performance tuning logic on targets, sources, mappings, and sessions to increase the efficiency of session.
  • Designed and Developed pre-session, post-session routines for Informatica sessions to drop and recreate indexes and key constraints for Bulk Loading.
  • Identified performance issues in existing sources, targets and mappings by analyzing the dataflow, evaluating transformations and tuned accordingly for better performance.
  • Work with Business Analyst and Business Users to understand the requirement and translate the requirement into technical specification.
  • Reviewing ETL design document and Test cases.
  • Conducted peer design and code reviews and extensive documentation of standards, best practices, and ETL procedures.
  • Developed UNIX shell scripts using PMCMD utility and scheduled ETL load.
  • Involved in Code review, Preparation of Test Cases and Unit testing, XML source files.

Environment: Informatica 7.X/8.X, Oracle 10g, UNIX, PL/SQL, Omniture (Data Analytics), Cognos 8.4

Confidential, New York, NY

Java Developer

Responsibilities:

  • Followed a well-defined and disciplined process to understand, analyze and solve the cases.
  • Understand the case and gather requirements from clarify requester.
  • Created stored procedures and packages to effectively handle complex business logic. Effectively used exception handling to take care of erroneous scenarios.
  • Tuned SQL queries, PL/SQL stored procedures, Functions and Packages for better performance and efficiency.
  • Created database Tables, Indexes, Views, Materialized Views, Sequences in Development and Production environment using PL/SQL, SQL*Plus and Toad.
  • Extensively worked on Eclipse IDE and Toad.
  • Involved in Unit, Integration and System testing to validate the data.
  • Responsible for Designing & creation of the Database object.
  • Recommended the ETL changes to improve maintainability, data quality, best practices and performance.
  • Developed thick Client user interface using Java Swings API.
  • Implemented business logic in Java.
  • Worked on JDBC programming to connect to Oracle database.

Environment: Oracle 9i, PL/SQL, Toad, Java

We'd love your feedback!