We provide IT Staff Augmentation Services!

Bigdata Developer Resume

Waltham, MA

SUMMARY:

  • Around 8 years of extensive experience in ETL, Confidential data Integration, Confidential Bigdata, data migration, data warehousing.
  • Experience in using AWS cloud components and connectors to make to make API calls for accessing data from cloud storage (Amazon S3, Redshift) in Confidential Enterprise Edition.
  • Experience in Hadoop Big Data Integration with Data stage ETL on performing data extract, loading and transformation process for automobile ERP data. experience with Big data, Hadoop, HDFS, Map Reduce, Spark and Hadoop Ecosystem (Pig & Hive) technologies.
  • Experience of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Job Tracker, Task Tracker, YARN and Map Reduce.
  • Experience in creating Joblets in Confidential for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
  • Extensively used ETL methodology for performing Data Migration, Extraction, Transformation and loading using Confidential and designed data conversions from wide variety of source systems.
  • Created sub jobs in parallel to maximize the performance and reduce overall job execution time with the use of parallelize component of Confidential and using the Multithreaded Executions in TOS.
  • Expensily worked on Data Mapper to map complex JSON formats to XML.
  • Extensive work experience on system analysis, design, development, testing and implementation of projects with full SDLC (Software Development Life Cycle).
  • Implemented complex business rules in Informatica by creating Reusable Transformations, and robust mappings/mapplets. Experience in loading data, troubleshooting, Debugging and tuning of Informatica mappings.
  • Designed and developed complex mappings, from varied transformation logic like Unconnected and Connected lookups, Source Qualifier, Sorter, Normalizer, Sequence Generator, Router, Filter, Expression, Aggregator, Joiner, Rank, Update Strategy, Stored procedure, XML Source qualifier, Input and Output transformations.
  • Expertise in Data Modeling concepts including Dimensional Modeling, Star and Snowflake schema, Experience in CDC and daily load strategies of Data warehouse and Data marts, slowly changing Dimensions (Type1, Type2, and Type3) and Surrogate Keys and Data warehouse concepts.
  • Hands - on experience in Performance tuning of Informatica ETL Queries. Experience in writing UNIX shell scripts and handling files with UNIX scripts.
  • Expertise in developing PL/SQL Stored Procedures, Packages, Functions, Triggers, Indexes and Shell scripting. Experienced in coding using SQL, PL/SQL stored procedures/functions, triggers and packages.

WORK EXPERIENCE:

Bigdata Developer

Confidential, Waltham, MA

Responsibilities:

  • Interact with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
  • Submitted Confidential jobs for scheduling using Confidential scheduler. Extensively Worked in Agile software development approach using JIRA
  • Leveraged Confidential to ingest data into the datalake of datafabric. Unit tested Confidential workflows for the correctness of the data.
  • Developed jobs to expose HDFS files to Hive tables and Views depending up on the schema versions.
  • Created Hive tables, partitions and implemented incremental imports to perform ad-hoc queries on structured data.
  • Developed jobs to move inbound files to HDFS file location based on monthly, weekly, daily and hourly partitioning.
  • Optimizing Hive queries, improve performance by configuring Hive Query parameters.
  • Implemented ORC data format for Apache Hive computations to handle the custom business requirements.
  • Imported data from RDBMS (MySQL, Oracle) to HDFS and vice versa using Sqoop (Big Data ETL tool) for Business Intelligence, visualization and report generation.
  • Mapping source to target data and converted data JSON to XML (229 Accord Fromat) using Confidential data mapper. Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Implemented ORC data format for Apache Hive computations to handle the custom business requirements. Implemented a centralized Data Lake in Hadoop with data from various sources.
  • Analyzing and understanding the legacy code and making recommendations on how the new system can be designed.
  • Stored MapReduce program output in Amazon S3 and developed a script to move the data to RedShift for generating a dashboard using QlikView.Worked on Confidential Administrator Console (TAC) for scheduling jobs and adding users

BigData Developer

Confidential, Tampa, FL

Responsibilities:

  • Interact with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document.
  • Excellent experience working on tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport and tSqoopExport.
  • Submitted Confidential jobs for scheduling using Confidential scheduler.
  • Leveraged Confidential to ingest data into the datalake of datafabric. Unit tested Confidential workflows for the correctness of the data.
  • Experience on cloud configuration in Amazon web services (AWS).
  • Copy data to AWS S3 for storage and use COPY command to transfer data to Redshift. Used Confidential connectors integrated to Redshift.
  • Analyzing and understanding the legacy code and making recommendations on how the new system can be designed.Worked on Confidential Administrator Console (TAC) for scheduling jobs and adding users.
  • Developed jobs to move inbound files to HDFS file location based on monthly, weekly, daily and hourly partitioning.
  • Developed jobs to expose HDFS files to Hive tables and Views depending up on the schema versions.
  • Created Hive tables, partitions and implemented incremental imports to perform ad-hoc queries on structured data. Optimizing Hive queries, improve performance by configuring Hive Query parameters.
  • Implemented ORC data format for Apache Hive computations to handle the custom business requirements.
  • Run hive queries using spark-sql to make use of the in memory processing.
  • Worked extensively on design, development and deployment of Confidential jobs to extract data, filter the data and load them into datalake.
  • Designed the hive jobs and scheduled them using the framework (an in-house scheduling framework in the organization).
  • Deployed Confidential workflows on various environments including dev, test and production environments.

ETL Developer

Confidential, Alpharetta, GA

Responsibilities:

  • Worked closely with Business Analysts to review the business specifications of the project and also to gather the ETL requirements.
  • Created Confidential jobs to copy the files from one server to another and utilized Confidential FTP components. Created and managed Source to Target mapping documents for all Facts and Dimension tables
  • Analyzing the source data to know the quality of data by using Confidential Data Quality. Involved in writing SQL Queries and used Joins to access data from Oracle, and MySQL.
  • Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer ofproject from development to testing environment and then to production environment.
  • Design and Implemented ETL for data load from heterogeneous Sources to SQL Server and Oracle as target databases and for Fact and Slowly Changing Dimensions SCD-Type1 and SCD-Type2.
  • Utilized Big Data components like tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport and tSqoopExport.
  • Used Confidential most used components (tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tSetGlobalVar, tHashInput & tHashOutput and many more).
  • Created many complex ETL jobs for data exchange from and to Database Server and various other systems including RDBMS, XML, CSV, and Flat file structures. Experienced in using debug mode of Confidential to debug a job to fix errors.
  • Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Confidential Integration Suite.
  • Conducted JAD sessions with business users and SME's for better understanding of the reporting requirements. Developed Confidential jobs to populate the claims data to data warehouse - star schema.
  • Used Confidential Admin Console Job conductor to schedule ETL Jobs on daily, weekly, monthly and yearly basis.
  • Worked on various Confidential components such as tMap, tFilterRow, tAggregateRow, tFileExist, tFileCopy, tFileList, tDie etc.
  • Worked Extensively on Confidential Admin Console and Schedule Jobs in Job Conductor.

ETL/Informatica Developer

Confidential, Irvine, CA

Responsibilities:

  • Interacted with Business Analysts and Users for gathering and analyzing the Business Reports Requirements.
  • Closely worked with ETL team to configure Informatica connections, source DB connection and target DB connection in DAC.
  • Worked closely with the Project Manager and Data Architect. Assisted Data Architect in design by doing source data analysis, rectifying the requirement documents, creating source to target mappings.
  • Gathered requirements for GAP Analysis, translated them into technical design documents and worked in making recommendations to close the GAP.
  • Worked with power center tools like Designer, Workflow Manager, Workflow Monitor, and Repository Manager.
  • Extensively used Informatica Transformation like Source Qualifier, Rank, SQL, Router, Filter, Lookup, Joiner, Aggregator, Normalizer, Sorter etc. and all transformation properties. Developed ETL mappings, transformations using Informatica Power center.
  • Responsible in developing the ETL logics and the data maps for loading the tables. Designed for populating Tables for one time load and Incremental loads.
  • Worked on Informatica tools like Source Analyzer, Warehouse Designer, Mapping Designer, Mapplet Designer and Transformation Developer.
  • Created and scheduled Sessions, Jobs based on demand, run on time and run only once. Monitored Workflows and Sessions using Workflow Monitor.
  • Created complex mappings that involved implementation of Business Logic to load data in to staging area. Used Informatica reusability Confidential various levels of development.
  • Monitored incremental and full load of data through Data Warehouse Administration Console (DAC) and Informatica. Created, launched & scheduled sessions.Developed interfaces for loading the lookup and transactional data. Delivered the assignments before the deadlines. Did error analysis for improvement of the system by catching data anomalies and code issues.
  • Wrote unit test cases, test scenarios, unit test documents and also verified the test scenarios of the QA team. Unit tested for all the possible test scenarios.

ETL Developer

Confidential, Melbourne VIC

Responsibilities:

  • Requirement gathering and then develop and design ETL solutions as required. Understanding the requirement and Data model to develop the mappings accordingly. Creating High Level Design and Low Level Design documents.
  • Designed and Developed Informatica Mappings from Scratch to Load the Data from Source System to Staging system and Warehouse System.
  • Developed many complex Full/Incremental Informatica Objects (Workflows / Sessions, Mappings/ Mapplets) with various transformations. Writing ETL codes, UNIX scripts, SQL queries to check the desired results.
  • Delivering the code on time as per schedule. Involved with the Data Modelling team and provided suggestions in creating the data model. Responsible for taking end-to-end ownership of the system for all the technical aspects.
  • Implementing best practices and taking delivery standards to the next level. Owning design and code reviews and proactively improving system scalability, robustness, performance and flexibility.
  • Provide inputs to Project Manager for project planning, providing robust estimates for development tasks.
  • Mentoring team and ensuring that design, code / technical deliverables are of high quality. Ensuring high quality and zero defect deliveries.
  • Involved in Coding, testing and preparing jobs as per the requirements specification. Performing Unit Testing for the development /enhancement performed.
  • Responsible for debugging any defects came across in the developed mappings and providing logical answers to Testing team.

Hire Now