We provide IT Staff Augmentation Services!

Bigdata Consultant Resume

5.00/5 (Submit Your Rating)

Richardson, TX

SUMMARY:

  • An IT professional with 6+ years of work experience in requirement writing, gathering, data analysis, design, development, testing, and implementing an Enterprise application like Teradata, DB2 and Big data applications.
  • Decent expertise in Hive, Teradata, PIG, Spark, Scala and Sqoop.
  • Experience in ETL Testing. Coordinated and managed test efforts with QA teams across Software Testing Life - Cycle: Smoke Test, Functional Testing, Regression Testing, Integration Testing, User Acceptance Testing and Release Planning.
  • Worked in various business domains, including Retail, Healthcare, Financing, Banking, Investing, Logistics, Telecom industries
  • Experience in Analysis, Design, Development and Implementation of Relational Database (OLTP) and Data Warehousing Systems (OLAP)
  • Expertise in tuning the performance of mappings and transformations in DataStage and determining the performance bottlenecks.
  • Extensive experience in the Data Modeling phases conceptual, logical, physical
  • Strong experience in Data warehouse development life cycle and Design of Data marts with Star Schemas, Snowflake Schemas and Integrated Schemas
  • Used the DataStage Designer to develop processes for extracting, transforming, and loading data into data warehouse databases
  • Experience in Pipeline and Partitioning Concept
  • Extensively worked with Parallel Extender for parallel processing to improve job performance while working with bulk data sources
  • Experience in integrating various data sources like Flat files, Oracle, Teradata into staging area, and then into Datamarts / EDW.
  • Experience in migrating DataStage 8.5 to 11.5 versions.
  • Worked extensively with different file stages of DataStage like Sequential, Dataset, File set, Lookup File set
  • Experience in analyzing performance bottlenecks, root cause analysis, monitoring end-to-end performance and fixing performance issues.
  • Skillful in identifying Release requirements, Functional Specifications Document, Product Backlog, Iteration Backlog, Acceptance Criteria, Test Artifacts, Code coverage, Test coverage and delivery of Release work items using Traceability Matrix.
  • Worked with Local Containers, Shared Containers and Job Sequences
  • Good knowledge in SQL & PL/SQL and expertise in writing Stored Procedures, Functions and Packages
  • Expert in unit testing, system integration testing, implementation and maintenance of databases jobs.
  • Extensively worked on Job Sequences to control the flow of job execution using various activities like Job Activity, email Notification, Sequencer, Routine activity and Exec Command
  • Excellent problem-solving and trouble-shooting capabilities. Quick learner, highly motivated, result-oriented and an enthusiastic team player. Good interpersonal skills, experience in handling communication and interactions between different teams

TECHNICAL SKILLS:

ETL Packages: Confidential Info Sphere/Web Sphere DataStage, and Quality Stage 11.5, 11.3, 9.1, 8.7, 8.5, 8.0.1, Ascential DataStage 11.5/9.1 (Designer, Director, Manager, Administrator), Information Analyzer/ Profile Stage, Quality Stage, Audit Stage

Big Data Tools: Hive, HBase, Pig, MapReduce, Sqoop, Spark

Databases: Teradata 13.11 client, MS Access, Oracle 10g/11g, MS SQL Server 2005/2008, DB2 UDB 8.1/7.2, OBIEE 10g/11g, MDM (Oracle CDH)/ Oracle Apps 11i/R12 (OM & AR modules)

Database Tools: SQL* Plus, SQL Loader, Toad, Autosys, Zena, Serena

Database Modeling: Anchor Modeling, Star-Schema Modeling, Snowflakes Modeling, Integrated Schema, E-R Modeling, FACT and Dimension Tables, Microsoft VISIO, ERWIN 4/7.1/8

Operating Systems: Windows NT / 2000 / XP Pro / Vista, 7, Windows Server 2000 / 2003 / 2008, UNIX - Solaris/AIX/UX, Linux - Mint, Ubuntu

Languages: SQL, PL/SQL, UNIX Shell scripting, Python

PROFESSIONAL EXPERIENCE:

BigData Consultant

Confidential, Richardson, TX

Responsibilities:

  • Testing, Integration, Data Migration and implementation of customer-oriented business applications in Creating Test plan and Strategy document, Critical scenarios and Test Scripts and schedule for testing.
  • Worked with technical designers and business analyststo understand the requirements for a test environment to setup ETL
  • Built Automation Teradata and Hive Scripts to validate Source to Target Testing and generated reports.
  • Extensively used Unix Commands to perform Data Validation.
  • Worked on Claims, Pharmacy, Membership and Provider data.
  • Built Automation hive scripts and spark scripts to validate Ingested against Consumption data from multiple sources in Data Lake and Teradata.
  • Performing source to target validations on hive tables, delimited text files, fixed-width files, XML files, JSON files according to the transformation rules in the mapping documents.
  • Prepared Spark scripts to load data from Hive to Teradata and Teradata to Hive for data analysis and data quality checks.
  • Migrating the DataStage jobs by using the DataStage Connector Migration tool from 8.5 version to 11.5 version.
  • Performing Root Cause Analysis to identify the bugs and create the defects in JIRA and assign to the proper resource
  • Upload and automate all test cases in qTest for reporting purpose.
  • Creating the Analysis & metric reports and publish them to the team.

Environment: Hadoop/Big Data, Teradata, Spark, Hive, JIRA, and Teradata SQL Assistant, DataStage 11.5.

BigData Tester

Confidential, Austin, TX

Responsibilities:

  • Experience in dealing with Apache Hadoopcomponents like HDFS, MapReduce, HiveQL, HBase, Pig, Hive, Sqoop and Spark.
  • Excellent understanding and knowledge of NoSQL databases like HBase and Cassandra.
  • Experience in writing HiveQL queries to store processed data into Hive tables for data analysis and data quality checks.
  • Performed data validation on millions of records in Hive and Pig.
  • Imported and tested data from HBase using Pig.
  • Extensive experience in ETL Architecture, Development, Enhancement, Maintenance, Production Support, Data Modeling, Data Profiling, Reporting including Business requirement, system requirement gathering.
  • Hands-on experience in shell scripting. Knowledge of cloud services Confidential Cloud.
  • Proficient in using RDMS concepts with Oracle, SQL Server and MySQL.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Skilled in leadership, self-motivated and able to work in a team effectively. Possess excellent communication and analytical skills along with a can-do attitude.
  • Strong work ethics with the desire to succeed and make significant contributions to the organization. Experience in processing different file formats like XML, JSON and sequence file formats.
  • Good Experience in creating Business Intelligence solutions and designing ETL workflows using Tableau.Hands-on experience in Object-Oriented Analysis, Design (OOAD), and development of software using UML Methodology.
  • Exposure to Java development projects.
  • Hands-on experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries using Oracle, DB2 and MySQL.
  • Good working experience on different OS like UNIX/Linux, Apple Mac OS-X Windows.

Environment: Hadoop/Big Data, Teradata, Map Reduce, Spark, Pig, Hive, JIRA, and Teradata SQL Assistant.

Confidential, Detroit, MI

Hadoop tester

Responsibilities:

  • Worked closely with Data Modeler Team to come up with proper Mapping document and ensured proper transformation rules are there complete testing.
  • Built Source tables with Complex queries to meet business requirements
  • Extensively used Hive scripts to validate data in ingestion to ensure data quality.
  • Validated source to target Java code to parse JSON files to delimited text files.
  • Extensively worked on hive tables with multiple sources and multiple target tables in array and struct columns.
  • Expertise in creating various business-related test cases by mocking up the test data when the data is unavailable for all the scenarios.
  • Validated PIG UDFs to for data transformations as per the business rules.
  • Validated UNIX script to automate the process of loading membership, HR data into HBase.
  • Validated Audit Control Framework on Membership, HR data that are received from different systems to ensure the data landed properly on the Data Lake.
  • Validated designed and implemented secondary indexes for querying range-based columns in HBase to reduce the response time drastically.
  • Moreover, using Spark to enrich and transform data to internal data models powering search, data visualization and analytics.
  • Create test cases scenarios, test cases, execute test cases and exceptionally document the process.
  • Work with systems engineering team to deploy and test new Hadoop environments and expand existing Hadoop clusters architecture from RAW to Data Lake.

Environment: Hadoop/Big Data Ecosystems with HDFS, Map Reduce, Spark, Pig, Hive, HBase, Teradata SQL Assistant, JIRA, Hue, Apache Phoenix.

Confidential

QA Tester (DataStage migration project)

Responsibilities:

  • Involved with Application and Database Support Teams to maintain system and database which supports current data scenarios and easily adapts to changing business needs.
  • Create DataStage components efficiently and made them reusable.
  • Used Parallel Extender extensively by using Processing, Development/Debug, File and Real Time Stages.
  • Involved in all phases including Requirement Analysis, Design, Coding, Testing, Support and Documentation
  • Designed the ETL jobs using Confidential Infosphere DataStage 9.1 and 11.3 to Extract, Transform and load the data into staging, ODS and EDW.
  • Analyzed business requirements, documented business requirements specifications, wrote Test Plans, Test Cases.
  • Worked with other teams to understand technical design and architecture for test planning.
  • Used different types of DataStage steps like modify, sequential file, Copy, Aggregator, Surrogate key, Transformer, dataset, look up, join, Remove Duplicates, CFF, sorter, Column generators, CDC, and Funnel.
  • Extracted data from Heterogeneous source systems like Oracle, Teradata, SQL Server and flat files.
  • Developed processes on both Oracle and Teradata using shell scripting and RDBMS utilities such as Multi Load, Fast Load, Fast Export, BTEQ (Teradata) and SQL*Plus, SQL*Loader (Oracle).
  • Developed Multi load and Fast Load scripts to load data from flat files into Teradata Staging area.
  • Worked on error handling in Teradata SQL's using ERROR CODE and ACTIVITY COUNT.
  • Worked with service manager to review schedules, objectives and priorities and also initiated suggestions for improvement to DataStage jobs.
  • Used database stage like DB2/UDB Enterprise, DB2 bulk load, DRS, Sybase, ODBC Connector,
  • Involved in Performance Fine Tuning of ETL programs, Tuned DataStage and stored procedures code. Worked in migration project where we migrated DataStage version 9.1 to 11.3. In this process I am responsible in moving all the code from Dev to QA and then to prod.
  • Worked with Data Modelers, Technical Architects, Customer/end user, Business analysts, and Data analysts to design Technical Specification documents / Mapping documents.
  • Extensively tested all the developed jobs based on the requirements of the project and deployed the code into production environment.
  • Used Quality Stage plug-in stage to call Quality Stage jobs into the DataStage Designer.
  • Extensively Implemented slowly changing dimensions (SCD) TYPES.
  • Created and used Shared Containers to reuse the logic in another Job.
  • Developed staging and Data Mart DS jobs using Data Stage Designer on parallel environment.
  • Used Cognos to generate reports.
  • Improved development in job sequences for the designed jobs using exec command Job Activity, Triggers and E-mail Notification Activities.
  • Experience in writing different complex types of SQL’s for validating the data between different stages with in the data warehouse.
  • Very good team player and excellent communication skills.

Environment: Confidential InfoSphere DataStage 11.3, DataStage Administrator, DB2 Z/OS, MS SQL Server 2008, Oracle 11g/12c, Rapid SQL 8.1.1, HP Quality Center 11/10, Main Frames, Autosys, OLAP, OLTP, Cognos.

Confidential

DataStage Consultant

Responsibilities:

  • Designed Parallel jobs using various stages like (Join, Remove Duplicates, FTP stage, Filter, Dataset, Lookup file set, Modify, Transformer, ODBC and Funnel stages)
  • Extensively used TOAD for analyzing data, writing SQL, PL/SQL scripts performing DDL operations
  • Used Quality Stage for various data cleansing stages to get complete visibility of the actual condition of data, to reformat data from multiple systems to ensure that the data has the correct specified content and format, and to ensure that the best available data survives and is correctly prepared for the target
  • Used Quality Stage components like Match Designer for designing and testing match passes
  • Information Analyzer was used to automate the task of source data analysis by expediting comprehensive data profiling and minimizing overall costs and resources for critical data integration projects by scanning the samples of data and determining their quality and structure
  • Used development/debug stages to test the environment by creating samples of data from given high volume data or by creating mock data
  • Ensuring timely deliveries of work items to the Client
  • Involved in Implementing ETL standards and Best practices within our portfolio
  • Used the DataStage Designer to develop processes for extracting, cleansing, transforming, integrating, and loading data into data warehouse database
  • Tuned DataStage transformations and jobs to enhance their performance.
  • Developed jobs using different types of stages -- Sequential File, Transformer, Aggregator, Merge, Link Petitioner and Link Collector and Hashed File
  • Performed lookups for faster access of data
  • Used DataStage Designer for importing metadata from repository, new job categories and creating new data elements
  • Extensively used Autosys and DataStage Director for Job Scheduling, Emailing production support for Troubleshooting from LOG files
  • Exported the project from Development to Test environment using DataStage Manager.
  • Developed parallel Jobs in DataStage Designer to Extract data from the Sources Oracle and Complex Flat Files, cleanse it, transform by applying business rules, staging it in Data marts and Load (Initial/Incremental) into Target DWH Teradata
  • Worked on the code fixes and on the tickets raised due to the job failures
  • Provided 24/7 support on rotation basis
  • Supporting unit, integration, and end user testing by resolving identified defects

Environment: Confidential Web sphere DataStage 8.5/7.5.3 (Designer, Director, Administrator, Manager), Quality Stage, Audit Stage, Information Analyzer/Profile Stage, Microsoft Visio, Oracle OBIEE10g, Teradata, UNIX AIX 6.1, Windows Server 2003, SQL Loader, Toad, Autosys, SQL, PL/SQL, Oracle SQL Plus, UNIX Shell Scripting, ERWIN 8

We'd love your feedback!