We provide IT Staff Augmentation Services!

Data Engineer Resume

SUMMARY:

  • Having 12 years of IT experience in In - Flight Entertainment & Connectivity(IFEC),Finance, Retail, Banking, Insurance,Industrial, Manfacturing and Health Care(US) domain.
  • Overall 12+ years of progressive IT experience in Analysis, Design, Development, Implementation and Testing of software applications using Big Data Technologies and ETL based technologies.
  • Complete life cycle (SDLC) experience of a product involved in System Analysis, Technical design, development, testing, deployment & support medium to large-scale business applications using Agile Scrum and iterative development methodologies.
  • Experience in design and build Big Data, Data Science and Data transformation implementations in HealthCare and Financial industries.
  • Experience in end-to-end design and build process of Near-Real Time and Batch Data Pipelines
  • Skilled with Spark framework on both batch and real-time data processing.
  • Worked on integrating Spark Streaming with Kafka for real time processing of streaming data.
  • Experienced in working with different file formats like Parquet, ORC File, Sequence Files, Text files,XML, XLSX, CSV, JSON and Avro etc.
  • Experience in working with build management tools like Maven, Gradle and Sbt along with Integration tools like Jenkins.
  • Experience in design and build data management lifecycle covering Data Ingestion, Data integration, Data consumption, Data delivery, and integrating Reporting, Analytics and System System integration.
  • Experience in importing and exporting data using Sqoop from Relation Database Systems (RDMS) to HDFS and vice versa.
  • Experienced in writing Ad-hoc queries for moving data between HDFS and Hive and analyzing the data using HQL.
  • Experienced in writing and Extending User Defined Functions (UDF's) in Spark SQL, Hive and worked on Optimization and performance tuning on Spark / Hive applications.
  • Experienced in building data pipelines for analyzing structured and unstructured data using tools such as HDFS, HIVE, HBase, Pig, Spark, Kafka, Scala, Control-M & StreamSets ETL.
  • Experienced in fetching data into Hadoop Datalake from various databases like Sybase, SQL Server, DB2, Oracle and Teradata using Sqoop.
  • Thorough knowledge on search engines like AWS Elastic Search, Apache Solr and their capabilities.
  • Knowledge on AWS components such as EC2, SQS, SNS, Kinesis, Firehose, Cloudformation, ElasticBean, Red shift, RDS, Lambda & S3.
  • Working knowledge of application and system availability, scalability and distributed dataplatforms with CPU and GPU high performance computation clusters.
  • Experienced working with GIT, Bit Bucket code repository Management and Jenkins for continuous build and deployment automation.
  • Having good knowledge on PL/SQL and experience in troubleshooting Procedures, Packages.
  • Verified and Validate the Missing records, Duplicates records, Nulls, Defaults records as per the design specifications.
  • Involved and verified the data at back end level, Data Integrity like Default values, Null check, Data Cleaning, Sampling, Scrubbing, Aggregation and Data Merging operations.
  • Worked in Agile, Scrum Methodologies and participated in daily stand-up meetings. Performed root cause analysis of the defects and worked with the Developers, BSA's and business to resolve any issues that arise out of the testing process.
  • Contributed to Defect Management throughout various phases of testing and re-testing.Facilitating defect review meetings such as defect/bug prioritization, planning, effort estimates, status updates and working closely with the project manager to keep the project releases on track.
  • Very good at root cause analysis and faster resolution of the defects, there by contributing to the productivity improvement in the project.
  • Demonstrated ability to handle multiple tasks both by working independently and in a team.
  • Ability of mentoring and providing knowledge transfer to team members, support teams and customers.

INTERPERSONAL COMPETENCIES:

  • Ability to interact successfully with multiple teams across the global organization, including services and support for all regions.
  • Strong mathematical, analytical background with innovative thoughts and creative action.
  • Sound Tech-skills, getting-across business flow, Quick-grasp self-learning professional
  • Value-added attitudes.
  • Zeal to learn New Technologies.

TECHNICAL SKILLS:

Operating Systems: Windows 10/8, Windows XP/2000/2007, Unix, Linux Red Hat, Mainframe & AS400, Ubuntu

Database: Oracle, DB2, SQL Server, MY SQL, No SQL(MongoDB), HBase

Big Data Ecosystems: HDFS, Map Reduce, HBase, Hive, PIG, SQOOP, Oozie, Phoenix, Flume, Cassandra, Apache Nifi, StreamSets, KAFKA, Impala, ZooKeeper, MongoDBDWH Tools

Ab: Initio, Informatica, Teradata, Data Stages, Cognos, Qlikview

Languages: Java, Ruby, Python, SQL, JCL, Scala, Shell Script

Testing Tools: HP Load Runner12.x

TSO/ISPF, File: AID, SPUFI, EZCOPY, QMF), IMS, VSAM

Other Tools: AutoSys, Teradata SQL Assistance, SQL Developer, IBM BPM, Aqua Studio, TOAD, Putty, WinSCP, SOAP UI, UML, MS Project, MS Visio

WORK EXPERIENCE:

Confidential

Data Engineer

Environment: Python, HDFS, Hive, HBase, Spark, KAFKA, Shell Script, Scala, SQL, VersionOne

Responsibilities:

  • Responsible for creating Data store, Datasets in the lake and then creating Spark and Hive refiners.
  • Used Sqoop/TDCH Connector to import and export functionalities to handle large data set transfer between Teradata database and HDFS.
  • Involved in developing Spark application to perform ETL kind of operations on the data.
  • Created reconciliation jobs for validating data between source and lake.
  • Worked on Hive partition and bucketing concepts and created hive external and internal tables with Hive Partition
  • Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Oozie, kafka & PySpark.
  • Involved in loading process into the HDFS and pyspark in order to reprocess data
  • Regularly tune performance of Hive and Pig queries to improve data processing and retrieving
  • Working with Spark RDDs, Data Frames, Spark SQL APIs, Accumulators & Broad cast variables etc.
  • Implemented Streamsets flow pipelines/topologies to perform cleansing operations before moving data into HDFS.

Confidential

Big Data Engineer

Environment: Python, HDFS, Hive, HBase, Spark, Shell Script, Streamsets, SQL, Oozie, Bit Bucket, Bamboo, JIRA

Responsibilities:

  • Interacted with the Infrastructure, Admins, Database, DQ and BA teams to ensure Data quality and availability.
  • Evaluated Business requirements and prepared detailed specifications that follow project guidelines required to develop suitable programs.
  • Responsible for creating Data store, Datasets in the lake and then creating Spark and Hive refiners to implement the existing SQL Stored Procedures.
  • Implementing Big Data Fabric Datalake provides a platform to manage data in a central location so that anyone in the firm can rapidly query, analyze or refine the data in a standard way.
  • Involved in moving legacy data from RBMS, Mainframes, Teradata & External source systems data warehouse to Hadoop Data Lake and migrating the data processing to lake.
  • Created Hive tables, loading and analyzing data using Hive scripts.Implemented partition, Dynamic Partitions, Buckets in Hive.
  • Worked on creating Views for Data Exchange and other downstream teams by masking proprietary fields, PHI Columns and other sensitive information's on respective Databases based on the Mapping document.
  • Involved in developing Spark application to perform ETL kind of operations on the data.
  • Used Sqoop/TDCH Connector to import and export functionalities to handle large data set transfer between Teradata database and HDFS.
  • Implemented Streamsets flow pipelines/topologies to perform cleansing operations before moving data into HDFS.
  • Involved in story-driven agile development methodology and actively participated in agine scrum meetings(Grooming, Planing, Daily Standup Meetings)

Confidential, Melbourne, FL

Biug Data Engineer

Environment: MapReduce, HDFS, Hive, HBase, Spark, Shell Script, NIFI, Apache Phoenix, Cassandra, Flume, SQL, Oozie, Bamboo, JIRA, JAMA, SQL Server

Responsibilities:

  • Responsible for creating Data store, Datasets in the lake and then creating Spark and Hive refiners.
  • Involved in developing Spark application to perform ETL kind of operations on the data.
  • Created reconciliation jobs for validating data between source and lake.
  • Worked on Hive partition and bucketing concepts and created hive external and internal tables with Hive Partition
  • Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Oozie, kafka & PySpark.
  • Involved in loading process into the HDFS and pyspark in order to reprocess data
  • Regularly tune performance of Hive and Pig queries to improve data processing and retrieving
  • Working with Spark RDDs, Data Frames, Spark SQL APIs, Accumulators & Broad cast variables etc.
  • Implemented Streamsets flow pipelines/topologies to perform cleansing operations before moving data into HDFS.

Confidential, Reston, VA

QA Lead - Hadoop/Automation(Cucumber)

Environment: MapReduce, HDFS, Hive, HBase, Spark, Shell Script, SQL, Oozie, Autosys, Jenkins

Responsibilities:

  • Involved in Test Plan, prepared traceability matrix for test coverage, high level business scenarios.
  • Involved in Automation Testing(Cucumber) for REST API for existing regresstion test suite.
  • Involved in Performance Testing(JMeter) for REST API.
  • Worked in Agile Scrum(Kanban) environment and participated in Agile Scrum meetings(Grooming, Planing, Daily Standup Meetings)
  • Worked in CICD process and involved in build deployments & validation for each build release using Jenkins.
  • Validation of input files in Landing area.
  • Validation of Dataflow from landing area to HIVE tables.
  • Validation of Incremental extracts.
  • Validation of batch functionality for any jobs failure or partial success by simulating the changes in input files.
  • Validation of Autosys Jobs
  • Validation of email notification(EMM) to source system on batch/job failures.
  • Validation of folder stru cture of files in Hadoop clusters (HDFS).
  • Validation of Audit log, recount, and hash totals between source, Hadoop.
  • Simulation of scenarios to check the restartability of batch jobs, logs, error log table, rejectlogs.
  • Validation for Data completeness a nd accuracy
  • Validation of transformation logics as per mapping.
  • Validation that batch functionality for any jobs failure.
  • Involved in System Testing, System Integration Testing.

Confidential, Atlanta, GA

QA Lead - Hadoop/ETL

Environment: MapReduce, HDFS, Hive, Java, Shell Script, SQL, Sqoop

Responsibilities:

  • Involved in Test Plan, prepared traceability matrix for test coverage, high level business scenarios.
  • Understanding of Big Data Hadoop architecture, EDW data model and scope of the system.
  • Validation of input files in Landing area.
  • Validation of Dataflow from landing area to HDFS.
  • Validation of Incremental extracts.
  • Validation of batch functionality for any jobs failure or partial success by simulating the changes in input files.
  • Validation of email notification to source system on batch/job failures.
  • Validation of folder stru cture of files in Hadoop clusters (HDFS).
  • Validation of Audit log, recount, and hash totals between source, Hadoop.
  • Simulation of scenarios to check the restartability of batch jobs, logs, error log table, rejectlogs.
  • Validation for Data completeness a nd accuracy
  • Validation of transformation logics as per mapping.
  • Validation that batch functionality for any jobs failure.
  • Validation for data loss, duplicate data .
  • Involved in System Testing, System Integration Testing.
  • Sending daily/weekly status reports to the client and also arrange defect tracking calls with all the stakeholders.

Confidential

Test Lead - ETL/BI

Environment: HP Quality Center, AbInitio, QlikView, MapReduce, HDFS, Hive, Java, Shell Script, SQL, Sqoop

Responsibilities:

  • Running the ETL plans and Tests which is given by dev team in UNIX environment for loading the data to data base.
  • Validate the data loaded in Target as per the business requirements by using SQL queries.
  • Preparing the Test data as per Business requirements and Business Rules given by client.
  • Manage and report on progress for own and team deliverables.
  • Report in weekly Test Team Meeting of supplier on progress, risks and issues.
  • Responsible for developing Test Scenarios, Test data and Test cases from Functional documents.
  • Executed test cases using UNIX environment and used Cognos for executing reporting test cases.
  • Attending & arranging Requirements meeting, Defect Call, Status calls.
  • Acting as coordinator to fulfill the functional gap in delivering things.
  • Effective co-ordinations and communications with client on regular basis to provide them estimates, planning updates and reports.
  • Providing estimations for Functional requirement and handling individually for all STLC phases.
  • Prepared Traceability Matrix, Detailed Test Plan.
  • Analyzed the user/business requirement and functional specification documents and created test scripts in Quality Center.
  • Validated ETL jobs and troubleshooted SQL procedures based on errors displayed in error log.
  • Check source data (table (s), columns, data types and Constraints), about Target data (table (s), columns, data types and Constraints).
  • Writing Queries for Data Scrubbing, Data Aggregation, Data Merging and Data Cleansing Transformation logics.
  • Prepared DOU (Document of Understanding) after completing Test Execution.
  • Mentoring and knowledge transfer to new entrants.
  • Performed performance testing using JMeter.

Confidential, Plano, TX

Sr.Test Analyst

Environment: Oracle RMS, Toad, Teradata SQL Assistant, Informatica, SOAP UI

Responsibilities:

  • Carried out System, Interface and End to End testing.
  • Involved in comparison testing as part of the data migration from the legacy applications into Oracle RMS environment.
  • Involved in web service testing for the web services created as part of the integration.
  • Involved in Test Planning, Test Execution, Test Result Reporting, Status tracking and reporting to the management.
  • Involved in Test Scenario and Test Case Review, Test Results Review and signoff process.
  • Extensively involved in test environment set up and test data identification.
  • Performed System Testing which includes validation of inbound data feed from EBS and other systems.
  • Tested the 40 plus interfaces developed as part of these implementations.
  • Identified End to End test scenarios by identifying the impacted legacy applications and carried out the testing by coordinating with different IT/ Business teams in JCPenney.
  • Performed End to End Test Planning, Data Identification and End to End test execution.
  • Involved in Usability testing of the new screens developed for JCPenney.
  • Organized Daily Defect Triage with the development teams and Business Analyst.
  • Weekly status meetings, to facilitate communication and maximize productivity.
  • Provided Demo to the Business Users after every Sprint.
  • Coordinated UAT testing.

Confidential

Mainframe/ BPM/ETL Tester

Environment: BPM 8.0, DB2, Java, HP Quality Center, Informatica

Responsibilities:

  • Successfully delivered R1 release in Confidential .
  • Attending & arranging Requirements meeting, Defect Call, Status calls.
  • Acting as coordinator to fulfill the functional gap in delivering things.
  • Successfully delivered various Work Packages in Pensions Reform programme during various releases.
  • Successfully delivered WP22.2/23.2 MI Payment(ETL Testing).
  • Leading the team of 3-5 members for Work Packages in Pensions Reform programme during various releases.
  • Effective co-ordinations and communications with client on regular basis to provide them estimates, planning updates and reports.
  • Handled the role of deep scrum lead to discuss on progress, risks and issues to monitor the health of WP’s on regular basis.
  • Providing estimations for Functional requirement and handling individually for all STLC phases.
  • Prepared Traceability Matrix, Detailed Test Plan.
  • Analyzed the user/business requirement and functional specification documents and created test scripts in Quality Center.
  • Prepared presentations at client visit to demonstrate Pension Reforms Eligibility Calculator and also prepared presentation to demonstrate Pension Reforms Case Study for R1, R2 & R3 releases
  • Created Test conditional template which is useful in drafting Test Cases.
  • Create and execute test scripts, and scenarios that will determine optimal system performance according to specifications and prepared required test data.
  • Prepared Regression Test cases for Pension Reforms in various releases.
  • Involved in System/System Integration Testing for Eligibility Calculator & Waiting Period.
  • Created Conditional Parameterized template in QC which save time in Test Preparation Phase.
  • Validating the data as per the business rules from source to target systems.
  • Validating data transformed correctly from OLTP to OLAP like ensuring expected data is loaded, ensuring that all the data is transformed correctly according to design specifications.
  • Execution of plans/batches to load the data into various target sources.
  • Created test data setup for different Member Status Active, Inforce, Opt-Out & Terminated using Mainframe Online Transaction processing (CICS), DB2 and other GUI Applications.
  • Running JCL for Batch Testing & ACR process to update Deferral Period and for Eligibility Calculator.
  • Verifying system logs using JESMSGLOG, JESSYSLOG and SYSOUT under status of jobs and message queue using MQB.
  • Keep track of new requirements of the product and the same discussing with the team on time.
  • Communicate test progress, test results, and other relevant information to UK counter parts.
  • Reporting and tracking defects using HP Quality Center.
  • Pro actively worked with developers to ensure timely bug resolution.
  • Working closely with team to perform extensively smoke, Functional & Regression testing.
  • Involved in peer reviews, weekly status meetings and weekly client status meetings.
  • Prepared DOU (Document of Understanding) after completing Test Execution.

Confidential, Minneapolis, MN

Environment: Mainframe, HP Quality Center, File Aid, QMF, SPUFI & JCL

Sr. Test Engineer - Lead

Responsibilities:

  • Providing estimations for change request and handling individually for all STLC phases.
  • Involved in preparing Traceability Matrix, Test Plan.
  • Analyzed the user/business requirement and functional specification documents and created test cases.
  • Create and execute test cases, and scenarios that will determine optimal system performance according to specifications and prepared required test data.
  • Create different type of pharmacy claims like Manual, Electronic, Paper and creating different Plan setups with different members depending on the requirement using Mainframe Online Transaction processing (CICS).
  • Running JCL for Batch Testing & ACR process to verify claim adjustments.
  • Verifying system logs using JESMSGLOG, JESSYSLOG, and SYSOUT under status of jobs and message queue using MQB.
  • Keep track of new requirements of the product and the same discussing with the team on time.
  • Communicate test progress, test results, and other relevant information to US counter parts.
  • Reporting and tracking defects using HP Quality Center and Rational Clear Quest.
  • Involved in peer reviews, weekly status meetings and weekly client status meetings.

Confidential, Richardson, TX

Mainframe Testing

Environment: Mainframe, DB2 & HP Quality Center

Responsibilities:

  • Providing estimations for change request and handling individually for all STLC phases.
  • Tracking work allocation using Rational Team Concert tool.
  • Involved in preparing Traceability Matrix, Test Plan.
  • Analyzed the user/business requirement and functional specification documents and created test cases.
  • Create and execute test cases, and scenarios that will determine optimal system performance according to specifications and prepared required test data.
  • Create different type of pharmacy claims like Manual, Electronic, Paper and creating different Plan setups with different members depending on the requirement in AS400 application.
  • Running PDE process, FIR transaction, R&R transaction to verify claim adjustments.
  • Running EZTEST for batch claims.
  • Keep track of new requirements of the product and the same discussing with the team on time.
  • Communicate test progress, test results, and other relevant information to US counter parts.
  • Reporting and tracking defects using HP Quality Center and Rational Clear Quest.
  • Pro-actively worked with developers to ensure timely bug resolution.
  • Working closely with team to perform extensively smoke, Functional & Regression testing.
  • Involved in peer reviews, weekly status meetings and weekly client status meetings.
  • Mentoring and knowledge transfer to new entrants.

Hire Now