Data Engineer Resume

SUMMARY

Over 15 years of professional experience in the field of Data Engineering, Data Analytics, Business Intelligence and Software Development using Open Source, Microsoft (Azure) and Amazon Webservice (AWS) technologies.
Industry standard expertise on both MPP & OLAP CUBE based large Data Warehousing Methodologies, ETL Processing, Fact and Dimensional Data Modeling
Successfully delivered Big Data & Business Intelligence solutions for Fortune 500 companies in Healthcare, Telecom, Fintech, Supply Chain, and other industry verticals.
Experience on both on premise and Could Data Integrations
Strong understanding on Mulesoft API - Led platform architecture
Hands on working capability with MuleSoft components, Mule Expression Language (MEL) workflow, Anypoint Studio, Enterprise Service Bus (ESB), API Manger and RAML, REST, SOAP
Exposure to MuleSoft Cloud-Hub and on-premise deployment models
Extensive knowledge on Amazon Redshift with Real-life implementation experience about Sort Key, Distribution Key, Diststyle and WLM (Workload Management)
Experience in Database Design, Complex T-SQL Scripts, TDD, In-Memory OLTP, Resource Governor, Indexing, Partitioning, Dynamic Data Masking of MS SQL Server / Azure SQL Database
In depth idea of SQL Server Query life cycle, Transaction Manager, Buffer Manager & Lazy Writer
Worked with SQL server Profiler/Extended Event for query optimization and deadlock detection
Practical experience in with PostgreSQL (MVCC, WAL, OOP, Foreign Data Wrapper) and MySQL & MariaDB
Demonstrable expertise in building Big Data Processing pipeline using Python, Spark (PySpark), Shell Script and handling of various data formats - Parquet, CSV, XML, JSON in both Linux and Windows platform.
Have knowledge in real time data processing using Kafka and Spark Streaming
Practical experience to design complex SSIS package and better analysis through SSAS in both Tabular and Multi-Dimensional model
Implemented CUBE, Analysis Service Model partitioning and dynamic cube refreshing using Analysis Service Management Objects (AMO) / Tabular Model Scripting Language (TMSL)
Professional expertise on complex ETL development and using Informatica
Leverages DevOps techniques and Experience with DevOps tools - GitHub, Jira, DocuSign, JAMA, Box with Agile methodology
Develop comprehensive dashboards with Tableau, SSRS, Microsoft Power BI, Business Objects, TIBCO JasperSoft and QlikSense, QlikView
Integrate Repots and Dashboard in web portal using HTML, CSS, Java Script (AngularJS)
Implemented Predictive Analytics/ Machine Learning using Azure Machine Learning (Azure ML) Studio, AWS ML & Python
Hands on working experience on Hadoop Ecosystem (Spark, HDFS, YARN, Zookeeper)
Excellent communication and interpersonal skills.

TECHNICAL SKILLS

DATABASE: SQL Server Azure SQL Database PostgreSQL MySQL

DATA WAREHOUSE: Redshift Azure Synapse SSAS (Tabular, Multi-Dimensional) SSIS

BIG DATA: Spark HDFS Python

ETL/API: MuleSoft Anypoint Studio Mule (ESB) Mule API Manager Mule Runtime Manager

DATA VISUALIZATION: SSRS Power BI JasperSoft QlikSense QlikView

MACHINE LEARNING: Azure ML AWS ML Jupyter Notebook

DEVOPS: JAMA JIRA DocuSign Box Bitbucket

OTHERS: Windows Linux Shell Script SQL Server Management Studio (SSMS)

PROFESSIONAL EXPERIENCE

Confidential

Data Engineer

Responsibilities:

Design and Develop custom application for BIG Data ingestion platform using MuleSoft 4 for Clinical Trial industry by defining cloud data integration, API scope and specifications
Code to build software application to collect data from SQL Server, Flat File through complex ETL Process using, MuleSoft and stored in Azure SQL Data Warehouse, Amazon Redshift, Azure SQL Database
Professional expertise in building complex Mule Flows, Scopes, Error Handling, Message Filters, Validation, DataWeave, Payload Transformation, Scatter Gather, Flow Reference, Message Enricher and Flow Controls using Anypoint Studio.
Used Anypoint Runtime Manager for deployment and scheduling. Developed reusable flow for complex data processing
Create REST API to expose data for sponsors using, MuleSoft, C# and ASP.Net, WebApi, HTML, CSS, Java Script
Building data pipelines and complex ETL to process external client data using Python, Spark
Build Dashboards using QlickView, QlikSense and integrate them into the portal with (Mashup and Dev-Hub)
Built interactive and dynamic action trigger platform using Python, Azure ML to notify trial site for potential abnormal behavior
Developed Machine Learning Model through Azure Machine Learning Service to identify issues for possible incorrect subject assessment in Clinical trial
Develop Measures and KPI in deploy Azure Analysis Service. Refresh Azure SSAS Models through REST API call using MuleSoft
Design unit test plan and test script to ensure data correctness in Data Warehouse and ETL Application
Manage and maintain optimize SQL Server, PostgreSQL in Windows, and Linux environment

Environment: SQL Server, Azure SQL Data Warehouse, Azure Analysis Service, Azure BLOB, MuleSoft, QlickView, QlikSense, Azure Machine Learning Studio, Python, C#, Spark, Linux, Jupyter Notebook, SQL Server Management Studio (SSMS), Anypoint Studio, Anypoint Runtime Manager, Anypoint API Manager, DataWeave

Confidential

Sr. Database Engineer

Responsibilities:

Create and maintain optimal distributed data pipeline architecture
Develop robust Changed Data Capture (CDC) solution using Kafka, Debezium, Shell Script
Consume CDC data using Spark and saved as Parquet in HDFS for later analysis
Process & Transform Parquet to build Data Warehouse
Implement visualization tools like Metabase, Tableau that utilize the data pipeline to provide actionable insights and key business performance metrics.
Administer, optimize, and ensure availability of PostgreSQL and MariaDB

Environment: Kafka, PostgreSQL, MariaDB, Debezium, Spark, PySpark, Python, Metabase, HDFS, Tableau, Shell Script, Linux

Confidential

Principal Data Engineer

Responsibilities:

Building AWS cloud-based data ingestion platform that allows data collection, transformation and loading from 5 different data sources using SSIS, MuleSoft and ingest them into Amazon Redshift
Troubleshooting Mule ESB, including working with debuggers, flow analyzers and configuration tools
Routing and orchestration in MuleSoft payload of JSON and XML message
Implement the development of Mule services to connect both internal and Cloud SOAP and REST web services.
Designed & developed robust data warehouse architecture and Visualization in Power BI and TIBCO JasperSoft
Manage & Maintained production Databases mostly in SQL Server, Implemented TDE, High Availability, Backup Recovery, Partitioning Large tables in existing production databases
Design interactive dashboards with Network Signal Monitoring starting from high-level Google Map overview to street view,
Built and Design ML Model using AWS Machine Learning Service

Environment: SQL Server, SSIS, Microsoft PowerBI, TIBCO JasperSoft, Amazon Redshift, AWS Machine Learning, MuleSoft, Mule ESB, SOAP, REST

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship