We provide IT Staff Augmentation Services!

Sr. Big Data Lead Engineer Resume


  • Senior Lead Data Engineer with 14 years’ experience in design and building Big Data Analytics applications & pipelines, including develop & deploy Enterprise level ETL solutions for Data Warehouse applications and data visualization dashboards


  • Power shell & shell Programming &T - SQL
  • Spark RDD, data frame, spark different contexts, UDF & notebook
  • Azure data factory, data lake, storage
  • Linux and windows Server Platforms
  • Hive, beeline, Ambari, Azure HDInsight, Data lake, Blob storage
  • Data Bricks, pyspark, json, Parquet
  • Power BI, ADF, U-SQL
  • Python,
  • Azure DevOps, Git
  • Sales Force marketing cloud, data extracts and campaign data
  • Webhook,
  • Data warehouse modelling, SQL DWH, poly base
  • Build releases & integration
  • Compute and scale application processing





  • Databricks & ADF: Develop solutions for data pipeline flow for inventory, loyalty, POS data marts
  • Perform Agile User Stories and sprint planning
  • Continuous Integration and Continuous Delivery (CI/CD), Test Automation, Infrastructure as Code, Secure Coding Practices
  • Build test release software SDK, UDF’ utilities, maintain best practices, source code and GIT repository
  • Use of Azure Data bricks and data frames for transformation. Load parquet and sql tables. Hive external tables for data scientists
  • Created BI data layer for building warehouse. Develop dashboards for sales analytics and financial data hub
  • Created hive/parquet tables, Use of HIVE/SQL context.
  • Define facts and dimensions, process ETL Load data for SQL DW data marts
  • Build data extension extracts from Sales Force marketing cloud, data extracts and campaign data
  • Migrated several on-premise solutions to azure cloud, infrastructure, network cloud integration IaaS
  • Technical experience with cloud and hybrid infrastructures, architecture designs, migrations, and technology management are required on an enterprise-scale
  • Experience with scalable architectures using Azure App Service, API management, serverless technologies

Languages and Technologies: Azure Data Factory, Web API, VSTF/TFS, GIT, data bricks, Azure Functions, Azure SQL Polybase, BLOB, Data lake, U-SQL, logic apps, EventHub, data bricks, Azure key vaults and python




  • Building Data warehouse and data lake analytics: - Build MS finance analytics warehouse using Azure services like data lake storage, U-SQL and Power shell for MS product sales insights and KPI’s by feeding different areas of sources using Azure ADF and spark/python
  • Surface Device Telemetry &Analytics: - Preparing Asimov data for surface devices. Metrics are stored in Azure database. Data is processed and loaded using Cosmos scripts and Xflow configuration. Wrote DAX queries to derive datasets and build Power BI dashboards to visualize.
  • Web Analytics: - Office Max Team is assisting Office product website through web analytics. Such as how many page searches, legitimate page visits help to broaden the development of site. Data is tracked via fiddler events in COSMOS cluster. Develop and build ETL pipelines using SSIS and stored procedures to track history of those events. Once loaded to SQL data warehouse build Tabular cube for analyzing its trends
  • Device Configuration: Use Power shell scripts to re-home, includes install, configure windows and database tools for store devices to be used for RISK analytics application

Technologies: Azure Data Factory, ETL, ADF, Data warehouse, python, SSAS, Power BI, Web API, COSMOS, VSTF/TFS, GIT, SQL Server 2012/2014, BLOB, Data lake, U-SQL, logic apps, EventHub




  • MSG Campaign Tools: - Design and developed tool named OSCAR database, for tracking machines for vulnerability which uses Global Foundation Services GFS scanned systems and inventory portals for regular tracking and email campaigns. Project include Front end for campaigns and middle tier of data warehouse and reporting dashboard based on SSRS
  • ETL Architecture, Developing Source to target data mapping (STDM) document defining transformation logic for SAP data using SSIS and SQL stored procedures. Translated business logic to transformation logic to generate pseudo code for the ETL process
  • Develop Online Product Approval (OPA) application database using to manage Product Lifecycle Management (PLM) system used to manage the approval lifecycle for Disney, Marvel, and ESPN consumer products

Technologies: SQL Server 2010, T-SQL, SSIS, Data warehouse, SSAS, SSRS, Power shell

Hire Now