Sr. Data Engineer Resume
CaliforniA
SUMMARY
- Having 10+ Yrs. of experience in SQL, My SQL, BI (SSRS, SSIS, and SSAS) and Power BI development knowledge and Data warehousing software applications.
- Hands on experience in working on SQL Server 2008R2/2012/2016 databases, Azure SQL, Azure Data Factory, Data Bricks, Azure Synapse &Azure Data Lake
- Major focus on Configuration, SCM, Build/Release Management, Infrastructure as a code (IAC) and as Azure DevOps operations Production and cross platform environments.
- Experienced in working on DevOps /Agile operations process and tools area (Code review, unit test automation, Build & Release automation Environment, Incident and Change Management) including various tools.
- Experience in DevelopingSparkapplications usingSpark - SQLinDatabricksfor data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
- Experience in Writing TSQL (DDL, DML AND DCL), developing/creating new database objects such as Tables, Views, Indexes, Complex stored procedures, function(UDF), cursors, and triggers, locking issues, BCP, common table expressions (CTEs), Backups, Recovery and SQL Server Agent profiler using SQLSERVER 2008 R 2/2012/2016.
- Extensive knowledge in dealing with RDBMS including E/R Diagrams, Normalization, Constraints, Joins, Keys and Data Import/Export.
- Used Pandas API to put the data as time series and tabular format for east timestamp data manipulation and retrieval.
- Worked with Python OO Design code for manufacturing quality, monitoring, logging, and debugging code optimization.
- Experience in Hadoop cluster performance tuning by gathering and analyzing the existing infrastructure.
- Managed large datasets using Panda data frames and MySQL.
- Proficient in Extraction Transformation and Loading (ETL) process by pulling large volumes of data (VLDB) from various data sources using SSIS.
- Performance Tuning in SQL Server - Using SQL Profiler, DTA (Tuning Advisor), HINTS, Effective use of DMV’S/DMF’s, Optimization for AD-HOC Workloads, Plan Guides, Parameter Sniffing, DBCC commands, Partitions & Partitioning, Bulk Loading Data(BCP), Indexed Views & Schema Binding, Clustered/Non-Clustered Indexes.
- Enhancing current technical expertise, and applying SSIS, SSRS, PerformancePoint Server (PPS), SQL transferable skill sets.
- Strong experience in Creating, Designing and Processing a Multi-Dimensional Analysis Cubes (Configured Cubes, Dimensions, Measures, MDX Queries, DAX Data mining, Data Source, Data Source Views and following Star and Snow Flake schema)
- Expertise in creating Perspectives, Partitions and Design Aggregations in cubes using SSAS.
- Experience in working Azure cloud platform in components of ADF, ADL and Azure SQL & Data Synapse.
- Experience in analyzing and working on requirement gathering from business.
- Experience in working Offshore Onsite Model.
- Knowledge on reporting tool Power BI and Azure Cloud space.
- Experience in working Excel Power Pivot &Power BI reports.
- Experience of programming languages, such as Java, Node.js, PHP etc.,
- Hands on experience in creating Scorecards and Dashboards using PPS.
- Having good Knowledge of ETL Concepts like Lookup Caches, Package Configurations and Variables.
- Hands on experience in Creating Stored Procedures, Functions, Indexes, Views and CTE.
- Experience in Extraction, Transformation, and Loading of data directly from different heterogeneous source systems like Flat File, CSV, XLS and Relational Database.
- Have good problem solving and analytical skills. Highly motivated professional, Team Player and the ability to effectively Work with, communicate and mentor people.
- Experience in Agile Methodology Scrum approaches in the SDLC.
- Experience of HDFS, and Hadoop/Hive and familiarity with big Data technologies (HBASE, Spark).
- Very good Knowledge on PostGRE SQL
- Excellent communication skills: Able to interact with users and business owners.
TECHNICAL SKILLS
Data Warehousing & BI: SQL Server Business Intelligence Studio (SSIS, SSRS, SSAS)
Azure Cloud: Azure cloud platform (Azure Data Factory, Azure Synapse, Databricks Azure Data Lake, Azure SQL, Azure DevOps, Spark)
Reporting Tools: SSAS 2016/2008R2, Crystal Reports XI/X/9/8, Excel, Power BI.
Database: MS SQL Server 2016/2008R2, MYSQL, PostGRE SQL
Programming: C, C#.Net, VB.Net, PL/SQL, TSQL, Python
IT Processes: Software Development Life Cycle (SDLC), Project Management, Agile process.
Languages: SQL, UNIX, C# 4.0, Shell Scripting, LINQ, ASP.NET, UML, XML, HTML, Java Script, CSS
Productivity Applications: MS Word, MS Excel, MS Access, MS Project, Visio, VSTF.
Operating System: UNIX, Windows
PROFESSIONAL EXPERIENCE
Confidential
Sr. Data Engineer
Responsibilities:
- Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data. Understand current Production state of application and determine the impact of new implementation on existing business processes.
- Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks.
- Implemented Proof of concepts for SOAP & REST APIs
- REST APIs to retrieve analytics data from different data feeds
- Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
- Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
- Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark data bricks cluster.
- Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Sql Activity.
- Hands-on experience on developing SQL Scripts for automation purpose.
- Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).
Environment: MS SQL Server 2014/2016, Power BI, SQLDB, spark, Python, Azure Data Factory, Azure Synapse, Azure Data Lake, Azure Databricks
Confidential - California
Sr. Data Engineer
Responsibilities:
- Working on PaaS application which caters services on Azure Cloud.
- Design and developed on On premise to cloud migration projects.
- Designed and developed Azure functions for data management.
- Data transfer using Azure Synapse and Polybase.
- Responsible for building Confidential data cube using SPARK framework by writing Spark SQL queries in Scala so as to improve efficiency of data processing and reporting query response time.
- Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
- Responsible for transporting, and processing real-time stream data sourced to Azure data factory for managing data from different data sources and data transfer.
- Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark databricks cluster.
- SQL Managed instances for storing relational data bases.
- Azure data lake Gen 2 for storing IOT data from GPS devices.
- Spark for streaming data and processing from different data sources to support real time analytics.
- Service fabric Micro services to support mobile application.
Confidential - Atlanta, GA
Sr. Data Engineer
Responsibilities:
- Participated in the development of Properties Auditing solution in Azure SQL Data warehouse Cloud by bringing data from on-prem using ADFPolybase.
- Proactively participated in the development of Digital Marketing for Rentpath, which emulates as a source for most of the BI projects in cloud using Azure Data Lake, Kafka, Azure Data bricks to handle all complex transformations, ADF Pipelines, ADW, Azure SQL DB etc.
- Analyzing and automating the ADF pipelines built to optimize the existing resources and understanding the end to end flow to retire few of the complex SQL Sp's by building it in Databricks.
- Currently analyzing the Customer Data from Social Media Platforms (Facebook, Instagram, Google, Apple etc..) and Digital Platforms like (Apartment.com and Apartment Guide) built in Big Data, which consumes data from 30+ sources and pushes the data to blob for Data Science team consumption post complex transformations. Working on laying out the design to implement the transformations in cloud using Databricks, thereby removing an extra hop and thus saving the licensing cost of Big Data
- Created a transformation layer in ADW using SQL SPs and did performance optimization of the same using replicated and hash-distributed tables for dimension and factual data. Finally, the consumption layer was used for reporting using Power BI.
- Currently working in the Property Management Company Social Network Data reporting project
- MyInisghts as a Data Engineer to rewire the current logic built in Azure SQLDB using complex SP's to Data bricks, to reduce the storage of SQL DB and use the same for cluster to compute the complex transformations in Azure Data Bricks, so that the processing time is reduced to a greater extent and making it available to the end users as early as possible
- Worked on complex stored procedure optimization by implementing partitioning on KPI tables, splitting the huge intermediate base table based on time frame to hold precalculated measures before doing complex set of cross joins across stores, location, product hierarchy, time frame in order to reduce the prolonged weekly processing for one of the markets. It will also reduce the response time at the UI end
Environment: MS SQL Server 2014/2016, Power BI, SQLDB, spark, Python, Azure Data Factory, Azure Synapse, Azure Data Lake, Azure Databricks
Confidential - Wayne, MI
Sr. Data Engineer
Responsibilities:
- Involved in Technical and Business decisions for Business requirement, Interaction with Business Analysts, Client team, and Development team through Agile Kanban process.
- Creating Azure Data factories for loading the data to Azure SQL database from Cosmos platform.
- Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark Databricks cluster.
- Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics.
- Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in InAzure Databricks.
- Acted as build and release engineer, deployed the services by VSTS (Azure DevOps) pipeline. Created and Maintained pipelines to manage the IAC for all the applications
- Understand the latest features like (Azure DevOps, OMS, NSG Rules, etc..,) introduced by Microsoft Azure and utilized it for existing business applications.
- Experience in DevelopingSparkapplications usingSpark - SQLinDatabricksfor data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
- Created complex power BI dashboards.
- Exploring Spark improves the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD, and Spark YARN.
- Performed the role of Scrum Master in daily stand up calls.
- Using polybase in ADF moved data into Azure Synapse staging tables.
- Experience on Migrating SQL database toAzure Data Lake, Azure data lake Analytics,Azure SQL Database, Data BricksandAzure SQL Data warehouseand Controlling and granting database accessandMigrating On premise databases toAzure Data lake storeusing Azure Data factory.
- Experience in DevelopingSparkapplications usingSpark - SQLinDatabricksfor data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
- Good understanding ofSpark Architectureincluding Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.
- Good understanding ofBig Data Hadoopand Yarn architecture along with various Hadoop Demons such as Job Tracker, Task Tracker, Name Node, Data Node, Resource/Cluster Manager, andKafka(distributed stream-processing).
- Experience in Database Design and development with Business Intelligence usingSQL Server 2014/2016, Integration Services (SSIS), DTS Packages, SQL Server Analysis Services (SSAS),DAX, OLAP Cubes, Star Schema and Snowflake Schema.
- Performed Column Mapping, Data Mapping and Maintained Data Models and Data Dictionaries.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs,Pythonand Spark.
- Involved in retrieving multi-million records for data loads using SSIS and by querying against Heterogeneous Data Sources like SQL Server, Oracle, Text files and some Legacy systems.
- Builtdatapipelines with Python and Bash for uploadingdatainto Warehouse.
- Expertise in using different Transformations like Lookups, Derived Column, Merge Join, Fuzzy Lookup, For Loop, For Each Loop, Conditional Split, Union all, Script component etc.
- Transferred data from various data sources/business systems including MS Excel, MS Access, and Flat Files to SQL Server using SSIS/DTS packages using various features.
- Involved in Performance tuning of ETL transformations, data validations and stored procedures.
- Strong experience in designing and implementing ETL packages using SSIS for integrating data using OLE DB connection from heterogeneous sources.
- Created Complex ETL Packages using SSIS which upsets data from staging table to database tables.
- Experience in creating reports from scratch using Power BI from Excel workbook that contains Power View sheets.
- Created roles using SSAS to restrict cube properties.
- Implemented cell level security in cubes using MDX expressions to restrict users of one region seeing data of another region using SSAS.
- Created calculated measures using MDX implementing business requirement
- Experienced working on Star and Snowflake Schemas and used the fact and dimension tables to build the cubes, perform processing and deployed them to SSAS database.
- Designed aggregations and pre-calculations in SSAS.
- Involved in designing Partitions in Cubes to improve performance using SSAS.
- Experienced in Developing Power BI Reports and Dashboards from multiple data sources using Data Blending.
- Used recently introduced Power BI to create self-service BI capabilities and use tabular models.
- DevelopedPythonbatch processors to consume and produce various feeds. Developed Merge jobs inPythonto extract and load data into MySQL database.
- Wrote Python codes to consume data from swagger UI.
- Responsible for creating and changing the visualizations in Power BI reports and Dashboards on client requests.
- Experienced in Working on Big Data Integration and Analytics based on Hadoop, Kafka
- Created Calculated Columns and Measures in Power BI and Excel depending on the requirement using DAX queries.
- Created hierarchies in Power BI reports using visualizations like Bar chart, Line chart, etc.
- Worked with both live and import data in to Power BI for creating reports.
- Managed relationship between tables in Power BI using star schema.
- Used different type of slicers available in Power BI for creating reports.
Environment: MS SQL Server 2014/2016, Power BI, SSIS, SSRS, SSAS, SQL Profiler, SQL, C#, web Hadoop, spark, Python, Azure Data Factory, Azure Synapse, Azure Data Lake, Azure Databricks
Confidential - Greater Chicago, IL
ETL/SQL/Power BI Developer
Responsibilities:
- Built business intelligence and data visualization dashboards using various technologies such as Power BI.
- Designed SSIS Packages to transfer data between servers, load data into database, and archived data file from different DBMS using SQL enterprise manager/SSMS on SQL server.
- Involved in Normalization and De-Normalization of existing tables for faster query retrieval.
- Excellent understanding and knowledge of Hadoop Distributed file system data modelling, architecture and design principles.
- Involved in designing Parameterized Reports for generating Ad-Hoc reports as per the client requirements.
- Gathered business requirements, definition and design of the data sources and data flows.
- Written SQL statements for retrieval of data and Involved in performance tuning of TSQL. Executed queries using Hive and developed Map Reduce jobs to analyze data.
- Executed queries using Hive and developed Map Reduce jobs to analyze data.
- Monitored and provided front-line support of daily processes.
- Helped create process logging and new monitoring tools, integrity reports, and mapping tools.
- Created SSIS packages for File Transfer from one location to the other using FTP task.
- Create and maintain SSIS packages to extract transform and load data in to SQL Server.
- Experience in creating scheduling Jobs, Alerts, SQL Mail Agent, and scheduled DTS Packages.
- Used Psycopg2 to make connections to RDS, which is written in python
- Involved in ETL architecture enhancements to increase the performance using query optimizer.
- Configured the loading of data into slowly changing dimensions using Slowly Changing Dimension wizard.
- Designed various SSIS modules in order to fetch the data in the data staging environment based on the different types of incoming data.
- Designed and created Report templates, bar graphs and pie charts based on the financial data.
- Delivered enterprise, web-enabled reports using SSRS.
- Implemented Database project solution for deploying database objects.
- Created report models using the report model designer to support Ad hoc reporting.
- Involved in requirement gathering, analysis, design, development & deployment.
- Provided support to the application developers on the application databases.
- Configured and deployed different reports to the servers using SQL Server 2012 SSRS
- Created complex stored procedures, triggers, cursors, tables, and views, using SQL.
- Experienced in SQL design, coding, testing and implementation.
- Developed and optimized stored procedures, views, and user-defined functions for the application.
- Developed custom scripts and stored procedures for data import and manipulation in SQL.
- Used cascaded parameters to generate report from different Data Sets.
- Involved with SQL query optimization to increase the performance of the report.
Environment: MS SQL Server 2014/2016, Power BI, SSIS, SSRS, SSAS, SQL Profiler, SQL, C#, VB.NET, Visual Studio.
Confidential
SQL/SSIS/REPORTING Developer
Responsibilities:
- Involved in writing stored procedures, Triggers, User-defined Functions, Views and Cursors for both Online and Batch requests handling business logic and functionality of various modules
- Did SQL performance monitoring and tuning of reporting data by optimizing indexes and stored procedures.
- Wrote Procedures and database triggers for the validation of input data and to implement business rules Developed several SQL reports for the Business Users to check the post processing rules.
- Designed and developed SSIS Packages for loading data from text files, CSV files to SQL Server databases using SSIS.
- Used different data flow elements like Flat File, OLEDB, Excel Sources, Destinations and Data Flow Transformations like Data Conversion, Conditional Split, Derived Column etc.
- Implemented Event Handlers and Error Handling in SSIS packages and notified process results to various user environments.
- Upgraded existing packages developed using SSIS 2008 to SSIS packages.
- Scheduled the SSIS jobs using SQL server agent for daily, weekly and monthly loads.
- Created SSRS report templates for ease of report development.
- Generated multiple Enterprise reports using SSRS from SQL Server Database (OLTP) and SQL Server Analysis Services Database (OLAP) and included various reporting features such as group by, drilldowns, drill through, sub-reports, navigation reports (Hyperlink) etc.
- Created different Parameterized Reports (SSRS 2008/) which consist of report Criteria in various reports to make minimize the report execution time and to limit the no of records required.
- Involved in development &Deployment of SSAS Cubes and Monitor Full and Incremental loads and support any issues.
- Implementing Dashboards and Score Cards using Performance Point Server and integrated with share point.
- Provide Operational Support to modify existing Tabular SSAS models to satisfy new business requirements
- Advanced knowledge of Excel for using Pivot tables, Power Pivot, Complex formulas.
- Developed Star schema using SSAS cubes.
- Created parallel period, filter, ancestors and cross join MDX queries.
- Developed Aggregations, partitions and calculated members for cube as per business requirements.
- Involved in analyzing and designing disaster recovery/replication strategies with business managers to meet the business requirements.
- Designed Dimensional Modeling using SSAS packages for End-User. Created Hierarchies in Dimensional Modeling.
- Created shared dimension tables, measures, hierarchies, levels, cubes and aggregations on MS OLAP/ Analysis Server (SSAS).
- Created a cube using multiple dimensions and modified the Relationship between a Measure Group and a Dimension, created calculated members and KPI’s, using SSAS.
- Experience in designing highly interactive visualizations using Tableau software and publishing and presenting dashboards on web and desktop platforms.
- Combined views and reports into interactive dashboards in Tableau Desktop that were presented to Business Users, Program Managers and End Users.
- Reviewed basic SQL queries and modified inner, left and right joins in Tableau Desktop by connecting Live/dynamic and static datasets.
- Worked on generating various dashboards in Tableau Server using different data sources. Used Data Blending to merge data from different sources.
- Created multiple dashboards with Drill Down capabilities using dashboard prompts, Local and Global Filters.
- Scheduled data refresh on Tableau Server for weekly and monthly increments based on business changes to ensure that the views and dashboards were reflected with modified data.
- Created Calculated Columns and Measures in Power BI and Excel depending on the requirement using DAX queries.
- Responsible for creating and changing the visualizations in Power BI reports and Dashboards on client requests.
- Responsible for building reports in Power BI from the Scratch.
Environment: SQL Server, SSIS (2008), SSRS 2008), Visual Studio 2010/, MS Excel 2010, VB.Net, TFS.