Data Engineer Resume
SUMMARY
- 7+ years of IT experience in various sectors, including Manufacturing, Mortgage, Insurance and Banking as a MSBI Full Stack Developer and Data Engineer.
- Experience in Data Modeling, Database Design, SQL Scripting, Development, and Implementation of Client - Server & Business Intelligence (SSIS, SSAS, SSRS) and Azure Platform service applications.
- Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and controlling and granting database access and Migrating On premise databases to Azure Data Lake store using Azure Data factory.
- Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming teh data to uncover insights into teh customer usage patterns.
- Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) or Snowflake and processing teh data in In Azure Databricks.
- Experience in Database Design and development wif Business Intelligence using SQL Server 2014/2016, Integration Services (SSIS), DTS Packages, SQL Server Analysis Services (SSAS), DAX, OLAP Cubes, Star Schema and Snowflake Schema.
- Strong skills in visualization tools Power BI, Confidential Excel - formulas, Pivot Tables, Charts and DAX Commands.
- Experience in Data Extraction, Transforming and Loading (ETL) between Homogenous and Heterogeneous System using SQL tools (SSIS, DTS, Bulk Insert, BCP, and XML).
- Extensive experience in Power BI Platform - Power BI Desktop, Visualization, Modelling, DAX, M, Dataflows, Power BI Service, Dashboards, Workspace administration, Performance improvements and Paginated Reports.
- Well-versed in understanding teh customer requirements, Ad-hoc and planned reporting needs and convert them into Reports and Dashboards through Power BI, Excel, SSRS, Paginated Reports etc. Worked in migrating Tableau / MicroStrategy / SSRS reports to Power BI.
- Worked wif data from various sources like Excel, SharePoint Folders / Lists, SQL Server, CSV, Amazon Redshift, Netezza, Azure data lake etc.
- Exceptional knowledge in Power Pivot, Power Query, Advanced formulas, and Reporting using Excel.
- Excellent hands-on experience wif data modeling tool like Erwin and strong knowledge of Relational and Dimensional database modeling concepts wif Star schema and Snowflake schema’s design and implementation.
- Experienced in creating reports such as Parameterized, Cascading, Drill through/Drill down Reports, Conditional Table, Matrix, Charts and Ad-hoc reports. Specialized in using Relational Data Warehouse and OLAP Cubes.
- Experienced using different Version Control Tools like Microsoft TFS and GIT.
- Experienced in business reviews wif customers and good at customer connect.
TECHNICAL SKILLS
Databases: MS SQL Server 2016/2014, ORACLE, Amazon RedShift, IBM Netezza, Azure SQL Database
SQL Server Tools: SSMS, Enterprise Manager, Query Analyzer Profiler, SSIS, SSAS, SSRS, Database Tuning Advisor, SQL* Plus, Azure Databricks, Azure SQL Datawarehouse
Cloud Environments: Azure, AWS, GCP
ETL-Tools: SQL Server Integration Services (SSIS), Azure Data Factory (ADF), Informatica
Reporting Tools: SSRS 2016/2014/2012 , Excel, MS Access, Tableau, Power BI, MicroStrategy
Data Modeling: FACT& Dimensions tables, Physical & Logical data modeling, Star & Snowflake schema Relational, Dimensional, and multidimensional modeling and De-Normalization Techniques, MS- Access.
Programming: T-SQL, Dynamic SQL, MDX, XML, .NET, Java, Python, Unix, Shell Scripting, Spark
Version Controls: Microsoft TFS and GIT HUB
PROFESSIONAL EXPERIENCE
Confidential
Data Engineer
Responsibilities:
- Responsible for researching, consulting, analyzing, and evaluating system program needs and various types of computer application software.
- Worked on migration of data from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB/Snowflake).
- Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Snowflake, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
- Developed Spark applications using Spark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming teh data to uncover insights into teh customer usage patterns.
- Responsible for estimating teh cluster size, monitoring, and troubleshooting of teh Spark data bricks cluster.
- Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- Developed JSON Scripts for deploying teh Pipeline in Azure Data Factory (ADF) that process teh data using teh Sql Activity.
- Hands-on experience on developing SQL Scripts for automation purpose.
- Analyzed teh traditional sql scripts and converted to Spark SQL (PySpark) for faster performance.
- Developing a prototype for design interface and estimate teh performance criteria.
- Utilize Power Query in Power BI to Pivot and Un-pivot teh data model for data cleansing and data massaging.
- Integrate Custom Visuals based on business requirements using Power BI desktop.
- Design Power BI data visualization utilizing Cross Tabs, Maps, Scatter plots, Pie, Bar and Density charts.
- Develop parameterized reports, drill down, drill through reports, sub reports as per teh report specifications and functionality.
- Develop analysis reports and visualization using DAX functions like table function, aggregation function and iteration functions.
- Create conditional filters and Action links to filter teh data on dashboard wif Power BI desktop.
- Design and develop reports according to Tableau and Power Map requirements.
- Design Calculated Columns and Measures in Power BI and excel depending on teh requirement using DAX queries.
- Create several user roles and groups for teh end-user and provide row-level security.
- Create Workspace and content packs for business users to view teh developed reports.
Confidential
Data Engineer
Responsibilities:
- Involved in complete project life cycle starting from design discussion to production deployment.
- Worked closely wif teh business team to gather their requirements and new support features.
- Developed a 16-node cluster in designing teh Data Lake wif teh Cloudera Distribution.
- Responsible for building scalable distributed data solutions using Hadoop.
- Implemented and configured High Availability Hadoop Cluster.
- Installed and configured Hadoop Clusters wif required services (HDFS, Hive, HBase, Spark, Zookeeper).
- Developed Hive scripts to analyze data and PHI are categorized into different segments and promotions are offered to customer based on segments.
- Extensive experience in writing Pig scripts to transform raw data into baseline data.
- Developed UDFs in Java as and when necessary to use in Pig and HIVE queries.
- Worked on Oozie workflow engine for job scheduling.
- Created Hive tables, partitions and loaded teh data to analyze using HiveQL queries.
- Created different staging tables like ingestion tables and preparation tables in Hive environment.
- Optimized Hive queries and used Hive on top of Spark engine.
- Integrated Custom Visuals based on business requirements using Power BI desktop.
- Designed Power BI data visualization utilizing Cross Tabs, Maps, Scatter plots, Pie, Bar and Density charts.
- Integrated Custom Visuals based on business requirements using Power BI desktop.
- Developed parameterized reports, drill down, drill through reports, sub reports as per teh report specifications and functionality.
- Embedded Power BI reports on SharePoint portal page and managed access of reports and data for individual users using Roles.
- Created conditional filters and Action links to filter teh data on dashboard wif Power BI Desktop.
- Created Dax Queries to generated computed columns in Power BI.
- Implemented complex business logic through T-SQL stored procedures, Functions, Views, and advance query concepts.
- Developed analysis reports and visualization using DAX functions like table function, aggregation function and iteration functions.
- Created reports using time intelligence calculations and functions.
Confidential
Data Engineer
Responsibilities:
- Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources likeAzure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
- Developed Spark applications usingScalaandSpark-SQLfor data extraction, transformation and aggregation from multiple file formats for analyzing& transforming teh data to uncover insights into teh customer usage patterns.
- Responsible for estimating teh cluster size, monitoring and troubleshooting of teh Hadoop cluster.
- Used Zeppelin,Jupyternotebooks andSpark-Shellto develop, test and analyze Spark jobs before Scheduling Customized Spark jobs.
- Undertake data analysis and collaborated wif down-stream, analytics team to shape teh data according to their requirement.
- Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- To meet specific business requirements wroteUDF’sin Pyspark.
- For Log analytics and for better query response usedKusto Explorer.
- Replaced teh existingMapReduceprograms andHiveQueries into Spark application using Scala.
- Deployed and tested (CI/CD) our developed code using Visual Studio Team Services (VSTS).
- Involved in migrating existing data science projects to GCP.
- Conducting code reviews for team members to ensure proper test coverage and consistent code standards.
- Responsible for documenting teh process and cleanup of unwanted data.
- Responsible for Ingestion of Data and maintaining teh PROD pipelines for real business needs.
- Expertise in creating HDInsight cluster and Storage Account wif End-to-End environment for running teh jobs.
- Developed Json Scripts for deploying teh Pipeline in Azure Data Factory (ADF) that process teh data using teh Cosmos Activity.
- ImplementedETLfunctionalities to process real time data using Spark inbuilt API’s and Spark SQL.
- Hands-on experience on developing PowerShell Scripts for automation purpose.
- Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).
- Programming for all ETL loading processes and converting teh files into parquet in teh Hadoop File System.
- Migrated teh DW to Redshift and teh data lake layers to Cloud S3 storage and migrated teh PySpark applications to AWS and EMR jogs on cloud
- Hands on experience in working on Spark SQL queries, Data frames, and import data from Data sources, perform transformations; perform read/write operations, save teh results to output directory into HDFS.
- Involved in running teh Cosmos Scripts in Visual Studio 2017/2015 for checking teh diagnostics.
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks.
Confidential, Michigan
Data Reporting Analyst
Responsibilities:
- Facilitated interview sessions to identify Business rules and Requirements and then documented in a format that can be reviewed and understood by teh whole team.
- Provided mock-up visuals to speed up requirements visualization.
- Responsible for creating and changing teh data visualizations inPowerBIreports and Dashboards on client requests.
- Performed data analysis and data profiling using T-SQL and data profiling tasks.
- Performed business analysis for current project running in team, create excel based report for quick sharing.
- Analyzed extracted data from spreadsheet to support management level decisions.
- Advanced knowledge of Self - Service analytic / ETL tools, specifically Alteryx Designer.
- Responsible for creating and maintaining SSRS reporting for shift wise incident queue reporting needs.
- Part of weekly SCRUM calls and deliver teh tasks in teh sprints.
- Expertise in creating Reports, Sub Reports, drill down reports using various features like Charts, filters etc.
- Created report models for ad-hoc reports using SSRS 2012.
- Used SSRS Report Catalog to store teh Configuration, Security and Caching Information for teh Operation of teh Report Server.
- Developed Cascading Reports for Employee Reports, Ad-hoc Reports using Report Builder, (SSRS) 2008.
- Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Developed Tableauvisualizations and dashboards using Tableau Desktop.
- Developed Tableau workbooks from multiple data sources using Data Blending.
- Designing, Development, testing and implementation of BI/Data Warehousing projects using Tableau, Micro Strategy, Alteryx etc.
- Responsible for designing and executing Application objects and Schema objects using MicroStrategy Desktop.
- Created and defined attributes from fact tables in teh appropriate fields under teh Attribute Definition section for each dimension in MicroStrategy.
- Troubleshoot high level issues to identify and resolve problems caused by teh database, source data, ETL, or teh Business Intelligence software.
- Enhancements, feature additions for improving teh dashboard quality and user experience.
Confidential
Data Analyst/ETL Developer
Responsibilities:
- Involved in gathering Business Requirement after interacting wif teh Manager and development team to build up a solution to implement in SQL Server 2008.
- Filtered and cleansed dirty data from legacy system using complex T- SQL statements in staging area and implemented various constraints and triggers for data consistency.
- Created SSIS package to load data from Flat File to Flat File and Flat File to SQL Server using Lookup, Fuzzy Lookup, Derived Columns, Condition Split, Term Extraction, Aggregate, Pivot and Unpivot Transformation
- Successfully migrated old data from legacy systems (FoxPro) and external system including excel and oracle into SQL Server 2008 using SSIS Packages.
- Created scripts to verify and reconcile teh migrated data from legacy system to SQL Server.
- Developed interface stored procedure to upload Insurance Record from XML format to Customer Insurance System.
- Extensively used joins and sub queries to simplify complex queries involving multiple tables.
- Designed teh Forms, templates & created Reports using SSRS 2008 reporting services.
- Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into teh database.
Confidential
Java - SQL Developer
Responsibilities:
- Designed and developed various SSIS packages (ETL) to extract and transform data and involved in Scheduling SSIS Packages.
- Created ETL metadata reports using SSRS, reports include like execution times for teh SSIS packages, Failure reports wif error description.
- Created OLAP applications wif OLAP services in SQL Server and build cubes wif many dimensions using both star and snowflake schemas.
- Used complex query statements like sub queries, correlated queries, derived tables, CASE functions to insert teh data depending on teh criteria into teh tables.
- Created scripts to verify and reconcile teh migrated data from legacy system to SQL Server.
- Developed interface stored procedure to upload Insurance Record from XML format to Customer Insurance System.
- Extensively used joins and sub queries to simplify complex queries involving multiple tables.
- Designed teh Forms, templates & created Reports using SSRS 2008 reporting services.
- Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into teh database.