Data Engineer /etl Developer Resume
Yardley, PA
SUMMARY
- 6 years of professional experience as a Tableau/BI developer and data engineer designing, developing and implementing T - SQL queries, ETL packages and reporting solutions using SSMS, SSIS, and Tableau
- Solid experience in logical and physical database (OLTP) design and development, data conversion and normalization concepts
- Profound understanding in enterprise data warehouse concepts (OLAP), including dimension/fact table design and development as well as Star/Snowflake schemas
- Proficient in T-SQL, including query optimization/tuning, creation of views, triggers, user-defined functions (UDFs), stored procedures, dynamic SQL, common table expression (CTE), temp table, table variable and a variety of joins
- Extensive experience in query optimization/tuning and database performance tuning using a reverse engineering approach of reviewing execution plan, and using SQL Server Profiler and Data Engine Tuning Advisor
- A profound experience of data Extraction, Transforming and Loading (ETL), development of ETL packages, incremental loading, and data cleaning in SSIS
- In-depth understanding of building and publishing interactive report solutions with customized parameters, user filters and easy-to-read dashboards in Tableau
- Adept experience of creating Tableau dashboards/stories, using heat maps, treemaps, circle views, bar charts, lines, pie charts, area charts, bubbles and highlight tables, symbol and filled maps according to deliverable specifications
- Proficient in creating different types of reports like a parameterized report, ad-hoc report, dashboard report, drill down report using T-SQL, Tableau and Power BI
- Experienced in Python programming for descriptive, inferential, predictive and descriptive data analyses, using Python libraries such as Numpy, Pandas, SciPy and Scikit-Learn, as well as in data visualization packages such as Matplotlib and Seaborn
- Familiar with Hadoop Ecosystem, such as HDFS, MapReduce, Spark, Hive, Pig, Sqoop and Kafka, for big data streaming and processing
- Experienced in cloud services, such as AWS (EC2, EMR, RDS, S3, Redshift) and Azure (Blob Storage, SQL Data Warehouse, HDInsight, Virtual Machine), for better data security, scalability and computing efficiency
TECHNICAL SKILLS
Programming Languages: T-SQL, Python (Numpy, Pandas, Matplotlib, Seaborn, Sklearn), R
Databases: MS SQL Server 2008/2012/2017, MySQL, PostgreSQL, Teradata, Netezza, MS Access
SQL Server Tools: SQL Server Management Studio (SSMS), SQL Server Integration Services (SSIS), SQL Server Profiler, Database Engine Tuning Advisor
Cloud Services: Azure (SQL Database/Blob Storage, SQL Data Warehouse, HDInsight, Virtual Machine); AWS (S3, RDS, Athena, Redshift, EC2, EMR), Databrick (Spark/Hive)
Hadoop Ecosystem: HDFS, MapReduce, Spark (PySpark), Hive, Pig, HBase
Data Visualization: Tableau/Tableau Prep, Power BI, MS Excel (Pivot Tables)
Others: MS Visio, Lucid Chart, Slack, GoToMeeting, MS Office
PROFESSIONAL EXPERIENCE
Confidential, Yardley, PA
Data Engineer /ETL developer
Responsibilities:
- Designed and created SSIS packages in Visual Studio to extract, transform and load (ETL) medical and pharmaceutical data, such as formulary, drug coverage, and medical restriction data, from the legacy databases and from the company’s data collecting and processing tools into the production data warehouse
- Transformed and processed the data from different formats (text file, XML, HTML) into a uniform format, depending on the needs of each client
- Developed complex SQL logic using case statements, stored procedures, CTEs, user-defined functions (UDFs), and views to translate business requirements, such as medical restrictions, and embedded the code into SSIS ETL pipelines
- Parameterized all possible SQL identity keys in SSIS to make the packages dynamic for configuration
- Involved in package/SQL tuning and optimization, such as changing table lock types, avoiding using functions that require too much resources, and adding indexes to the staging tables, etc.
- Scheduled data refresh jobs in SQL Server Agent using T-SQL for daily and weekly incremental loading, daily database backup and restoring, and error handling mechanisms
- Created QC queries and worked closely with the QA team to ensure the data presenting correct values before pushing the SSIS project into production
- Built Tableau dashboards that include text tables, donut charts and bar charts for quality control purposes and presented business findings to the client managers and business analysts using Tableau
Environment: MS SQL Server, SQL Server Management Studio (SSMS), Visual Studio SSIS, Tableau 2018.x/2019.x (Desktop/Server), DevOps, MS Azure, JIRA, Team Foundation Server (TFS)
Confidential, East Hanover, NJ
Data Engineer /Data Analyst
Responsibilities:
- Used T-SQL in SQL Server Management System (SSMS) to develop complex stored procedures, CTEs, triggers, user-defined functions (UDFs), and views to facilitate ETL processes
- Designed a data warehouse in AWS Redshift using snowflake schema and defined a variety of fact and dimension tables, such as appointment, insurance claim and payment fact tables, and locations, DateTime and insurance company dimension tables, by meeting the business requirements and KPIs
- Designed and built SSIS packages to extract, transform and load (ETL) existing data from the legacy systems and other sources into AWS S3 then to the data warehouse
- Utilized control flow and data flow components in SSIS, such as Pivot Transformation, Fuzzy Lookup, Derived Columns, Condition Split, Aggregate, Execute SQL Task and Execute Package Task to build ETL pipelines
- Created SSIS Packages that dealt with different data formats (Text files, XML, Database Tables) from various sources
- Involved in developing, testing and maintaining the initial load and incremental loads processes
- Implemented error handling mechanisms and fine-tuned SQL queries in the packages to ensure the ETL performance and data consistency and integrity
- Built Tableau dashboards with bar charts, line charts, and text tables, and applied quick/context/global filters, parameters and calculated fields to help clinical staffs monitor their patients and payments
- Scheduled data refresh on Tableau Server for weekly and monthly increments based on business change to ensure that the dashboards displayed the most updated information
- Assisted in Tableau administration activities including granting access, managing extracts and installations, caching configurations to support large volume usage
Environment: MS SQL Server, SQL Server Management Studio, SQL Server Data Tools (SSIS), Tableau 2018.x/2019.x (Desktop/Server), AWS S3, RedShift, Office 365, SharePoint
Confidential, Philadelphia, PA
Data Engineer / Tableau Developer
Responsibilities:
- Developed Tableau dashboards according to client’s specifications and business requirements
- Extensively used data blending techniques in dashboard development in Tableau to create reports with data from different sources based on their related features
- Created interactive worksheets and dashboards using advanced techniques such as drill-down and drill-through
- Created calculated fields, mapping, and hierarchies, and used statistical models such as linear regression and forecasting functions for in-depth data analysis
- Developed dual-axis charts with multiple measures and developed pie charts and donut charts for percentage analysis
- Rebuilt extraction, transformation and loading (ETL) pipelines in SSIS using techniques like initial/ incremental loading, large table partitioning, caching data
- Designed and developed ETLs using cloud services in Azure HDInsight and used Spark and Hive to process large datasets
- Participated in features engineering such as combining sparse classes, reducing unused features, generating dummy variables and label encoding using Sklearn in Python
- Participated in building machine learning models for classification, such as KNN, Random Forests, Gradient Boosting, and SVM, and in creating a model evaluation and hyperparameters fine-tuning
Environment: Tableau 10.x/2018.x (Desktop/Server), Python 3.X, T-SQL (SSIS/SSMS), MS Azure Data Lake Storage/SQL Data Warehouse/HDInsight, Databricks (Spark/Hive), SQL Server Code
Confidential, Kansas City, MO
Data Analyst (SSMS/SQL/Tableau)
Responsibilities:
- Collected customer data including their geolocations and demographics through mobile apps and loaded these data into the database for further analysis in SQL Server 2012
- Applied complex SQL queries to join tables, union datasets, select specific data, format categorical data, aggregate and filter values in SSMS (SQL Server Management Studio) to preprocess the data
- Applied EDA (Exploratory Data Analysis) to the preprocessed data to find patterns, distributions, and correlations among different features of the customer data using Python Jupyter Notebook
- Designed data visualizations such as stacked bar charts, donut charts, stream flow charts and histograms using Tableau 9.2 to further identify customer behaviors and customer flows at different locations, times, and under different weather conditions
- Published Tableau workbooks and extract data sources to Tableau Server and implemented row-level security and scheduled automatic extract refresh
- Created documentation for each visual report, which consists of the report purposes, data sources, columns mapping, transformations, and user groups
Environment: MS SQL Server, SQL Server Management Studio, Tableau 9.0x, Power BI, Python2.0x/3.0x
Confidential, St. Louis, MO
BI Developer/ Data Analyst
Responsibilities:
- Worked with business analysts, sales representatives and other business units to identify existing business challenges suitable for BI (Tableau) solutions and new reporting requirements
- Conducted reverse engineering on reports that have limited documentations and built a detailed data mapping documentation
- Developed a variety of reporting solutions from simple, standard reports for regular subscriptions to complex, multi-purpose BI reports in Tableau
- Developed Tableau data visualization using Cross tabs, Heat maps, Scatter Plots, Geographic Map, Pie Charts and Bar Charts and Density Chart
- Involved in testing and maintaining new and updated features Tableau Desktop/Server
- Participated in meetings, reviews, and user group discussions as well as communicating with stakeholders and business groups
Environment: MS SQL Server, SQL Server Management Studio, Tableau, Office 365, SharePoint