- Looking for a Sr. Microsoft SQL DBA, SQL Developer, Data Warehousing Architect and Specialist, ETL Developer role
- 18+ years hands - on experience in SQL 2019/2017/2016/2014/2012/2008 R2 administration
- 12+ years hands-on experience in SQL development (TSQL, Microsoft BI suite, ETL, PowerShell, VB Script)
- 10+ years hands-on experience in Data Warehouse modelling, design, development and implementation
- 10+ years hands-on experience in ETL development using Microsoft BI suites (SSIS, SSRS and SSAS)
- 8+ years hands-on experience in Data Warehouse technologies (Redshift, Azure DW, Snowflake, Data Lakes)
- 6+ years hands-on experience in managing IT Services in AWS(VPC, EC2, RDS, EBS, S3, Networking, Security and Monitoring)
- 6+ years hands-on experience in migration from on premise data centers and Rackspace to AWS and Azure
- 3+ years hands-on experience in managing Azure (virtual machine, virtual networks, SQL database, data warehouse, storage)
- 5+ years hands-on experience in Tableau development and administration
- 2+ years hands-on experience in Informatica power center ETL platform and Talend ETL platform
- 5+ years hands-on experience in EMC storage, HP storage, AWS EBS and S3 storage management and optimization
- 3+ years hands-on experience in Oracle 10G and 11G administration
- 2+ years hands-on experience in Sybase ASE administration
- Solid ETL development experience, hands on development experience using Microsoft SQL BI suite
- Extensive experience in SQL database upgrade from lower version to upper SQL version
- Solid hands-on experience in SQL Server 2019/2017/2016/2014 Failover Clustering technology
- Substantial hands-on experience in SQL Always-ON, Replication, Log-Shipping and Mirroring technologies
- Expertise in SQL database environment Disaster Recovery (DR) design, implementation and validation
- Hands on experience in many SQL 3rd-party tools, include but are not limited to: Red-Gate tools, Quest tools, Idera tools, SolarWinds DPA, New Relic, SQL Sentry, IBM Guardium, Microsoft tools (SCOM and SCCM).
PROFESSIONAL INDUSTRY EXPERIENCE:
Confidential, New York, New York
Managing Consultant / SQL Developer / SQL DBA / Data Warehousing Developer
- For the most recent project, I led an onshore and offshore team with 6 SQL resources. In the past 18 months, we evaluated and compared AWS, Microsoft Azure, and Google Cloud Platform in all relevant technical aspects, and we chose AWS and Azure as the cloud service providers. We used AWS as the primary cloud environment and Azure as the secondary cloud provider. I led the project to build the AWS and Azure database environment. In AWS, we had about 150 EC2 and RDS instances in 3 isolated VPCs, in Azure, we had about 60 virtual machines and SQL databases. I led the database migration project from Rackspace to AWS and Azure via Lift and Shift approach. Through this migration project, I gained the hands-on experiences in both AWS and Azure cloud environments, including but were not limited to: AWS VPC, EC2, RDS, S3, EBS, and EBS performance, AWS cloud watch, networking, security, VPN, route 53 and performance monitoring, Azure virtual machines, Azure databases, Azure storage, security and Azure active directory. I managed the mission-critical OLTP systems as well as the large data warehouse systems, built and managed the tableau reporting platform and SSRS legacy sites. Evaluated, designed, and implemented the MPP cloud data warehouse system and gradually migrated data reports from legacy data warehouse to the cloud data warehouse. Also supported the local Active Directory, VMWare vSphere servers and legacy EMC, Dell and HP storage systems.
- For one small E-Commerce client, I was a core technical lead for the in-house data warehouse project from the beginning to the end. In a 12-month time frame, designed, developed and implemented the in-house data warehouse. I spent considerable amount of time meeting with each department in the company to understand their daily data requirements, performed in-depth analysis on the current data systems, designed and developed the ETL processes to consolidate all the dispersed data sources into a centralized Operational Data Store (ODS). Led the data modelling using Erwin data modeler, based on the star schema data model, developed and implemented the ETL processes using SQL stored procedures and SSIS packages to import fact and dimension tables into data warehouse environment, designed and calculated multiple layers of data aggregates based on each department’s data requirements. In order to guarantee the data accuracy, I developed clear data lineage diagrams, defined and developed the necessary data validation rules to check the data accuracy based on the data lineage diagrams, also automated the process to send alerts to the team in case there were data accuracy and overnight ETL cycle execution problems. I was also responsible for the installation, configuration and maintenance of the on premise tableau reporting platform, created large data extracts cached in the tableau server for fast data retrieval, developed the processes to perform quarterly tableau software upgrade, daily tableau site backup and routine tableau maintenance, developed the tableau scripts to refresh the cached tableau data extracts.
- Upon the completion of the data warehouse project, I took the ownership to define the data governance policies in order to clarify the process and procedure for the organization to better manage, utilize, maintain, and protect the data, to clarify the roles and responsibilities for each department and key individual to assess, improve, manage and protect the organizational data.
- One recent notable data warehouse tuning project I worked on was a 3-month stabilization and optimization engagement on a 20TB legacy relational data warehouse. The finance client had a large SMP data warehouse which was constantly having ETL performance issues. The overnight ETL processes constantly missed the SLAs. I was tasked to stabilize and optimize this environment before the replacement product can be implemented. Upon taking this project, I performed thorough data flow and data lineage analysis, identified the processes which contributed the most delay in the overnight ETL processes. I took the following steps to fine tune this large legacy data warehouse system.
- Rewrote and simplified the complex SQL queries within the ETL processes to make them SARGable, removed functions and column calculations from the columns that are in the WHERE clause and JOIN statement.
- Used SQL temp table to replace the table variables and CTEs within the ETL processes to stage large intermediate data sets, then created indexes and statistics on temp tables, so database engine can generate better execution plans. Focused on the execution plans, performed the index tuning by adding new and modifying the existing indexes to improve the proper index usage. Checked index usage stats and physical stats to make sure index structures were in good shape.
- Removed excessive triggers on very busy tables to improve DML performance.
- Improved data insert strategy for certain large fact tables to first drop all the indexes before the data inserts, then recreate the same index structure after the data inserts.
- Implemented SQL partitioning on the large SQL tables, this enables the certain queries to use partition elimination to further improve query performance. Implemented data archival process using SQL partition switch to keep the main fact tables small. Implemented page compression on large fact and dimension tables.
- Implemented the incremental statistics update with full scan on the recent data partitions for the large fact and dimension tables in order to create the up-to-date statistics. Implemented the regular online index rebuild on the recent data partitions for the large tables to eliminate the physical fragmentation.
- Created multi-column statistics to assist queries with multi-column predicates to feed database engine accurate histogram information to help generate the more efficient execution plans.
- Performed SQL wait analysis, rebalanced the storage system, reset the certain instance level configurations, such as Server Memory setting, MAX DOP settings, Cost threshold.
- I led the project to perform the comprehensive tuning to ensure the backend DB system can handle that amount of intensive data writes. I built a 8-node mission critical active-active SQL 2016 failover cluster to host the OLTP databases, replaced all HDD hard disks with the high performance SSDs, performed thorough storage system IO performance tests, used the local PCIe SSDs to host tempDB to ensure the dedicated IO bandwidth for tempDB, ensured all the application writes were committed via small transactions. I also built a 10 TB relational data warehouse for OLAP reporting paired with a standalone tableau server for better data visualization and reporting.
- For one e-commerce client, I was tasked to design, implement and support the mission-critical OLTP backend DB system hosted in AWS. This OLTP DB system was the backend of the very busy online shopping websites that may have tens of thousands of concurrent users during the holiday shopping season, therefore, system performance was the top priority from the beginning. Due to the budgetary constraint, SQL 2016 Standard Edition was picked, synchronous SQL mirroring was utilized to provide High Availability and Automatic Application Failover Capability, Log Shipping was used to provide the secondary warm standby read-only databases for reporting and other intensive ad hoc readers. I performed thorough OLTP system tuning from all the relevant aspects to ensure the backend DB system can handle the large amount of the concurrent data reads and writes during the holiday shopping season.
- For one financial client, I led the project to implement Microsoft Parallel Data Warehouse (PDW) system with the help of the onsite Microsoft consultants. This Massive Parallel Processing (MPP) Data Warehouse system included a group of servers with different roles working together as one big data processing appliance. PDW servers included landing zone node, control node, two full data racks with 40 computing nodes, 2 management nodes and one backup node. The PDW environment had more than 60 terabytes storage in total. I led the SQL tasks to design and implement overnight ETL processes to extract data from the OLTP DB system, and load into PDW system. The nightly ELT process loaded about 200GB data from OLTP system to PDW system, this import process included a series of ELT steps, 1). Extract daily OLTP data into compressed zip files. 2). Move the zip files to the PDW landing zone node. 3). Unzip the files. 4). Import the files into the PDW system after necessary data transformations.
- I also created the detailed data lineage diagrams to show every step of the data flows, developed the data verification rules and scripts to verify the data accuracy following the data lineage diagrams, created the automated notifications to alert the team when there were data accuracy and overnight ELT issues.
- Other than that, I also developed a set of TSQL and PowerShell scripts to perform regular PDW administration tasks, such as full/differential backup, data restores, table partitioning, partition maintenance, and general PDW environment monitoring.
- For one financial client, I was tasked to build a 4-node active-active SQL 2016 cluster on Windows 2012 R2 for business-critical transactional SQL databases. I was the lead for the entire planning and implementation phases of this failover clustering project. Some specific tasks included: evaluate the physical hardware and SAN storage vendor, conduct the IO performance tests on the storage system, create and configure windows server cluster, allocate the shared SAN mount points to the windows cluster, create and configure SQL cluster, relocate tempDB to the dedicated local SSD drives, plan the HA and DR procedures. The entire design used SQL failover cluster for High Availability in New York local data center, and used storage level replication to synchronize the SAN drives to the remote New Jersey Data Center for Disaster Recovery.
- For one client in retail industry, I helped architect and Implement SQL Server 2016 HA-DR AlwaysON cluster as the backend OLTP DB cluster, built a 4-node multi-site geo-distributed SQL Server 2016 AlwaysON cluster as the backend of the company’s online shopping websites, with 3 nodes in local New York data center, and 1 node in the remote Disaster Recovery data center. I installed and configured windows cluster and SQL instances, and configured AlwaysON availability groups, configured 2 replicas with synchronous mode with automatic failover, configured the third replica to offload read-only workload, backup operation, physical checkDB maintenances and SSRS reports off the primary SQL OLTP instance.
- I also conducted thorough evaluation on multiple storage vendor, such as EMC VMAX storage array, DELL Compellent storage array, HP 3PAR storage, we picked EMC VMAX storage as the primary database storage. Evaluated and tested the storage performance and some storage layer features, such as auto-tiering, deduplication, fat and thin provisioning, performed storage pool stress test using sqlio utility to compare the IOPS and throughput metrics, configured and optimized storage pool, RAID groups, and provision LUNs to SQL Servers based on the database I/O performance requirement.
- Deployed the Double Take (DT) disk replication software to replicate the SQL databases at storage layer to offsite Data Center for Disaster Recovery. Deployed the SolarWinds DPA to monitor SQL waits and SQL activities in SQL Server production environment. Deployed SQL Sentry to monitor SQL databases at the transaction and query layer.
- Led the project for SQL OLTP platform performance tuning. The specific tuning tasks included but were not limited to: use the premium SSDs to improve the I/O performance, split large database data file into multiple ndf files and place them to the SSD drives, move transactional log files to faster rotational HDD disk drives to improve sequential I/O performance, split tempDB data file into multiple ndf files and move to SSD drive, review and adjust the SQL instance and database level settings, enable SQL backup and database table level compression, improve SQL index maintenance strategy, improve SQL statistics update strategy, and implement SQL table partitioning for large tables for better manageability.
- Led the initiative to utilize SQL Transactional Replication to synchronize a set of SQL tables to multiple SQL subscribers in real-time to meet the business requirements. Implemented the Snapshot Replication to push a set of SQL tables to subscribers based on the pre-defined schedule. Troubleshoot and monitor SQL replication using replication monitor log, agent output logging and replication system stored procedures.
- Led the SQL Server consolidation project to build a centralized multi-node SQL Server production cluster, this project required extensive prior capacity planning and hardware sizing. After the SQL clustered instances were built, I coordinated with business area to gradually migrate all the dispersed SQL databases to the centralized SQL cluster via tier by tier in parallel approach. I also developed PowerShell scripts to monitor and alert on the health of SQL cluster, such as service status, free disk space, CPU usage, RAM usage and SQL instance failover status. Also developed stored procedures to perform the routine SQL maintenance tasks, such as backups, index rebuild, index reorganization, statistics update, and database integrity check.
- Managed the project to migrate 1000+ SSRS reports from SQL 2008R2 SSRS site to SQL 2016 SSRS site. I was tasked to build and configure the new SQL 2016 SSRS site and migrate SSRS reports to the new SSRS site using scripting tool. I have developed the daily RDL backup job, SQL scripts to track and report on the report execution and security access history. I was tasked to design and develop RDLs based on the business reporting requirements and had solid hands-on experiences in report design, development, deployment, access control, report audit, and report subscription.
- Led the project to review and enforce the SQL environment security access control policy, develop the TSQL script to report the logins/users which have elevated permission at instance/database/object level, remove the elevated permission to enforce the security access policy. Develop and implement DDL trigger at instance level to enforce the password policy for all SQL authenticated logins.
- Led the SSAS project which involves installing, patching and configuring the 2008R2 Analysis Services cluster, design, develop, and deploy the SSAS cubes based on the business analytical needs. Hands-on experience in aggregation, hierarchy, measure group, cube partition design and deployment, automate the cube partition process and storage management. Led the initiative to develop a set of administrative XMLAs and MDX scripts to perform cube backup, restore, and Cube partitioning, partition processing, Analysis Services (AS) database monitoring and performance tuning.
- Designed and implemented SQL backup and recovery strategy based on the business needs, utilize the combination of full, differential and log backup to ensure all business data are backed up and stored on the storage system for the required retention period. Deploy the third-party tool Litespeed to perform SQL backup for encryption and better compression ratio. Designed, implemented, and validated the SQL database environment Disaster Recovery (DR) procedure. The DR procedure is fully tested during the annual enterprise-wide Disaster Recovery (DR) practice to ensure the proper DR protocols are reliable and update-to-date in case of real disasters.
- Designed and developed the SSIS packages based on the business ETL needs. Solid hands-on experience in SSIS package design, development, deployment, configuration, logging, troubleshooting and management. Following lists some typical package development projects I worked on.
- Extract data from various data sources to compressed csv files and send the csv files to the intended recipient.
- Retrieve the flat files from remote server and load the files into the SQL tables after applying necessary transformations.
- Transfer database objects between SQL Server instances.
- Loop the OS folder to load the files with the specific file extension.
- Execute TSQL scripts against SQL Server data sources.
- Execute windows process following the specific sequence.
- Process the SSAS Cube measure group partitions.
- Execute XMLA scripts again SSAS database source.
Sr. SQL Server DBA / Data warehouse technologist
- Led the upgrade project from SQL 2005 to SQL 2008R2, worked as the liaison between SQL team and business areas to coordinate the tier by tier upgrade process. Developed the procedures to automate the SQL upgrade process. Defined and scripted SQL instance configuration golden standard. Utilized SYSPrep for SQL server auto-deployment.
- Defined security control model which covers OS file system access control, SQL instance and database level access control, and object level access control. For SOX audited DBs, no direct table access was given, users had to utilize SQL views and stored procedures to interact with the database objects.
- Developed a utility to capture the SQL application account usage outside of the application, developed a set of SQL stored procedures to invoke the SQL profiler trace to run in the background to capture the details of each incoming SQL connection. Designed and deployed a set of SQL application account usage reports to SSRS report server, utilized the report subscription to deliver reports to the security auditing team.
- Setup Transactional Replication to synchronize a subset of tables to multiple subscribers in real-time for reporting purpose. Troubleshoot and fine-tune replication using replication monitor log, output verbose logging and replication stored procedures. Utilize Log-Shipping and SQL Mirroring to create the warm standby SQL DBs for business areas. Build the SSIS package using copy database task to copy database to secondary server based on the package scheduling
- Define a set of SQL development best practices and query tuning steps shared with SQL developers, covering normalization, table schema design, indexing, transaction isolation level, locking, join algorithm, dynamic SQL, etc.
- Develop T-SQL scripts to monitor the health of SQL server, such as top expensive queries, current blockings, long running transactions, missing index, unused indexes, I/O stalls and pending I/O requests, top wait types, top fragmented tables, and top sessions with the most tempDB space usage.
- Summarize a group of key perfmon counters to measure and monitor CPU, memory, I/O system, network, and SQL process performance. Define SQL profiler templates to capture different types of SQL activities, such as long running queries, execution plans and deadlocking.
- Develop T-SQL scripts to create SQL agent jobs to perform SQL routine activities, including full backup, differential backup, log backup, SQL login backup, DB integrity check, re-indexing, index-reorganize, statistics update and history purging.
- Develop an automation process that enables SQL developers to check-in and execute SQL DML/DDL scripts themselves.
- Programming on SQL Server SMO interfaces using VB.NET to build a utility to push SQL maintenance jobs to SQL agent.
Sr. SQL Server DBA
- Create Very Large database with multiple data files, and spread ndf files across different physical disks
- Create and configure database file-group to separate system metadata from application data
- Configure Very Large database environment (split and spread tempDB data file, pre-size tempDB, master and msdb file size)
- Backup Very Large database with multiple parallel backup threads to multiple physical disks to utilize parallel I/O
- Create and schedule database maintenance jobs, including full backup, log backup, index rebuild, and integrity check
- Execute DB re-indexing and index-defrag on indexes with different fragmentation level.
- Update the statistics of the given database based on the sampling rate
- Configure and support SQL transactional replication for SCCM environment which uses transactional replication to synchronize datasets across three geo-distributed data centers. Implement log shipping and SQL mirroring with snapshots to create the secondary warm standby reporting databases.
SQL Server DBA
- 24*7 on-call support for mission-critical database systems
- Installation, Upgrade, Patching, Configuration, Backup and Restore
- Monitoring (Backup Job status, Disk space, SQL error log, windows event log error message)
- Performance Troubleshooting and Tuning (SQL profiler, Performance monitor and custom scripts)