There aren’t many benchmarks which allow you to test your systems against a hybrid OLTP and OLAP workloads. CH-BenCHmark fills that gap by combining TPC-C and TPC-H. You can download the source from the linked site or you can use something like OLTPBench (a collection of benchmarks). However, the TPC-H modified queries are not written for SQL server. In this article, I will add the modified CH-BenCHmark OLAP queries for SQL Server.
This article demonstrates the physical design of a multi-column non-clustered index with include-columns. Many examples on the internet only demonstrate the most simple version of an index with a single column. This article gives a proper view of an index with multiple columns through a simple example. Furthermore, you can see how the include-columns are stored, only at the leaf level of the tree.
Here we use a simple table ‘People’ with 6 columns (ID, First Name, Last Name, Age, Sex, Address). We assume we already have a clustered index created on the ID column (it will be almost no difference if there is no clustered index as well, explained at the end). Now we are going to create the non-clustered index as defined below.
1
2
3
4
CREATE NONCLUSTERED INDEX IX_NAME ON People
(FirstName,LastName)
INCLUDE(Age,Sex)
GO
Below diagram shows the structure of this non-clustered index.
Recently I wanted to run the JOB benchmark for an experiment. This benchmark uses an IMDB dataset, published in 2013. Initially, I had some trouble running the benchmark as it was designed for a PostgreSQL database. And the dataset was created in a UNIX system which can create issues when used in a Windows system. So I decided to share the exact steps you need to take to take in order to create a Microsoft SQL Server database with IMDB dataset. All the scripts used in the project can be found in this Git repo.
If you are a Database administrator or a developer working with a transaction database, you might have come across this problem
“Is it worthy to build that index?”
Exact answer for that question is only known once you build it. However, luckily SQL server provides you with functionality to check the workload performance under hypothetical indexes (without actually creating them)
You can find more information about hypothetical indexes here.
I will just provide you with a simple python code that will help you with the hypothetical index creation. Example code will compose of 3 parts
Index creation
Enabling the index (unlike the normal indexes you need to enable them before using)