26 Oct

DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees

Abstract:

“Automating physical database design has remained a long-term interest in database research due to substantial performance gains afforded by optimised structures. Despite significant progress, a majority of today’s commercial solutions are highly manual, requiring offline invocation by database administrators (DBAs) who are expected to identify and supply representative training workloads. Unfortunately, the latest advancements like query stores provide only limited support for dynamic environments. This status quo is untenable: identifying representative static workloads is no longer realistic; and physical design tools remain susceptible to the query optimiser’s cost misestimates (stemming from unrealistic assumptions such as attribute value independence and uniformity of data distribution). We propose a self-driving approach to online index selection that eschews the DBA and query optimiser, and instead learns the benefits of viable structures through strategic exploration and direct performance observation. We view the problem as one of sequential decision making under uncertainty, specifically within the bandit learning setting. Multi-armed bandits balance exploration and exploitation to provably guarantee average performance that converges to a fixed policy that is optimal with perfect hindsight. Our comprehensive empirical results demonstrate up to 75% speed-up on shifting and ad-hoc workloads and 28% speed-up on static workloads compared against a state-of-the-art commercial tuning tool.” [1]

[1] Full Paper: https://arxiv.org/abs/2010.09208

23 Aug

CH-BenCHmark for MS SQL Server – HTAP benchmarking

There aren’t many benchmarks which allow you to test your systems against a hybrid OLTP and OLAP workloads. CH-BenCHmark fills that gap by combining TPC-C and TPC-H. You can download the source from the linked site or you can use something like OLTPBench (a collection of benchmarks). However, the TPC-H modified queries are not written for SQL server. In this article, I will add the modified CH-BenCHmark OLAP queries for SQL Server.

Read More
03 Apr

Split brain issue in elasticsearch- 2 node Cassandra For Titan graph database

Split brain issue in elastic search

I recently configured a 2-node Cassandra cluster with elastic search installed in both nodes creating another 2-node cluster (Cassandra cluster was built using DataStax OpsCenter). Soon after I configured my cluster, I got to know that there are x + 1 issues when we try to get it done with only 2 nodes.

Split brain problem in elastic search

When there are, 2 or more nodes acting like master nodes we call it a split brain. This can happen when one node does not know that it is connected to a cluster with a master. When there is more than 1 master, indexing requests will be completed by both nodes and after some time two copies of the shard are diverged. We might have to do full re-indexing to overcome this issue. If you need more information about the split-brain problem here.

Read More

Last updated by .