- Optimistic strategy is always beneficial (as expected given we have a hint about the previous value).
- Implicit strategy wins against Explicit one with no secondary indexes (as expected).
- Implicit vs Explicit choice is not as clear with addition of secondary indexes - I'd expect that the latter would be better since we should be losing 1PC optimization of Implicit. The tradeoff appears to be lower latency (with Implicit) vs lower CPU usage and higher foreground QPS (with Explicit).
- We should consider introducing some pacing whenever the replication job is started or resumed if there is data that needs replication - the hit on the foreground workload is massive.
-- Disable ingestion queries attribution to user traffic to see more easily the foreground QPS
SET CLUSTER SETTING sql.internal_executor.attribute_to_user.enabled = false;
-- Disable auto stats collection to have cleaner profiles (focused on the foreground workload and replication)
SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false;
CPU cluster: 33% QPS: 10.6k P99 latency: 3.2ms
22:20 CPU spiked to 93%, QPS dropped to 1.3k at the bottom 22:22 CPU stabilized at 54%, QPS stabilized at 7.3k-7.8k
SQL CPU workload: 11.0% SQL CPU replication: 16.6% CPU cluster: 54% QPS: 7.7k P99 latency: 7.0ms
SET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled = false;
Slight slowdown overall.
SQL CPU workload: 9.7% SQL CPU replication: 22.5% CPU cluster: 57% QPS: 6.5k P99 latency: 9.8ms
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = false; RESET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled;
CPU lower than with defaults, but QPS is lower and latency is higher.
SQL CPU workload: 10.2% SQL CPU replication: 14.9% CPU cluster: 51% QPS: 7.4k P99 latency: 12.0ms
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = false; SET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled = false;
Slight slowdown overall.
SQL CPU workload: 8.6% SQL CPU replication: 22.9% CPU cluster: 56% QPS: 6.3k P99 latency: 16.4ms
RESET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled; RESET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled;
-- there were some issues at the storage level on one node of one cluster, so I wiped everything to start from scratch
CREATE INDEX ON ycsb.usertable(field0); SET CLUSTER SETTING sql.internal_executor.attribute_to_user.enabled = false; SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false;
CPU cluster: 63% QPS: 7.6k P99 latency: 4.2ms
23:06 CPU spiked to 93%, QPS dropped to 470 at the bottom 23:11 cluster B stabilized at 70% CPU, 6.8k QPS, two clusters appear to "fight" a bit 23:14 cluster A stabilized at 76% CPU, 4.4k QPS, cluster B is at 75%, 4.8k
SQL CPU workload: 5.5% SQL CPU replication: 7.5% CPU cluster: 75% QPS: 4.6k P99 latency: 16.2ms
SET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled = false;
Slight slowdown overall.
SQL CPU workload: 4.6% SQL CPU replication: 12.5% CPU cluster: 75% QPS: 4.2k P99 latency: 22.8ms
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = false; RESET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled;
23:26-23:31 spikes in WAL Fsync Latencies on cluster A 23:33-23:34 spikes in WAL Fsync Latencies on cluster B
CPU lower than with defaults with QPS being higher and latency slightly higher.
SQL CPU workload: 5.9% SQL CPU replication: 7.8% CPU cluster: 71% QPS: 4.9k P99 latency: 20.8ms
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = false; SET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled = false;
Slight slowdown overall.
SQL CPU workload: 5.5% SQL CPU replication: 10.8% CPU cluster: 72% QPS: 4.2k P99 latency: 26.0ms
RESET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled; RESET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled;
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = false; RESET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled;
-- about 113GiB of live bytes
CREATE INDEX ON ycsb.usertable(field1);
-- about 167GiB of live bytes
CREATE INDEX ON ycsb.usertable(field2);
Things stabilized.
SQL CPU workload: 3.1% SQL CPU replication: 5.0% CPU cluster: 79% QPS: 2.6k P99 latency: 54ms
SET CLUSTER SETTING logical_replication.consumer.use_implicit_txns.enabled = true; SET CLUSTER SETTING logical_replication.consumer.try_optimistic_insert.enabled = true;
SQL CPU workload: 3.5% SQL CPU replication: 3.9% CPU cluster: 81% QPS: 2.6k P99 latency: 31ms