Skip to content

Instantly share code, notes, and snippets.

View yuzefovich's full-sized avatar

Yahor Yuzefovich yuzefovich

  • Cockroach Labs
  • San Francisco, CA
View GitHub Profile
name old time/op new time/op delta
ExternalDistinct/spilled=false/ordering=false/shuffled/newTupleProbability=0.100/rows=1024/ordCols=0/type=int-24 42.8µs ± 2% 39.3µs ± 4% -7.98% (p=0.000 n=10+10)
ExternalDistinct/spilled=false/ordering=false/shuffled/newTupleProbability=0.001/rows=1024/ordCols=0/type=int-24 37.6µs ± 2% 34.9µs ± 1% -7.27% (p=0.000 n=10+10)
ExternalDistinct/spilled=false/ordering=false/shuffled/newTupleProbability=0.001/rows=1024/ordCols=0/type=bytes-24 63.2µs ± 2% 59.9µs ± 0% -5.23% (p=0.000 n=10+10)
ExternalDistinct/spilled=true/ordering=false/shuffled/newTupleProbability=0.001/rows=1024/ordCols=0/type=bytes-24 1.53ms ± 9% 1.45ms ± 9% -5.20% (p=0.015 n=10+10)
ExternalDistinct/spilled=true/ordering=true/shuffled/newTupleProbability=0.100/rows=65536/ordCols=0/type=bytes-24 32.5ms ± 3% 31.1ms ± 2% -4.21% (p=0.00

QA Plan - DistSQL with PCR

Background

Reads from stand-by virtual clusters (VCs) are supported by creating another read-only virtual cluster. The original replicating VC is offline and cannot be accessed while we can connect to the reader VC to read data that is "external" to it (but "internal" to the replicating one). This makes it so that some of the observability features (e.g. crdb_internal.ranges virtual table) can be misleading because they show the state of the reader key space (which is empty) and not of the actual stand-by VC (which is non-empty).

Add the AOST information into the DistSQL diagram (cockroachdb/cockroach#127583) to make it easier to confirm that the AOST is added implicitly.

CREATE TABLE t (k INT PRIMARY KEY);
INSERT INTO t SELECT generate_series(1, 100);

Test that in miscellaneous scenarios the inconsistent scan is used:

EXPLAIN (DISTSQL) CREATE STATISTICS s FROM t;

Takeaways

  1. Optimistic strategy is always beneficial (as expected given we have a hint about the previous value).
  2. Implicit strategy wins against Explicit one with no secondary indexes (as expected).
  3. Implicit vs Explicit choice is not as clear with addition of secondary indexes - I'd expect that the latter would be better since we should be losing 1PC optimization of Implicit. The tradeoff appears to be lower latency (with Implicit) vs lower CPU usage and higher foreground QPS (with Explicit).
@yuzefovich
yuzefovich / ldr_experiments
Last active June 25, 2024 04:22
LDR experiments with ingestion strategies
# Default YCSB schema.
Using c6d2f7d87e7fa3edc392d32ffe73662c5aa02abe + #126181
SET CLUSTER SETTING sql.internal_executor.attribute_to_user.enabled = false;
3:14 workload run
3:15 jobs start - 1m dip in QPS from 11k to 1.2k
CPU spiked from 33% up to 93%, then settled at 55%
name old time/op new time/op delta
BoundAccountGrow-24 2.07ns ± 0% 2.08ns ± 1% +0.47% (p=0.003 n=10+10)
TraverseTree/levels=2/children=2-24 79.3ns ± 4% 62.0ns ± 5% -21.88% (p=0.000 n=10+10)
TraverseTree/levels=2/children=4-24 122ns ± 1% 102ns ± 4% -16.28% (p=0.000 n=9+10)
TraverseTree/levels=2/children=8-24 214ns ± 2% 185ns ± 0% -13.40% (p=0.000 n=9+9)
TraverseTree/levels=4/children=2-24 445ns ± 1% 307ns ± 1% -31.12% (p=0.000 n=8+10)
TraverseTree/levels=4/children=4-24 2.23µs ± 0% 1.73µs ± 0% -22.34% (p=0.000 n=10+9)
TraverseTree/levels=4/children=8-24 15.9µs ± 0% 12.2µs ± 0% -23.59% (p=0.000 n=9+10)
TraverseTree/levels=8/children=2-24 8.01µs ± 0% 5.29µs ± 0% -33.90% (p=0.000 n=9+10)
TraverseTree/levels=8/children=4-24 669µs ± 1% 478µs ± 1% -28.54% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=1-24 13.2µs ± 6% 11.2µs ± 5% -15.16% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=32-24 18.1µs ± 5% 15.0µs ± 4% -17.08% (p=0.000 n=10+9)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=2/numInputRows=32-24 17.2µs ± 4% 13.8µs ± 3% -19.90% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=32/numInputRows=32-24 14.9µs ± 2% 11.9µs ± 7% -20.27% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=1024-24 169µs ± 2% 164µs ± 2% -2.91% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=2/numInputRows=1024-24 144µs ± 2% 140µs ± 2% -2.29% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=32/numInputRows=1024-24 62.1µs ± 5% 60.7µs ± 3% -2.17% (p=0.043 n=10+10)
Aggregator/ANY_NO
Select1/Cockroach-24 114µs ± 4% 119µs ± 5% +4.24% (p=0.004 n=10+10)
Select1/SharedProcessTenantCockroach-24 141µs ± 3% 143µs ± 2% ~ (p=0.182 n=10+9)
Select1/MultinodeCockroach-24 121µs ± 2% 127µs ± 3% +5.16% (p=0.000 n=10+10)
Select2/Cockroach-24 547µs ± 2% 562µs ± 1% +2.83% (p=0.000 n=9+9)
Select2/SharedProcessTenantCockroach-24 594µs ± 1% 611µs ± 1% +2.89% (p=0.000 n=10+9)
Select2/MultinodeCockroach-24 1.09ms ± 2% 1.11ms ± 2% +2.38% (p=0.000 n=10+9)
Select3/Cockroach-24
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=1-24 12.3µs ± 3% 11.5µs ± 1% -6.80% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=32-24 16.8µs ± 2% 15.8µs ± 4% -6.36% (p=0.000 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=2/numInputRows=32-24 15.3µs ± 2% 14.8µs ± 5% -3.44% (p=0.043 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=32/numInputRows=32-24 13.0µs ± 1% 13.0µs ± 4% ~ (p=0.853 n=10+10)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=1/numInputRows=1024-24 153µs ± 3% 153µs ± 1% ~ (p=0.905 n=10+9)
Aggregator/ANY_NOT_NULL/hash/int/groupSize=2/numInputRows=1024-24
SELECT pg_catalog.setval('public.actor_actor_id_seq', 200, true);
SELECT pg_catalog.setval('public.address_address_id_seq', 605, true);
SELECT pg_catalog.setval('public.category_category_id_seq', 16, true);
SELECT pg_catalog.setval('public.city_city_id_seq', 600, true);
SELECT pg_catalog.setval('public.country_country_id_seq', 109, true);