Skip to content

Instantly share code, notes, and snippets.

@ZacharyPatten
Created January 31, 2021 05:54
Show Gist options
  • Save ZacharyPatten/8de188b2bd358ab5c3517cbb55e83632 to your computer and use it in GitHub Desktop.
Save ZacharyPatten/8de188b2bd358ab5c3517cbb55e83632 to your computer and use it in GitHub Desktop.
Random Generation (with efficient exclusions)

Note: code examples are in C#, but the theory should be applicable to most programing languages

"How do I generate random numbers except for certain values?" - This is a relatively common question that I aim to answer with this post. I wrote some extension methods for the topic that look like this:

Random random = new();

int[] randomNumbers = random.Next(
    count: 5,
    minValue: 0,
    maxValue: 100,
    excluded: stackalloc[] { 50, 51, 52, 53 });

// if you want to prevent duplicate values
int[] uniqueRandomNumbers = random.NextUnique(
    count: 5,
    minValue: 0,
    maxValue: 100,
    excluded: stackalloc[] { 50, 51, 52, 53 });

There are two algorithms you can use:

  1. Pool Tracking: you can dump the entire pool of possible values in a data structure (such as an array) and randomly generate indices of that data structure. See Example Here
  2. Roll Tracking: you can track the values that that need to be excluded, reduce the range of random generation, and then apply an offset to the generated values. See Example Here

Which algorithm is faster? It depends... Here are estimated runtime complexities of each algorithm:

  1. Pool Tracking: O(range + count + excluded)
  2. Roll Tracking: O(count * excluded + excluded ^ 2)

Notice how algorithm #1Pool Tracking is dependent on the range of possible values while algorithm #2 Roll Tracking is not. This means if you have a relatively large range of values, then algorithm #2 is faster, otherwise algorithm #1 is faster. So if you want the most efficient method, you just need to compare those runtime complexities based on the parameters and select the most appropriate algorithm. Here is what my "Next" overload currently looks like: See Source Code Here

public static void Next<Step, Random>(int count, int minValue, int maxValue, ReadOnlySpan<int> excluded, Random random = default, Step step = default)
	where Step : struct, IAction<int>
	where Random : struct, IFunc<int, int, int>
{
	if (count * excluded.Length + .5 * Math.Pow(excluded.Length, 2) < (maxValue - minValue) + count + 2 * excluded.Length)
	{
		NextRollTracking(count, minValue, maxValue, excluded, random, step);
	}
	else
	{
		NextPoolTracking(count, minValue, maxValue, excluded, random, step);
	}
}

Notes:

Specifically to point out one benchmark in particular:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1316 (1909/November2018Update/19H2)
Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.200-preview.20614.14
  [Host]     : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
  Job-JNMXTF : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT

InvocationCount=1  UnrollFactor=1  
Method MinValue MaxValue Count Exclued Mean Error StdDev Median
Towel 0 1000000 1: sqrt(sqrt(range)) 1: sqrt(sqrt(range)) 3.575 μs 0.2149 μs 0.6025 μs 3.300 μs
HashSetAndArray 0 1000000 1: sqrt(sqrt(range)) 1: sqrt(sqrt(range)) 6,172.451 μs 128.1054 μs 369.6132 μs 6,091.300 μs
SetHashLinkedAndArray 0 1000000 1: sqrt(sqrt(range)) 1: sqrt(sqrt(range)) 6,975.152 μs 344.0002 μs 1,014.2922 μs 6,633.050 μs
RelativelySimpleCode 0 1000000 1: sqrt(sqrt(range)) 1: sqrt(sqrt(range)) NA NA NA NA

In that benchmark the range was 1,000,000 and the count was sqrt(sqrt(1,000,000)) ~= 31 and the number of excluded values was sqrt(sqrt(1,000,000)) ~= 31 so it is a rather extreme example but it demonstrates the difference between algorithm #1 (3.575 μs) and #2 (6,172.451 μs).

Thanks for reading. Feedback appreciated. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment