This should give an idea of relative throughput of the models. I could not discern what would be fastest from the names alone.
This is just a speed test. Obviously the larger models will perform better on evaluation benchmarks at the tradeoff of speed. Find the models that meet your throughput requirements then benchmark those for performance on the task you are doing.
Tested on an NVIDIA RTX 3090. CPU is an AMD 7950x, though that should not affect the benchmark much.