Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
mlabonne/OmniTruthyBeagle-7B-v0 π | 57.8 | 45.72 | 77.49 | 76.16 | 50.18 |
mlabonne/NeuralOmniBeagle-7B-v2 π | 57.75 | 45.86 | 77.31 | 75.34 | 50.09 |
mlabonne/OmniBeagle-7B π | 57.72 | 45.64 | 77.48 | 75.03 | 50.03 |
mlabonne/NeuralOmniBeagle-7B π | 57.71 | 45.85 | 77.26 | 76.06 | 50.03 |
mlabonne/NeuralOmni-7B π | 57.7 | 45.8 | 77.5 | 74.51 | 49.8 |
mlabonne/OmniTruthyBeagle-7B π | 57.69 | 45.65 | 77.22 | 75.77 | 50.21 |
mlabonne/Omnarch-7B π | 57.64 | 45.88 | 77.28 | 74.07 | 49.76 |
mlabonne/BeagleB-7B π | 57.61 | 45.19 | 77.75 | 73.19 | 49.88 |
mlabonne/Monarch-7B π | 57.56 | 45.48 | 77.07 | 78.04 | 50.14 |
mlabonne/Beyonder-4x7B-v3 π | 57.55 | 45.85 | 76.67 | 74.98 | 50.12 |
abideen/AlphaMonarch-daser π | 57.55 | 45.48 | 76.95 | 78.46 | 50.21 |
mlabonne/AlphaMonarch-7B π | 57.53 | 45.37 | 77.01 | 78.39 | 50.2 |
mlabonne/NeuralMonarch-7B π | 57.53 | 45.31 | 76.99 | 78.35 | 50.28 |
mlabonne/Beagle4 π | 57.53 | 45.5 | 77.38 | 73.84 | 49.7 |
shadowml/BeagSake-7B π | 57.53 | 45.9 | 77.36 | 72.82 | 49.32 |
shadowml/WestBeagle-7B π | 57.52 | 46.19 | 77.23 | 72.25 | 49.15 |
abideen/AlphaMonarch-dora π | 57.51 | 45.42 | 76.93 | 78.48 | 50.18 |
abideen/AlphaMonarch-laser π | 57.51 | 45.39 | 77.0 | 78.4 | 50.15 |
mlabonne/Monarch-7B-dare π | 57.44 | 45.16 | 77.22 | 77.98 | 49.95 |
shadowml/WestBeagle-7B-gen3 π | 57.42 | 45.74 | 77.28 | 72.29 | 49.23 |
mlabonne/ArchBeagle-7B π | 57.41 | 45.56 | 77.32 | 73.36 | 49.36 |
shadowml/OmnixBeagle-7B π | 57.38 | 45.3 | 77.64 | 75.24 | 49.2 |
shadowml/BeagleSempra-7B π | 57.38 | 45.56 | 77.44 | 73.35 | 49.15 |
mlabonne/Monarch-7B-slerp π | 57.26 | 45.13 | 77.09 | 78.63 | 49.56 |
shadowml/FoxBeagle-7B π | 57.26 | 45.46 | 77.42 | 72.08 | 48.91 |
shadowml/BeagleX-7B π | 57.18 | 45.39 | 77.52 | 72.91 | 48.63 |
flemmingmiguel/MBX-7B-v3 π | 57.14 | 45.08 | 77.72 | 72.61 | 48.63 |
mlabonne/Zebrafish-7B π | 57.13 | 44.92 | 77.18 | 78.25 | 49.28 |
shadowml/Beaglake-7B π | 57.1 | 45.03 | 77.8 | 72.58 | 48.48 |
shadowml/TurdusBeagle-7B-gen3 π | 57.1 | 45.08 | 77.52 | 70.36 | 48.69 |
shadowml/TurdusBeagle-7B π | 57.1 | 45.08 | 77.52 | 70.36 | 48.69 |
shadowml/MBTrix-7B π | 57.08 | 44.92 | 77.14 | 77.26 | 49.18 |
mlabonne/Zebrafish-slerp-7B π | 57.07 | 44.83 | 77.13 | 78.27 | 49.25 |
shadowml/Beagwake-7B π | 57.04 | 45.03 | 77.54 | 72.37 | 48.56 |
shadowml/MBeagleX-7B π | 57.02 | 45.02 | 76.87 | 78.04 | 49.18 |
mlabonne/Zebrafish-linear-7B π | 56.98 | 44.58 | 77.12 | 78.25 | 49.24 |
mlabonne/Zebrafish-dare-7B π | 56.96 | 44.68 | 77.0 | 78.28 | 49.21 |
mlabonne/UltraMerge-7B π | 56.95 | 44.36 | 77.15 | 78.47 | 49.35 |
mlabonne/NeuralBeagle14-7B π | 56.9 | 46.06 | 76.77 | 70.32 | 47.86 |
mlabonne/FrankenMonarch-11b π | 56.89 | 44.01 | 76.45 | 76.7 | 50.22 |
yam-peleg/Experiment26-7B π | 56.85 | 44.49 | 77.06 | 78.58 | 49.0 |
mlabonne/NeuBeagle-7B π | 56.81 | 44.43 | 76.62 | 79.13 | 49.38 |
bardsai/jaskier-7b-dpo-v3.3 π | 56.77 | 44.57 | 76.53 | 80.0 | 49.22 |
mlabonne/DareBeagle-7B-v2 π | 56.75 | 45.6 | 76.58 | 69.48 | 48.07 |
CultriX/NeuralTrix-7B-dpo π | 56.73 | 44.61 | 76.33 | 79.8 | 49.24 |
shadowml/DareBeagle-7B π | 56.72 | 45.47 | 76.63 | 69.48 | 48.05 |
CultriX/NeuralTrix-bf16 π | 56.7 | 44.43 | 76.43 | 80.18 | 49.23 |
mlabonne/UltraMerge-v2-7B π | 56.69 | 44.16 | 76.72 | 79.58 | 49.2 |
argilla/distilabeled-Marcoro14-7B-slerp π | 56.68 | 45.38 | 76.48 | 65.68 | 48.18 |
flemmingmiguel/MBX-7B-v2 π | 56.66 | 44.23 | 77.27 | 71.04 | 48.47 |
mlabonne/NeuralDaredevil-7B π | 56.65 | 45.23 | 76.2 | 67.61 | 48.52 |
shadowml/DareBeagel-2x7B π | 56.63 | 45.51 | 76.56 | 69.45 | 47.82 |
mlabonne/FrakenBeagle14-11B π | 56.58 | 45.08 | 76.08 | 70.93 | 48.58 |
occultml/CatMarcoro14-7B-slerp π | 56.14 | 45.21 | 75.91 | 63.81 | 47.31 |
shadowml/mibe-7B π | 56.13 | 44.22 | 76.9 | 71.25 | 47.27 |
mlabonne/NeuralDarewin-7B π | 56.08 | 45.6 | 74.29 | 63.15 | 48.35 |
mlabonne/Beagle14-7B π | 56.05 | 44.38 | 76.53 | 69.44 | 47.25 |
shadowml/Daredevil-7B π | 56.0 | 44.85 | 76.07 | 64.89 | 47.07 |
mlabonne/Darewin-7B π | 55.96 | 45.08 | 75.36 | 60.94 | 47.44 |
mlabonne/NeuralMarcoro14-7B π | 55.89 | 44.59 | 76.17 | 65.94 | 46.9 |
mlabonne/Beyonder-4x7B-v2 π | 55.88 | 45.29 | 75.95 | 60.86 | 46.4 |
OpenPipe/mistral-ft-optimized-1218 π | 55.84 | 44.74 | 75.6 | 59.89 | 47.17 |
SanjiWatsuki/Kunoichi-DPO-v2-7B π | 55.83 | 44.79 | 75.05 | 65.68 | 47.65 |
mlabonne/FrankenMonarch-7B π | 55.81 | 45.1 | 75.53 | 73.86 | 46.79 |
mlabonne/Marcoro14-7B-slerp π | 55.51 | 44.66 | 76.24 | 64.15 | 45.64 |
Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp π | 55.29 | 43.5 | 74.88 | 63.22 | 47.5 |
fblgit/una-cybertron-7b-v2-bf16 π | 55.24 | 43.29 | 74.98 | 65.32 | 47.45 |
mlabonne/Daredevil-8B π | 54.81 | 44.13 | 73.52 | 59.05 | 46.77 |
openchat/openchat-3.6-8b-20240522 π | 54.73 | 44.03 | 73.67 | 49.78 | 46.48 |
mlabonne/NeuralDaredevil-8B-abliterated π | 54.71 | 43.73 | 73.6 | 59.36 | 46.8 |
mlabonne/Daredevil-8B-abliterated π | 54.26 | 43.29 | 73.33 | 57.47 | 46.17 |
Nexusflow/Starling-LM-7B-beta π | 54.17 | 44.21 | 73.7 | 56.45 | 44.6 |
openchat/openchat-3.5-0106 π | 54.1 | 44.17 | 73.72 | 52.53 | 44.4 |
NousResearch/Hermes-2-Theta-Llama-3-8B π | 53.58 | 43.9 | 72.62 | 56.36 | 44.23 |
openchat/openchat-3.5-1210 π | 53.11 | 42.62 | 72.84 | 53.21 | 43.88 |
mlabonne/NeuralHermes-2.5-Mistral-7B-laser π | 53.07 | 43.54 | 73.44 | 55.26 | 42.24 |
NousResearch/Hermes-2-Pro-Llama-3-8B π | 52.9 | 42.52 | 72.64 | 57.8 | 43.53 |
mlabonne/NeuralHermes-2.5-Mistral-7B π | 52.89 | 43.67 | 73.24 | 55.37 | 41.76 |
microsoft/Phi-3-mini-4k-instruct π | 52.74 | 44.44 | 71.88 | 57.77 | 41.9 |
openchat/openchat_3.5 π | 52.7 | 42.67 | 72.92 | 47.27 | 42.51 |
NousResearch/Hermes-2-Pro-Mistral-7B π | 52.55 | 44.54 | 71.2 | 59.12 | 41.9 |
mlabonne/ChimeraLlama-3-8B-v3 π | 52.49 | 42.11 | 71.48 | 55.03 | 43.87 |
berkeley-nest/Starling-LM-7B-alpha π | 52.44 | 42.06 | 72.72 | 47.33 | 42.53 |
FuseAI/FuseChat-7B-VaRM π | 52.3 | 41.91 | 72.02 | 46.76 | 42.96 |
teknium/OpenHermes-2.5-Mistral-7B π | 52.23 | 42.75 | 72.99 | 52.99 | 40.94 |
FuseAI/OpenChat-3.5-7B-Mixtral π | 52.22 | 41.97 | 71.95 | 46.81 | 42.73 |
FuseAI/OpenChat-3.5-7B-Solar π | 52.2 | 41.61 | 71.99 | 46.7 | 43.01 |
FuseAI/FuseChat-7B-Slerp π | 52.16 | 41.73 | 72.03 | 46.72 | 42.71 |
mlabonne/ChimeraLlama-3-8B-v2 π | 52.13 | 41.01 | 71.11 | 55.48 | 44.26 |
mlabonne/NeuralLlama-3-8B-Instruct-abliterated π | 51.6 | 41.6 | 69.95 | 54.22 | 43.26 |
cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser π | 51.56 | 38.32 | 73.77 | 61.03 | 42.58 |
mlabonne/ChimeraLlama-3-8B π | 51.3 | 39.12 | 71.81 | 52.4 | 42.98 |
meta-llama/Meta-Llama-3-8B-Instruct π | 51.24 | 41.22 | 69.86 | 51.65 | 42.64 |
Open-Orca/Mistral-7B-OpenOrca π | 51.09 | 39.24 | 72.39 | 52.27 | 41.65 |
beowolx/CodeNinja-1.0-OpenChat-7B π | 50.89 | 39.98 | 71.77 | 48.73 | 40.92 |
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 π | 50.81 | 40.23 | 69.5 | 52.44 | 42.69 |
mistralai/Mistral-7B-Instruct-v0.2 π | 50.81 | 38.5 | 71.64 | 66.82 | 42.29 |
cognitivecomputations/dolphin-2.9-llama3-8b π | 50.8 | 39.59 | 69.96 | 55.66 | 42.84 |
mlabonne/Darewin-7B-v2 π | 50.61 | 37.67 | 73.16 | 49.5 | 41.01 |
cognitivecomputations/dolphin-2.8-mistral-7b-v02 π | 50.54 | 38.99 | 72.22 | 51.96 | 40.41 |
mlabonne/Llama-3-DARE-8B π | 50.42 | 38.2 | 71.38 | 50.15 | 41.69 |
HuggingFaceH4/zephyr-7b-alpha π | 50.27 | 38.0 | 72.24 | 56.06 | 40.57 |
mlabonne/zephusion-2x7b π | 50.22 | 37.82 | 72.14 | 55.96 | 40.71 |
dreamgen/opus-v1.2-llama-3-8b π | 50.2 | 37.9 | 70.55 | 50.45 | 42.16 |
Weyaxi/Einstein-v6.1-Llama3-8B π | 50.17 | 36.33 | 73.08 | 55.07 | 41.11 |
cognitivecomputations/dolphin-2.2.1-mistral-7b π | 50.03 | 38.64 | 72.24 | 54.09 | 39.22 |
mlabonne/Meta-Llama-3-12B-Instruct π | 50.0 | 41.7 | 67.71 | 52.75 | 40.58 |
mlabonne/Llama-3-SLERP-8B π | 49.91 | 36.82 | 72.03 | 49.52 | 40.88 |
HuggingFaceH4/zephyr-7b-beta π | 49.62 | 37.33 | 71.83 | 55.1 | 39.7 |
abacusai/Llama-3-Smaug-8B π | 48.98 | 37.15 | 69.12 | 51.66 | 40.67 |
mlabonne/Llama-3-linear-8B π | 48.41 | 39.89 | 70.88 | 43.01 | 34.46 |
asharsha30/LLAMA_Harsha_8_B_ORDP_10k π | 48.22 | 35.54 | 71.15 | 55.39 | 37.96 |
AetherResearch/Cerebrum-1.0-7b π | 48.2 | 35.25 | 71.93 | 46.99 | 37.43 |
Weyaxi/Einstein-v4-7B π | 48.04 | 37.83 | 67.52 | 55.56 | 38.78 |
shadowml/phixtral-4x2_8odd π | 48.03 | 34.46 | 72.34 | 49.56 | 37.3 |
Venkman42/Phiter π | 48.01 | 34.62 | 71.23 | 48.93 | 38.18 |
mlabonne/Llama-3-12B-Instruct π | 48.0 | 36.04 | 67.53 | 51.36 | 40.44 |
Venkman42/PhiPhiter π | 47.93 | 34.65 | 71.14 | 48.54 | 37.99 |
rhysjones/phi-2-orange-v2 π | 47.89 | 34.55 | 70.96 | 54.87 | 38.17 |
Venkman42/ReversePhiter π | 47.87 | 35.0 | 70.64 | 48.31 | 37.97 |
shadowml/phixtral-4x2_8odo π | 47.87 | 33.74 | 71.93 | 48.68 | 37.95 |
mlabonne/phixtral-3x2_8 π | 47.78 | 33.58 | 72.1 | 49.59 | 37.67 |
Muhammad2003/OrpoLlama3-8B π | 47.59 | 34.26 | 70.91 | 55.4 | 37.59 |
Lumpen1/Orpo-Mad-Max-Mistral-7B-v0.3 π | 47.58 | 35.4 | 71.26 | 50.74 | 36.07 |
microsoft/WizardLM-2-7B π | 47.52 | 35.76 | 68.56 | 56.46 | 38.24 |
mlabonne/phixtral-2x2_8 π | 47.45 | 34.1 | 70.44 | 48.78 | 37.82 |
mlabonne/OrpoLlama-3-8B π | 47.37 | 34.17 | 70.59 | 52.39 | 37.36 |
mlabonne/phixtral-4x2_8 π | 47.34 | 33.91 | 70.44 | 48.78 | 37.68 |
rhysjones/phi-2-orange π | 47.33 | 33.37 | 71.33 | 49.87 | 37.3 |
meta-math/MetaMath-Mistral-7B π | 47.25 | 33.91 | 70.12 | 44.83 | 37.71 |
cognitivecomputations/dolphin-phi-2-kensho π | 47.14 | 34.05 | 69.25 | 50.2 | 38.11 |
Locutusque/Llama-3-Orca-1.0-8B π | 47.13 | 34.37 | 69.34 | 49.95 | 37.69 |
mlabonne/Mistralpaca-7B π | 47.08 | 33.48 | 70.71 | 52.89 | 37.06 |
mistralai/Mistral-7B-Instruct-v0.1 π | 46.9 | 33.36 | 67.87 | 55.89 | 39.48 |
meetkai/functionary-small-v2.2 π | 46.82 | 33.15 | 70.35 | 51.5 | 36.97 |
abacaj/phi-2-super π | 46.75 | 31.95 | 70.81 | 48.39 | 37.49 |
cognitivecomputations/dolphin-2_6-phi-2 π | 46.72 | 33.12 | 69.85 | 47.39 | 37.2 |
marcel/phixtral-4x2_8-gates-poc π | 46.34 | 31.78 | 70.22 | 47.01 | 37.02 |
macadeliccc/Mistral-7B-v0.2-OpenHermes π | 46.33 | 35.57 | 67.15 | 42.06 | 36.27 |
Lumpen1/MadWizardOrpoMistral-7b-v0.3 π | 46.18 | 32.47 | 71.75 | 47.45 | 34.33 |
meta-llama/Meta-Llama-3-8B π | 45.92 | 31.1 | 69.95 | 43.91 | 36.7 |
g-ronimo/phi-2-OpenHermes-2.5 π | 45.78 | 30.27 | 71.18 | 43.87 | 35.9 |
Yhyu13/phi-2-sft-dpo-gpt4_en-ep1 π | 45.66 | 30.61 | 71.13 | 48.74 | 35.23 |
lxuechen/phi-2-dpo π | 45.66 | 30.39 | 71.68 | 50.75 | 34.9 |
deepseek-ai/deepseek-moe-16b-chat π | 44.72 | 30.42 | 68.72 | 48.73 | 35.02 |
microsoft/phi-2 π | 44.66 | 27.98 | 70.8 | 44.43 | 35.21 |
mlabonne/Meta-Llama-3-12B π | 44.35 | 29.46 | 68.01 | 41.02 | 35.57 |
stabilityai/stablelm-zephyr-3b π | 43.74 | 34.04 | 62.07 | 46.46 | 35.11 |
mlabonne/Llama-3-12B π | 43.73 | 28.11 | 68.75 | 43.02 | 34.34 |
venkycs/phi-2-instruct π | 43.54 | 25.8 | 67.93 | 44.82 | 36.88 |
Qwen/CodeQwen1.5-7B-Chat π | 38.28 | 27.42 | 53.72 | 44.71 | 33.71 |
Qwen/CodeQwen1.5-7B π | 37.72 | 24.84 | 54.76 | 42.36 | 33.55 |
mlabonne/Gemmalpaca-2B π | 35.52 | 24.48 | 51.22 | 47.02 | 30.85 |
google/gemma-2b π | 32.36 | 22.7 | 43.35 | 39.96 | 31.03 |
google/gemma-2b-it π | 32.26 | 23.76 | 43.6 | 47.64 | 29.41 |
mlabonne/OrcaGemma-2B π | 32.23 | 24.44 | 42.49 | 45.84 | 29.76 |
mlabonne/OrcaGemma-2B-v2 π | 31.79 | 24.22 | 42.24 | 44.51 | 28.9 |
mlabonne/Gemmalpaca-7B π | 31.0 | 21.68 | 40.93 | 44.76 | 30.38 |
google/gemma-7b-it π | 30.81 | 21.33 | 40.84 | 41.7 | 30.25 |
VAGOsolutions/SauerkrautLM-Gemma-7b π | 29.64 | 20.75 | 39.29 | 46.2 | 28.88 |
alpindale/gemma-7b π | 29.24 | 20.67 | 38.48 | 46.66 | 28.58 |
google/gemma-7b π | 29.21 | 20.64 | 38.49 | 46.61 | 28.51 |
Last active
December 27, 2024 08:10
-
-
Save mlabonne/90294929a2dbcb8877f9696f28105fdf to your computer and use it in GitHub Desktop.
Leaderboard made with π§ LLM AutoEval (https://github.com/mlabonne/llm-autoeval) using Nous benchmark suite.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment