Skip to content

Instantly share code, notes, and snippets.

@aviator19941
Last active August 29, 2024 18:56
Show Gist options
  • Save aviator19941/f10b5b7a7c3975de4363450b4d7ec68f to your computer and use it in GitHub Desktop.
Save aviator19941/f10b5b7a7c3975de4363450b4d7ec68f to your computer and use it in GitHub Desktop.
llama-3.1-8b-f16 benchmark commands (bs=4)
ROCR_VISIBLE_DEVICES=0 ../iree-build-no-trace/tools/iree-benchmark-module \
--device=hip://0 \
--hip_use_streams=true \
--hip_allow_inline_execution=true \
--device_allocator=caching \
--module={VMFB_NAME} \
--parameters=model={GGUF NAME} \
--function=prefill_bs4 \
--input=4x32xsi64 \
--input=4xsi64 \
--input=4x2xsi64 \
--input=128x1048576xf16 \
--benchmark_repetitions=3
ROCR_VISIBLE_DEVICES=0 ../iree-build-no-trace/tools/iree-benchmark-module \
--device=hip://0 \
--hip_use_streams=true \
--hip_allow_inline_execution=true \
--device_allocator=caching \
--module={VMFB_NAME} \
--parameters=model={GGUF_NAME} \
--function=decode_bs4 \
--input=4x1xsi64 \
--input=4xsi64 \
--input=4xsi64 \
--input=4x2xsi64 \
--input=128x1048576xf16 \
--benchmark_repetitions=3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment