The VectorLM finetuning code implementing LoRA hot-swapping can be found in this branch: (link).
Slide deck showcasing architecture of this approach: (link)
This adaption relies on features included in a third-party pull-request (link) for the vLLM project. Since this pull request has not yet been merged at the time of writing, you would need to build vLLM manually from source:
- link to a copy of the branch referenced in the pull request, hosted in a VectorInstitute fork of the vLLM project.
- link to vLLM documentation on steps to install vLLM from source. Be sure to enable the punica kernels (set
VLLM_INSTALL_PUNICA_KERNELS
to1
when installing) to enable LoRA hot-swap support.
Note that the punica vLLM LoRA hot-swap kernels require NVIDIA Ampere GPUs or newer.
Output from an example LoRA hot-swap run. The gemma-2b model was LoRA fine-tuned (learning rate for AdamW: 1e-4) to minimize next-token cross-entropy loss on the following text:
Vector Institute of the University of British Columbia
Given that after about 100 steps, the model started to generate "Vector Institute of the University of British Columbia" when prompted "Vector Institute of", it is reasonable to believe that vLLM did picked up these parameter updates.
$ nvidia-smi -L && nvidia-smi topo -m | head -n 5
GPU 0: NVIDIA A100-SXM4-80GB (UUID: GPU-14b3057c-cd6d-8cf5-2089-926e52fa6904)
GPU 1: NVIDIA A100-SXM4-80GB (UUID: GPU-80d2dc6e-14f7-798e-5ce0-647e33324ef0)
GPU0 GPU1 NIC0 NIC1 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV4 NODE SYS 1-3,5-7,9-10 0 N/A
GPU1 NV4 X SYS SYS 3 N/A
NIC0 NODE SYS X SYS
NIC1 SYS SYS SYS X
(vectorlm-ampere) ~/vectorlm-prod$ PYTHONPATH=`realpath ~/vectorlm-prod/`:$PYTHONPATH python3 examples/llama_example_mp.py --yaml_path configs/config_gemma.yaml --world_size 2
virtualenv/vectorlm-ampere/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
warnings.warn(
INFO 05-07 11:37:24 pynccl.py:58 Loading nccl from library ~/.config/vllm/nccl/cu12/libnccl.so.2.18.1
WARNING 05-07 11:37:27 ray_utils.py:76 Unable to import Ray with ModuleNotFoundError("No module named 'ray'"). For multi-node distributed inference, please install Ray with `pip install ray`.
virtualenv/vectorlm-ampere/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
WARNING 05-07 11:37:29 config.py:1009 Casting torch.bfloat16 to torch.float16.
INFO 05-07 11:37:29 llm_engine.py:82 Initializing an LLM engine (v0.4.0.post1) with config: model='google/gemma-2b', speculative_config=None, tokenizer='google/gemma-2b', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=8192, download_dir=None, load_format=auto, tensor_parallel_size=2, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, seed=0)
rank 1: init_worker_dist started
driver worker: init_worker_dist started
INFO 05-07 11:37:30 pynccl_utils.py:45 vLLM is using nccl==2.18.1
INFO 05-07 11:37:30 pynccl_utils.py:45 vLLM is using nccl==2.18.1
INFO 05-07 11:37:33 utils.py:129 reading GPU P2P access cache from ~/.config/vllm/gpu_p2p_access_cache_for_0,1.json
INFO 05-07 11:37:33 utils.py:129 reading GPU P2P access cache from ~/.config/vllm/gpu_p2p_access_cache_for_0,1.json
driver worker: init_worker_dist completed
rank 1: init_worker_dist completed
rank 1 vllm_init_barrier wait
rank 0 vllm_init_barrier wait
INFO 05-07 11:37:33 selector.py:28 Using FlashAttention backend.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:33 selector.py:28 Using FlashAttention backend.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:33 local_worker_utils.py:193 Worker ready; awaiting tasks
(VectorLMWorker-1 pid=43837) WARNING 05-07 11:37:33 gemma.py:54 Gemma's activation function was incorrectly set to exact GeLU in the config JSON file when it was initially released. Changing the activation function to approximate GeLU (`gelu_pytorch_tanh`). If you want to use the legacy `gelu`, edit the config JSON to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
WARNING 05-07 11:37:33 gemma.py:54 Gemma's activation function was incorrectly set to exact GeLU in the config JSON file when it was initially released. Changing the activation function to approximate GeLU (`gelu_pytorch_tanh`). If you want to use the legacy `gelu`, edit the config JSON to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:33 weight_utils.py:197 Using model weights format ['*.safetensors']
INFO 05-07 11:37:34 weight_utils.py:197 Using model weights format ['*.safetensors']
INFO 05-07 11:37:38 model_runner.py:169 Loading model weights took 2.3556 GB
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:38 model_runner.py:169 Loading model weights took 2.3556 GB
INFO 05-07 11:37:40 multi_gpu_executor.py:71 # GPU blocks: 79751, # CPU blocks: 14563
INFO 05-07 11:37:42 model_runner.py:967 Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 05-07 11:37:42 model_runner.py:971 CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:42 model_runner.py:967 Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:42 model_runner.py:971 CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
INFO 05-07 11:37:46 custom_all_reduce.py:230 Registering 1295 cuda graph addresses
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:46 custom_all_reduce.py:230 Registering 1295 cuda graph addresses
INFO 05-07 11:37:46 model_runner.py:1048 Graph capturing finished in 4 secs.
(VectorLMWorker-1 pid=43837) INFO 05-07 11:37:46 model_runner.py:1048 Graph capturing finished in 4 secs.
Instantiated ManagedLLM: <vectorlm.sampling.utils.ManagedLLM object at 0x7ff1a20adcc0>
main: vllm_init_barrier waiting
main: vllm_init_barrier cleared
(VectorLMWorker-1 pid=43837) rank 1 vllm_init_barrier cleared
rank 0 vllm_init_barrier cleared
Rank: 0, World size: 2
(VectorLMWorker-1 pid=43837) Rank: 1, World size: 2
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.56it/s]
Vector Institute is at the future driving boundary between computer science and applied mathematics. They nurture next-
Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
(VectorLMWorker-1 pid=43837) Gemma's activation function should be approximate GeLU and not exact GeLU.
(VectorLMWorker-1 pid=43837) Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.06it/s]
(VectorLMWorker-1 pid=43837) trainable params: 921,600 || all params: 3,031,382,016 || trainable%: 0.030401974912290304
(VectorLMWorker-1 pid=43837) Model sharded. Per device model parameters are 1515691008
(VectorLMWorker-1 pid=43837) Initializing sampling_engine
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.04s/it]
trainable params: 921,600 || all params: 3,031,382,016 || trainable%: 0.030401974912290304
FSDP config: {'mixed_precision': MixedPrecision(param_dtype=torch.bfloat16, reduce_dtype=torch.bfloat16, buffer_dtype=torch.bfloat16, keep_low_precision_grads=False, cast_forward_inputs=False, cast_root_forward_inputs=True, _module_classes_to_ignore=(<class 'torch.nn.modules.batchnorm._BatchNorm'>,)), 'auto_wrap_policy': functools.partial(<function _or_policy at 0x7ff1b4cad630>, policies=[functools.partial(<function lambda_auto_wrap_policy at 0x7ff1b4cad120>, lambda_fn=<function lora_requires_grad_policy_fn at 0x7ff1b479ba30>), functools.partial(<function transformer_auto_wrap_policy at 0x7ff1b4cad510>, transformer_layer_cls={<class 'transformers.models.gemma.modeling_gemma.GemmaDecoderLayer'>})]), 'sharding_strategy': <ShardingStrategy.FULL_SHARD: 1>, 'device_id': 0, 'param_init_fn': None, 'sync_module_states': True}
Model sharded. Per device model parameters are 1515691008
Train dataset length 1000
Eval dataset length 100
Initializing sampling_engine
0%| | 0/63 [00:00<?, ?it/s]Evaluating
Step: 0, eval loss: 4.679570879255023
WARNING 05-07 11:37:56 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.72it/s]
Vector Institute of the University of Toronto together with a new partner, MicroStrategy, are bringing two new | 1/3 [00:00<00:00, 7.59it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of Toronto together with a new partner, MicroStrategy, are bringing two new
3%|████▏ | 2/63 [00:03<01:33, 1.53s/it]LR: 0.0001
10%|████████████▍ | 6/63 [00:04<00:25, 2.21it/s]LR: 9.99888864929809e-05
13%|████████████████▋ | 8/63 [00:05<00:19, 2.81it/s]WARNING 05-07 11:38:00 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.49it/s]
Vector Institute of the Open University of Venice is the first private research university accredited in Italy. It has | 1/3 [00:00<00:00, 7.84it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the Open University of Venice is the first private research university accredited in Italy. It has
16%|████████████████████▋ | 10/63 [00:08<00:43, 1.22it/s]LR: 9.995555091232516e-05
Evaluating
Step: 10, eval loss: 4.594359261648996
22%|████████████████████████████▉ | 14/63 [00:09<00:22, 2.17it/s]LR: 9.990000807704114e-05
25%|█████████████████████████████████ | 16/63 [00:10<00:17, 2.71it/s]WARNING 05-07 11:38:05 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.38it/s]
Vector Institute of the Art as a centre of higher education opened in the Vidya Niketan, Radha Nagar | 1/3 [00:00<00:00, 7.81it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the Art as a centre of higher education opened in the Vidya Niketan, Radha Nagar
29%|█████████████████████████████████████▏ | 18/63 [00:13<00:36, 1.23it/s]LR: 9.982228267815643e-05
32%|█████████████████████████████████████████▎ | 20/63 [00:13<00:23, 1.84it/s]Evaluating
Step: 20, eval loss: 4.490373066493443
35%|█████████████████████████████████████████████▍ | 22/63 [00:14<00:21, 1.92it/s]LR: 9.972240926774168e-05
38%|█████████████████████████████████████████████████▌ | 24/63 [00:15<00:15, 2.51it/s]WARNING 05-07 11:38:11 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.59it/s]
Vector Institute of the UPTerm applies sophisticated planning techniques, broad policy knowledge, and sophisticated analytic tools to | 1/3 [00:00<00:00, 7.88it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the UPTerm applies sophisticated planning techniques, broad policy knowledge, and sophisticated analytic tools to
41%|█████████████████████████████████████████████████████▋ | 26/63 [00:18<00:31, 1.16it/s]LR: 9.96004322435508e-05
48%|█████████████████████████████████████████████████████████████▉ | 30/63 [00:19<00:13, 2.38it/s]LR: 9.945640582928437e-05
Evaluating
Step: 30, eval loss: 4.400834492274693
51%|██████████████████████████████████████████████████████████████████ | 32/63 [00:20<00:14, 2.16it/s]WARNING 05-07 11:38:16 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.36it/s]
Vector Institute of the Health Sciences (VIHS) is located at 135 Pine Street in | 1/3 [00:00<00:00, 7.80it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the Health Sciences (VIHS) is located at 135 Pine Street in
54%|██████████████████████████████████████████████████████████████████████▏ | 34/63 [00:23<00:25, 1.16it/s]LR: 9.929039405048501e-05
60%|██████████████████████████████████████████████████████████████████████████████▍ | 38/63 [00:24<00:10, 2.38it/s]LR: 9.910247070607552e-05
63%|██████████████████████████████████████████████████████████████████████████████████▌ | 40/63 [00:25<00:09, 2.55it/s]Evaluating
Step: 40, eval loss: 4.365881238664899
WARNING 05-07 11:38:21 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.17it/s]
Vector Institute of the Humanities Exploring Images for Learning What kind of image do you see and what kind of | 1/3 [00:00<00:00, 7.74it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the Humanities Exploring Images for Learning What kind of image do you see and what kind of
67%|██████████████████████████████████████████████████████████████████████████████████████▋ | 42/63 [00:28<00:19, 1.07it/s]LR: 9.889271933555213e-05
73%|██████████████████████████████████████████████████████████████████████████████████████████████▉ | 46/63 [00:29<00:07, 2.29it/s]LR: 9.866123318184803e-05
76%|███████████████████████████████████████████████████████████████████████████████████████████████████ | 48/63 [00:30<00:05, 2.80it/s]WARNING 05-07 11:38:26 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.95it/s]
Vector Institute of the Arts with the support of UniCamp Foundation are organizing a Cultural Camp "PEACE" | 1/3 [00:00<00:00, 7.66it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the Arts with the support of UniCamp Foundation are organizing a Cultural Camp "PEACE"
79%|███████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 50/63 [00:33<00:10, 1.24it/s]LR: 9.840811514988294e-05
Evaluating
Step: 50, eval loss: 4.270726612636021
86%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 54/63 [00:34<00:04, 2.20it/s]LR: 9.813347776081789e-05
89%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 56/63 [00:35<00:02, 2.76it/s]WARNING 05-07 11:38:31 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.56it/s]
Vector Institute of the University of Toronto (VI) is a one-year PhD program, which provides | 1/3 [00:00<00:00, 7.87it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of Toronto (VI) is a one-year PhD program, which provides
92%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 58/63 [00:38<00:04, 1.19it/s]LR: 9.783744310203491e-05
95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 60/63 [00:38<00:01, 1.80it/s]Evaluating
Step: 60, eval loss: 4.208117348807199
98%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 62/63 [00:40<00:00, 1.90it/s]LR: 9.752014277286432e-05
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 63/63 [00:40<00:00, 1.56it/s]
0%| | 0/63 [00:00<?, ?it/s]WARNING 05-07 11:38:38 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.03it/s]
Vector Institute of the University you have signed in:███████▋ | 1/3 [00:00<00:00, 7.69it/s]
Doctorate in philosophy
The application was processed
(VectorLMWorker-1 pid=43837) Vector Institute of the University you have signed in:
(VectorLMWorker-1 pid=43837) Doctorate in philosophy
(VectorLMWorker-1 pid=43837) The application was processed
5%|██████▏ | 3/63 [00:03<00:51, 1.17it/s]LR: 9.718171782608356e-05
11%|██████████████▌ | 7/63 [00:04<00:21, 2.65it/s]LR: 9.682231870521347e-05
Evaluating
Step: 70, eval loss: 4.092340196881976
13%|████████████████▋ | 8/63 [00:05<00:28, 1.90it/s]Repo card metadata block was not found. Setting CardData to empty.
WARNING 05-07 11:38:43 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.21it/s]
Vector Institute of the University of Toronto, Toronto, ON.██▋ | 1/3 [00:00<00:00, 7.75it/s]
Erica Stuckey is a fourth year
(VectorLMWorker-1 pid=43837) Vector Institute of the University of Toronto, Toronto, ON.
(VectorLMWorker-1 pid=43837) Erica Stuckey is a fourth year
17%|██████████████████████▋ | 11/63 [00:08<00:37, 1.41it/s]LR: 9.644210517764014e-05
24%|██████████████████████████████▉ | 15/63 [00:09<00:18, 2.64it/s]LR: 9.60412462635919e-05
25%|█████████████████████████████████ | 16/63 [00:09<00:16, 2.85it/s]Repo card metadata block was not found. Setting CardData to empty.
WARNING 05-07 11:38:48 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.28it/s]
Vector Institute of the McGill University Centre for Interactive Research (CIR): | 1/3 [00:00<00:00, 7.77it/s]
City g is a location-
(VectorLMWorker-1 pid=43837) Vector Institute of the McGill University Centre for Interactive Research (CIR):
(VectorLMWorker-1 pid=43837)
(VectorLMWorker-1 pid=43837) City g is a location-
27%|███████████████████████████████████ | 17/63 [00:12<00:45, 1.00it/s]Evaluating
Step: 80, eval loss: 4.00127329145159
30%|███████████████████████████████████████▏ | 19/63 [00:13<00:33, 1.31it/s]LR: 9.561992016100293e-05
37%|███████████████████████████████████████████████▍ | 23/63 [00:14<00:17, 2.31it/s]LR: 9.517831416629716e-05
38%|█████████████████████████████████████████████████▌ | 24/63 [00:15<00:15, 2.58it/s]WARNING 05-07 11:38:53 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.29it/s]
Vector Institute of the Massachusetts Institute of Technology and Stanford University | 1/3 [00:00<00:00, 7.78it/s]
\section{Term Paper}
May
(VectorLMWorker-1 pid=43837) Vector Institute of the Massachusetts Institute of Technology and Stanford University
(VectorLMWorker-1 pid=43837)
(VectorLMWorker-1 pid=43837) \section{Term Paper}
(VectorLMWorker-1 pid=43837) May
43%|███████████████████████████████████████████████████████▋ | 27/63 [00:18<00:25, 1.43it/s]LR: 9.471662459112747e-05
Evaluating
Step: 90, eval loss: 3.8485614231654575
49%|███████████████████████████████████████████████████████████████▉ | 31/63 [00:19<00:13, 2.32it/s]LR: 9.423505667510724e-05
51%|██████████████████████████████████████████████████████████████████ | 32/63 [00:20<00:11, 2.58it/s]WARNING 05-07 11:38:58 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.07it/s]
Vector Institute of the University of Toronto completed its alchemy at the home of Avi Friedman, Bluenotes | 1/3 [00:00<00:00, 7.70it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of Toronto completed its alchemy at the home of Avi Friedman, Bluenotes
56%|████████████████████████████████████████████████████████████████████████▏ | 35/63 [00:23<00:19, 1.46it/s]LR: 9.373382449457304e-05
59%|████████████████████████████████████████████████████████████████████████████▎ | 37/63 [00:24<00:12, 2.10it/s]Evaluating
Step: 100, eval loss: 3.7914932795933316
62%|████████████████████████████████████████████████████████████████████████████████▍ | 39/63 [00:25<00:11, 2.06it/s]LR: 9.321315086741916e-05
63%|██████████████████████████████████████████████████████████████████████████████████▌ | 40/63 [00:25<00:09, 2.36it/s]WARNING 05-07 11:39:03 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.21it/s]
Vector Institute of the University of British Columbia has been awarding premier a research and postgraduate education programs in data | 1/3 [00:00<00:00, 7.75it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of British Columbia has been awarding premier a research and postgraduate education programs in data
68%|████████████████████████████████████████████████████████████████████████████████████████▋ | 43/63 [00:28<00:13, 1.48it/s]LR: 9.267326725404599e-05
75%|████████████████████████████████████████████████████████████████████████████████████████████████▉ | 47/63 [00:29<00:05, 2.69it/s]LR: 9.21144136544666e-05
Evaluating
Step: 110, eval loss: 3.700671059744699
76%|███████████████████████████████████████████████████████████████████████████████████████████████████ | 48/63 [00:30<00:07, 1.94it/s]WARNING 05-07 11:39:08 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.93it/s]
Vector Institute of the University of British Columbia, three sets of arrows of equal length with tails stacked end | 1/3 [00:00<00:00, 7.66it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of British Columbia, three sets of arrows of equal length with tails stacked end
81%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 51/63 [00:33<00:08, 1.43it/s]LR: 9.153683850161706e-05
87%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 55/63 [00:34<00:03, 2.66it/s]LR: 9.094079855091797e-05
89%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 56/63 [00:35<00:02, 2.87it/s]WARNING 05-07 11:39:13 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.93it/s]
Vector Institute of the University of Toronto honoured 36 of their/our most promising graduates today during | 1/3 [00:00<00:00, 7.66it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of Toronto honoured 36 of their/our most promising graduates today during
90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 57/63 [00:37<00:05, 1.01it/s]Evaluating
Step: 120, eval loss: 3.6563213893345425
94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 59/63 [00:38<00:03, 1.31it/s]LR: 9.032655876613636e-05
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 63/63 [00:40<00:00, 1.57it/s]
0%| | 0/63 [00:00<?, ?it/s]LR: 8.96943922015986e-05
WARNING 05-07 11:39:20 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.16it/s]
Vector Institute of the University of British Columbia, Vancouver, Canada, and Department of Mathematics and Statistics, | 1/3 [00:00<00:00, 7.73it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of British Columbia, Vancouver, Canada, and Department of Mathematics and Statistics,
6%|████████▎ | 4/63 [00:03<00:36, 1.63it/s]LR: 8.904457988080681e-05
Evaluating
Step: 130, eval loss: 3.534939629690988
13%|████████████████▋ | 8/63 [00:05<00:21, 2.51it/s]LR: 8.83774106715125e-05
WARNING 05-07 11:39:25 tokenizer.py:120 No tokenizer found in /dev/shm/4702010, using base model tokenizer instead. (Exception: /dev/shm/4702010 does not appear to have a file named config.json. Checkout 'https://huggingface.co//dev/shm/4702010/tree/None' for available files.)
Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.33it/s]
Vector Institute of the University of British Columbia█████████████████████████████████████████████▎ | 2/3 [00:00<00:00, 15.58it/s]
(VectorLMWorker-1 pid=43837) Vector Institute of the University of British Columbia
19%|████████████████████████▊ | 12/63 [00:08<00:29, 1.75it/s]LR: 8.76931811573033e-05
22%|████████████████████████████▉ | 14/63 [00:09<00:20, 2.38it/s]Evaluating