-
-
Save awni/e6467ae27c8b8ca688bfaebaa733e177 to your computer and use it in GitHub Desktop.
import os | |
import mlx.core as mx | |
from mlx_lm import load, generate | |
filename = os.path.join(os.path.dirname(mx.__file__), "core/__init__.pyi") | |
with open(filename, 'r') as fid: | |
prompt = fid.read() | |
prompt += "\nHow do you write a self-attention layer using the above API in MLX?" | |
model, tokenizer = load("mlx-community/meta-Llama-3.1-8B-Instruct-4bit") | |
messages = [{"role": "user", "content": prompt}] | |
prompt = tokenizer.apply_chat_template( | |
messages, tokenize=False, add_generation_prompt=True | |
) | |
generate( | |
model, | |
tokenizer, | |
prompt, | |
512, | |
verbose=True, | |
temp=0.0, | |
max_kv_size=4096, | |
) |
It should find it automatically if you install MLX: pip install -U mlx
.
That worked. Thanks! I had installed mlx locally with pip install -e '.'
Got bad output though:
[...]
Returns:
array: The array of zeros with the specified shape.
"""
def zeros_like(a: array, /, *, stream: Union[None, Stream, Device] = None) -> array:
"""
An array of zeros like the input.
Args:
a (array): The input to take the shape and type from.
Returns:
array: The output array filled with zeros.
"""
How do you write a self-attention layer using the above API in MLX?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
assistant
assistant
assistant!
==========
Prompt: 31078 tokens, 670.746 tokens-per-sec
Generation: 7 tokens, 29.623 tokens-per-sec
Peak memory: 7.158 GB
M3 Max 128GB
macOS 14.6.1
mlx 0.16.3
mlx-lm 0.17.0
Ah sorry about that. This relies on a not yet released MLX / MLX LM. We should have the release out which supports this by Thursday. In the meantime here are the instructions to build from source:
pip install git+https://github.com/ml-explore/mlx.git
pip install git+https://github.com/ml-explore/mlx-examples.git@use_fast_rope
No worries. Thanks you so much for the help! With your instructions I managed to get a working recipe from a fresh environment. Leaving it here for those that might be interested.
# Create workspace
mkdir mlx-test
cd mlx-test
# Create new environment
mamba create -n mlx-test python=3.12
mamba activate mlx-test
# Download and install mlx
git clone https://github.com/ml-explore/mlx.git
cd mlx
pip install nanobind
python setup.py develop
python setup.py generate_stubs
cd ..
# Download and install mlx-llm
git clone https://github.com/ml-explore/mlx-examples.git
cd mlx-examples
git checkout use_fast_rope
cd llms
python setup.py develop
cd ../..
# Download and run script
wget https://gist.githubusercontent.com/awni/e6467ae27c8b8ca688bfaebaa733e177/raw/3a7b5dc593130a3de60e5554ac2eaee0be08a540/mlx_api_prompt.py
python mlx_api_prompt.py
Ah sorry about that. This relies on a not yet released MLX / MLX LM. We should have the release out which supports this by Thursday. In the meantime here are the instructions to build from source:
pip install git+https://github.com/ml-explore/mlx.git pip install git+https://github.com/ml-explore/mlx-examples.git@use_fast_rope
Getting an error from this
Running command git checkout -b use_fast_rope --track origin/use_fast_rope
Switched to a new branch 'use_fast_rope'
branch 'use_fast_rope' set up to track 'origin/use_fast_rope'.
Resolved https://github.com/ml-explore/mlx-examples.git to commit 0a52a9d55a5c1bfa6b85ab63b259e4c86e98b62a
ERROR: git+https://github.com/ml-explore/mlx-examples.git@use_fast_rope does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.
Any insights??
@seshakiran See the instructions I posted.
Alternatively, if you already have core/__init__.pyi
, you can do pip install "git+https://github.com/ml-explore/mlx-examples.git@use_fast_rope#egg=mlx-lm&subdirectory=llms" --no-deps
.
@awni Could see it being useful to kv cache MLX docs like this for porting.
@awni on that note above, have you or anyone else found if the method above (getting the methods for MLX via the library) act as the best docs for porting code to MLX?
Was about to start working on a LoRA trainer for FLUX.1 (starting with looking at mflux
), and it's a beast of torch/diffusers/cuda code.
have you or anyone else found if the method above (getting the methods for MLX via the library) act as the best docs for porting code to MLX?
I haven't tried much there tbh. The API I use above includes the docstrings (from which a lot of the docs are autogenerated) so there would be substantial overlap between using that and using the actual docs.
have you or anyone else found if the method above (getting the methods for MLX via the library) act as the best docs for porting code to MLX?
I haven't tried much there tbh. The API I use above includes the docstrings (from which a lot of the docs are autogenerated) so there would be substantial overlap between using that and using the actual docs.
I've been using the MLX .md docs and formatting them with structure via https://github.com/simonw/files-to-prompt
The docstrings is a good idea.
Where is
core/__init__.pyi
or how do I generate it?