ValueError: Model type qwen3_next not supported
Tried to run the prompt on Mac M3 Max with 128Gb. Seems like might have to do with https://github.com/ml-explore/mlx-lm/commit/d6c45998f077ff6a7b57935f4631786de55f1f4c right?
(base) β mlx uv run --with mlx-lm==0.27.1 python prompt.py
Fetching 27 files: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 27/27 [00:00<00:00, 9029.36it/s]
ERROR:root:Model type qwen3_next not supported.
Traceback (most recent call last):
File "/Users/dome/.cache/uv/archive-v0/gG7iMw7pa3R4YIMoZk4wD/lib/python3.12/site-packages/mlx_lm/utils.py", line 67, in _get_classes
arch = importlib.import_module(f"mlx_lm.models.{model_type}")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/importlib/__init__.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1387, in _gcd_import
File "", line 1360, in _find_and_load
File "", line 1324, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'mlx_lm.models.qwen3_next'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dome/work/general/mlx/prompt.py", line 3, in
model, tokenizer = load("mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dome/.cache/uv/archive-v0/gG7iMw7pa3R4YIMoZk4wD/lib/python3.12/site-packages/mlx_lm/utils.py", line 266, in load
model, config = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dome/.cache/uv/archive-v0/gG7iMw7pa3R4YIMoZk4wD/lib/python3.12/site-packages/mlx_lm/utils.py", line 184, in load_model
model_class, model_args_class = get_model_classes(config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dome/.cache/uv/archive-v0/gG7iMw7pa3R4YIMoZk4wD/lib/python3.12/site-packages/mlx_lm/utils.py", line 71, in _get_classes
raise ValueError(msg)
ValueError: Model type qwen3_next not supported.
Got it running with:
mlx uv run --with git+https://github.com/ml-explore/mlx-lm.git python prompt.py
and now get like 45-50 tokens/s!
Wrote a simple one-liner to run this model: https://www.reddit.com/r/LocalLLaMA/comments/1ng7lid/run_qwen3next80ba3binstruct8bit_in_a_single_line/
Great work folks! :)