yabs.io

Yet Another Bookmarks Service

Search

Results

[https://huggingface.co/docs/optimum/index] - - public:mzimmerm
ai, doc, huggingface, llm, model, optimum, repo, small, transformer - 9 | id:1489894 -

Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.

[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - - public:mzimmerm
ai, doc, llm, model, optimize, perform - 6 | id:1489804 -

GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.

Follow Tags


Export:

JSONXMLRSS