yabs.io

Yet Another Bookmarks Service

Viewing mzimmerm's Bookmarks

model delete ,

[https://www.xda-developers.com/local-llm-clarifying-questions-system-prompt/] - - public:mzimmerm
llm, local, model - 3 | id:1546855 -

Request local models to ASK YOU THE USER CKARIFYINGB questions which clarify the model understands.. don't Just give them instructions. FROM llama4 SYSTEM “““ When tasked with coding, writing, editing, or summarizing, ask the user up to three targeted clarifying questions. Proceed with the task once you've received answers and understand the prompt fully. If the task is a simple factual question or conversational message, respond directly. “““

[https://huggingface.co/docs/optimum/index] - - public:mzimmerm
ai, doc, huggingface, llm, model, optimum, repo, small, transformer - 9 | id:1489894 -

Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.

[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - - public:mzimmerm
ai, doc, llm, model, optimize, perform - 6 | id:1489804 -

GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.

With marked bookmarks
| (+) | |

Viewing 1 - 50, 50 links out of 61 links, page: 1

Follow Tags

Manage

Export:

JSONXMLRSS