yabs.io

Yet Another Bookmarks Service

Intersect Tags

+ ai
+ doc
+ huggingface
+ llm
+ model
+ optimize
+ perform

Viewing mzimmerm's Bookmarks

(1) Most cost effective GPU for local LLMs? : LocalLLaMA

[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - 2024-03-05 00:49:23 - public:mzimmerm

ai, doc, llm, model, optimize, perform - 6 | id:1489804 -

GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.

Optimizing LLMs for Speed and Memory

[https://huggingface.co/docs/transformers/v4.35.2/en/llm_tutorial_optimization] - 2024-03-05 00:46:21 - public:mzimmerm

ai, doc, huggingface, llm, model, optimize, perform - 7 | id:1489803 -

With marked bookmarks

Mark all

| (+) | |

Viewing 1 - 2, 2 links out of 2 links, page: 1

Follow Tags

Manage

perform - Please Log In To follow this tag

Export:

JSON XML RSS