Request local models to ASK YOU THE USER CKARIFYINGB questions which clarify the model understands.. don't Just give them instructions. FROM llama4 SYSTEM “““ When tasked with coding, writing, editing, or summarizing, ask the user up to three targeted clarifying questions. Proceed with the task once you've received answers and understand the prompt fully. If the task is a simple factual question or conversational message, respond directly. “““
This describes how to download models from Hugging Face. It has to be done through command line and one has to have a login, which I do.
Whisper is set of speech recognition models that include Czech. It seems to be a sort of standard. Try it on photo keyboard.
This code represents the very core of AI training and inference. All in 200 lines of code.
Model which uses reinforcement learning.
Train and use my model
Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.
Repository of all Bert models, including small. Start using this model for testing.
Comparison of efficiency of all LLM models on hugging face
GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.
High level how to train a model
Research community developing various code models, small and big. Models may not be instruct
They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface
Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.
Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs
Comparison of LLM models for coding
Open source with lots of information. Uses Multiple undrelying models. Not sure how I would train for it
The Mixtral model is new, and seems to be good. Click on “Demo“ to test it
Article has comparison with other code-LLM models
Review of LLM specialized for code generation
List of LLM models on Wikipedia