Ettore Di Giacinto
|
295f3030a9
|
feat: add typical_p to model parameters (#598)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Ettore Di Giacinto
|
10ddd72b58
|
fix: set default batch size (#597)
|
2 years ago |
Ettore Di Giacinto
|
e37361985c
|
deps: update gpt4all bindings, fix search path on new versions (#592)
|
2 years ago |
Ettore Di Giacinto
|
84946e9275
|
feat: display download progress when installing models (#543)
|
2 years ago |
Ettore Di Giacinto
|
c9bbba4872
|
tests: add llama tests with openllama (#538)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Ettore Di Giacinto
|
5abbb134d9
|
feat: extend model configuration for llama.cpp (#536)
|
2 years ago |
Ettore Di Giacinto
|
d62aef2016
|
feat: add experimental support for falcon-7b (#516)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Ettore Di Giacinto
|
b503725dc7
|
fix: downgrade gpt4all (#503)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Samuel Maynard
|
96794851b3
|
feat: add support for `Stream: true` to completionEndpoint (#465)
|
2 years ago |
Ettore Di Giacinto
|
78ad4813df
|
feat: Update gpt4all, support multiple implementations in runtime (#472)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Aisuko
|
c8a4a4f4e9
|
feat: Add new test cases for LoadConfigs (#447)
Signed-off-by: Aisuko <urakiny@gmail.com>
|
2 years ago |
Pavel Zloi
|
3ba07a5928
|
feat: add LangChainGo Huggingface backend (#446)
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
|
2 years ago |
Aisuko
|
49ce24984c
|
feat: Add more test-cases and remove dev container (#433)
Signed-off-by: Aisuko <urakiny@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
|
2 years ago |
Ettore Di Giacinto
|
f401181cb5
|
fix: switch back to upstream for rwkv bindings (#432)
|
2 years ago |
Ettore Di Giacinto
|
aacb96df7a
|
fix: correctly handle errors from App constructor (#430)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Ettore Di Giacinto
|
217dbb448e
|
feat: allow to set a prompt cache path and enable saving state (#395)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Ettore Di Giacinto
|
76c881043e
|
feat: allow to preload models before startup via env var or configs (#391)
|
2 years ago |
Ettore Di Giacinto
|
bf54b78270
|
feat: add /healthz and /readyz endpoints for kubernetes (#374)
|
2 years ago |
Ettore Di Giacinto
|
9decd0813c
|
feat: update go-gpt2 (#359)
Signed-off-by: mudler <mudler@mocaccino.org>
|
2 years ago |
Robert Hambrock
|
4aa78843c0
|
fix: spec compliant instantiation and termination of streams (#341)
|
2 years ago |
Ettore Di Giacinto
|
6f54cab3f0
|
feat: allow to set cors (#339)
|
2 years ago |
Ettore Di Giacinto
|
05a3d569b0
|
feat: allow to override model config (#323)
|
2 years ago |
Ettore Di Giacinto
|
4e381cbe92
|
feat: support shorter urls for github repositories (#314)
|
2 years ago |
Ettore Di Giacinto
|
1fade53a61
|
feat: minor enhancements to /models/apply (#297)
|
2 years ago |
Ettore Di Giacinto
|
cc9aa9eb3f
|
feat: add /models/apply endpoint to prepare models (#286)
|
2 years ago |
Ettore Di Giacinto
|
3f739575d8
|
Minor fixes (#285)
|
2 years ago |
Ettore Di Giacinto
|
9d051c5d4f
|
feat: add image generation with ncnn-stablediffusion (#272)
|
2 years ago |
Ettore Di Giacinto
|
acd03d15f2
|
feat: add support for cublas/openblas in the llama.cpp backend (#258)
|
2 years ago |
Ettore Di Giacinto
|
a035de2fdd
|
tests: add rwkv (#261)
|
2 years ago |
Ettore Di Giacinto
|
2488c445b6
|
feat: bert.cpp token embeddings (#241)
|
2 years ago |
Ettore Di Giacinto
|
b4241d0a0d
|
tests: enable whisper (#239)
|
2 years ago |
Ettore Di Giacinto
|
8250391e49
|
Add support for gptneox/replit (#238)
|
2 years ago |
Ettore Di Giacinto
|
fd1df4e971
|
whisper: add tests and allow to set upload size (#237)
|
2 years ago |
Ettore Di Giacinto
|
4413defca5
|
feat: add starcoder (#236)
|
2 years ago |
Ettore Di Giacinto
|
85f0f8227d
|
refactor: drop code dups (#234)
|
2 years ago |
Ettore Di Giacinto
|
59e3c02002
|
make use of new bindings for gpt4all (#232)
|
2 years ago |
Matthew Campbell
|
032dee256f
|
Keep whisper models in memory (#233)
|
2 years ago |
Matthew Campbell
|
6b5e2b2bf5
|
Upload transcription API wasn't reading the data from the post (#229)
|
2 years ago |
Ettore Di Giacinto
|
11675932ac
|
feat: add dolly/redpajama/bloomz models support (#214)
|
2 years ago |
Ettore Di Giacinto
|
f8ee20991c
|
feat: add bert.cpp embeddings (#222)
|
2 years ago |
Ettore Di Giacinto
|
9f426578cf
|
feat: add transcript endpoint (#211)
|
2 years ago |
Ettore Di Giacinto
|
89dfa0f5fc
|
feat: add experimental support for embeddings as arrays (#207)
|
2 years ago |
Dave
|
07ec2e441d
|
mini fix - OpenAI documentation url (#200)
|
2 years ago |
mudler
|
8c8cf38d4d
|
tests: use 1 core
|
2 years ago |
mudler
|
009ee47fe2
|
Don't allow 0 as thread count
|
2 years ago |
mudler
|
ec2adc2c03
|
tests: use 3 cores
|
2 years ago |
mudler
|
e62ee2bc06
|
fix: remove trailing 0s from embeddings
This happens when no max_tokens are set, so by default go-llama
allocates more space for the slice and padding happens.
|
2 years ago |
mudler
|
b49721cdd1
|
fix: respect config from file for backends settings
|
2 years ago |
mudler
|
64c0a7967f
|
fix: pass prediction options when using the model
|
2 years ago |
mudler
|
e96eadab40
|
feat: support deprecated embeddings API
|
2 years ago |