@ -150,19 +169,22 @@ rebuild: ## Rebuilds the project
$(MAKE) -C go-gpt2 clean
$(MAKE) -C go-gpt2 clean
$(MAKE) -C go-rwkv clean
$(MAKE) -C go-rwkv clean
$(MAKE) -C whisper.cpp clean
$(MAKE) -C whisper.cpp clean
$(MAKE) -C go-stable-diffusion clean
$(MAKE) -C go-bert clean
$(MAKE) -C go-bert clean
$(MAKE) -C bloomz clean
$(MAKE) -C bloomz clean
$(MAKE) build
$(MAKE) build
prepare:prepare-sourcesgpt4all/gpt4all-bindings/golang/libgpt4all.ago-llama/libbinding.ago-bert/libgobert.ago-gpt2/libgpt2.ago-rwkv/librwkv.awhisper.cpp/libwhisper.abloomz/libbloomz.a## Prepares for building
prepare:prepare-sourcesgpt4all/gpt4all-bindings/golang/libgpt4all.a$(OPTIONAL_TARGETS)go-llama/libbinding.ago-bert/libgobert.ago-gpt2/libgpt2.ago-rwkv/librwkv.awhisper.cpp/libwhisper.abloomz/libbloomz.a## Prepares for building
clean:## Remove build related file
clean:## Remove build related file
rm -fr ./go-llama
rm -fr ./go-llama
rm -rf ./gpt4all
rm -rf ./gpt4all
rm -rf ./go-stable-diffusion
rm -rf ./go-gpt2
rm -rf ./go-gpt2
rm -rf ./go-rwkv
rm -rf ./go-rwkv
rm -rf ./go-bert
rm -rf ./go-bert
rm -rf ./bloomz
rm -rf ./bloomz
rm -rf ./whisper.cpp
rm -rf $(BINARY_NAME)
rm -rf $(BINARY_NAME)
## Build:
## Build:
@ -170,7 +192,8 @@ clean: ## Remove build related file
**LocalAI** is a drop-in replacement REST API compatible with OpenAI API specifications for local inferencing. It allows to run models locally or on-prem with consumer grade hardware, supporting multiple models families compatible with the `ggml` format. For a list of the supported model families, see [the model compatibility table below](https://github.com/go-skynet/LocalAI#model-compatibility-table).
**LocalAI** is a drop-in replacement REST API compatible with OpenAI API specifications for local inferencing. It allows to run models locally or on-prem with consumer grade hardware, supporting multiple models families compatible with the `ggml` format. For a list of the supported model families, see [the model compatibility table below](https://github.com/go-skynet/LocalAI#model-compatibility-table).
- OpenAI drop-in alternative REST API
- OpenAI drop-in alternative REST API
- Supports multiple models
- Supports multiple models, Audio transcription, Text generation with GPTs, Image generation with stable diffusion (experimental)
- Once loaded the first time, it keep models loaded in memory for faster inference
- Once loaded the first time, it keep models loaded in memory for faster inference
- Support for prompt templates
- Support for prompt templates
- Doesn't shell-out, but uses C++ bindings for a faster inference and better performance.
- Doesn't shell-out, but uses C++ bindings for a faster inference and better performance.
@ -23,6 +23,7 @@ LocalAI uses C++ bindings for optimizing speed. It is based on [llama.cpp](https
See [examples on how to integrate LocalAI](https://github.com/go-skynet/LocalAI/tree/master/examples/).
See [examples on how to integrate LocalAI](https://github.com/go-skynet/LocalAI/tree/master/examples/).
### How does it work?
### How does it work?
<details>
<details>
@ -33,6 +34,14 @@ See [examples on how to integrate LocalAI](https://github.com/go-skynet/LocalAI/
## News
## News
- 16-05-2023: 🔥🔥🔥 Experimental support for CUDA (https://github.com/go-skynet/LocalAI/pull/258) in the `llama.cpp` backend and Stable diffusion CPU image generation (https://github.com/go-skynet/LocalAI/pull/272) in `master`.
- 13-05-2023: __v1.11.0__ released! 🔥 Updated `llama.cpp` bindings: This update includes a breaking change in the model files ( https://github.com/ggerganov/llama.cpp/pull/1405 ) - old models should still work with the `gpt4all-llama` backend.
- 13-05-2023: __v1.11.0__ released! 🔥 Updated `llama.cpp` bindings: This update includes a breaking change in the model files ( https://github.com/ggerganov/llama.cpp/pull/1405 ) - old models should still work with the `gpt4all-llama` backend.
- 12-05-2023: __v1.10.0__ released! 🔥🔥 Updated `gpt4all` bindings. Added support for GPTNeox (experimental), RedPajama (experimental), Starcoder (experimental), Replit (experimental), MosaicML MPT. Also now `embeddings` endpoint supports tokens arrays. See the [langchain-chroma](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-chroma) example! Note - this update does NOT include https://github.com/ggerganov/llama.cpp/pull/1405 which makes models incompatible.
- 12-05-2023: __v1.10.0__ released! 🔥🔥 Updated `gpt4all` bindings. Added support for GPTNeox (experimental), RedPajama (experimental), Starcoder (experimental), Replit (experimental), MosaicML MPT. Also now `embeddings` endpoint supports tokens arrays. See the [langchain-chroma](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-chroma) example! Note - this update does NOT include https://github.com/ggerganov/llama.cpp/pull/1405 which makes models incompatible.
@ -106,7 +115,7 @@ Depending on the model you are attempting to run might need more RAM or CPU reso
@ -204,6 +216,8 @@ To build locally, run `make build` (see below).
### Other examples
### Other examples
![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png)
To see other examples on how to integrate with other projects for instance for question answering or for using it with chatbot-ui, see: [examples](https://github.com/go-skynet/LocalAI/tree/master/examples/).
To see other examples on how to integrate with other projects for instance for question answering or for using it with chatbot-ui, see: [examples](https://github.com/go-skynet/LocalAI/tree/master/examples/).
@ -294,6 +308,73 @@ Specifying a `config-file` via CLI allows to declare models in a single file as
See also [chatbot-ui](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) as an example on how to use config files.
See also [chatbot-ui](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) as an example on how to use config files.
### Full config model file reference
```yaml
name: gpt-3.5-turbo
# Default model parameters
parameters:
# Relative to the models path
model: ggml-gpt4all-j
# temperature
temperature: 0.3
# all the OpenAI request options here..
top_k:
top_p:
max_tokens:
batch:
f16: true
ignore_eos: true
n_keep: 10
seed:
mode:
step:
# Default context size
context_size: 512
# Default number of threads
threads: 10
# Define a backend (optional). By default it will try to guess the backend the first time the model is interacted with.