LocalAI
LocalAI is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
Dify allows integration with LocalAI for local deployment of large language model inference and embedding capabilities.
Here, we choose two smaller models that are compatible across all platforms. serves as the default LLM model, and serves as the default Embedding model, for quick local deployment.
NOTE: Ensure that the THREADS variable value in doesn't exceed the number of CPU cores on your machine.
The LocalAI request API endpoint will be available at http://127.0.0.1:8080.
And it provides two models, namely:
If you use the Dify Docker deployment method, you need to pay attention to the network configuration to ensure that the Dify container can access the endpoint of LocalAI. The Dify container cannot access localhost inside, and you need to use the host IP address.
Click "Save" to use the model in the application.
For more information about LocalAI, please refer to: https://github.com/go-skynet/LocalAI