_ _ _
| |__ __ _ ___ __ _| |__ _ _ _ _| | ____ _
| '_ \ / _` / __|/ _` | '_ \| | | | | | | |/ / _` |
| |_) | (_| \__ \ (_| | |_) | |_| | |_| | < (_| |
|_.__/ \__,_|___/\__,_|_.__/ \__,_|\__,_|_|\_\__,_|
This whole code is licensed under the License of Redistribution 1.0. You can find a copy of this licence on this instance, under
https://code.basabuuka.org/alpcentaur/license-of-redistribution
The whole prototype is nested in a tensorflow/tensorflow:2.2.0-gpu docker container. Install graphic card drivers according to your hardware and your OS. To make the tensorflow docker container work.
Also get Ollama running in a docker container, sharing the same network protoRustNet. That it is reachable from the ollama-cli-server under http://ollama:11434/api/generate.
I run my ollama container seperately together with open web ui. Like that, I can administrate the models over the web ui, and then use them by changing the code in the fastapi_server.py file of the ollama-cli-server container.
After having set up the ollama container and the gpu docker drivers, just start the whole project in the
compose
directory with
docker compose up -d
The deb-rust-interface will be running on port 1030.
For instructions how to setup a webserver as reverse proxy, you can contact basabuuka. My nginx configuration for the basabuuka prototype is the following:
upstream protointerface {
server 127.0.0.1:1030;
}
server {
server_name example.org;
# access logs consume resources
access_log /var/log/nginx/example.org.access.log;
location / {
proxy_pass http://protointerface;
}
# set high timeouts for using 14b model on 200 euro gpu
proxy_connect_timeout 300;
proxy_send_timeout 300;
proxy_read_timeout 300;
send_timeout 300;
keepalive_timeout 300;
}