No Description

dependabot[bot] bff2f8ca0a Bump typing-extensions from 4.9.0 to 4.10.0 in /api (#1163) 11 hours ago
.github cd6bdcff64 Configure Sweep (#1160) 1 day ago
api bff2f8ca0a Bump typing-extensions from 4.9.0 to 4.10.0 in /api (#1163) 11 hours ago
charts c568c0688d Update values.yaml 1 month ago
docs 5aca2b27d6 Add Kubernetes helm charts for Serge (#500) 6 months ago
scripts e1f966ace3 Fix ipv4/ipv6 modes (#1153) 2 days ago
vendor 78cff9c47f Updates to CI process for Python dependencies (#912) 3 months ago
web 89fe141e24 Bump eslint from 8.56.0 to 8.57.0 in /web (#1164) 11 hours ago
.dockerignore 78cff9c47f Updates to CI process for Python dependencies (#912) 3 months ago
.gitattributes c936e1d0f2 added .gitattributes 11 months ago
.gitignore da8c3e27d3 Refactor production Dockerfile, Add development Dockerfile (#485) 8 months ago
CODE_OF_CONDUCT.md 0fa1d081eb Create CODE_OF_CONDUCT.md (#88) 11 months ago
Dockerfile 78cff9c47f Updates to CI process for Python dependencies (#912) 3 months ago
Dockerfile.dev 78cff9c47f Updates to CI process for Python dependencies (#912) 3 months ago
LICENSE-APACHE d68734a665 Add support for dual-license (#852) 3 months ago
LICENSE-MIT 2dfcde881a Add support for using wheels when installing llama-cpp-python (#904) 3 months ago
README.md 7151af023f Add support for OpenCodeInterpreter models (#1165) 11 hours ago
docker-compose.dev.yml 1a0a70be46 Update documentation, Add titles to sections (#1054) 1 month ago
docker-compose.yml 12ec7b7f42 Support for DragonflyDB (#598) 6 months ago
sweep.yaml cd6bdcff64 Configure Sweep (#1160) 1 day ago

README.md

Serge - LLaMA made easy 🦙

License Discord

Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted!

  • 🌐 SvelteKit frontend
  • 💾 Redis for storing chat history & parameters
  • ⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

🖥️ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

Category Models
Alfred 40B-1023
**BioMistral 7B
Code 13B, 33B
CodeLLaMA 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python
Gemma 2B, 2B-Instruct, 7B, 7B-Instruct
Falcon 7B, 7B-Instruct, 40B, 40B-Instruct
LLaMA 2 7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST
LLaMA Pro 8B, 8B-Instruct
Med42 70B
Medalpaca 13B
Medicine Chat, LLM
Meditron 7B, 7B-Chat, 70B
Mistral 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca
MistralLite 7B
Mixtral 8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1
Neural-Chat 7B-v3.3
Notus 7B-v1
Notux 8x7b-v1
Nous-Hermes 2 Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT
OpenChat 7B-v3.5-1210
OpenCodeInterpreter DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B
OpenLLaMA 3B-v2, 7B-v2, 13B-v2
Orca 2 7B, 13B
Phi 2 2.7B
Python Code 13B, 33B
PsyMedRP 13B-v1, 20B-v1
Starling LM 7B-Alpha
TinyLlama 1.1B
Vicuna 7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder
WizardLM 7B-v1.0, 13B-v1.2, 70B-v1.0
Zephyr 3B, 7B-Alpha, 7B-Beta

Additional models can be requested by opening a GitHub issue. Other models are also available at Serge Models.

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

💬 Support

Need help? Join our Discord

🧾 License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License and Apache-2.0.

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build