> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kubox.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Self-hosted LLM Chatbot

> An end-to-end example for a self-hosted chatbot with open-source LLM.

## Problem

When adopting Large Language Models (LLMs), businesses face a critical question: *Should you self-host the model or simply access it through an API?*

The right choice depends on several key factors, for example:

* Do you operate in a regulated industry where owning and protecting proprietary data is critical?
* How will it impact or augment existing business processes?
* Do you have the necessary engineering resources and expertise to build, deploy, and maintain an in-house LLM solution from scratch?

Self-hosting LLMs requires advanced DevSecOps and Kubernetes expertise, placing a high operational burden on organisations. The shortage of platform engineering skills raises the barrier to entry, often diverting data teams from integrating LLMs into business systems and delaying real-world impact.

## Solution

With Kubox, your teams can deploy a self-hosted LLM in minutes — without needing Kubernetes or cloud expertise. This end-to-end example includes full source code to deploy an open-source chatbot solution, bringing the power of Large Language Models (LLMs) into your own secure environment, while giving you full control and significantly reducing operational costs.

<img height="200" src="https://mintcdn.com/theenigmaco/7DJClRTbpxSycmG9/examples/chatbot/00-self-hosted-llama.png?fit=max&auto=format&n=7DJClRTbpxSycmG9&q=85&s=dd26bf5ec24c9aac7784f7da6e97766e" data-path="examples/chatbot/00-self-hosted-llama.png" />

### Data Infrastructure

The system uses [Ray.io](https://www.ray.io) and [vLLM](https://github.com/vllm-project/vllm) to efficiently serve a `Meta-Llama-3.1` model on NVIDIA L4 GPUs instance on AWS.

<img height="200" src="https://mintcdn.com/theenigmaco/7DJClRTbpxSycmG9/examples/chatbot/design.svg?fit=max&auto=format&n=7DJClRTbpxSycmG9&q=85&s=6f995b697b59ea45ac2d4102cb154238" data-path="examples/chatbot/design.svg" />

## Source Code

> Visit our Github repository at [https://github.com/kubox-ai/chatbot](https://github.com/kubox-ai/chatbot)

## What's next

In future examples, we’ll dive into practical challenges and solutions for optimising LLM inference performance to build cost effective, scalable and high-performant AI systems.
