Posts

Showing posts with the label ollama

Improving LLMs CAG with accumulative knowledge using RabbitMQ, Ollama and Gemma3

Image
Improving LLMs CAG with accumulative knowledge using RabbitMQ, Ollama and Gemma3 Why RabbitMQ ? Thinking about an LLM that will receive a lot of messages from everywhere means that you need to keep track of its messages and responses so the context could be understood on the whole conversation. Remember that it can't be just a stream every time; sometimes users would appear after days or months, and the sense of knowing the person should be stored. Now, storing is just one part of the solution for retaining context. We all have heard about RAG and CAG, but what happens if I need to store the information in real-time? You know, users keep sending messages. So the saving and retrieving messages would be blocking the AI's response time. Therefore, I figured why not use a queue system that notifies a consumer about new messages, stores them in a DB, so t...