have been a bit snowed under with work, I am just about to do a release that also does On Start and On Stop functions (as have been doing a lot of integration with LangGraph AI). Just want to do a bit more testing, will post here this week.
For what it’s worth, at FlowFuse, we’ve just published a new RAG Blueprint too, which, within the flow, is using LangGraph: OpenAI RAG Chat Agent • FlowFuse
Interesting. It would be interesting to see a follow-up article using a fully offline model as well. I suspect that this would be even more interesting in many specialist and niche market areas because there are a lot of people who will want and need specialist models where they will want to be totally certain that their data cannot end up in public models.
The nodes can be fairly “easily” subbed out, as only the OpenAI nodes are the “online” sections. I’ve also been exploring Transformers.JS and Hugging Face models for more offline capabilities in AI generally.
I have worked for a magic circle law firm, and they definitely don’t want to go out to the public web. I use OLLAMA a lot locally
if you use Langchain; use this instead in your function
// npm install @langchain/ollama @langchain/core @langchain/langgraph @langchain/community
const { ChatOllama } = require("@langchain/ollama");
const { BufferWindowMemory } = require("@langchain/community/memory");
const { ConversationChain } = require("langchain/chains");
// Define the LLM for the agent to use
const model = new ChatOllama({
model: "llama3.1", // or your preferred Ollama model
temperature: 0,
baseUrl: "http://localhost:11434" // default Ollama endpoint
});
// Initialize memory with a window of 5 previous messages
const memory = new BufferWindowMemory({
k: 5, // Keep last 5 exchanges
returnMessages: true,
});
// Create the conversation chain
const chain = new ConversationChain({
llm: model,
memory: memory,
});
this.context.set('chain', chain);
if you are using the endpoint in HTTP, here are the endpoints in CURL, dont forget (i sometimes do) to PULL the model first using ollama pull <MODEL_NAME>
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt": "Compose a poem on LLMs",
"stream": false
}'