AI home automation digital assistant

Hi guys,

I wanted to play around a bit with using AI as a digital assistant for my home automation setup. Mostly as a learning experience and as a fun project, and (of course) that illusive Tony Stark feeling...

I opted to expand my old home automation project node-red-contrib-hal2 instead of starting from scratch with something new. I wanted to play around a bit with MCP, Model Context Protocol, and this little project felt like it would fit the bill.

For a bit of context: hal2 is a small home-automation framework for Node-RED. Every device is a Thing made of typed items — a thermostat has a target temperature item, a plug has an on item and a power item, and so on. That little abstraction turned out to be the key to the whole MCP experiment.

Getting devices in is mostly Matter these days for me. I wrote a companion bridge (node-red-contrib-matterjs-bridge) that talks to a matterjs-server and turns each commissioned Matter node into a hal2 Thing; Zigbee/Shelly/ESPHome come in through their own integrations. The important part is that everything ends up as Things — so the MCP layer never has to know what a device actually is. It just sees items with types, names and tags.

The MCP server

The MCP server is embedded in hal2's Event handler config node. Flip it on and Node-RED starts speaking MCP over HTTP. Point an MCP client at it (I've been using Claude) and you can ask things like "is anyone home, and what's the cabin temperature?" or "dim the living room to 30%" — it reads state and sends commands through the same tool catalog the rest of hal2 uses. Each server also carries a per-location id ("Home" / "Cabin") so an assistant connected to several houses can tell them apart.

Three things turned out to be more interesting than I expected:

1. Auth that isn't tied to one provider.
The MCP spec wants OAuth, and I didn't want to hard-wire a single identity provider. So the server presents itself as an OAuth protected resource and proxies the authorization-server metadata to whatever OIDC provider you point it at: it reads the provider's /.well-known/openid-configuration, uses the advertised endpoints, and verifies the JWT access tokens locally against the provider's JWKS. A tiny dynamic-client-registration shim hands the client a confidential client you registered once. In practice you expose four paths through your reverse proxy:

  • /.well-known/oauth-protected-resource
  • /.well-known/oauth-authorization-server
  • /oauth/register
  • /mcp

…and any standard OIDC provider issuing JWT access tokens should work. I've been running it behind Caddy with PocketID.

2. Designing tools an LLM can actually hit.
This was the biggest lesson. The model only ever sees your tool descriptions and your error messages, so both have to do real work:

  • The catalog is hardware-gated — the server only advertises tools that match configured devices. No covers? No control_cover. Smaller surface, fewer wrong turns.
  • Errors that teach. If the model picks the wrong item, the tool doesn't just say "not found", it returns the device's available_items (ids, names, ha_type, tags) so the next call is right. Things and items are separate namespaces, and that genuinely confused the model until the errors started spelling out the choices.
  • Flexible resolution. You can address an item by id, by name, by ha_type ("the temperature of this sensor"), or by tag to disambiguate (e.g. indoor vs outdoor). Clear errors beat silent guessing every time.

3. Doing the work on the server, not in the model.
Classic case: "show me the cabin water temperature over the last week as a graph." The naive path is to dump every logged sample to the model and let it average — a lot of tokens and a lot of busywork. So get_history can downsample server-side: ask for bucket=hour or bucket=day and you get avg/min/max/count per bucket. A week that's ~2000 raw rows becomes ~168 hourly buckets (a few hundred bytes); the model just renders it. The aggregation belongs on the server, the model should get the finished answer.

A couple more bits

If you want to extend it, you can define your own MCP tools as plain Node-RED flows (a node registers the tool and fires a message when the assistant calls it; another returns the result — text, or even an image like a camera snapshot). And the exact same tool catalog is exposed as a plain JSON request/response API, so non-MCP things can use it too.

Caveats

Big disclaimer: this is experimental and very much a for-fun / learning project. I've only really tested it with Claude.ai as the client, PocketID as the identity provider and Caddy as the reverse proxy — mileage with other combinations may vary.

Links

Would love thoughts from this crowd — especially on the auth side (I'm sure there are sharper ways to do the OAuth dance), and on which tools actually make an AI assistant useful rather than just a gimmick. Happy to go deeper on any part if there's interest.

Here's a few examples:

Get temperature history

Play some music (using the MCP In/MCP Out node and Music Assistant in a separate container)

Not sure about the AI part given the absolute s%it-show that AI costs are coming to.

Also the Matter part not yet of real interest since I haven't had time to get into that yet and I'm not sure we have much in the way of Matter supported hardware in the house. Though I did recently purchase a SONFF Max dongle which includes Matter support.

However, HAL2 looks interesting. Added to my list of things to look at. :smiley:

I've recently been thinking whether I need a new iteration of my home automation setup so it will be interesting to see whether HAL2 could simplify things. Though I do also have 433MHZ devices as well as Zigbee and Wi-Fi.

433 shouldn't be a problem, I've built hal2 to be as hardware agnostic as I could. I built it primarily for MQTT input/output, but it doesn't really care what the input is as long as it's a msg object. Take a look at this example for some simple device definitions and flows.

This is like a weird mindbender - node-red is deterministic. MCP is deterministic. AI is probabilistic, but you expect a deterministic outcome.

This is the crux of all this AI based engineering - people are using and abusing AI with all kinds of guardrails in order to make it behave like a deterministic system, while it never will act as one and it is really getting out of control where people seem to be losing the plot, IMO.

Just trying your hal2 node with one of my sensors that sends MQTT messages.


Would appreciate a bit of guidance on how to fill-in the properties for the Thing Node.

Part of my sample readings from MQTT

homie/ird/node41/readings : msg.payload : Object

object
humidity: 62
temperature: 28.4
etc..

This is how I have configured the MQTT-In node...