How can I define a Pydantic schema for heterogeneous Node-RED workflows when using OpenAI structured outputs?

Hi, I am new to Node-RED. I have a question where I needed help, I posted this question on stackoverflow, but here I paste again:

I’m experimenting with generating Node-RED workflows using an LLM. As expected, the unstructured JSON responses are often messy or invalid.

To improve reliability, I’m trying to use OpenAI’s structured_output feature. It works nicely with simple, fixed schemas — for example:

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

response = client.responses.parse(
    model="gpt-4o-2024-08-06",
    input=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    text_format=CalendarEvent,
)

event = response.output_parsed

However, in my case, the target structure — a Node-RED flow — is much more complex and heterogeneous.

A Node-RED flow (or subflow) can contain many types of nodes, each with its own attributes. The Node-RED admin API docs describe the general structure, e.g.:

{
  "id": "1234",
  "label": "Sheet1",
  "nodes": [ ... ],
  "configs": [ ... ],
  "subflows": [ ... ]
}

My question is:
How should I define a Pydantic model (or a hierarchy of models) that can represent this flexible workflow structure, so that an LLM’s structured output can conform to it?

I understand that each node type may have its own schema, but I’m not sure how to model this polymorphism in a way that still works well with OpenAI’s structured_output or responses.parse.

Any examples or design patterns for handling this kind of heterogeneous JSON structure would be appreciated.