How to build a video surveillance system from scratch?

BTW in the mp4frag node config screen, you can see keyframe segements and playlist segments. If stuff is buffered in memory or on disk, it would be nice to have this visualised also in the flow.

And not sure if it is clearly visualized, but the 4 file icons (at the bottom of the drawing) represent mp4 files stored on disc: recorded footage to view later on.

Frigate looks good, I may need to learn about docker to package up my system.

From my experience over the past few years the idea of using some simple "motion detection" to crop out a region to be sent to the object detector, while logical, may in fact be counter productive. I run 7 4K cameras and 7 1080p cameras and I just resize the full image to the object detector size and run it. Counter-intuitively it works exceptionally well.

Here is a non-alert detection, rejected because the object being outside the borders of my property:

Like Frigate, I use a Coral TPU and MobilenetSSD_v2_coco, the biggest limitation is decoding multiple rtsp streams.

A Jetson Nano with Coral TPU can process five 3 fps 4K cameras and hit an aggregate frame rate of 14.6 fps -- effectively running the AI on every frame.

My design goal was very different, I consider the 24/7 video recording problem solved by things like Zoneminder, Motion, Motioneye, etc, in the open source world and many reasonably priced commercial offerings, I wanted the AI to monitor the video streams and alert only when a person is detected on my property and work with an recording setup than can forward me the rtsp streams or image snapshots.

While I fully understand Bart's dedication to node-red and DIY, I wish he'd focus his magnificent talent on something other than re-inventing this particular wheel.

I do welcome a thread to clarify how to install and use Kevin's magnificent work, as making this easier to use opens up a lot of possibilities. I've been following it but not had time to actually play with it and its development has been so rapid and dynamic I've lost track of how to actually get started beyond a couple of Krambriw's sample flows and sub-flows that I've saved.

My final point is that monitoring live security camera feeds is incredibly boring so much so that I never look at them unless something has happened at which point the AI person detection images provide a great index into where the video might actually be interesting. A few days ago our mail carrier was attacked a block away from my house, luckily she escaped unharmed, but I reviewed my street facing camera video in hopes I could get a good high quality image of the fleeing suspect, no such luck, as apparently he went in the opposite direction, but I did see several police cars circling the neighborhood looking for the perp. I never knew any of this had happened until yesterday.

1 Like

Hi @wb666greene,
Well in fact I know that you are right... Damn you! But it is near the top of my bucket list, so I can't easily remove it :slight_smile:

In fact that was my goal of this discussion. But I might have said that a bit more explicit in the introduction...

The 1st thing we have to do is figure out how what type of video/audio streams are available. Whenever you get a chance to get to the command line, run ffprobe on your input. Be sure to use quotes around the input if it has special characters in it. Change the input url to whatever your cam's docs recommend. Also, try this on the sub stream, if there is one.

kevinGodell ~ $ ffprobe -hide_banner -i "rtsp://username:password@"
Input #0, rtsp, from 'rtsp://admin:Purple@2026@':
    title           : Media Server
  Duration: N/A, start: 0.128000, bitrate: N/A
    Stream #0:0: Video: h264 (Main), yuv420p(progressive), 2688x1520, 100 tbr, 90k tbn, 180k tbc
    Stream #0:1: Audio: aac (LC), 8000 Hz, mono, fltp

edit: a better cmd to show more details:

ffprobe -hide_banner -i "rtsp://admin:Orange2021@" -show_format -show_streams -print_format json
1 Like

That is indeed a good one!!! It are all rather old camera's, so I don't have high hopes. But I am going to buy new ones, as soon as I have a decent setup in Node-RED running.
But first I need to find some time to get my new RPI 4 running with a Samsung SSD disc that I bought last week. As soon as I managed to do that, I will get back here ...

Since you are still shopping, I will give a little recommendation of products that worked good for me so far.

I recently added an external drive to my pi 4 for a place to put my 24/7 recordings. It was tricky to find a good external enclosure that had its own separate power supply and could also be recognized after a power loss or reboot. After much tinkering and a firmware update, I was able to make this Fideco and 6TB WD Purple work for me. The original problem it had was that the stock firmware was so old that if I rebooted the pi while leaving the enclosure powered on, the usb communications would fail.

As for cameras, I have been playing with Amcrest lately. I have moved on from using the cheapest chinese no-brand cams to something that costs a little bit more, around $60 USD. I was pleasantly surprised at the quality and performance. They actually have a web interface that does not require internet explorer or flash player for viewing. They can also handle multiple connections. The cams with audio are already encoded with aac, which makes stream copying them much easier and use less cpu load in ffmpeg. cams tested: 5mp poe no audio, 5mp poe with audio

1 Like

I concur about Amcrest IP cams, another good source for Cameras is "Urban Security Group" ( ). They sell direct or on Amazon (the selection is larger direct) but they have a 90 day no questions asked money back guarantee which I can verify that they honor as they refunded my money on a rather expensive 4K 5-50X zoom camera I'd got for a plate reader project, it worked great in daylight but was unusable at night and when a firmware update they sent me didn't fix it, they refunded my money without hassles.

A post was split to a new topic: Problem with video feed

Hi mister @kevinGodell,
I only had 2 hours available yesterday to play with your nodes, but I'm very impressed ...
Nice work!!

1 Like

Hey Kevin,
I have one Hikvision cam, so going to test that one now.
Your ffprobe command gives me following Video information (not Audio):

    title           : Media Presentation
  Duration: N/A, start: 0.300000, bitrate: N/A
    Stream #0:0: Video: h264 (Main), yuvj420p(pc, bt709, progressive), 2688x1520 [SAR 1:1 DAR 168:95], 20 fps, 20 tbr, 90k tbn, 40 tbc

When I login the web interface of the camera then I see these settings (i.e. no AAC audio):

When I change the stream's video type to both video and audio:


Then indeed your probe command gives me both video and audio information:

    title           : Media Presentation
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: h264 (Main), yuvj420p(pc, bt709, progressive), 2688x1520 [SAR 1:1 DAR 168:95], 20 fps, 20 tbr, 90k tbn, 40 tbc
    Stream #0:1: Audio: pcm_mulaw, 8000 Hz, 1 channels, s16, 64 kb/s

Do you have any recommendations for the audio/video settings?
And can you please guide me a bit of how to compose a good ffmpeg command, that is able to decode the rtsp stream?

In the camera's gui, I would set the i frame interval to the lowest settings that it allows. Since we will be stream copying, that setting will have a direct affect on the segment duration. Shorter segments is usually better for live streaming. For example, if your fps is set to 20 and the i frame is set to 40, then you will have segments with a duration approximately 2 seconds long.

Another thing to consider is the byte size of the segments. Having a higher bitrate equals bigger segments, which is then slower over a network. Lower the quality settings to the least acceptable. That is something you will have to tweak many times to get right.

The profile is something that I never really noticed making much of a difference. Higher profiles add extra data, which allows more compression, but then becomes less compatible with older devices.

Your audio cannot be stream copied and played in the browser since there is no AAC option. It will have to be encoded if you want audio with the video in the browser, which might cost about 5% additional cpu load on the ffmpeg instance.

We should start off with the simplest ffmpeg command and only add extras as needed. If you are going to pipe the content into node-red-contrib-mp4frag, then you can use something like this:

ffmpeg -f rtsp -rtsp_transport tcp -i rtsp://your.cameras.url -f mp4 -c:v copy -c:a aac -movflags +frag_keyframe+empty_moov+default_base_moof pipe:1

Let me know if that gets working, and then we can get a little more advanced.


Splendid Noob level info. Just what I needed, to get started step by step, and gain understanding...
Will get back to you later on. It is my little boy's birthday today. And he has planned a lot of activities :wink:


As you can see in the screenshot above about my camera configuration, I have 20 fps and an I-frame interval of 50. So I expect every segment to contain 2,5 seconds of video footage? But when I use a node-red-contrib-msg-speed node, then I see that have on average 4 messages per 5 seconds:


Based on your explanation I would have expected only two messages to arrive every 5 seconds...
Do you have an idea what I am doing wrong?

Is there an easy way to display those segments e.g. using a node-red-contrib-image-output node (as jpeg images...)? Just to make sure that my segments contain uncorrupted images, and to be able to determine whether the quality is good (e.g. while experimenting to find an acceptable bitrate).

Message !== segment

Remember, you are piping buffer from an external software into node-red. Your system limitation for a pi is about 65k. Your segments are bigger than that, which mean you will have more messages than segments since they are broken into chunks of buffer.

1 Like

That makes sense!
Not sure how to continue from here on...
My messages contain a chunk of a segment, that contain both audio and video. So I cannot determine whether the content is correct. Would be nice if I could extract the images at this point (and display them in a node-red-contrib-image-output node), to test whether my segments are ok. Not sure whether someting like that is possible?

I believe the first chunk of each segment should be playable, but you can't play the second unless you join it to the first.

Wow when I feed it to your mp4frag nodes, then I can see the stream in my dashboard :champagne:
That is good news already...

And still would be nice if I could extract the images, e.g. for testing. Or for example to do license plate recognition, or whatever image processing...

Hey Colin,
I see buffers arriving in my messages, but I have no idea what the content is (format, ...). So would be nice to have some insight in what kind of data is running through my Node-RED wires.

Now that we know you can connect to the cam and stream copy the rtsp content into an mp4 container, we can move on to also creating jpegs. The same situation will occur when outputting other video/images if the size is larger than the system's pipe, the content will come out in chunks. So, if you want to output a jpeg without catching the chunks and re-assembling them, then you would have to lower the jpeg quality or resolution so that it will be within 65k. That would also require tweaking the settings to get right, or just simply use node-red-contrib-pipe2jpeg to catch the chunks and ensure they are complete.

If you are using the exec node, then you will either have to stop outputting mp4 on pipe:1 or use pipe:2, but then you will have to tell ffmpeg not to do error logging which normally goes to that pipe. If using node-red-contrib-ffmpeg-spawn, then you can output as much as you want by selecting more pipes than the standard stdio[1] and stdio[2].

silencing logging to be able to use sterr output for jpegs:
ffmpeg -loglevel quiet -f rtsp -rtsp_transport tcp -i rtsp://your.cameras.url -f mp4 -c:v copy -c:a aac -movflags +frag_keyframe+empty_moov+default_base_moof pipe:1 -f image2pipe -c mjpeg -vf fps=fps=1 pipe:2

or only outputting jpegs:
ffmpeg -f rtsp -rtsp_transport tcp -i rtsp://your.cameras.url -f image2pipe -c mjpeg -vf fps=fps=1 pipe:1

The vf is a filter and can also be used to change the resolution w x h of the output.

Because you will be decoding h264 and encoding jpeg, this will cause high cpu load. Depending on your system and version of ffmpeg and also the input video's resolution, you may be able to decode the h264 encoded video using hardware to reduce some cpu load.

1 Like

So if you need to have separate images in your Node-RED flow, then you would create an (extra) rtsp stream to get those images directly from the camera. I had expecgted you would propose to convert the segments (that are currently running through my Node-RED wires) to images, by an extra Exec node in my flow (which uses ffmpeg).

But that would perhaps be a bit too heavy for a Raspberry?

[EDIT] I think I am telling no-nonsense now... Your first command both gets the segments AND separate images? Or not?

yes, but you will face limits using the exec node instead of the experimental node-red-contrib-ffmpeg-spawn node. If you can confirm which node, exec or ffmpeg-spawn, then I can give better answer of how to proceed.

It depends on many factors. If you can take advantage of the sub streams from your cam and use that as the source for generating a jpeg, while also using hardware acceleration h264 encoding, then you might be able to get away with minimal cpu load. Personally, I have 28 ffmpeg instances running on a pi4 4gb node-red v1.3 and it runs pretty good, but I am not often creating jpegs so there is very little decoding/encoding, just stream copying.

That would still give a high cpu load by re-encoding, plus the extra overhead of another ffmpeg instance running. I suspect that you want a fairly high fps for your jpeg stream, or is it just 1 jpeg per 5 minute period?