Dear @kevinGodell, thanks for posting the ffmpeg module, I am testing it in the context of a home made security system (learning process)
My current workflow is something like this:
-In Scrypted the rstp flows are processed, and motion is detected. This triggers MQTT events that I receive on Node Red. Scrypted also provides some features like exposing to Apple HomeKit and others, and rebroadcasting camera traffic via rtsp.
-The second layer is Node Red. Today I am getting a picture from every camera via http, and procesing it with Tensorflow to detect objects of interst.
Instead of fetching images via http (that provides a lot of timeouts) I would rather use the rtsp stream to get 1 picture at a time via your ffmpeg node.
I have used ffmpeg from the command line, and it provides the frames, which I could save to disk or pipe into another ffmpeg command for further processing. But that leaves the image let's say at the operating system level, and I would rather have the image processed on node red.
Could you please expand the examples so that on e of the outputs shows 1 frame?
I am currently running the code on a MAC Mini (intel, 2014)
Just to be clear so that i understand, you have 2 separate systems that are working together. The Scrypted system connects to your rtsp cams and does basic motion detection and notifies your node-red system of the event via mqtt. Then, you want to try to grab the current frame that triggered the motion detection and perform image analysis on it using tensorflow in node-red?
Are you going to be connecting to the rtsp stream being re-broadcast from Scrypted?
If you are truly going to connect just to grab a single image and then disconnect, i would suggest using the exec node running in exec mode. This will help you to avoid some issues that arise when piping data that is too large to fit inside a single chunk. The exec mode buffers all of the output and gives it all at once when the process exits.
When you capture frames triggered by motion detection, maybe you should capture a sequence instead of just one single frame. Otherwise you might miss the actual one of highest interest. Here I think you have to experiment and fine tune according to your needs and camera characteristics
@kevinGodell showed a fine example and in that you can also define how many frames you like to have, like 15 ffmpeg -i rtsp://somevideostream -f image2pipe -c mjpeg -frames 15 pipe:1
With rtsp you also have a delay relative live view but this might in this case be beneficial instead of using http since the frames with relevant motion will "arrive a bit delayed" so the risk of losing the event is less I think & hope. You have to test and try
The teachable machine node looks very interesting, you can train your own models with images from your own cameras. I think such a model should perform better than those generic pre-trained models publicly available. Most of the models around are trained on images with a frontal view of objects but security cameras are normally mounted higher up - looking down. Training a model with images from such mounted cameras should give a good result - in theory. Need to test that at some time with my own images...
Thanks a lot @kevinGodell , you are 100% right on what I would like to achieve.
The solution proposed works really good, although it takes several seconds to grab an image and return to node-red. I guess it is related to the slow rate of keyframes I have set up on the cameras.
In order to save processing time to use on tensorflow I had made an attempt to reduce the frame rate to 3, and the 1 keyframe every 4 seconds if I'm not mistaken.
Lot's of parameters to tweak.
I have found other threads where you have helped other friends with info, I will take some time to read before asking for more help. Thanks for driving me to this wonderful forum, never been here before, we always learn something new. Take care Kevin!
Hi @krambriw, you read my mind. I have in the roadmap to use teachable machines model trained to recognize when garage door is open for instance. This could be exported to scrypted and shown in apple's world as a sensor. Infinite options.
On grabbing several images at once you are right, my constraint is CPU power, I currently have 4 cameras 2 MegaPix each, and I am planning to add a couple of old phones with rtsp camera app. As a result I came to the conclusion that maybe having a reduced framerate on the cameras and setting 4 seconds as key-frame may help analyze more images.
Thanks for sharing!
There are many intelligent and creative people here and there is always somebody willing to help.
I have a follow-up question based on your confirmation.
In Scrypted, are you using the camera's built-in motion detection notification, opencv, or pam-diff?
A possible workaround would be to maintain a long running rtsp connection from node-red using ffmpeg to the rtsp source at the Scrypted server. You could have it generate a new jpeg once per second and always keep a fresh jpeg available for when you want to send it to the image analysis tensorflow. Of course, it would use more resources, but if you can use hardware acceleration on the node-red box, then the cpu load would be manageable. Personally, I keep a 22 ffmpeg rtsp connections running on my little pi4. Each cam has a main and sub stream. On the 11 sub streams, I decode the h264 encoded video using hardware acceleration to generate 1 jpeg per second so that I can access the jpeg anytime or even stream it as mjpeg, or use it as the poster image for html5 video element. Your mac mini may have hardware accel available to ffmpeg.
If you want to check the ffmpeg, run this command to list the codecs:
For the decoder, I use h264_v4l2m2m and that ffmpeg process uses a small amount of cpu. FYI, there are limitations on how big the video can be, such as width and height. That's why I use this on my smaller resolution sub streams.
The other possibility is that your camera's substream is outputting mjpeg. In that case, there would be a completely different solution for keeping the jpegs on the ready.
Thanks a lot @kevinGodell , I do like your idea of having a long lasting connection to ffmpeg and receive the stream at 1 frame per second.
I can actually change the second streams to mpeg and stream to 1 frame per second. Not sure though how to access the info from that stream from node-red and extract that frame every second. I would assume that I could have an HTTP request node requesting the MPEG URL, but from that point on I would have to guess how to extract those frames. Thanks for the idea!
I have run the command on the MAC Mini and here is the output:
ffmpeg version 5.0.1-tessus static FFmpeg binaries for macOS 64-bit Copyright (c) 2000-2022 the FFmpeg developers
built with Apple clang version 11.0.0 (clang-122.214.171.124)
configuration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-version3 --pkg-config-flags=--static --disable-ffplay
libavutil 57. 17.100 / 57. 17.100
libavcodec 59. 18.100 / 59. 18.100
libavformat 59. 16.100 / 59. 16.100
libavdevice 59. 4.100 / 59. 4.100
libavfilter 8. 24.100 / 8. 24.100
libswscale 6. 4.100 / 6. 4.100
libswresample 4. 3.100 / 4. 3.100
libpostproc 56. 3.100 / 56. 3.100
DEV.LS h264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 libopenh264 ) (encoders: libx264 libx264rgb libopenh264 h264_videotoolbox )
I have read a bit about ffmpeg but not certain which one would help using MAC Mini's GPU.
I am sure scrypted and node-red could run on its own hardware, but right now everything runs on my old MAC Mini.
Regarding hardware acceleration I need to buy one of those TPUs, but still don't have it (those come with 100% or more taxes in Brazil where I am living right now)
With this flow I am able to process 1 FPS for 4 cameras in a sustained timeframe, which is enaugh by now.
My next challenge is to get the motion detection to work. It has stopped working on srypted, they update stuff very often They use OPENCV to detect motion, and my old MAC Mini is not so friendly with that.
My plan B then is detecting motion on node-red, I have found a node "node-red-contrib-camera-motion" that works fine when there is only one instance, but behaives wierdly when more than one instance is in place.
It is thanks to @BartButenaers we have the "node-red-contrib-multipart-stream-decoder" node, they work also very well, I use it myself in some applications. For http streams it is great!
I briefly looked at Scrypted, looks very nice and modern! Strange if they would remove such an important feature as motion detection. MAybe is on your side, you need to update OpenCV?? I don't know
Motion detection normally starts with comparing frames, number of pixel changes, but it is so much more. It has to handle many things like fast light changes, trees moving in the wind, supporting masks and much more
Alternatives to Scrypted? Maybe you could check out Shinobi that @kevinGodell also works/worked with but I do not know if it meets all your requirements and how it can integrate with Node-RED
Myself I use the pretty old but still maintained Motion, it works also very well as "middleware & motion detector" with my usb cameras. My "residential" setup is pretty customized to my needs:
usb cameras are connected to a number of distributed RPi's, all running Node-RED and Motion
when motion is detected, local recording starts and frames are sent for analyze via MQTT
the analyzer is a Python script with YOLO v4 for object detection, running on a NVIDIA Jetson Nano
if selected objects are found, such images are stored and notifications sent via Telegram
live viewing of cameras, stored images from analyzes, playback of recordings is all handled by Node-RED and it's dashboard running on the same Jetson Nano
Object detection using YOLO is very accurate and resource intensive. That's why the Jetson Nano so YOLO can utilize the power of GPU's. Analyzing an image takes in my case some 0.3 seconds. Running the same script in a RPi4, the analyze takes almost 10 seconds