I started already a couple of times with the mp4frag related nodes from @kevinGodell to have live video streaming/recording from my camera's.
But I also stopped every time with it, because I lack time and need to switch to other tasks. And when I afterwards go back to mp4frag stuff, then I need to study everything from scratch again...
Therefore I would like to create some kind of overview drawing, so I have my mp4frag cheat sheet
This first version will be full of mistakes, and lack important stuff. But I hope to get some feedback so I can improve it until I have something decent to get me and others started quickly with streaming.
So all constructive input is very welcome! Can be written texts, or just open my drawing in paint and quickly draw some stuff on it. That makes it easy for me to understand, and I will add it to my drawing.
Some things that are e.g. not clear to me:
The mp4frag node has settings for 3 buffers on the readme page. Not entirely sure if I have all these buffers on my drawing, and what rules of thumbs I could use to set an optimal length for them.
Suppose I use the websocket part. I had thought that the segment data would be send via the output messages to the ui node, and that the ui node would push it to the dashboard (via the same websocket used for all other dashboard related data. But it seems to work entirely different? Is that correct and is there a reason to do it like that?
When I would push the segments to the dashboard, does that mean that data is also pushed to my smartphone, even if I am not viewing any camera's in my dashboard?
When my dashboard fetches the stream from the webserver, what kind of communication is there? I mean is the playlist fetched continiously, or something else perhaps?
If I want to show past fragments (e.g. when my PIR detects motion), does this mean they need to be fetched from the pre-buffer?
And how does the webserver knows that he needs to send those past frames, or frames from playlist?
Nice illustration. As you say, there are a few things to change. Currently, I don't have access to any type of paint software to update the drawing. I will try to clarify some things.
It depends on the duration, resolution, bitrate and all the other things that contribute to the byte size of the video and how much memory you have available on your system. The defaults might be good enough for you. I will show an example screenshot of one of my higher resolution video feeds and some of its details:
When tweaking your settings to find optimum playback, you can select the status to output the additional mp4 metadata, which will include additional details that you can use. In the pic, the settings show that 4 segments are buffered. On the front end, that equates to a buffered duration of 3.9 seconds, with the last segment being 0.9 seconds. The total buffer size is 1537 MB. I am using a pi 4 8gb, so I have 22 videos feeds (11 cams, each using the main and sub streams) and there is plenty of unused memory still available.
The settings for buffer output apply to the output that shows the little word buffer when you mouseover it. That is used if you want to dump the initialization and fragments and save them to disk or relay them via mqtt. That does not affect the playback in the browser.
Unfortunately, managing the mediasource's sourcebuffer in the browser is delicate and you can't dump the data to it too quickly. Sometimes it is in a busy state and you have to hold the segment until the mediasource is done eating the previous buffers. Sometime, we have to toss out some old buffers before they are used simply because the browser is choking and can't keep up.
I deliberately do not push the segments to the dashboard because they would still be getting sent even when not being viewed. That works ok for sending small things such as numbers and strings, but does not seem like a good idea for large quantities of video/image buffers. Also, your smart phone may not support mediasource extensions and then it would be pointless for it to receive video buffer that is essentially useless to it. It is consumed on the client by using a fallback system, which is why you can order the players in ui-mp4frag.
I believe that any widget on the dashboard should request its own data. The only data that should be pushed is the configuration for how the widget should connect. Obviously, most or all dashboards do the pushing technique. That is why I only send the playlist so the client side can determine how it wants to consume it. Also, node-red-mp4frag and node-red-ui-mp4frag do not have to be used together. I do not want to tightly couple them. As the guys have shown with the HSS project, they use the mp4frag WITHOUT ui-mp4frag and made their own ui template to consume the video client side.
Playlist is only sent when the video stream becomes available in the backend and when it is turned off and is sent using the normal node.send(). Any video stream is delivered via http if using hls.js or native hls on safari, and socket.io if mediasource is compatible with the browser. Of course, depending on the order of the players. It will try them in order until it finds the first compatible option.
Show past fragments where and how? If you send the write command to mp4frag to start the buffer output, you could then use that recording technique I shared a while back to host your hls m3u8 playlist at which point you could play that using hls.js and go back as far as you want with the recorded video segments. That is outside the capabilities or design of mp4frag, but I did add the buffer output option for the people that wanted a way to access the buffer directly for recording or relay it via mqtt.
The webserver does not send the past frames or segments. Strictly used for the buffer output directly from the node. You would have to save them to disk to deliver them later on.
If you want to see the details after changing settings, view the api at /mp4frag or /mp4frag/your_cam and there is lots of data to pick through. Maybe this can clarify what changes with some of the settings.
Why is a buffer required for relay via MQTT and recording to the filesystem? Probably a stupid question, but can't this data be send immediately to MQTT/filesystem?
I have drawn a dashed line from the filesystem to hls.js? Not sure if that is correct. Or is it better to push the recorded playlist (via some or another node) as an input message to node-red-ui-mp4frag, so that hls.js can fetch the data. Although that won't be possible since the recorded files are not available via your mp4frag node.
Despite your good explanation, my brain doesn't understand the following two sessions?
I assume the buffer output is related to the buffer towards the MQTT relay in my drawing? If so, why do you call it the pre-buffer?
The "Extra" segments is about past segments. That is what I meant in my previous post. Is that somehow related to the playlist buffer (which is accessible via the webserver) or is it related to the buffer towards the mqtt relay? I assume this must be something very logical, but I can't see it unfortunately
I completely understand your opinion about not pushing segment data via the standard dashboard websocket to the frontend. But I don't understand how your websocket channel is different? Does this somehow eliminate the choking effect of the browser?
Just lost my really long post here. Not sure how it happened. I am writing this again, but it won't be as thorough. Always feel free to ask any more follow-ups if i am not clear.
Do you mean that you could directly send the output from ffmpeg to mqtt? Yes, you can do that. But, a single segment will be broken into many pieces, and if a single piece gets lost, the segment will not be able to be re-assembled on the receiving side. Its best to only send whole segments. Just my opinion. Also, it was a feature request from somebody whose service provider did not allow incoming requests, etc., and he needed to pass the video from one node-red to another so he could view the video from the cam. As for the file system and saving to disk, you could do that also. In fact, you don't really need mp4frag for anything. You could have ffmpeg directly write an hls.m3u8 playlist and segments directly to dev/shm and then host http routes serving from there. That might be a little trickier on windows as you will have to get creative without dev/shm. The buffer output feature is there if you want to also do something else with the pieces that you are already using for live viewing. That's why it only outputs when you give it the start command.
Playback of the recordings is tricky. If you save it just as mp4 and don't create an hls.m3u8 to accompany it, it is harder to play in the browser, especially if it has a long duration. Also, desktop players seem to work better when it has the playlist with it. This is due to the segment timestamps missing time from the previous segments. The playlist negates that issue. Also, hls.js can easily move back and forward as you move the play head in the video element. It automatically requests the old segments in list to fill the buffer in the browser. It will do byte range requests to get only what it needs from the recorded file and the express static host can easily handle that request.
Personally, I push the playlist from my recordings to ui-mp4rag. I have a tab setup with all of my current recordings ready to be viewed. And because mp4frag and ui-mp4frag are not tightly coupled, you can serve the recordings from a static dir using node-red and then play that in ui-mp4frag without needing mp4frag. Also, ui-mp4frag is not needed because you can simply make a ui template with a video player and host hls.js somewhere.
I guess I am using past/previous words interchangeably, since they seem synonymous to me. The little yellow box above tries to explain. I don't know what else to call a previous buffered segment. Shortened to fit without wrapping. Looks like trash when it wraps.
Typically when you have a motion detection event and you want to trigger a recording, usually you will want to capture a few seconds of video before the event. Sometimes there was something of interest already happening before you detected it. You may not need previous segments when relaying via mqtt. That is the option of the end user to select 0 or up to 5. Of course, it is a best attempt to retrieve them and they may not be available if your video feed has just started or you deliberately set the size/extra to low values. It's up you to make it NOT work. I could try to detect the other values on the other settings, but I absolutely hate jquery and waste far too much time trying to create the settings page. I am just not a front end guy. jquery should be sealed in concrete and buried deep in the earth
The extra segments is to help out hls.js or native hls. Sometimes it is requesting a segment that has just fallen off the hls.m3u8 playlist. If we clean it up to quickly and allow it to be garbage collected, you will end up with a 404 in the browser due to the segment no longer existing on the server. This happens due to network latency, browser overload, etc. I am really just mimicking the many settings of how ffmpeg can output hls.m3u8 playlist and segments.
Eliminate? no. But it can help a bit by only sending the video buffer when it is being viewed. A user request long ago was to automatically stop the video from playing when it is out of view. It was tricky, but seems to work cross-browser. Most other ui nodes don't probably deal with a high amount of data, so it probably is not a problem for them and they can get away with always pushing, even when out of view. You can try it and see. Scoll it out of view until the percent is reached as set in Unload and Threshold. See the info panel for explanations.
Also, a clarification about the mp4 buffer is needed. The initialization fragment must always be sent before any media segments, otherwise the media segments are useless and not playable. This is true if outputting from the node's 2nd output and saving it to disk or serving via http or socket .io
When sending to the browser and playing using mediasource, we must first ask the browser if it is supports our codec. Then there is the issue of dealing with the mediasource's sourcebuffer that is sometimes too busy to accept a new media segment and you have to try again later when it is available. It is a delicate dance and it all has to be in the right order at the right time. Obviously, much more complicated than mjpeg.
p.s. I can't remember If I made it clear, but there is only 1 list of media segment buffers and its total count is determined by using the size and extra values. So size 4 and extra 2 = 6 total media segments buffered, for example. The pre-video setting can only grab from the already buffered segments and does not keep a separate set of buffered media segments.I try to keep the absolute minimum number of segments in memory at any given time. This is also why I dont copy the msg from input to output. It is too risky to accidentally keep buffer alive longer than it needs to be by passing it along carelessly to the following nodes. I know some have complained about this, but node-red was created long before we ever started pushing large quantities of video buffer through it and I don't want to cause anybody to run out of memory.
I always say that one picture tells more than 1000 words. But this is one of those cases where I cannot draw that picture without you telling me how it works with lots of words
Here is a new version of it, based on what I learned here so far:
I must admit that I have now been able to solve some issues in my setup, based on this...
That seems quite a bit modest to be honest. Such a convenience node is a really nice wrapper to make this kind of complex stuff available for a larger audience...
The deeper I dive into this, the more questions bubble up:
You say that a recorded mp4 on disc should be accompanied by a playlist file. I had always thought that a playlist was referring to multiple mp4 files. But do you mean that the playlist contains timing information about the segments inside that single mp4 file, i.e. timing information which is not available inside the mp4 file itself?
From the docs of your mp4frag node, I had understood that- on the status output - there would be an "initialization" output message, and N "segment" output messages. However I only get one single message when my stream starts:
Did I misunderstood this? Not sure how the player can show a continious live stream, when we only push 1 single playlist to it.
When I use hls.js it works very smooth. Had a quick look at my Chrome developer tools 'network' tabsheet, and then I see this kind of traffic:
Does this mean that a playlist contains a link to 1 next m4s file and 3 previous m4s files? Is that what you mean that hls.js can play forward and backwards perhaps?
One of my colleguas at work is using your nodes, inclusive your recording subflow. He showed me that the subflow also generates a listing.js file, which seems to contain all the m3u8 filenames in the folder:
Is this because an m3u8 file contains the playlist inside a single mp4 file, and your listing file is some kind of custom made playlist of all playlist files? Is this a universal format that you can use somehow? l
I think my drawing is still not correctly regarding the buffers. Initially I thought you had separate buffers for the output (towards e.g. mqtt) and for the ExpressJs webserver (i.e. the playlist that can be fetched). But you say that you - which absolutely make sense - do not keep a separate list of buffered media segments in memory. So I assume both ExpressJs and the buffer output are getting their data from the same buffer? And if hls.js fetches file xxx.m3u8 from your webserver, then there is no real physical file but you compose the file from the data inside those buffers?
If you could draw a "quick" scetch on paper of which buffers are available and from where they are written and read, that would REALLY help me to understand the internals of this. A simple hand-written quick drawn scetch is more than enough!!
Normally I don't go into such depth for other nodes, but understanding your implementation is important for me since a small error in my settings will ruin my entire performance due to the high data rates. Which I might be able to avoid with some extra insights in your internal kitchen...
I can only explain it with a real world situation. I have 11 cams and record them to disk 24/7 and broken into 15 minute groups and kept on disk for 14 days. So, 4 recordings per hour, 96 per day, etc. After some time has past, maybe 12 days or 234 years, regardless, the timing of the previous recordings is not present in the current recordings. If you just save the initialization fragment and combine it with the media segments and try to play it back as a regular old mp4, some players get confused and will be looking for the previous missing video and the play head will be screwed up. I really didn't feel like trying to fix VLC, etc, so I figured an alternative to make the video very playable. When feeding the accompanying hls.m3u8 playlist to the various players, they seemed to like it very much, including hls.js in the browser. I can't explain the science behind it, but it works. Also, there are at least 2 types of hls.m3u8 playlists, some point to multiple segments and some point to a single file at various byte ranges. Keeping it as a single file with the extra playlist made it easier to have not too many files. An alternative would be to use ffmpeg to remux the mp4 recording afterwords and reset the timestamps or you can manually fix the buffer if you can figure such things and have endless time.
The playlist is not being pushed, but a path to the playlist(s) so that the front end can use that info to connect. View it in your browser at /mp4frag/your_cam_name. Also, you can view a browser friendly version of the hls.m3u8.txt and keep refreshing the page to see the updates.
HLS.js is capable of much, which is why it is a gazillion lines of code. It can play back the recorded hls.m3u8 very nicely and allow you to shuffle through it very quickly. It doesn't really matter when playing from a live hls.m3u8, since the segments are designed to be deleted after a short amount of time.
Yeah, that needs some work. Of course, it would be best if all of the recordings were tracked in a database. This was a poor man's alternative that allows me to quickly shuffle through recordings. It needs more brainstorming. I was waiting for you to work on it.
yes. all of the buffer is kept inside the mp4frag depencency and is accessible using some getters.
yes. it was the best way to be cross platform compatible without writing to disk and causing unnecessary wear and tear on our devices.
Well, I am not so organized in my house that I could actually find paper and pencil to make a drawing . I really wouldn't know what to put on it. It's all spaghetti in my mind.
I almost forgot the main selling point of having the accompanying hls.m3u8 with a recording. It allows you to view the recording while it is being written to disk. If you try viewing the mp4 directly while it still has segment buffers being appended, express static is not aware that the file is still growing. It will simply give it to you as-is. Since the playlist is being updated and refreshed in the browser by hlsjs with additional byte range details, it can keep requesting the correct bye ranges to keep filling its buffers and keep the video playing, while still allowing you to move the play head back and forward to scan the video with your eyes.
Ok, I think I am getting to understand the theory behind your developments.
Sounds all very logical...
I did not have time yet to experiment and test your theory in practice, but would be nice if you could review my diagram and let me know if something relevant is missing or if something is incorrect:
Hopefully others can also benefit from this, and they will start downloading your nice nodes. And one day you will die as a poor man, but at least your nodes will continue to run in the homes of our community members
That's the best possible outcome that I could hope for
About the diagram, it's getting very close to perfection, but...
there should be an initialization fragment that first comes out of ffmpeg, (and maybe the other rtsp client?)
the init frag is also parsed by mp4frag to get codec and timing info and held in memory until some request is made for it
when delivering via /video.mp4 or socket io, or sending on buffer output, the init frag always goes first
for hlsjs and native hls reading the live hls.m3u8 from mp4frag, the init frag is listed there for it to request, such as #EXT-X-MAP:URI="init-hls.mp4"
for hls consuming the recorded hls.m3u8, the init frag is part of the whole mp4, with its byte range listed something like #EXT-X-MAP:URI="00h.00m.00s.mp4",BYTERANGE="785@0" starts at 0, size is 785 bytes.
the recorded hls.m3u8 is set as #EXT-X-PLAYLIST-TYPE:EVENT. This tells the client that the video is currently live, but will come to an end at some point and will then have a final line as #EXT-X-ENDLIST. The client uses this info to determine if it should keep requesting an updated playlist or not.
if turning on the mp4 metadata for the status output, the node's status output should not be directly connected to ui-mp4frag. If you need to use the extra data when fine tuning or monitoring your setup, there should be a function node in between to separate the playlist from the other stuff. It is like this to not have an extra output on the node and to be backwards compatible while still allowing the extra data to be handled.
for the http routes, there is no /mp4frag/base_name/status, but the status can be viewed at /mp4frag/base_name/
the live hls.m3u8 can more easily be viewed in the browser by appending .txt for debugging, example /mp4frag/base_name/hls.m3u8.txt, and you can keep refreshing it to see it is updated
Ok, digested your last feedback and updated the cheat sheet.
I have also used two colors, to highlight the separate recording (orange) and live viewing (green) chains:
I think this is, if not the most, at least one of the most, complex & deep diving discussions that so far has ever appeared on his forum. And very interesting to follow I think. Biased by a personal interest of course
Keep up, this is so very informative and mind challenging!!
The larger pieces, as you already know by now, are the media segments. Based on your screenshot, they were arriving in 4 second intervals and were still smaller than your pipe size limit of 65k. That must have been from your sub stream. Ideally, the less pieces it is broken into, the less work is needed by mp4frag for later re-assembling and less memory usage overall and then less garbage collection, etc.
Don't forget the socket_io connection info. The 3 options gives the client side a choice of what is available and can pick which one to use based on browser compatibility or user preference. For example, chrome and firefox can consume the live video.mp4(not recommended because there is browser limits on persistent http connections), but safari cannot because it only will try to do byte range requests which cannot work in this situation. Use the hls.playlist to then be able to use hlsjs or native hls which gives a smoother playback in browser, or the socket_io stuff for the least latency at the sacrifice of being much less smooth.
That is strictly for when you give the start/restart write command and cause the buffer to start being sent on the 2nd output. If you start with preBuffer set to 0, then the buffer will start to be sent using the most current media segment and then follow that with the future media segments until it is told to stop, restart, or the internal timer stops it. For example, If you choose preBuffer 3, it will grab the past media segments(if they are available) and send those first before sending the current or future media segments. If you view the accompanying topic, you will see they are tagged with pre vs seg to differentiate in case you need that info.
I will add some info here about the write (start, restart, and stop) commands, some of which may be undocumented since I am not sure if they are good enough and this is still a beta version. This only affects the 2nd output ( labeled as buffer when you mouseover it).
start - init fragment plus media segments start getting sent to downstream wired node
values set in config panel are always used unless overridden in command input
preBuffer try to get past/older/previous segments and output them first (after init frag output)
timeLimit creates internal timer that will later call stop()
if currently outputting from previous start command and timeLimit is not set, has no effect
if currently outputting from previous start command and timeLimit is set, changes current output's timeLimit to new value, affectively extending the recording
stop - causes the buffer output to stop
restart - internally calls stop first and then start.
Why use start vs restart?
recording 24/7 (use restart)
Personally, I am doing 24/7 recording at the moment. There was no combination of settings that I could give to ffmpeg to make it do this for me in a way that I wanted, so I had to do this in node.js. I use an inject node that calls the restart command every 15 minutes at :00, :15, :30, :45. The restart command tells it to stop the existing output and start again, which my downstream function node can detect and then generate new filenames and make the new separate recording. Like I previously mentioned, 4 recordings per hour, all day, every day. This is a perfect situation for restart.
event recording (use start)
Another situation is that you may want to make recordings based on motion detection events or some other trigger. And because you want to have a little bit of video from before the event, you give it a pre buffer value of 3. Also, you don't want the video to record forever and you don't want to have to be bothered to remember to stop it yourself, so you give it a time limit of 30 seconds. The video will stop outputting after the time limit is reached.
extended event recording (use start)
What if you receive another motion detection event while there is already a recording happening? Sending the start command with a new time limit will update the internal time limit and the single recording will continue until that new time limit is reached (new time limit starts from current time, not the beginning of the original start). You may have an overly sensitive trigger that may send many messages and you wouldn't want to end up with bunch of tiny recordings.
To drive it home even further with an example, you set the pre buffer to 3 (and you already know your segments are 2 seconds duration) which is already going to give you a 6 second duration recording. And you had set a time limit of 10 seconds. After it runs, you will end up with approximately a 16+ second duration recording. And also, you had another triggering event occur 8 seconds after the original start, you will now have the 6 + 8 + 10 giving you a recording of 24+ seconds.
And there may be other scenarios that I have not considered yet...
The substream is MJPEG, which I had used to test the new release of my node-red-contrib-multipart-stream-decoder node recently...
I had not looked at the media segment buffer sizes. No clue to be honest how big they should be at this resolution/fps...
Ah yes of course that makes sense. So you mean that people should adjust their cam settings to make sure they have no chunked segments, i.e. the segment buffers should be smaller that the pipe limit (in my case < 65K)?
I had misinterpreted this incorrectly, when looking at the diagram, e.g. when I configure to keep 3 past segments:
The live streaming (green) arrow always fetches from the current segments via the m3u8 file, never from the the past segments. Is that correct?
Although you wrote above "The extra segments is to help out hls.js or native hls. Sometimes it is requesting a segment that has just fallen off the hls.m3u8 playlist. If we clean it up to quickly and allow it to be garbage collected, you will end up with a 404 in the browser due to the segment no longer existing on the server. This happens due to network latency, browser overload, etc". Do you mean that by keeping some past segments, this is also useful for the live viewing to avoid those 404 errors from older links in the playlist?
The recording only uses the past segments (if available) if you specify pre-order=3 in the input message or in the config screen. I assume you have implemented it like that just for making it as generic as possible?
Thanks for explaining also the different scenario's of recording!!!
Started some experiments to see if my diagram is complete enough to start from scratch. But there is something missing in it. I want to have a few switches in my dashboard, and one of those is to start/stop viewing:
I use hls.js because after all your info I became more a fan of fetching instead of pushing. So when I want to stop live viewing, this means I want the ui-mp4frag node to start fetching the playlist address. But the ui-mp4frag node keeps going on:
Is there any way to stop this? Because I don't want my smartphone to keep on fetching the playlist continiously, even when I am not watching it.
Yes I know what you mean. But would like to run a series of high resolution streams on my Raspberry, similar to like Kevin his setup. Won't get it up and running without a decent understanding of all the details. Even a tiny mistake will ruin the entire performance. Therefore I am biting the sour apple...
yes. Also consider that there are many settings that you can pass to hlsjs to change its behavior, which is beyond my knowledge at this point.
yes. set preBuffer in the config or or override it in the input command message. it seems like a value should be able to be set dynamically.
hide it from view after setting the unload and threshold values. also, add this simple ui template so that you can make your own controls and click the eye button to toggle it from view. or on the backend, send it an empty playlist. or put it in a group and toggle the group closed. it's really up to you. the pause button only pauses the playing of the native video player, not the loading of the video.