Video stills capture

I've used your "master SD card" technique pretty much from the beginning of my playing with these small computers.

One thing you do need to be careful with for the AI is that the models that will run on the Pi, either with Movidius co-processor or software only, are trained on small images -- 300x300 pixels for the MobileNetSSD. So more capture pixels eventually becomes a problem as the "squish" factor makes people much less detectable unless they are very close to the camera and fill a good percentage of the frame.

One trick I use for 1920x1080 images is to break them up into four quadrants and process each quadrant separately. This of course reduces the frame rate by a factor of four so I don't think its viable for the software only on openCV dnn on the Pi.

I'm curious as to what you mean by using openCV for video streaming.

Its only during the development, setup, and troubleshooting phases that I use the openCV image display.

options. With the help from this forum, I have a nice node-red dashboard UI to display every image for setup (again only about one frame every 2 seconds with software only AI on the Pi3) or only the detected frames.

I resize to 300x300, detect, scale the detection box points for the original image and draw the detection.

I think its great we are discussing this as its a game changing technology for video security systems and "makers" can be at the forefront for now. I'd be surprised of most of these cheap IP cams don't have a Movidius-like chip embedded in a few years.

Yeah, well, it was, I estimate, like six months ago, I was investigating how I could make a solution with 2 cameras on one single pi, basically using the ideas from Adrians blog for detection

To stream the videos via http, I defined two threaded web services (the whole stuff written in a python script) that served my gui need with "live" pictures from the same cameras. So I think I was wrong, I was not streaming using openCV, only analyzing, streaming was handled by the web services

The analyzing was in that solution at that time using HOG

#initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

The solution worked ok but the HOG analyzing loaded the cpu too much. I abandoned it in favor of Motion and a "smarter" analyze using openCV with MobileNetSSD dnn ONLY for pictures where Motion detected movements.

Maybe it would work less cpu-hungry if I re-wrote the script using MobileNetSSD instead but for now, I will stick with Motion as the base since it has web services built-in supporting my gui with live video.

Yes & No. You remember the discussions a couple of years ago when everyone was talking about how much processing & local storage should be done in the edge devices. To minimize load on the network, to provide ready-analyzed-stuff to the security management systems etc etc

I like the Movidius even though I do not have one, it is a cool idea. On the other hand, or closing in from the other side, you have the cloud services becoming really competent & fast

I have experimented a bit with one of them, the Amazon Rekognition,

it is really impressive. In terms of person detection, very reliable and fast I must say. You can try it out very easily, just create an account (free for one year), upload one of your pictures with a person. I also tried the "Text in image" feature they host when I wrote a python script to detect cars and reading number plates. It was surprisingly accurate

So I'm not, anymore, that convinced that edge devices will be "more intelligent". The cloud services could in the future very well be a service hosted & available only available on the intranet to make it secure and non-public. Edge devices could be made cheaper without "too much overhead" added locally, instead focusing on the main purpose, producing high quality pictures with high resolution at a high rate.

Besides, talking about upgrades with new features or bug-fixes; so much easier with a cloud or central service instead of having to re-flash each device, eventually also OTA

There are so many professional manufacturers around most likely working or already having solutions, I would be surprised if none of them would provide, if not today, soon tomorrow, a solution using similar features like Movidius or Rekognition included in their video management software

Lots of good stuff here, but to me Cloud services and personal at home security is a non-starter. Unless you have cellular internet data access -- too expensive (and slow) for us (we have Gigabit fiber), although I have considered a minimal plan for emergency backup, it is very easy to add "fall over" with most modern "home routers", I just haven't been able to justify yet another monthly bill :slight_smile:

Its just too easy to knock the down the Internet connection to depend on cloud services, YMMV. Maybe one cloud camera to catch who messed with the connection makes sense (you have to walk through my camera views to get to my Internet connection unless you climb the pole on my neighbor's property), but I want low latency, high priority push notifications so I can take action as it happens instead of just being one of the zillions of after the crime security camera footage "call the TIPS line if you know this guy" shown on the nightly news. The down side is the false alarm rate has to be darn near zero and so far I am there. My experiments with Onvif cameras is getting the latency down to about one second. My main issue here is the inevitable but not common temporary connection failures.

I do use a cheap Android phone with a $10/month plan to push SMS notifications from the AI system via MQTT to the phone running node-red with termux and termux-api. If I get SMS alerts and no Emailed photos I know something serious is up!

The beauty of my system is that I never look at the video once I've setup the cameras but have about 17 days worth should there be a serious event, with the AI as the index of where to look. Obviously I depend on my (and my wife's) cell phone for the notifications while away and speech synthesis for alerts when at home -- this is what really proves the fantastically low false alarm rate.

My other bias against "third party services" is my ex-wife moved into an upscale "gentrifying" but rather high crime area. Monitored alarms were standard equipment in all the homes there. Alarm company had a "scheduled, short outage" sure enough her house was robbed while the alarm was out -- criminal gangs have infiltrated these providers.

I think the high resolution may be over-rated, sure the initial wave of 320x240 video security images were barely usable, although way better than nothing, but once you get a good D1 or 720p snapshot I'm not sure that 1080p or 4K is worth the extra computer power and bandwidth required, especially if you are going to be looking at it on a cell phone. For 24/7 monitoring and after the crime investigations I'd consider 4K the necessary minimum as may looters and rioters have walked away from arrests without convictions from "poor quality" security video not meeting the "reasonable doubt" standard.

I will definitely play with the Amazon Rekognition as soon as I hit a stopping point with this Onvif stuff, thanks for the link, didn't know they have a free trial. Darknet YOLO was really impressive, but 17 seconds per image on my i7 desktop (without CUDA) made it of academic interest only. There are some "lightweight" versions of it I plan to evaluate down the road as a possible enhancement.

Here is a fairly typical news story:
attempted home invasion
Don't know if they still have it up, but I downloaded it, cut out the security camera footage, and ran it through the AI and uploaded it to youTube:
AI run on news footage.

The take home lesson is that physical barriers are your first and perhaps most important line of defense. The guy had recently beefed up his front door and the bad guys ran off when the door failed to "kick in" figuring they'd lost the element of surprise. He was totally oblivious to what had happened until he saw the marks on the front door the next morning. Would likely have been a very unhappy outcome had the door not held! An audio alert at the first green box detection could literally make a life or death difference.

We may be getting off topic, but after a bunch of Googling I did sort of get openCV to read an RTSP x264 stream, frame rate is pretty decent calling imshow in a tight loop, despite the errors/warnings that are being printed. Seems terribly fragile as it only works for me with python 2.7.12 with opencv 2.4.9.1, not on the Raspberry Pi3 with openCV 3.4.2, or newer versions of openCV with either python 2.7 or 3.5 on my desktop.

I haven't had time to see if the errors are camera related, but I tested with my worst (in terms of Onvif compliance), the others are in use at the moment with some Movidius code to try and repeat the errors I had last night to see if my try block is working to make the system soldier on in the event of intermittent network issues.

Here is the minimal demo:

import cv2
#basic web search sample code.  Works with errors in python 2.7.12 with cv 2.4.9.1, Errors:
#    Invalid UE golomb code
#    [h264 @ 0x2b3f220] error while decoding MB 35 37, bytestream -5

#fails with Error: '[rtsp @ 0x1897400] Nonmatching transport in server reply' in python3 with cv 3.4.1

cap = cv2.VideoCapture("rtsp://192.168.2.92:554/onvif1")
while(1):
    ret, frame = cap.read()
# stackoverflow comment suggested this make no difference
#    if not ret:
#        cap = cv2.VideoCapture("rtsp://192.168.2.92:554/onvif1")
#        continue
cv2.imshow('RTSP VIDEO', frame)
cv2.waitKey(30)

Terrible situation in those videos. Almost looks "arranged", did they give up that easily??? But I guess, as you say, when they lost the momentum, they better give up quickly, otherwise they could have met a man ready with his shotgun. Anyway, having video monitoring & analyzing with an audio alert (indoor and outdoor I assume) makes sense when we now how found a solution with such low false alarm rate. I guess, by time, even better and improved trained models will become available

In the meantime, I have tested further with openCV and dnn (MobileNetSSD), streaming video from two cameras, i.e. dnn- analyzing each frame continuously. It really loads the cpu on the RPi3 heavily, it's actually getting too hot, at 82.7 C right now, I have to terminate the script very soon now I think

This reminds me that you can't really use openCV running the analyze this way on a normal CPU, checking every single frame. You need something "less cpu hungry" (or GPU??) to make the decision to analyze frames for objects. In my case, I have found that Motion is running with very low CPU load and when movement is detected, frames can be analyzed using openCV & dnn. PIR detectors or similar could very well do as well.

A question related to your Movidius stick, does it get hot when you analyze at full speed?

I don't know how, but it would be interesting to run openCV using the GPU on the Pi for the crunching. It would require some more things, like compiling it with GPU support using CUDA. Question is also if methods and functions used by cv2.dnn.readNetFromCaffe has a GPU implementation or not

> OpenCV can be compiled with GPU support using CUDA. Some methods like SURF point extraction have a gpu implementation. See here for more information.
_> _
> In order to use the GPU support, you need to have CUDA installed and compile the OpenCV source code with the USE_CUDA flag set in CMake.

But I guess this wanted efficiency is easily reached by far using the Movidius stick with it's 16 cores?

The Movidius stick on the 9 camera system doesn't have any thermal issues that I'm aware of, the one currently in my i7 desktop that has been running more or less continuously round-robin sampling two IP cameras for three days is barely warmer to the touch than the computer case around where its plugged in.

My Pi3B+ temp in the 9 channel system is right now about 53C, it does have a heatsink and I leave the top off the case. Its not in the best location in terms of airflow. This is the peak of our summer and so far no signs of any thermal issues. Its got a lot of node-red running to pass the data from the security DVR to the python running the Movidius. I do have code to reduce the data flow when I won't care about what the AI finds, but its in audio alert mode now while I'm in the back room working on this stuff so all the cameras are being feed to the AI whenever the PIR get triggered. Which on a day like to day is a lot more than you might think reasonable, certainly proves why the PIR have too high a false alarm rate for any kind of priority alerting.

The NCS has a pretty substantial heatsink built in, which on the Pi means you need a USB "pigtail" unless the Movidius is your only USB device.

My stand-alone single PiCamera module CPU-only prototypes using Pi2B needed heat sinks and extra holes in the case, now they run typically 68-70C with the GPU usually a bit hotter than the CPU. I've recently added a node-red watchdog to reboot them as I've observed some random failures where they quit working or responding to ssh logins. Time with tell if this is a real solution or not. My goal with these is to loan them to friends and neighbors when they ask us to "watch their house" while they are away.

I'm not sure, but I think CUDA is only for NVidia graphics chips, unless other vendors have adopted the API and made drivers for it. My desktop has some low-end CUDA capabilities, but I've not bothered with exploring it. I'm still in the "make it work" phase of developement, Optimization is down the road, if required.

When I first saw the local TV newscast, I thought it might be a "setup" to promote that door brace the guy had installed, but these things have been around for a long while and I doubt the local TV station would leave it up on their website if they were duped. If its "fake news to promote a product" I think they'd be in violation of some US FCC requirements about differentiation of "programming" and "advertising".

Google has recently re-released their Artificial Intelligence Yourself Vision Kit, I've one on the way. It uses the camera interface to feed in the data to a version of the Movidius chip and supposedly can process a frame in 38 milliseconds when fed jpeg files (instead of a live camera image) from a PiZeroW. I'm hoping it'll also work with a Pi3Bplus. Price of the kit is competitive with the Movidius stick, hopefully the AI "bonnet" will be sold seperately as the kit comes with a PiZeroW-H and v2 camera module and a few other supporting things, so subtracting the price of the camera module and PiZeroW the AI bonnet could be about half the price of the NCS if sold stand-alone. The "bonnet" is like half a "Pi Hat" :smile:

This is Texas, they were very smart to run off after possibly losing their initial element of surprise.

Edit:
I've now got a version of the Movidius AI doing round-robin sampling of two Onvif IP camera on a Pi3B+ drawing every image on two openCV windows on the GUI, its using 12-14% cpu. I should be getting more cameras to test with on Thursday. The key for me is that the latency to detection is now less than a second with sequential detection images written to a USB stick about 0.7 seconds apart.

Very interesting indeed. Seems like that Google thing box could very well serve as your "neighbour house watcher", especially with the addition described here: (saw later, you already bin there)

https://webrtchacks.com/aiy-vision-kit-tensorflow-uv4l-webrtc/

For other projects, interesting would be to hook the Bonnet to a RPi3 (think I read this is possible using an adapter of some kind) and to be able to stream from multiple ip cameras instead of using a picam. Like you have done now with the NCS. Or just stick to using a NCS, looks more professional. Only reason for testing is that someone wrote that Google claims their analyzing software is faster

So, I wonder, is an architecture based on a single RPi3 with an NCS and multiple number of ip-cameras to be recommended as a rather simple and straightforward setup?

At this point making "optimum" choices is premature IMHO, but I've consolidated the variations in my Python code versions so it runs on my i7 desktop and Pi3B+ with automatic fallback to CPU only AI if the NSC stick is not found and a choice of using Onvif IP cameras, USB camera, or the PiCamera module.

On the i7 basically CPU only is actually significantly faster, (Adrian at PyImageSearch found similar with his Mac computer) I was surprised USB3 vs USB2 (Ubuntu 16.04) made only a small difference.

NCS 1280x720 USB camera: 130-150 mS/frame
USB3 NCS 1280x720 USB camera: 100-120 mS/frame
USB3 NCS Onvif IP 1280x720 camera: 140-230 mS/frame
CPU Onvif IP 1280x720 camera: 80-130 mS/frame
CPU 1280x720 USB camera: 40-60 mS/frame

Take these with a large grain of salt as my desktop is running four desktops with more apps and browser windows & tabs open than most folks would think reasonable on a 4K UDH TV used as a monitor. These timings are with every frame (including no detection) displayed using cv2.imshow("Detector", image)

My take home lesson is that Onvif IP cams are the easiest way to go for multiple cameras on a single system, the two camera system works very well on the Pi3B+ with under a second latency to detection -- my most important parameter. YMMV. But yes, at this point Onvif IP cameras, NCS and a Pi3B+ is a very nice and rather straight forward solution for ~$200-250, I planning on using 4 IP cameras, I could always add another NCS later :slight_smile:

I will post this updated code in the "share your projects" section thread once I've added in the MQTT and updated node-red flow to start and control it.

I plan to start replacing some Lorex cameras with Onvif Netcams ASAP. The USAVision (aka Geovision) camera has been perfectly well behaved for the week I've been running it, the Besder "straight from China" cameras IMHO should be avoided as the WiFi one isn't Onvif compatable enough for anything, and the one without WiFi has bursts of network errors about every 10-12 hours, usually my (crude) error handling code lets it recover after a few seconds to a few minutes, but several times I've had to cycle the camera power to get it back. The USAVision only costs like $3 more from Amazon.

For me the optimum is one of the commercial systems for 24/7 recording, which they do extremely well for a very reasonable price and AI cameras in key locations for early detection and warning alerts. If not for the latency my node-red-contrib-ftp-server Pi3B+ NCS add-on solution could be a viable product.

Costco has a 4-channel 4-IPcamera 1920x1080 Network DVR for like $300, but I'm afraid the network DVR acts as a NAT router and I would not be able to talk directly to the cameras. But since I can get four of the USA vision cameras for a bit over $100 I stopped looking for on-line documentation about the commercial system.

I'm going to sign up for the Amazon "free trial" today or tomorrow.

Trying the Google AIY Vision bonnet on a Pi3B+ will be a priority for me next week assuming it arrives when promised.

Edit:
I need to get a longer PiZero camera module cable to run the AIY Vision on the Pi3B+ reports are it does work so I'm starting with it.

Ran a few CPU only AI tests on the Pi3B (non-+):
Onvif IP cams 1280x720 and 704x480: ~1 sec per frame, each image time overlay counts upby 2, ~66-80% CPU, after about 10 minutes the little red statusbar thermometer pops up about two of four ticks red and now the time overlay sometimes counts by 3. After 45+ minutes seems stable, about every 3rd or 4th frame counts up by 3 instead of 2 seconds. Room temp here is about 28C at the moment. Thermometer notification went away a few seconds after I closed the program.

Pi Camera Module 960x544: CPU usage is about 90% and the thermometer notification pops back up its running about one frame every 2.5 seconds. I'm surprised the PiCamera module slows things down this much. I'm using the PyImageSearch imutil "videostream" camera interface. All these tests were run with openCV displaying every frame on the gui.