Standalone Raspberry Pi Security Camera with AI "person detection"

My first implementation was using AWS Lambda and S3, then I changed to Rekognition, it worked very well, is fast, no doubt, and off-loads edge devices that can be made simpler (just push the image). It's true, it is free up to a certain usage level but only for the first year.

I also agree with @wb666greene, I prefer a local solution but it is a personal choice. From technology point of view, they all do the job

By chance, do you have a script that is doing items 1-3 on your Centralized DNN analyzed solution diagram? I just ordered my Movidius and once I get it up and running I would like to do the same thing as described in items 1-3 except send the results to Home Assistant. Also, I was thinking I could use Wyze Cameras with the Dafang hack and get those to save pictures to my pi or NAS?

Let me look into that, my scripts are a bit personalized and I think I need to add some more comments while making them more generic

Since you have ordered a Movidius, the actual analyze will then take place in that device but my analyze is made in sw only so that part, supporting & using the stick, you need to build yourself. You can most likely use my script as an input for such a modification (and you can keep mine as is and compare performance if you run it in a decent powerful laptop or so).

I am, as mentioned, running the DNN analyzer in a couple of years old Lenovo laptop and it is very fast in analyzing I think. Maybe not fully up to speed for self driving cars but anyway, for my application, fast enough

I'll be back with scripts but it might take a day or so for me to finish considering other things on the "to-do-list"

Kind regards, Walter

That would be amazing. I have several items on my "to-do" as well, including just get the Movidius setup, so take your time. Right now I am most concerned with extracting the analysis results and getting them into MQTT, so anything that helps me in that regard is greatly appreciated.

For this setup, to be able to give adequate support if needed, I assume that the reader is somehow familiar with Python and is able to install required Python packages using PIP

I also assume that you are able to install OpenCV on the computer running the analyzer. I did follow this guide:
https://milq.github.io/install-opencv-ubuntu-debian/

Furthermore I foresee that we have a common computer available with "decent" performance where the DNN analyze takes place. I have tried running it on a RPi3 and it works but the performance was not satisfying so instead I use an older Lenovo Intel Core i5 laptop now with Debian, 12GB RAM, SSD disk, in all, pretty good performance

1) Sending pictures via MQTT
The Python script, "pic_send_via_mqtt.py", reads a picture file and sends that to a camera specific topic on a MQTT broker. The script must be called with some parameters

  • the path to and name of the picture file to be sent
  • a camera id to tell the DNN analyzer from which camera the picture was taken

Typically you call the script like this:

python pic_send_via_mqtt.py -p /home/pi/Pictures/picname.jpg -c 32

In my setup I use the great Motion software so this call is simply executed when Motion detects motion...if you use some other software, you have to arrange the call to be made in a way that is supported by that software

Just some words about the camera id; I use numbers for my cameras that are defined in the DNN analyzer script but this can of course be modified. Only thing, you need to modify on both ends if you decide to do it.

This script shall be installed in all entities that will send pictures for analyze. In my case I have a number of distributed RPi3's handling 2 cameras each. So the script is installed in each of them.

The Python script do have some dependencies that are required as well, see the import section of the script

2) DNN analyze
The current functionality of the Python script "centralized_dnn_analyzer.py" (yes, the name can be discussed, change them if you like) subscribes to all camera specific topics on the MQTT broker.

When a picture arrives, the script starts analyzing, looking for objects using MobileNetSSD caffe model. If one or more persons are detected with a required confidence level, a text message and a picture with those persons framed is created and sent to a mobile phone using Telegram (if you plan to use Telegram you need to create an account to get chat id and a token).

image

To avoid filling up the phone only two pictures (configurable) are sent per event, assuming all and many more pictures taken by the cameras are saved locally for later investigations. Furthermore, pictures sent to the phone are compressed to make them lightweight, still good enough to give a good overview

The script also has some more additional features:

Heartbeat; that is used to check that the script is running. I let NR send the keyword "Heartbeat" every five minute and if the script is running, it responds with an answer that is analyzed. If no answer is given, NR can be configured to do necessary actions

Terminate; to stop the script, I let NR send the key phrase "Stop-DNN-Analyzer" that will terminate the script instead of just killing it

To install it is simple but you will have to install the required Python packages first before the script will run, see the import section of the script. I also recommend you to install and use the MQTT broker on this computer to get best possible performance.

To start the script (it should run all the time) there are several ways. I personally run it as a service using systemd. See this link "how to": https://www.raspberrypi-spy.co.uk/2015/10/how-to-autorun-a-python-script-on-boot-using-systemd/

Or why not install Node-RED in the computer and let it start the script using the Exec node?

The content of the zip file can be extracted to /home/pi, that is how I did it. Then you can copy the files "pic_send_via_mqtt.py" and "mosquitto.py" to each entity that shall send pictures.

(I had to provide an external link since uploading zip-files here is not allowed)
https://drive.google.com/drive/folders/1vd2LbJB7WcEUwnWlEhDxkn3S6P5y4tmc?usp=sharing

We will see if this works out well or if I have missed mention something...

4 Likes

Thank you very much for the thorough and detailed right up! This is a massive help as I set off on my journey for my own object detection system. I receive my Movidius this afternoon, so I'm actively looking forward to diving into this. Thank you again for your help!

I thought I should just add something about starting & running the DNN analyzer script.

To start & run it for testing is simply to use the following command in a terminal program (like gnome-terminal in Debian):

python /home/pi/centralized_dnn_analyzer.py

That's it, the script will start waiting for you to send pictures for analyzing. To terminate the script, you just send the key phrase "Stop-DNN-Analyzer" mentioned above from Node-RED to the MQTT broker, topic "image/hb-in"

For a more "professional" setup, you might prefer to run the script as a service that automatically starts if and when you boot the computer.

This is how I did that configuration

Create a new unit file for the service with the content as below:

sudo nano /lib/systemd/system/dnnanalyzer.service

[Unit]
Description=My Script DNN Analyzer Service
After=multi-user.target

[Service]
Type=idle
ExecStart=/usr/bin/python /home/pi/centralized_dnn_analyzer.py

[Install]
WantedBy=multi-user.target

The permission on the unit file needs to be set to 644 :

sudo chmod 644 /lib/systemd/system/dnnanalyzer.service

Next steps:

sudo systemctl daemon-reload

Now tell systemd to start it during the boot sequence:

sudo systemctl enable dnnanalyzer.service

From here you can use the commands below to start, stop, restart and check the status of the service:

sudo systemctl start dnnanalyzer.service
sudo systemctl stop dnnanalyzer.service
sudo systemctl restart dnnanalyzer.service
sudo systemctl status dnnanalyzer.service

If everything is working as expected, you now have a DNN Analyzer server with a rather good performance, hopefully framing those unexpected visitors in a timely manner

But please, just remember to check the local regulations for video monitoring in your region. Where I live, we are not allowed to have cameras monitoring any part of public areas without specific permissions

Thanks so much for the additional information. This is fantastic work! I'm very anxious to get started on this project, and hope to get it to work as seamlessly as you. And no worries on the regulations. For now, I'm going to try and deploy inside my house for presence detection/body count

I can't thank you enough, I was just staring to stumble around the Pillow docs to figure out how to do this.

The need arose as a friend is running my system on Windows 10 but since it also hosts an externally accessible web-site I advised him not to install the node-red, but to run the notification on a Raspberry Pi (that he already had on hand), which of course meant I "volunteered" to modify my code to support this.

Your python pic_send_via_mqtt.py script solves the issue completely by inserting the four lines that are the "guts" of composing the buffer to be send via MQTT into my code.

Thanks again.

I've "unified" my code to run on Linux or Windows, use multiple NCS sticks if available and fall back to the DNN module if NCS is not available or run one CPU AI thread in addition to one thread for each NCS installed.

I'm finding a single i3 software only is about as fast as the NCS on the Raspberry Pi3B+ ~6.5 fps. The i3 with one CPU AI and one NCS is getting ~16 fps.

On my i7 desktop with everything else I do still running, 1 CPU AI thread and 1 NCS thread hits ~38 fps getting frames from 15 rtsp camera threads. These can often be "camera limited" as the AI can be starved for frames lowering the composite framerate, took 15 rtps streams to apparently saturate the AI subsystem on my i7 -- I didn't have enough http snapshot cameras to do it.

I've found rtsp mp4 streams can feed frames to the AI at a higher rate than http "snapshots" or ftp but the latency is very much worse.

I will share my code after I fix a few things, but the NCS stuff has a monkey wrench thrown by Intel. They introduced an NCS SDK v2 which broke all existing code, I wasted some time with it but it offered no performance benefits I could see. Then they came out the the Movidius NCS2 but the NCS SDK v2 didn't support it, only there "OpenVINO" code. This is where I plan to move my code to as it "seamlessly" supports CPU AI and NCS/NCS2.

I don't have a timeline, but your send images via MQTT was one of the "more time consuming" items on my TODO list! I'm still debating if there is a need to allow mixing rtsp stream cameras with http snapshot cameras, right now its either/or via a command line option.

1 Like

I'm glad to hear! I really enjoy this type of discussions, so good to exchange ideas and experiences
BR & Thank You!
Walter

I've had a chance to go through your dnn_analyzer.py code and I've learned a few things (always looking to improve my Python understanding, which in pretty basic). I'm more of a "domain" expert than a language expert. I've always said: "its easier to teach a biochemist enough about programming to solve a biochemistry problem, than it will be to teach a programmer enough biochemistry to solve a biochemistry problem." :slight_smile:

For example, your cn = { '41':0, '42':0, ...} confused me initially until I realized it was making an "array" to be indexed by strings instead of numerical indices (0,1, ... N-1). Nice to know you can do such a thing.

I'm not seeing how your timer gets started, I thought it needed Timer.start() somewhere. I assume this logic is to limit the Telegram messages to 2/minute. With your code as a starting point, I may give Telegram another try, I bogged down in Botfather and adding users to a group, I did send messages from node-red but never could get multiple recipients to work. I didn't put a lot of effort into it, as Email and SMS was working fine for me, although recent Android updates seem to have made these notifications a lot less reliable for my phone :frowning:

What I find most interesting is your use of Contrast Limited Adaptive Histogram Equalization pre-processing of your images. I may have to give that a try as the bulk of my static (background clutter) false detection come under IR illumination or during the dusk-to-dawn or vice-versa transitions. But I'm wondering why you are converting the grayscale back to color afterwards as this seems to do nothing other than make the image size (h,w,3) instead of (h,w,1) or is three planes required for the caffe model? (I've not looked into the details, just a luser at this point :slight_smile: ). I thought the traditional way to do this on color images was to convert BGR to LAB, run CLAHE on the L channel and then convert back to BGR.

We seem to be using the exact same MobileNetSSD_deploy.caffemodel and prototxt files. You are sending 500x500 images in the blob to net.forward(), whereas I'm using 300x300 becasue I thought that was the sized the model was trained on and thus had to be the input size. I can't help but think the results would be better if the AI started with higher resolution images, but 500x500 is a lot more coefficients to configure in the training. If its has been done, I'd be a fool for not using it!

I too am enjoying and benefiting immensely from this discussion.

One other trivial question, why did you use the included mosquitto.py instead of paho-mqtt? Their functions/methods seem very similar.

Also I just noticed your confidence thresholds seem pretty low, ~0.3, I usually use ~0.8 or do you have a higher tolerance for false positives than I do?

I'm not seeing how your timer gets started

This happens in the on_message function:

if not tmrs[cnbr].is_alive():
    tmrs[cnbr].start()

And you are correct, this is to limit the number of pictures from the same camera, in this case 2/minute

But I'm wondering why you are converting the grayscale back to color afterwards as this seems to do nothing other than make the image size (h,w,3) instead of (h,w,1) or is three planes required for the caffe model?

I'm not 100 certain about this, if I remember correctly, the createClahe function needs a gray scale but then the blobFromImage requires a BGR. Anyway, if I do not convert to BGR before doing the blob, I get error messages

You are sending 500x500 images in the blob to net.forward()

I'm actually down to 400x400 now but this is just a personalized tuning I have. It seems, for some strange reason, that I am able to detect persons better if I have an "overlap" so to say. It's because my cameras view angle are different to what I suspect was used when the model was trained (several of my cameras are mounted higher up under the roof and watching down whereas the model is most likely trained with more frontal/horizontal views). Anyway, I detect better with 400x400 than with 300x300

why did you use the included mosquitto.py

Just old habit I guess. They are the same but paho is newer I think. Both works fine and paho was not installed in the raspi images I used at that time. Regardless, you need to import some py library, mosquitto.py works fine but if you install paho, it works as well

Also I just noticed your confidence thresholds seem pretty low, ~0.3

This is also just a personal tuning due to some of my cameras view angle. Most of the time the detected confidence is much higher, well above 90%, but for some cameras looking from above and with some light or weather conditions, object contrast to background, the confidence level sometimes drops to 35-40%. Anyway, this is configurable by camera in the conf_lev object

Kind regards, Walter

I wonder what the dnn module is doing internally? If the first layer is expecting 300x300 and you send it more does it crop? resize? analyze multiple overlapped 300x300 "tiles"? (which it seems should slow it down a lot). If you send it less, does it interpolate or zero pad? I'm going to see if searching comes up with some info on what is going on internally. I'm kind of surprised it doesn't give errors if the size is not 300x300 or whatever it was trained on. I'm pretty sure the NCS version chokes if the input is not 300x300, although if I recall the NCS SDK v2 api would do an automatic resize if necessary (I didn't pursue it as I only saw different, not better)

I've got some mp4 files (from a real crime at an industrial site) that are particularly hard to detect, I do get some detections but most of the time that the perp is in the frame its not detected. I plan to add the CLAHE to the pre-processing to see if it helps.

Doh! my object oriented stupidity, I searched on Timer not the tmrs objects created.

I've played a bit with CLAHE and pre-processing the images with it before sending to the AI can definitely help. Unfortunately I've found little guidance on how to set its parameters with my googlefu.

Here is an ~10 second clip of that security camera footage I mentioned of an actual crime where the people paid to monitor the cameras totally missed it, the entire event lasted ~30 minutes, the thefts were discovered the next day.

CLAHE enhanced AI performance comparrison

Dual frame view "normally" processed by the MobileNet-SSD AI on the left. On the right is the result of images pre-processed with CLAHE before going to the AI sub-system. The images certainly look better to the naked eye, but I doubt it would have been enough to make a difference to the people who were obviously not paying attention.

If you want to play with this, I'll share my Python code, if requested. It reads from a video file or a "live" camera on /dev/video0 using OpenCV cv2.VideoCapture(). This code uses the Movidius NCS, so as far as I know it will not run on Windows. Analyzing the input file, twice per frame, gave me ~4.5 fps meaning the NCS was processing ~9 fps.

2 Likes

Wow, that is not too bad I think!

(I just noticed your posting now, missed it for some unknown reason)

Clearly, the picture is enhanced on the right side and good enough for DNN detection to be triggered (5 times if I counted correctly). If the security guards would have had an alarm triggered by that, they should have managed to "wake-up". I think this would be a great add-on to their existing video monitoring system

Very good and interesting!

Yes indeed, I'm working with the principle of that company to add AI to their monitoring system. Been a bit of a side-track since their system runs on Windows, but I'm extraordinarily impressed the Python portability between Windows and Linux.

The Movidius NCS is not suported on Windows so I'm going up the OpenVINO learning curve. So far with OpenVINO the NCS2 is showing ~2.5X the frame rate of the original NCS (running on OpenVINO) The OpenVINO code transparently handles the difference between the NCS and NCS2

CPU only AI is a possibility as his systems run on i3 for which CPU only is about the same and with the NCS -- although that is not with his normal workload running.

It been fun, I'm getting some new toys to play with, He's also looking into integrating Lidar - the idea is Lidar detects motion (appearence of a new object in map's background), directs a high power PTZ camera to it and the AI decides if its a person or livestock.

I'm learning alot!

And you are helping them a lot to get a more intelligent systems, it is a joy to see. You know, I was working with and global responsible for large security solutions all my life until I retired two years ago. With what I know today, we could have made magic, enhancing with AI. I had a lot of colleagues also in the US since the company I worked for had operations there as well. But we used systems from Pelco, Genetec, Lenel, Bosch and lately Milestone. All of them had nothing of this intelligence a few years ago but things are moving fast and a lot has happened recent years, I imagine they have now catched up

When you are mentioning OpenVino, is it still dependent on the stick?

I am also thinking how performance can improve. My i5 laptop is doing really good, within a second I have a picture analysed with detected person in Telegram, good enough for a family house I suppose. I assume the RPi's will continue to be faster next generations. I found the Odroid-XU4 (now at $49) ODROID-XU4 Special Price – ODROID could be an interesting candidate, in various benchmarking it is measured to be about 7 times faster than the RPi3, means it would perform at the same level as my laptop. I expect much will happen coming years so we might not need to optimize on the software side for performance reasons, more for accuracy in detection

OpenVINO tries to "unify" the program logic with modules for CPU, Myriad (NCS & NCS2), Intel HD graphics GPU, Movidius VPA, and FPGA. The main difference is the NCS models need to be compiled for fp16 vs fp32 for CPU, I've no experience with the other "inference engine modules"

There are some IMHO backward steps -- no way to probe for now many NCS are installed, no way to get the stick's temperature.

Thanks for the tip about the Odroid, I've just started to investigate "enhanced" RPi type IOT-like small computers.

Another alternative could eventually be ASUS SBC Tinker board S RK3288. However, it is a bit more expensive. Anyway, I finally just ordered the Odroid-XU4 from Hardkernel's web shop in South Korea, I'm curious how it will perform. Let's see how it goes with delivery & time in customs