Standalone Raspberry Pi Security Camera with AI "person detection"


Is there a way to update the files I've uploaded, or should I just make a new post with new uploaded files?

I've improved the node-red flow to show images on the dashboard and made a few minor changes to the python code to support it.

Since krambriw hasn't seen the issue with opencv-3.4.1 that I've worked around where the opencv-3.3.0 dnn function returns the previous detection instead of the results for the current frame, I'm in the process of compiling opencv-3.4.2 which seems the latest at the moment. Taking hours to complete, on the Pi 3B+ I've failed miserably at my attempt tp cross-compile :frowning:

Getting openCV with the dnn module running on the Pi is the most difficult part.

Edit. I've uploaded the "final" versions of my flow, the AI python script, and the bash script that the flow uses to start the python script initially. You need to remove the .txt extensions I had to ad for the upload. To keep the USB stick from filling up I added this to root crontab:
3 1 * * * /usr/bin/find /media/pi/AI/detect/ -maxdepth 1 -type d -mtime +3 -exec rm -rf {} ; >/dev/null 2>&1

To mount the USB stick on boot without auto login enabled when running "headless" I added this to /etc/fstb:
/dev/sda1 /media/pi/AI ext4 defaults,noatime 0 3

You still need to download the MobileNetSSD_deploy.caffemodel and MobileNetSSD_deploy.prototxt.txt files, I believe they originally came from here:

I initially got them from the PyImageSearch tutorial:
Its worth reading to help understand the python script. (206 Bytes) (10.6 KB)
PiCam_notification.json.txt (26.7 KB)

At this point I consider this thread completed, unless someone trying to use it has questions.

I want to thank again all the contributors on this site who have helped me get the dashboard UI showing images. It makes the headless systems easy to setup and control from a browser on a cell phone, once you have solved the "chicken or egg" WiFi SSID/Passphrase issue. My attempt in the JSON file is incomplete.


Just a little update (not so little in fact)

I had some thoughts about improving efficiency in the DNN analyze. In my current "production system", each RPi3 is responsible for handling 2 cameras and doing the DNN analyze locally, i.e. distributed computing. In general, the functionality is fine & very stable, it simply just works. The time it takes from a detected intruder until I have a notification with a DNN analyzed thumbnail in my iPhone is around 5-7 seconds. This is acceptable for my application so I should be happy.

So why think about changing something like that?

Well, being a tech guy, there should always be room for a challenge to overcome and to improve performance.

I thought about purchasing a Movidius stick. But since I have a distributed system with 4 RPi3's I should then look for a solution where I only need to buy one stick and not one for each RPi. So a kind of DNN Analyzing Server should be nice to have. Each RPi would then push pictures over MQTT for a centralized analyze instead of doing it locally and distributed.

I dived into this and created some Python scripts, one for the server and one for each RPi (client). Further looking around revealed that I had an older Lenovo laptop. Not in use any longer. An idea came to my mind, could this "thing" be powerful enough to improve performance and how much better could it be?

I made a fresh Debian install, wiped the old Windows 10 away, installed OpenCV and all the other stuff required making the DNN analyze possible. And the laptop booted up perfectly fine and very fast (partly due to the ssd disk I presume)

And then, wow, when I started testing and pushed pictures for analyze, now we are talking, it is really fast. I get the same notification as earlier in my phone within less than a second. Almost instant.

At the moment this is just my "development system", haven't decided yet if I wan't the to have the laptop as a server, maybe buying a barebone is better. The idea here was to see & evaluate how much faster a laptop or similar could do the analyze compared to the RPi3's. It was a substantial difference.


I've been impressed with how much faster the openCV dnn is on my i7 destop than it is on the Pi3 with Movidius, software only dnn is still faster on the i7 than is Movidius NCS on the i7 using USB3. I was surprised by this,

If I had a old i5 or better laptop laying around, I'd certainly use it! I don't have CUDA installed on the i7.

OpenCV 4 is on the horizon, I don't know if its true or not, but I've read some claims it will (or could) use the gpu on the Intel integrated graphics chips so CUDA would no longer be the only "acceleration" option This could make old laptops even more attractive. I haven't investigated CUDA as I have a fairly low-end NVidia card chosen for its cost-effectiveness using a 4K HDTV as a monitor. I sure feel crippled when forced to use a computer with only 1920x1080 display :slight_smile:

Thanks for the update.

I'm in the process of switching key camera views to Onvif net-cameras. Found some 720p that work well on Amazon for ~$25 each (USAVision, aks Geovision). I've found the AI lets me completely ignore the video motion and PIR detection. Just run the AI and alert/notify as necessary.

I've also found some DirectTV adapters that turn a RF cable into an Ethernet link, this will let me reuse my existing analog HD "Siamese" video cables for the netcams, <$20 per cam beats rooting around in the attic to pull new cat6 cables.


Exactly on that theme, I found this and I sent a link & question to Adrian if this could be something useful to improve performance for these things. No replay yet though


I have been following this thread for a couple of weeks and i was thinking to myself - why are they doing the Image processing on the PI - throw it across to a VM running somewhere and let it at it !

I intend to read up on what you have done a lot more and try and implement this - i use VMware extensively and have a small cluster at home - i also have an always on desktop machine running Vmware and this would be the perfect candidate as it has just been installed with the latest I7 etc.



Fwiw, another possibility that we've used is to set up an AWS Lambda function to do the image processing... the trick is to get the image into Lambda. It's probably not the best way, but we just pass a node-red url endpoint into the function, that pulls the latest image and analyzes it in the cloud.


YMMV, but I for one sure don't want images from my security system uploaded to any "cloud", especially Amazon or Google so they could do god knows what with face recognition or whatever on it.

Amazon will sell you their AI on a per image basis:

I will give their "free trial" a try when I have some time to investigate using it, but its something I what to learn about, not a service i have any plans to ever actually use.

A big part of my reason for using the Pi is power. The Pi3 and the cameras and router/internet access will run a lot longer on a UPS than will my i7 desktop.


My first implementation was using AWS Lambda and S3, then I changed to Rekognition, it worked very well, is fast, no doubt, and off-loads edge devices that can be made simpler (just push the image). It's true, it is free up to a certain usage level but only for the first year.

I also agree with @wb666greene, I prefer a local solution but it is a personal choice. From technology point of view, they all do the job


By chance, do you have a script that is doing items 1-3 on your Centralized DNN analyzed solution diagram? I just ordered my Movidius and once I get it up and running I would like to do the same thing as described in items 1-3 except send the results to Home Assistant. Also, I was thinking I could use Wyze Cameras with the Dafang hack and get those to save pictures to my pi or NAS?


Let me look into that, my scripts are a bit personalized and I think I need to add some more comments while making them more generic

Since you have ordered a Movidius, the actual analyze will then take place in that device but my analyze is made in sw only so that part, supporting & using the stick, you need to build yourself. You can most likely use my script as an input for such a modification (and you can keep mine as is and compare performance if you run it in a decent powerful laptop or so).

I am, as mentioned, running the DNN analyzer in a couple of years old Lenovo laptop and it is very fast in analyzing I think. Maybe not fully up to speed for self driving cars but anyway, for my application, fast enough

I'll be back with scripts but it might take a day or so for me to finish considering other things on the "to-do-list"

Kind regards, Walter


That would be amazing. I have several items on my "to-do" as well, including just get the Movidius setup, so take your time. Right now I am most concerned with extracting the analysis results and getting them into MQTT, so anything that helps me in that regard is greatly appreciated.


For this setup, to be able to give adequate support if needed, I assume that the reader is somehow familiar with Python and is able to install required Python packages using PIP

I also assume that you are able to install OpenCV on the computer running the analyzer. I did follow this guide:

Furthermore I foresee that we have a common computer available with "decent" performance where the DNN analyze takes place. I have tried running it on a RPi3 and it works but the performance was not satisfying so instead I use an older Lenovo Intel Core i5 laptop now with Debian, 12GB RAM, SSD disk, in all, pretty good performance

1) Sending pictures via MQTT
The Python script, "", reads a picture file and sends that to a camera specific topic on a MQTT broker. The script must be called with some parameters

  • the path to and name of the picture file to be sent
  • a camera id to tell the DNN analyzer from which camera the picture was taken

Typically you call the script like this:

python -p /home/pi/Pictures/picname.jpg -c 32

In my setup I use the great Motion software so this call is simply executed when Motion detects motion...if you use some other software, you have to arrange the call to be made in a way that is supported by that software

Just some words about the camera id; I use numbers for my cameras that are defined in the DNN analyzer script but this can of course be modified. Only thing, you need to modify on both ends if you decide to do it.

This script shall be installed in all entities that will send pictures for analyze. In my case I have a number of distributed RPi3's handling 2 cameras each. So the script is installed in each of them.

The Python script do have some dependencies that are required as well, see the import section of the script

2) DNN analyze
The current functionality of the Python script "" (yes, the name can be discussed, change them if you like) subscribes to all camera specific topics on the MQTT broker.

When a picture arrives, the script starts analyzing, looking for objects using MobileNetSSD caffe model. If one or more persons are detected with a required confidence level, a text message and a picture with those persons framed is created and sent to a mobile phone using Telegram (if you plan to use Telegram you need to create an account to get chat id and a token).


To avoid filling up the phone only two pictures (configurable) are sent per event, assuming all and many more pictures taken by the cameras are saved locally for later investigations. Furthermore, pictures sent to the phone are compressed to make them lightweight, still good enough to give a good overview

The script also has some more additional features:

Heartbeat; that is used to check that the script is running. I let NR send the keyword "Heartbeat" every five minute and if the script is running, it responds with an answer that is analyzed. If no answer is given, NR can be configured to do necessary actions

Terminate; to stop the script, I let NR send the key phrase "Stop-DNN-Analyzer" that will terminate the script instead of just killing it

To install it is simple but you will have to install the required Python packages first before the script will run, see the import section of the script. I also recommend you to install and use the MQTT broker on this computer to get best possible performance.

To start the script (it should run all the time) there are several ways. I personally run it as a service using systemd. See this link "how to":

Or why not install Node-RED in the computer and let it start the script using the Exec node?

The content of the zip file can be extracted to /home/pi, that is how I did it. Then you can copy the files "" and "" to each entity that shall send pictures.

(I had to provide an external link since uploading zip-files here is not allowed)

We will see if this works out well or if I have missed mention something...


Thank you very much for the thorough and detailed right up! This is a massive help as I set off on my journey for my own object detection system. I receive my Movidius this afternoon, so I'm actively looking forward to diving into this. Thank you again for your help!


I thought I should just add something about starting & running the DNN analyzer script.

To start & run it for testing is simply to use the following command in a terminal program (like gnome-terminal in Debian):

python /home/pi/

That's it, the script will start waiting for you to send pictures for analyzing. To terminate the script, you just send the key phrase "Stop-DNN-Analyzer" mentioned above from Node-RED to the MQTT broker, topic "image/hb-in"

For a more "professional" setup, you might prefer to run the script as a service that automatically starts if and when you boot the computer.

This is how I did that configuration

Create a new unit file for the service with the content as below:

sudo nano /lib/systemd/system/dnnanalyzer.service

Description=My Script DNN Analyzer Service

ExecStart=/usr/bin/python /home/pi/


The permission on the unit file needs to be set to 644 :

sudo chmod 644 /lib/systemd/system/dnnanalyzer.service

Next steps:

sudo systemctl daemon-reload

Now tell systemd to start it during the boot sequence:

sudo systemctl enable dnnanalyzer.service

From here you can use the commands below to start, stop, restart and check the status of the service:

sudo systemctl start dnnanalyzer.service
sudo systemctl stop dnnanalyzer.service
sudo systemctl restart dnnanalyzer.service
sudo systemctl status dnnanalyzer.service

If everything is working as expected, you now have a DNN Analyzer server with a rather good performance, hopefully framing those unexpected visitors in a timely manner

But please, just remember to check the local regulations for video monitoring in your region. Where I live, we are not allowed to have cameras monitoring any part of public areas without specific permissions


Thanks so much for the additional information. This is fantastic work! I'm very anxious to get started on this project, and hope to get it to work as seamlessly as you. And no worries on the regulations. For now, I'm going to try and deploy inside my house for presence detection/body count

How to display CCTV camera in dashboard (RTSP)

I can't thank you enough, I was just staring to stumble around the Pillow docs to figure out how to do this.

The need arose as a friend is running my system on Windows 10 but since it also hosts an externally accessible web-site I advised him not to install the node-red, but to run the notification on a Raspberry Pi (that he already had on hand), which of course meant I "volunteered" to modify my code to support this.

Your python script solves the issue completely by inserting the four lines that are the "guts" of composing the buffer to be send via MQTT into my code.

Thanks again.

I've "unified" my code to run on Linux or Windows, use multiple NCS sticks if available and fall back to the DNN module if NCS is not available or run one CPU AI thread in addition to one thread for each NCS installed.

I'm finding a single i3 software only is about as fast as the NCS on the Raspberry Pi3B+ ~6.5 fps. The i3 with one CPU AI and one NCS is getting ~16 fps.

On my i7 desktop with everything else I do still running, 1 CPU AI thread and 1 NCS thread hits ~38 fps getting frames from 15 rtsp camera threads. These can often be "camera limited" as the AI can be starved for frames lowering the composite framerate, took 15 rtps streams to apparently saturate the AI subsystem on my i7 -- I didn't have enough http snapshot cameras to do it.

I've found rtsp mp4 streams can feed frames to the AI at a higher rate than http "snapshots" or ftp but the latency is very much worse.

I will share my code after I fix a few things, but the NCS stuff has a monkey wrench thrown by Intel. They introduced an NCS SDK v2 which broke all existing code, I wasted some time with it but it offered no performance benefits I could see. Then they came out the the Movidius NCS2 but the NCS SDK v2 didn't support it, only there "OpenVINO" code. This is where I plan to move my code to as it "seamlessly" supports CPU AI and NCS/NCS2.

I don't have a timeline, but your send images via MQTT was one of the "more time consuming" items on my TODO list! I'm still debating if there is a need to allow mixing rtsp stream cameras with http snapshot cameras, right now its either/or via a command line option.


I'm glad to hear! I really enjoy this type of discussions, so good to exchange ideas and experiences
BR & Thank You!


I've had a chance to go through your code and I've learned a few things (always looking to improve my Python understanding, which in pretty basic). I'm more of a "domain" expert than a language expert. I've always said: "its easier to teach a biochemist enough about programming to solve a biochemistry problem, than it will be to teach a programmer enough biochemistry to solve a biochemistry problem." :slight_smile:

For example, your cn = { '41':0, '42':0, ...} confused me initially until I realized it was making an "array" to be indexed by strings instead of numerical indices (0,1, ... N-1). Nice to know you can do such a thing.

I'm not seeing how your timer gets started, I thought it needed Timer.start() somewhere. I assume this logic is to limit the Telegram messages to 2/minute. With your code as a starting point, I may give Telegram another try, I bogged down in Botfather and adding users to a group, I did send messages from node-red but never could get multiple recipients to work. I didn't put a lot of effort into it, as Email and SMS was working fine for me, although recent Android updates seem to have made these notifications a lot less reliable for my phone :frowning:

What I find most interesting is your use of Contrast Limited Adaptive Histogram Equalization pre-processing of your images. I may have to give that a try as the bulk of my static (background clutter) false detection come under IR illumination or during the dusk-to-dawn or vice-versa transitions. But I'm wondering why you are converting the grayscale back to color afterwards as this seems to do nothing other than make the image size (h,w,3) instead of (h,w,1) or is three planes required for the caffe model? (I've not looked into the details, just a luser at this point :slight_smile: ). I thought the traditional way to do this on color images was to convert BGR to LAB, run CLAHE on the L channel and then convert back to BGR.

We seem to be using the exact same MobileNetSSD_deploy.caffemodel and prototxt files. You are sending 500x500 images in the blob to net.forward(), whereas I'm using 300x300 becasue I thought that was the sized the model was trained on and thus had to be the input size. I can't help but think the results would be better if the AI started with higher resolution images, but 500x500 is a lot more coefficients to configure in the training. If its has been done, I'd be a fool for not using it!

I too am enjoying and benefiting immensely from this discussion.


One other trivial question, why did you use the included instead of paho-mqtt? Their functions/methods seem very similar.

Also I just noticed your confidence thresholds seem pretty low, ~0.3, I usually use ~0.8 or do you have a higher tolerance for false positives than I do?


I'm not seeing how your timer gets started

This happens in the on_message function:

if not tmrs[cnbr].is_alive():

And you are correct, this is to limit the number of pictures from the same camera, in this case 2/minute

But I'm wondering why you are converting the grayscale back to color afterwards as this seems to do nothing other than make the image size (h,w,3) instead of (h,w,1) or is three planes required for the caffe model?

I'm not 100 certain about this, if I remember correctly, the createClahe function needs a gray scale but then the blobFromImage requires a BGR. Anyway, if I do not convert to BGR before doing the blob, I get error messages

You are sending 500x500 images in the blob to net.forward()

I'm actually down to 400x400 now but this is just a personalized tuning I have. It seems, for some strange reason, that I am able to detect persons better if I have an "overlap" so to say. It's because my cameras view angle are different to what I suspect was used when the model was trained (several of my cameras are mounted higher up under the roof and watching down whereas the model is most likely trained with more frontal/horizontal views). Anyway, I detect better with 400x400 than with 300x300

why did you use the included

Just old habit I guess. They are the same but paho is newer I think. Both works fine and paho was not installed in the raspi images I used at that time. Regardless, you need to import some py library, works fine but if you install paho, it works as well

Also I just noticed your confidence thresholds seem pretty low, ~0.3

This is also just a personal tuning due to some of my cameras view angle. Most of the time the detected confidence is much higher, well above 90%, but for some cameras looking from above and with some light or weather conditions, object contrast to background, the confidence level sometimes drops to 35-40%. Anyway, this is configurable by camera in the conf_lev object

Kind regards, Walter