Let's discuss how we might use AI to detect people in stills, for low CPU users like me

Greetings,

I am trying to solve the problem of false motion alarms using Node-Red, and one of the ways I want to do that is with locally hosted AI.

This is my use case.

I live on a farm, and have experienced four burglaries this year. We lost chainsaws, brush cutters, spanner sets, power tools, the list breaks my heart, it really does.

We're OK, because some of it was insured, and also we're farmers, so we'll figure this out with what we have. It's kind of what farming is, if you're not corporate.

In any case, what I need is valid early warning, and Node-RED is the tool, because I have a whole bunch of disparate systems and some bespoke stuff too, and I have to glue it together because there's no budget for anything beyond the eight new cameras I bought.

I am the A-Team, locked in a shed with everything I need, I just have to figure out how. Here we go.

Essentially what we are doing is presence detection, and we are improving it with AI. I would never use this method inside any room or area that I consider to be private, because just no. Yes to patios, no to bedrooms.

AI person detection is a game changer because it removes any doubt that human is present. It does, however, require processing, and if this is going to be a thing I can share with my neighbours if I get it working, well, they can't buy GPUs either. GPUs also eat power, which is a hard thing if you run off an inverter (the grid is a joke out here).

So the gig is to integrate AI in a way that is reliable, yet super light on processor. My thinking is that it has to be possible to do AI human detection with still images instead of live stream. That way, I only have to ask an AI server if this 720p jpg contains a humanoid, and even then, only if a bunch of other stuff says {"state"= true} first.

Because I am limited to what I have, I will be dedicating a refurb Lenovo Thinkcentre M93P Tiny, which has an i7-4785T Octa-Core CPU and 16GB of RAM to being an AI server.

So step one is get every motion sensor or line crossing sensor or whatever the hell I have into Node-RED, which I've done in a sane way using MQTT topics.

The logic flow is to use a Bayesian filter (thank you @mixu_78) to decide if the motion warrants verification. I'm still fiddling with functions to make it work; it looks promising.

For each room/space, I'm using an array stored in global context. When a new presence event comes in (looking at mmwave too with great interest), the global array is updated with the new value, and then the entire array is passed to the bayesian filter, which has the relevant conditions in place, YMMV.

If the filter says the room is occupied when it shouldn't be, at that point I'm using a HikvisionUltimate node to pull a jpg. The idea is to then send it to an AI server somehow, and get at least a boolean value back.

If it's a person, wake me the **** up with a voice notification and a strobe or something, so I can radio my neighbour and get my ass down to the tractor shed with a paintball gun loaded with pepper gas rounds, instead of watching the video the next day and appreciating how professional the thieves are.

My neigbour has a million guns and a great need to fire them, and on some occasions that's a help. We drive each other's farms at night during picking season; it's like that.

Anyhow, I'm hoping that there are other scrappy people like me trying to make it happen with duct tape. Let's figure it out together. I would love to hear your ideas

1 Like

I've never bothered to do visual presence detection, never needed it so not sure I can help with that side of things. Though I know that people are doing good stuff just with Raspberry Pi's so your server is probably over powered. :grinning:

Some questions around security though, more my thing. It would surely be better if you could detect intrusions long before they get to a building? By the time they are at a building, they could be in and out before you can respond. If there is any way to funnel them down a couple of possible routes, you could have simple early-warning just by using an IR tripwire, very cheap and easy. Depending on the layout of your farm(s), you might be able to pre-signal most intrusions at the boundary? Realise that some farms would be too big/open for that though.

You might also consider if there are high-risk times or days. If you can identify those, again, simpler presence detect solutions may be very effective, though if you have lots of animals wondering around, that will likely complicate things unless you can keep them away from entrances.

While AI image categorisation can be pretty good, it isn't going to be perfect by any means, especially in rain, snow or dust. So combining with more traditional approaches is going to be better. Further combining with knowledge about whether anything should be in a location makes it stronger still.

I'm assuming that you will put up some loud speakers and bright strobe lights if you haven't already. Letting thieves know they've been spotted and telling them what they are about to experience - several angry, gun laden farmers and being filmed - along with a loud siren and bright strobe will put off many. Internal sirens should be also fitted and should be at the max safe level possible because that causes people to panic or be confused.

One final thing, you haven't said what country you are in but if it is a country with prolific firearms, you might need to consider having a firearm backup. I believe that shotguns with birdshot are pretty safe but effective deterrents. I also know that in the UK at least, many arable farmers use bird scarers that use shotgun blank rounds, one of those in an enclosed space would probably be quite an effective deterrent!

(Not to confuse the subject.)

There is/was a recent thing I saw on youtube of a small board for presence detection that can detect you breathing.

Using microwaves.

It is cheap.

It is better at detecting people rather than false positives.

But (Sod's law) I can't find the link.

This is one I just fund.

Hi @TotallyInformation, thanks for your feedback.

If it matters, I'm in the wilds of Limpopo, in South Africa. I'll try to keep the conversation related to Node-RED and AI, but I hear you on a lot of your security considerations.

I also hear you that AI categorisation won't be perfect, but it doesn't have to be, because it is simply an additional way of confirming a thing. Other triggers are mostly motion sensors, but also line crossing sensors where the cameras allow for it.

So we still need to follow best practise for presence detection in all other ways; we're just looking for easy ways to get an AI boolean back.

It now occurs to me that the boolean value I want from an AI server will mean that I could run a second bayesian filter that uses the ai boolean as an additional consideration.

But even if I didn't, if the AI agrees with the calculated output of multiple sensors, I'll take the odd false positive, particularly at certain times of day. You're right, we tend to get hit between midnight and 2am, so, I think that's a thing that would be pretty easy to do with the Big Timer node.

1 Like

Hi @Trying_to_learn, thank you, yes, those are mmwave sensors, and there's probably a conversation to be had elsewhere on those. I have four of them sitting in a box, and this a perfect use case.

The bugger for me is all my sensors came with a 1.25mm pitch header, which is tiiiiiny. Can't find a breakout over here yet.

But that sensor kicks out a 1 for positive, so it can be fed into the Bayesian filter too, with appropriate settings.

The bugger to solve now is how we use Node-RED interact with an AI that we locally host somewhere on our network.

I'll update once I've got ProxMox and docker happening on the i7.

You may well have found several of the older projects already on here age - AI enhanced video security system with Node-RED controller/viewer and Standalone Raspberry Pi Security Camera with AI "person detection" for example. Many of the folk that appear in those threads are still very active here so have probably got newer and better setups in use already.

2 Likes

Can i suggest that you simplify a lot of this for yourself.

Change your Lenovo (or get a 2nd one) to use Docker and deploy Frigate - purchase a cheap USB Tensor PU - i use the Coral USB unit which is about $150/AUD and uses Next to no power and CPU on the host box.

You use Frigate to do the categorisation (and maybe your cameras as well depending on what you have) and then it spits out MQTT topics telling you what it has found (Dog, Cat, Person etc) and then you can play with it in Node Red etc

This is much easier and more supportable moving forward and scales much more easily as you can add multiple Coral units if required. You can subscribe to their Frigate+ model if you wish and then send your own datasets to them for better model retention.

Craig

1 Like

Seeing as you are in SA - why not get a Boerboel and solve the problem at the source.

Our boy was the son of a SA Champion - ours was 75Kg - his Father was 92Kg !!

We were told a story by a South African breeder at a show day of 3 thieves who tried to come on to a property which was guarded by a team of Boerboels - apparently only one made it back out on his own steam, one had an ambulance called and the other needed a hearse !!

Craig

@no_mpimpi
You seem to have some stuff to start experimenting with. I give you my story and what I use today.

Since years there are available libraries you can use in a local computer. Wether you would define them as AI or not, well, I think you can. They are trained on real data and learned from that. YOLO and SSD are examples and free available in various versions.

If you prefer and have access to internet you could also use cloud services for the analyze, like "Amazon Rekognition" in AWS.

Some will as you say load the CPU harder and runs best with GPU's. YOLO in later versions is the most accurate but I say you need a computer with GPU's to get short response times.

Another rather good one that runs well also on a CPU is based on SSD detectors. There is a great working node for Node-RED available: node-red-contrib-tfjs-coco-ssd (node) - Node-RED. It works fine and is worth testing in your setup.

Very important for all monitoring is the camera view, how you mount the cameras. If they are too high up and looking down, the detection will not be as good as if you can mount them a bit lower looking more towards the intruders (frontal view).

In my own setup I run both YOLO and SSD on two separate computers, just for comparison so to say. Both computers receive the same images from cameras via MQTT when motion is detected. In the following analyze I filter out all objects that are of no interest, just keeping "persons". Below some sample images of myself being detected when arriving home. I have had my system running like this for years and it has been working rock solid.

The computers I use for the analyze are maybe a bit overkill, NVIDIA Jetson Nano's. At least for the SSD that I think you can run with decent reponse time also on a RPi4, it's worth trying!

Best regards, Walter

SSD version:

YOLO version:

I set myself the task of building a system using the least expensive (but still reliable) components I could find in terms of outdoor object detection.

I have been "testing" a few Raspberry Pi Zero 2 systems (with RPi cameras). That is all the hardware I am using. The firmware is Python running TensorFlow Lite (without OpenCV).

Performance is very good considering everything is done on the RPi.

Reliability is also very good. Every time the object is detected a notification is sent to the NR server along with a still image.

Cost is literally what a RPi Zero 2 + RPi camera cost. And something to house it in naturally.

Overall, works very well.

Cheers
Bob

2 Likes

The best way to get decent AI on low spec hardware like Raspberry Pi, is with the Coral TPU AI co-processor, it is very good with MobilenetSSD_v2. It costs ~$60 in its USB3 versions (using USB2 will cut down your frame rate considerably) and about $30 in an "M.2" form factor, but few of the IOT class systems have these slots. The OP has an i7, so with OpenVINO (free download) and Python my Lenovo Idea Pad (~3 years old, bottom of the line i3 cost me $160 as an "open box" closeout when the new models came out) does about 28 fps with CPU AI and MobilenetSSD_v2 on the i3. I'd wager his older i7 would do better, But his old i7 might not have a new enough iGPU to run the yolo8 model, I know that a 6700K does and 4500U doesn't, his is in between.

My next project will be trying to use yolo10 and MobilenetSSD_v3 each of which are supposed to have about the same "accuracy" with 20-30% lower inference times.

1 Like

I'm addressing the actual coding in a different thread because this discussion seems more general, and my bugs are mundane.

I agree with @wb666greene that a Coral USB would be ideal. It's not going to work for me, though, simply because to get stuff into this country requires me to purchase in USD, and then get it through customs, which is a dice roll. You have, however, made me realise I've a friend coming to visit in December, and it'll reliably arrive in his luggage.

So I've got codeproject.ai up on the i7, running in a proxmox CT, Ubuntu 22.04, much ram, such disk space, wow. In testing so far, YOLOv5 6.2 crashes and burns. YOLOv5 .NET runs stable, so alrighty then.

For context, I record 34 cameras at 10fps. There are three separate buildings monitored, as well as farm gates. I still laugh that I ever thought 6 cameras would be remotely enough. I iterate in all things.

1 Like

Really? I had it running - via pure js code on Coral tpu stick - in a new node from @dceejay (after a LOT of headache) but my results were very bad: the scores of a real human and a false positve were very close to each other. So I had at last abandon the project.

@no_mpimpi
I have always been a fan of a pure javascript solution (since Node-RED is all about javascript), so tfjs was the way to go. But if you look at the code commits of the tfjs repo, you will see that Google has no interest anymore for the javascript variant of Tensorflow:

So if you want to have a future proof solution, I would NOT follow that road. Although it breaks my heart to say it, because I really loved tfjs ...

1 Like

Google hasn't done much with the Corel TPU either since 2021, A third party recompiled the TPU code for Python 3.10 and Ubuntu 22.04 and put it on GitHub https://igor.technology/installing-coral-usb-accelearator-python-3-10-ubuntu-22/

Running yolo4 or yolo8 was not all that much better than SSD_v2 in terms of initial false positives, and much lower in frame rate. The key to my system has been a three step process.

First full frame detection with SSD_v2, I find it amazing that a 4K frame resized to 300x300 pixels detects anything. My confidence threshold is pretty high ~0.65, once a person is detected.

Second I zoom in on the detection box, and resize it to 300x300 and run the inference again with a confidence of about 0.70 or higher (I'd have to look at the code to remember what I settled on). This gets rid of a lot of false positives but was still leaving me a few every month.

Third step was to crop (zoom) the detection for the yolo model (608x608 for youlo4, 640x640 for yolo8) and run the final detection with a confidence threshold of ~0.75, this has gotten me to a few false positives per year on 26 cameras running about 3 fps each split across two systems for a bit of redundancy.

As far a I can tell the TPU lacks some compute blocks to be usable with yolo, so it doesn't seem to be much help with yolo models. My yolo inference rates are only about 10 fps on the GTX1060 in my old "gaming" laptop bought circa 2017, but that has been enough because of all the filtering done by the SSD detections, the only time it really starts dropping frames is when a group of dog walkers stop by my corner in view of multiple cameras (I have overlapping fields of view) and fills up the yolo inference queue.

1 Like

Ah really, 0.65 is high? Indeed I got similar confidence values (in pure javascript) and I thought they should be much higher (like e.g. 9.0). Thought I was doing something wrong in my code, so I stopped my developments...
Main problem I had with those values: when the SSD model incorrectly identified a shrub in my garden as a human, the confidence was not much lower (e.g. 55). That way it becomes very hard to set a mimimum required confidence value, to eliminate false positives...

That is one of the SSD flaws, it often calls plants as "people" particularly as the sun goes up, goes down, or at around noon. I have a node-red function node implement a per camera filter to remove fixed detection boxes with about a 20% tolerance and not alert. This has worked well, but yolo8 verification has worked better and doesn't need to be setup or tweaked as the plants grow.

If you note most "demo" images to show how "good" a model works have a confidence threshold of about 0.2 and the images generally lack the objects that often false detect. Nothing malicious, just what I have observed.
The github for my current system that has been running for about two years has the four false detection images that I've had in that time, three of the four were with yolo4 verification, only one had yolo8 verification of a false positive. https://github.com/wb666greene/AI-Person-Detector-with-YOLO-Verification

I'm close to starting a "share your project thread" for my newest version2, you can have a preview: https://github.com/wb666greene/AI-Person-Detector-with-YOLO-verification-Version-2/tree/main It is not quite ready as the models are too big to upload to GitHub, the new code automatically downloads the yolo8 and converts it to openvino format for iGPU if it is not there at startup. And it is easy to download the TPU model for MobilenetSSD_v2 from Google, but the openvino version of it for Openvino CPU detection that I converted in 2021, I need to make instructions for someone to convert it with the current openvino version.

The detect, zoom, and redetect was the first big step in lowering the false positive rate, the yolo verification really gets me close to where I want to be.

This version2 is much easier to setup and can easily handle four 4K cameras with a three year old bottom of the line i3 laptop and iGPU, which can be found used/refurbished for ~$130, i5 or i7 or adding a TPU increases the number of cameras it can handle.

3 Likes

Hi @BartButenaers,

I'm anticipating having the same problem that you with tricky objects in frame, especially with cameras that have a lot of vegetation (I'm on a farm, this is a thing). The solution @wb666greene mentions is very interesting, and I hope to be sampling it soon.

But in the spirit of this discussion, I'd like to talk about how people might optimise camera angles and/or lighting in order to give the software an easier job. I find the great thing about Node-RED is it allows me to use very disparate strategies to achieve a result, and I think that that should include making sure the sensors work in the best way that they can. We are using AI as a sensor, at the end of the day.

So maybe let's talk about physical things we can do too.

Do the models we use cope better in LED light or IR modes? is the first question I have. I've got a couple of old IR emitters lying around in a box, but if the AI enjoys that sort of thing, I could reinstall them. Any ideas there?

Or, would turning on more lights in response to motion give the AI a better shot? I imagine for this to work, some shenanigans would be required regarding bulb placement, but since these are agricultural buildings I can pretty much do whatever without regard for cosmetics.

Visible light is better than IR illumination. PIR activated floodlights will definitely help, I get my "best" nighttime detections on cameras who's field of view is partially under streetlamps.

Here is an early morning detection under mostly IR illumination:

And here is one around the corner mostly under streetlamp illumination about fifteen minutes earlier this morning:

My experience is that the claims of "night viewing" with consumer priced cameras are greatly exaggerated, the newer cameras with built in PIR activated white light LEDs are probably worth the price premium.

1 Like

How much if that difference would be down to the AI model only being trained on daylight images ?

Impossible for me to tell. I think it works relatively well on IR images but the image quality under IR illumination, between the different focal point, and motion blur from the slow shutter, make the images good for little more than notifying you that a person is in the field of view.