RasPi Zero Wlan0( ) connection detecting

I'm looking how to write some code for my RasPi zero for WiFi connection detection.

It connects to my WAP.

Originally there is/was the problem if the WAP goes down/up it doesn't reconnect.

I dug around and was told the default was is flawed and was shown the better way.

I did it and it kind of seems to be working. But - and there's always one of those - it doesn't seem to be much better.

Today I was doing things with the modem and in doing so the WAP was taken down and restarted a couple of times. The Zero didn't reconnect as it should.

I need some code that monitors the WLAN0( ) connection.

If it is down for more than n it reboots.
But I don't want it rebooting ad-infinitum if the WAP is down.
Maybe 5 reboots and stops.

But that's a bit down the track/line.

Sorry if it has been done before. I'm a bit stressed out just now.
Thanks in advance.

Can you use a ping node? Then create some if loops to make it all work?

Is it actually the router that needs to be rebooted? In which case run node-red somewhere else and when you see that the other device has not reconnected force a reboot of the router.

Brew,

Of course! Thanks. Blonde moment.

Colin,

Well, I am hoping not. If it is the modem which needs to be rebooted, that's a whole other story.
Just I was changing some WiFi setting in the modem and when they were applied the WAP was taken off-line and brought back up (by the modem) but in doing so, a RPZ didn't reconnect.

So it needed to be either rebooted or.......

With what was said in mine, I will have to do some more research.

It was the fact that you indicated that sometimes even a reboot of the pi doesn't fix it that made me ask. There must be some reason.

The first thing to do is to wait till the pi fails to connect again and look in /var/log/syslog on the pi, in case there is something helpful there. There is always a reason for this sort of thing, though working out what that is may be tricky.

Ok. Thanks.

I'm not sure about the:

It was the fact that you indicated that sometimes even a reboot of the pi doesn't fix it

It hangs and I can't reboot it, because it is not on the network and is headless, so I have to power cycle it.

Yes, finding the exact reason is going to be tricky.
There may in fact be several causes.

It is just they are so infrequent it is hard to trap them.

Would /var/log/syslog be valid after a reboot?

Just one of the many things I am chasing.

I thought you said that you wanted the flow to try rebooting a few times but if that didn't work then to shutdown. From that I thought you meant that sometimes even a reboot didn't fix it. Perhaps that is not what you meant.

syslog is continually added to, with a new file being started occasionally and the last 5 or so files are kept before being deleted. If you run
ls -l /var/log/syslog*
you will see them. So after a reboot the end of the last session will still be in /var/log/syslog or possibly in the previous file (/var/log/syslog.1). Look back through the log for the start of the current session, you will see near the start of the session a whole series of kernel lines with timestamps since boot at the front looking something like
....kernel: [ 0.001868] ....
which means that message wase .00186 seconds after the reboot. Go back to the start of those and then a bit further to find the end of the previous session. Note though that the timestamps at the start of the line can be confusing as the clock may not be correct until later in the startup sequence.

Ok, sorry to clarify.

The RPZ is headless.

Now and then it locks(locked) up.

I did a lot of digging and someone on the RPI forum said the supplied system for connecting to a WAP is problematic - if not deprecated.

The older problem was shown also by the LED blinking crazily on the RPZ.
I could not SSH into it and it wasn't talking to NR.
So I had to power reboot it.

Anyway, they showed me the better way. I applied it but can't really say it is any better.
About that time I also updated my WAP to which it connects. (Yeah, silly me changing two things in a problem at once.)

I think I saw this once after the update and though it may be KERNEL PANIC'ing,
Luckily I could/did sent a MQTT message to it and it rebooted and was ok. (LED wasn't blinking crazily.)

It has been a lot more stable since then. So: what ever.

Alas recently when I was changing settings on the modem/router (the new WAP) to apply the changes it took the WAP down and back up again.

The RPZ didn't reconnect.
The LED wasn't crazy, so......
As the other machines were on, I couldn't connect; nothing worked; so I had to power reset it.

So..... I need to do more studying on what to do.
I think first off I need to get the interface back up.
Failing that working, I will need to reboot the machine.

worst case scenario
Everything else dead, I don't want it to continually reboot waiting for the WAP if the WAP is dead.

So maybe 3 reboots. If it doesn't connect, just shut down (or not reboot again.)

But that is (really) a few steps down the list of things done, because really if the WAP is down, I would like to try getting it up before rebooting.

My fault. Sorry.
Is that a bit clearer?

I had a similar problem which I described at;

I don't understand what you mean by using a better way to connect to wifi. If you have messed with something in the way the OS uses wifi then all bets are off.
So you don't actually know that it is a wifi problem at all. Rather than adding flows to attempt to sort a problem when you don't even know that it is the wifi I l suggest looking at syslog first. If the pi is completely locked then you will need some external system to sort it.

well, from what I have found:

/etc/network/interfaces, ifup/ifdown, the "hotplug" keyword are all obsolete . Even in Jessie.

What you want is the modern way of configuring network interfaces, which is event-driven: when an interface is brought up, an event is fired and dhcpcd configures networking (see /etc/dhcpcd.conf). When an interface disappears, an event is fired and dhcpcd deconfigures networking.
In a sense, it is all "hotplug" by default now.

Forget the old commands "ifup wlan0" / "ifdown wlan0".
Instead get in the habit of using "ifconfig wlan0 up" / "ifconfig wlan0 dow

Somewhere I was also shown a newer way of setting up the WLAN( )

Originally if the WAP went down/up the RPZ would just not reconnect.

The way I was told (in another thread with a whole new way of setting up the WLAN( ) )
is supposed to be compatible with the WAP going down/up and reconnecting.

I'm wanting to add flows to monitor what is going no.
If the WAP goes down, on the RPZ it logs the event ASAP.
Then if the WAP comes back up (which may not be known to the RPZ) I want the RPZ to be looking for the WAP.
If the WAP is there, it connects.
If it isn't, it reboots. Just to be sure.
It looks for the WAP. (Repeat above a couple of times)
Once it has done this a few times (say 3 - like I think I said earlier) it stops doing that and logs this fact also.
Then it is up to me to note this event.

No, I can't say if it is a WiFi problem. But if it detects the WiFi down, it tries to get it back up.

If a Kernel Panic sets in, I have a watchdog running that I hope will reboot the machine.

Yes, I shall have to look at the log file.

Here's what I see:

pi@TelePi:~ $ ls -l /var/log/syslog*
-rw-r----- 1 root adm  3665 Jan 17 21:00 /var/log/syslog
-rw-r----- 1 root adm 70678 Jan 17 06:25 /var/log/syslog.1
-rw-r----- 1 root adm  1170 Jan 16 06:25 /var/log/syslog.2.gz
-rw-r----- 1 root adm  1112 Jan 15 06:25 /var/log/syslog.3.gz
-rw-r----- 1 root adm  2116 Jan 14 06:25 /var/log/syslog.4.gz
-rw-r----- 1 root adm  1187 Jan 13 06:25 /var/log/syslog.5.gz
-rw-r----- 1 root adm  1215 Jan 12 06:25 /var/log/syslog.6.gz
-rw-r----- 1 root adm 15450 Jan 11 06:25 /var/log/syslog.7.gz
pi@TelePi:~ $ usb
pi@TelePi:/media/pi/9020-9C27 $ cd logs
pi@TelePi:/media/pi/9020-9C27/logs $ lf
last_alive.db
Rebooted at 2019-12-23T234534.db
Rebooted at 2019-12-29T221036.db
Rebooted at 2019-12-30T221932.db
Rebooted at 2020-01-09T033921.db
Rebooted at 2020-01-09T034400.db
Rebooted at 2020-01-09T201957.db
Rebooted at 2020-01-12T200053.db
Rebooted at 2020-01-16T045928.db
pi@TelePi:/media/pi/9020-9C27/logs $ cd ~

So the machine was rebooted 16 Jan. at T04:59:28
Alas that seems to be in syslog.2.gz
From the file:

Jan 16 00:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 00:17:01 TelePi CRON[25137]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 00:46:11 TelePi systemd[1]: Starting Daily apt download activities...
Jan 16 00:46:19 TelePi systemd[1]: Started Daily apt download activities.
Jan 16 00:46:19 TelePi systemd[1]: apt-daily.timer: Adding 28min 39.461111s random time.
Jan 16 00:46:19 TelePi systemd[1]: apt-daily.timer: Adding 6h 6min 20.488228s random time.
Jan 16 01:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 01:17:01 TelePi CRON[31252]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 02:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 02:17:01 TelePi CRON[4966]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 02:57:27 TelePi systemd[1]: Starting Certbot...
Jan 16 02:58:06 TelePi systemd[1]: Started Certbot.
Jan 16 02:58:06 TelePi systemd[1]: certbot.timer: Adding 3h 18min 54.543480s random time.
Jan 16 02:58:06 TelePi systemd[1]: certbot.timer: Adding 11h 21min 46.836520s random time.
Jan 16 03:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 03:17:01 TelePi CRON[11045]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 04:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 04:17:01 TelePi CRON[17131]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 05:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 05:17:01 TelePi CRON[23201]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 06:05:22 TelePi wpa_supplicant[316]: wlan0: WPA: Group rekeying completed with 24:f5:a2:b2:2a:07 [GTK=CCMP]
Jan 16 06:17:01 TelePi CRON[29254]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 16 06:25:02 TelePi CRON[30076]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
Jan 16 06:25:09 TelePi systemd[1]: Stopping Make remote CUPS printers available locally...
Jan 16 06:25:09 TelePi systemd[1]: Stopped Make remote CUPS printers available locally.
Jan 16 06:25:09 TelePi systemd[1]: Stopping CUPS Scheduler...
Jan 16 06:25:09 TelePi systemd[1]: Stopped CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Stopped CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Stopping CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Started CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Closed CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Stopping CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Listening on CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Started CUPS Scheduler.
Jan 16 06:25:09 TelePi systemd[1]: Started Make remote CUPS printers available locally.
Jan 16 06:25:11 TelePi liblogging-stdlog:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="206" x-info="http://www.rsyslog.com"] rsyslogd was HUPed

Because the time in the "Rebooted at" file is Zulu I'm not exactly sure when it happened.
I'm guessing 15:59 local.

But keeping to Zulu time, 04:59..... It is weird.

I can't corralate what is going on.
Every 20 seconds it writes a time stamp to the file: Last alive.
On bootup, it moves Last alive to Rebooted at....... with the time stamp.
But it doesn't make sense.
Rebooted at says 04:59.xx 16 Jan.
Taking the times on the syslog are are also Zulu they go beyond 04:59.xx

I'm confused in what I am seeing.

Thanks Paul.

Nice idea.

The problem I have is that I was editing something (I think it was Parent control for devices - and alas they were WiFi ones too - and to apply them the modem/router had to reset.) which caused the WAP to go down while it was rebooting.

Subsequently the RPZ didn't reconnect. Yeah, I know - or probably guess - there are further gremlins awaiting me.

I need a way to tell the RPZ to reconnect to the WAP after the WAP goes down.
There are a few layers to it - like with what you router-reboot has.

Then as the next layer up, if the RPZ can't connect to the WAP - after a certain time - it reboots the RPZ.
That happens a few times and then stops because if the WAP is down/dead/the world ends, I see it pointless rebooting the RPZ over and over.

I have opened a whole new can of worms with Parental control on my router and DHCP stuff.
I won't go into it here. It is outside the scope of the thread.

dhcpcd.conf isn't a better way, it is the way and has been for years. You should not need to mess about with ifup/down or anything like that, it should all be completely automatic. Just configure the SSID and leave it.

The log you show looks like a perfectly normal shutdown. It doesn't show any wifi failures or anything untowards at all, and is not a crash or lockup otherwise it would not show it shutting down the services. So it looks like something told it to reboot.

Presumable the startup is in the next file, it will often start a new file at startup, depending on the size of the previous file I think.

Ok, thanks.

I shall have to keep an eye on it.