Saturday, May 2, 2020

Building a hardware watchdog timer for a kiosk or other system that needs to run 24x7

It was proven by Alan Turing back in 1936 that the halting problem applies for computing in general, and our contemporary computing machines are no exception.

Would predicting the crashing of an algorithm or program be a possible function, and we would be able to know the edge cases that cause an application to fail or enter a loop, without having to explore the actual scenarios to find these edge cases. To put it simply we would only have to ask the algorithm in which conditions it would enter a loop or end unexpectedly, and by not providing these inputs we would with absolute certainty not enter these scenarios.
In the real world however, computability theory doesn't give us that. If we assume that the software will fail at some point, and considering that it needs to continue functioning, we need to employ the reactive measures that allow us to control the flow of execution of that particular software, causing it to restart or have its current state moved to a different execution point when the incorrect behavior is detected by an external observer.

The watchdog timer is a mechanism that can be seen as an element of external control. If we consider that a certain software loop needs to periodically tell the watchdog timer that it is running, then the later is able to control the machine running the software, if it ceasses to receive the feedback from the software loop. This control command can consist of an action that causes the machine to restart the software.

That is precisely the principle that I have followed when a few years ago I had the need for building one such device, in order to make sure that my obsolete-smartphone-based IP camera could be recovered back to operational state, after a software crash or similar problem:

With the construction of the Home Assistant Kiosk in my last couple of posts, I found that occasional system crashes would again be an issue. In order to maximize the device availability, this would potentially be an appropriate solution.

Using it on the Android system that is present in this Kiosk is at the time of this writing a topic which is still under analysis, because it is necessary to the determine if there are any GPIO pins available on the board, and if these can be controlled from the system:

This watchdog timer consists of a PIC12F683 programmed with a firmware that I have written, a 2N2222 transistor for controlling the smartphone power button, and a couple of passive components. Its source code can be found here:

It only uses two GPIO pins from the PIC, one serving as an input, and the other as an output. The input is connected to one of the target device GPIO pins.
For example in the smartphone, I have removed the vibration motor and used its signal pin for this purpose. The motor state could be controlled from the /sys filesystem, by issuing the desired state to the appropriate file.

Once I have more details on how the integration with the Kiosk will be done, I will post it here. Ultimately (if a useable GPIO pin cannot be found), there is a UART that is present on the board. With it, the liveness data can be sent to the watchdog timer, but ultimately the device will need to be a slightly more sofisticated one, capable of receiving serial data, or at least of interpreting a specific pattern in the serial stream.

No comments: