Watchdog Reset through terminal DOS

Post Reply
User avatar
johu
Site Admin
Posts: 5799
Joined: Thu Nov 08, 2018 10:52 pm
Location: Kassel/Germany
Has thanked: 160 times
Been thanked: 1031 times
Contact:

Watchdog Reset through terminal DOS

Post by johu »

Today I found 90% of the cause of an issue that has been plaguing me for months now. I had random reboots of my VCU mid drive. Read up on the whole story here: viewtopic.php?p=66887&hilit=reset#p66887

Lately I had the VCU restart during charging but there was one important difference to a normal charging session: I had the dashboard running over the entire session. The dashboard polls settings and spot values in a 500ms interval, so two lengthy "get" command per 500ms.

So I conducted an experiment here on the desk: connected USB/TTL adapter, set baud to 921600 and then spammed it with the same get commands as the dashboard but at a much higher rate. And indeed the replies become more and more garbage and after less than 10 requests the watchdog resets. The longer the commands, the sooner

Then I had a first lucky shot, commented out the echo:

Code: Select all

   while (lastIdx < currentIdx) //echo
      usart_send_blocking(usart, inBuf[lastIdx++]);
with that done I can stream data indefinitely at the highest possible rate.

SPOILER: this is not the root cause, read on

The strange thing is, even with the terminal task completely locked up (e.g. by running "stream 10000 something,something") the watchdog doesn't trigger because naturally the scheduler (or in fact any interrupt) has high priority than the terminal which runs in the main loop.

usart_send_blocking() calls usart_wait_send_ready() which contains

Code: Select all

while ((USART_SR(usart) & USART_SR_TXE) == 0);
So it keeps polling the status register until the send buffer is empty. Is polling the usart register somehow un-interruptable? So that's why I said 90%.

Next challenge is to fix the issue in a backward compatible manner. Simply not echoing does not play well with the existing ESP8266 or ESP32 code that runs on 100s of devices by now. Also for most people it is not really an issue because at the normal poll rates of the web interface all can run fine for hours.

Maybe an "echo off" command would be the solution.
Support R/D and forum on Patreon: https://patreon.com/openinverter - Subscribe on odysee: https://odysee.com/@openinverter:9
User avatar
johu
Site Admin
Posts: 5799
Joined: Thu Nov 08, 2018 10:52 pm
Location: Kassel/Germany
Has thanked: 160 times
Been thanked: 1031 times
Contact:

Re: Watchdog Reset through terminal echo

Post by johu »

I have added an echo 0|1 command now and also allow turning off echo in the constructor.
So now for the car project I turn it off by default. Upcoming versions of the ESP firmware can turn it off via the command
https://github.com/jsphuebner/libopenin ... e838746db2

I still found a way to crash the STM via the terminal. When executing a very long running terminal command like "stream 10000 xxx" and then hitting the terminal with new commands at a high rate while that is executing. It's a bit more academic but should be solved also
Support R/D and forum on Patreon: https://patreon.com/openinverter - Subscribe on odysee: https://odysee.com/@openinverter:9
User avatar
johu
Site Admin
Posts: 5799
Joined: Thu Nov 08, 2018 10:52 pm
Location: Kassel/Germany
Has thanked: 160 times
Been thanked: 1031 times
Contact:

Re: Watchdog Reset through terminal echo

Post by johu »

Alright, more changes. Now I no longer managed to lockup the terminal or entire processor.

The USART has an overrun flag and apparently while that is active it no longer accepts data. So now I
a) disable the RX DMA while running a command - that stops the watchdog reset
b) reset the resulting overrun flag before enabling the RX DMA
https://github.com/jsphuebner/libopenin ... e7973a82df
Right, and as I was writing this it stopped accepting commands again so now I always clear the overrun flag. No reason not to.
https://github.com/jsphuebner/libopenin ... f32845af0d

I do understand why b) recovers the terminal but I don't understand why a) stops the processor from locking up.

EDIT: and waddayaknow - with those changes I can turn the echo back on :twisted:
Support R/D and forum on Patreon: https://patreon.com/openinverter - Subscribe on odysee: https://odysee.com/@openinverter:9
Post Reply