OnTimerDlg ---- Thread timeout, restarts

Started by Jimbocous, May 03, 2019, 08:27:23 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Jimbocous

Low priority issue to me, definitely. More a matter of curiosity.
I see this 2-3 times per day. Seems to happen when there's a fair amount of traffic caused by activity from project servers (e.g. downloading a slug of new tasks, change of work venue, etc).

03 May 2019 - 01:40:54 OnTimerDlg ---- Thread timeout, will restart.
03 May 2019 - 01:40:54 BoincTasks is closing down all threads, this may take a couple of minutes.
03 May 2019 - 01:40:58 BoincTasks has closed down all threads

Occasionally, one of the computers on the network apparently fails to respond:

03 May 2019 - 01:40:58 ERROR: Not closed down properly ---- [machine name]  Suspended

No real problems result from this, other than the crash itself, and everything restarts in a minute or two. Just wondering what the cause is, and if there's something I can do to minimize occurrences.
At present, given the configuration detailed below, BT is tracking 2000 tasks on SETI, and sometimes added tasks if other projects are active.

Current configuration is as follows:
1 - general use PC
     Xeon 3ghz hexacore
     running BT 1.78 on Win10, BOINC 7.12.1, monitoring dedicated crunchers
     Avg. CPU utilization < 90%,
     plenty of memory and SSD disk space
     3 NVidia GPUs
     LAN stats:
         DL Rate:  5.3 MB/s avg, 8.9 MB/s max
         UL Rate:  112 KB.s avg, 6.0 MB/s max
4 - dedicated SETI crunchers
     2 - Xeon 3ghz hexacore,         Win10, 1 - BOINC 7.12.1, 1 - BOINC 7.14.2, 3 NVidia GPUs ea.
     1 - dual Xeon 3ghz hexacore, Win10, BOINC 7.12.1, 3 NVidia GPUs
     1 - Xeon 3 ghz Core2Quad,     Win7,   BOINC 7.12.1, 4 NVidia GPUs
     Avg. CPU utilization > 90%
     plenty of memory and SSD disk space, LAN stats insignificant
Network - all 1gb wired connects via 2 gigabit switches
    No indication of network blockage or other LAN issues.

While utilization is high, none exhibit slow mouse, kybd or other response or other issues when used at at console.
Just wondering if there's anything I can do to tune this, or if it's just a hard-coded timer a machine intermittently fails to meet due to loading, or  ???

Would be interested in any thoughts?
Many thanks!

fred

Quote from: Jimbocous on May 03, 2019, 08:27:23 AM
Low priority issue to me, definitely. More a matter of curiosity.
I see this 2-3 times per day. Seems to happen when there's a fair amount of traffic caused by activity from project servers (e.g. downloading a slug of new tasks, change of work venue, etc).

Would be interested in any thoughts?
Many thanks!
BoincTasks checks if a thread that handles the machine doesn't respond within a certain time.
If that happens BT restarts all threads.
That does indicate that something goes wrong, my guess BT doesn't get enough resources to handle things in time.

The number of tasks isn't very high, BT should be able to handle much more.

In itself this isn't a crash, just a safety timer that kicks in.

Go to BoincTasks Settings -> Expert and check Enable Thread runtime
That shows how busy a thread is.
Restart BoincTasks and after a while check the graph in the same dialog.

Check if there are any recent crashes: https://efmer.com/boinctasks-crashes/

Jimbocous

Thanks for the info. I'll do as you suggest and post updates, if any. Appreciate it!