Boinctasks stuck on "updating"

Started by hucker, February 18, 2020, 04:43:50 PM

Previous topic - Next topic

0 Members and 5 Guests are viewing this topic.

hucker

I often see Boinctasks not refreshing the screen.  Either it does it very several seconds (instead of the 1 I set it to), sometimes every 30 seconds, sometimes never again until restarting it.  The status bar at the bottom says "updating.....73" where 73 is counting the number of seconds since the last update.  It's monitoring itself and 3 other computers on a local network which has no other problems and is fast and not overloaded.  The computer Boinctasks is on is not overloaded.  Any idea why?  Boinctasks does have immense difficulty connecting to new computers or ones I've rebooted, it seems a rather flimsy communication system.

fred

Quote from: hucker on February 18, 2020, 04:43:50 PM
The computer Boinctasks is on is not overloaded.  Any idea why?  Boinctasks does have immense difficulty connecting to new computers or ones I've rebooted, it seems a rather flimsy communication system.
I would say it's have problems communicating with the computers.
Seems like the computers on the other side aren't responding to a request.
If it takes too long the thread that handles the computer will try to establish a new connection.

Firewall/virus scanner might be the cause, or the BOINC client refuses to connect.
Problems while rebooting might be because the IP address changed.

You can goto show->log and click on Debug setup and check Connecting. Also check Enable debug mode.
This might show what's going on.

hucker

It's one of my computers that's playing up and doesn't always respond.  But why does Boinctasks not continue to run and show the other three?  It seems odd that one connection halts the whole program.  Imagine if I had 50 computers....

fred

Quote from: hucker on February 18, 2020, 08:30:51 PM
It's one of my computers that's playing up and doesn't always respond.  But why does Boinctasks not continue to run and show the other three?  It seems odd that one connection halts the whole program.  Imagine if I had 50 computers....
All connections run an isolated thread so they are not supposed to influence each other.
But everyone waits for the slowest computer.
What I suggest is put the 3 good computers in a group to isolate them from the bad computer.
I have no idea why this happens, maybe something low level.

Maybe you should figure out why that one computer isn't responding as it should.

hucker

Quote from: fred on February 18, 2020, 11:24:55 PM
All connections run an isolated thread so they are not supposed to influence each other.
But everyone waits for the slowest computer.

Which is it?

Quote from: fred on February 18, 2020, 11:24:55 PM
What I suggest is put the 3 good computers in a group to isolate them from the bad computer.
I have no idea why this happens, maybe something low level.

Maybe you should figure out why that one computer isn't responding as it should.

Having many computers is always going to have one that's having problems.  I don't expect this to annoy Boinctasks, making it even harder for me to spot the problem, since the whole thing stops working!

fred

Quote from: hucker on February 18, 2020, 11:48:39 PM
Having many computers is always going to have one that's having problems.  I don't expect this to annoy Boinctasks, making it even harder for me to spot the problem, since the whole thing stops working!
Hmm, the connection is there, but it's doing something unexpected.
It doesn't actually stops but waits, the other computer made a connection and sends something.
But you are right it might be avoided if I knew what the problem is.

hucker

Quote from: fred on February 19, 2020, 02:40:49 AM
Quote from: hucker on February 18, 2020, 11:48:39 PM
Having many computers is always going to have one that's having problems.  I don't expect this to annoy Boinctasks, making it even harder for me to spot the problem, since the whole thing stops working!
Hmm, the connection is there, but it's doing something unexpected.
It doesn't actually stops but waits, the other computer made a connection and sends something.
But you are right it might be avoided if I knew what the problem is.

I've sorted the dodgy computer and it now runs 24/7 without problems, but it's still causing Boinctasks to stick every so often.  I get the status bar saying "Updating...." with a counter going up to about 100 seconds before it recovers.  Sometimes it changes to "Closing...." with a counter.  Not sure what that means.

Sorry if I sounded rude earlier, I've just been getting rather annoyed with the dodgy computer - I think I've narrowed it down to crappy power supplies that don't like me using the 12V rail without the 5V rail.  Fixed it with a car lightbulb!  CIT power supplies are rubbish!

If I can provide some logs or something to help you find the problem, let me know.

hucker

Ok, I now have the "offending" computer running perfectly.  But Boinctasks still sticks for no reason.  It's monitoring four computers: itself, and three others on a fast local network.  There is no reason for this sticking.  Why does it keep stopping responding?  There's spare CPU time on all computers, yet it just stops talking to the other machines for no reason.  Any other communication between the computers works fine, eg file transfers, remote desktop, etc.  Boinctasks seems to have a very flimsy connection, often requiring the toolbox to persuade communication to work.  Can't this be reprogrammed to function better?

fred

Quote from: hucker on February 26, 2020, 10:09:50 PM
Boinctasks seems to have a very flimsy connection, often requiring the toolbox to persuade communication to work.  Can't this be reprogrammed to function better?
BoincTasks relies on the BOINC client to communicate.
But what you are telling, something isn't right.
Do you use a fixed IP address, or a dynamic.
Doesn't the offending BOINC computer show anything in it's message log. It might be refusing the connection.

hucker

Quote from: fred on March 08, 2020, 02:42:08 AM
Quote from: hucker on February 26, 2020, 10:09:50 PM
Boinctasks seems to have a very flimsy connection, often requiring the toolbox to persuade communication to work.  Can't this be reprogrammed to function better?
BoincTasks relies on the BOINC client to communicate.
But what you are telling, something isn't right.
Do you use a fixed IP address, or a dynamic.
Doesn't the offending BOINC computer show anything in it's message log. It might be refusing the connection.

The problem seems to occur when one computer is a bit bogged down with something, either the one running Boinctasks (it did it last night when I was backing up the whole system to a slow hard disk with cloning software), or one of the ones it's monitoring.  Perhaps if it can't get a response in a certain timeframe due to the CPU being overloaded then it gets upset?  This usually results in a couple of the computers disappearing from the task list, or sometimes the whole thing stops responding.  Even right clicking on the tray icon for Boinctasks and clicking exit does nothing.

The IPs are handed out by the ISP router automatically by DHCP, but they always remain the same.  As far as I can remember, I gave Boinctasks the IP of each machine, so I don't think that's the problem.  Ipconfig on a machine shows they only have a 3 day lease, but they won't lose the IP even if switched off for longer, as the router cycles through new IPs first if more machines are connected.

Do you want me to turn on any specific debug logging?  I had a quick glance at the stdoutdae.txt on the computer that usually causes the problems because it's slowest, but I can't spot anything.  The stdoutdae.txt on this computer (the one running boinctasks) has no entries for the past few months!  I've deleted the file to see if it will start again.

fred

If the computers don't have a fixed IP set than you should use the MAC address and remove the IP address.
You should set up the remote computer in BOINC to allow a connections from any IP address.
In BoincTasks Setting, make sure the Expert tab shows Reconnect every xx seconds e.g. 120 and the Connection timeout e.g 120.

hucker

Quote from: fred on March 09, 2020, 03:29:53 PM
If the computers don't have a fixed IP set than you should use the MAC address and remove the IP address.
You should set up the remote computer in BOINC to allow a connections from any IP address.
In BoincTasks Setting, make sure the Expert tab shows Reconnect every xx seconds e.g. 120 and the Connection timeout e.g 120.

It's as good as fixed, they haven't changed IP since I bought them.  It's technically DHCP and COULD change, but doesn't.

If I run Efmer Boinc Toolbox on the remote computer, it has a tick against "allow remote (RPC) access for all computers" (and all the other 3 ticks) - is that good enough?  I don't have the toolbox running all the time, only when I need to create or fix a connection.  Should it run all the time?

My expert tab says reconnect every 30 seconds and timeout in 120.  Are there better numbers to use than those?

hucker

Quote from: fred on March 09, 2020, 03:29:53 PM
If the computers don't have a fixed IP set than you should use the MAC address and remove the IP address.
You should set up the remote computer in BOINC to allow a connections from any IP address.
In BoincTasks Setting, make sure the Expert tab shows Reconnect every xx seconds e.g. 120 and the Connection timeout e.g 120.

I've fixed it.  A faulty ethernet cable was causing some packets to be lost - I only spotted it by running a continuous ping between computers and swapping different cables in.  I didn't suspect a wiring problem before, since I can easily transfer terabytes of data using windows file sharing.  I assume this must be more robust than whatever method you're using.  Perhaps you should program Boinctasks to cater for the odd lost packet?

hucker

This is still causing problems.  Mainly if a disk on one computer I'm monitoring is busy - loading new Vbox tasks for example.  It doesn't happen with SSD computers.  Can you please work out why Boinctasks waits for every computer to respond before displaying the others? 

fred

Quote from: hucker on February 01, 2022, 10:51:32 PM
Can you please work out why Boinctasks waits for every computer to respond before displaying the others?
This is how it works, all connected computers should respond, if they don't they are not connected.
A bad connection can cause problems outside of BoincTasks.

Try removing the bad computer and use it in BoincTasks Js or give BoincTasks Js a try for all computers.