eFMer - BoincTasks and TThrottle forum

BoincTasks For Window, Mac & Linux => Beta Testing => Topic started by: Pepo on May 28, 2010, 11:08:11 AM

Title: BT 0.58
Post by: Pepo on May 28, 2010, 11:08:11 AM
I have 2 computers defined (localhost+extern). While starting, BT was trying to connect to them, Computers/Status for the extern one was showing "Not connected, Connected". After a while (and possibly a couple of unsuccessful connection retries - the host is in a different network) the text changed to "Not connected" AND "1.80" appeared in its TThrottle column!
Note that the machine does not run TTh and is not reachable at all (switched off and in a different city part), so it was a very very wild guess by BT ;D
I guess the connection status should be something like "Not connected, Connecting..."?
Title: Re: BT 0.58
Post by: fred on May 28, 2010, 11:46:17 AM
Quote from: Pepo on May 28, 2010, 11:08:11 AM
I have 2 computers defined (localhost+extern). While starting, BT was trying to connect to them, Computers/Status for the extern one was showing "Not connected, ". After a while (and possibly a couple of unsuccessful connection retries - the host is in a different network) the text changed to "Not connected" AND "1.80" appeared in its TThrottle column!
Note that the machine does not run TTh and is not reachable at all (switched off and in a different city part), so it was a very very wild guess by BT ;D
I guess the connection status should be something like "Not connected, Connecting..."?
If the status is the same only one should appear.

Status,
Connected,
Not connected,
Connected, Not connected,
Not connected, Connected,

Are the options.

Noted a possible bug: Computers: Shows TThrottle version number even as the remote computer is not connected/has TThrottle installed.
Title: Re: BT 0.58
Post by: Pepo on May 28, 2010, 12:05:47 PM
Quote from: fred on May 28, 2010, 11:46:17 AM
Quote from: Pepo on May 28, 2010, 11:08:11 AM
[...]While starting, BT was trying to connect to them, Computers/Status for the extern one was showing "Not connected, ".
You've stolen "Not connected, Connected" from my quote :o (copy/paste with care ;))

Quote
QuoteI guess the connection status should be something like "Not connected, Connecting..."?
If the status is the same only one should appear.

Status,  <--
Connected,
Not connected,
Connected, Not connected,  <--
Not connected, Connected,  <--

Are the options.
Not that I understand this ??? especially the first one (typo? I guess just 4 options are available, Status is the column name) and the latter two...
What do you mean with "If the status is the same"?
Title: Re: BT 0.58
Post by: fred on May 28, 2010, 01:11:21 PM
Quote from: Pepo on May 28, 2010, 12:05:47 PM
Quote from: fred on May 28, 2010, 11:46:17 AM
Quote from: Pepo on May 28, 2010, 11:08:11 AM
[...]While starting, BT was trying to connect to them, Computers/Status for the extern one was showing "Not connected, ".
You've stolen "Not connected, Connected" from my quote :o (copy/paste with care ;))

Quote
QuoteI guess the connection status should be something like "Not connected, Connecting..."?
If the status is the same only one should appear.

Status: 
Connected,
Not connected,
Connected, Not connected,  <--
Not connected, Connected,  <--

Are the options.
Not that I understand this ??? especially the first one (typo? I guess just 4 options are available, Status is the column name) and the latter two...
What do you mean with "If the status is the same"?
Connected, Connected and Not connected, Not connected are shown only once.
Title: Re: BT 0.58
Post by: fred on May 28, 2010, 03:06:19 PM
Quote from: Pepo on May 28, 2010, 11:08:11 AM
I have 2 computers defined (localhost+extern). While starting, BT was trying to connect to them, Computers/Status for the extern one was showing "Not connected, Connected". After a while (and possibly a couple of unsuccessful connection retries - the host is in a different network) the text changed to "Not connected" AND "1.80" appeared in its TThrottle column!
Note that the machine does not run TTh and is not reachable at all (switched off and in a different city part), so it was a very very wild guess by BT ;D
I guess the connection status should be something like "Not connected, Connecting..."?
I found the problem, that part was unnecessary complicated.
Title: Re: BT 0.58
Post by: jjwhalen on May 28, 2010, 05:57:52 PM
I probably just never noticed this before :-[  When BT starts up, it apparently sends a "reread preference override file" to all attached clients.  Is this by design?  Certainly a good idea when you're starting/restarting a client, but is it necessary to send this to 1...n clients that are already running ???  I don't immediately see a down side, but what's the up side?  Just a thought.

P.S. Thanks for the project filter in Messages 8)  I haven't tested it yet for all projects I'm attached to, but I will.  If you don't hear anything more, please assume the filter works OK, at least at my house.
Title: Re: BT 0.58
Post by: Pepo on May 28, 2010, 10:23:11 PM
Quote from: jjwhalen on May 28, 2010, 05:57:52 PM
If you don't hear anything more, please assume the filter works OK, at least at my house.
This morning I was thinking of telling something similar: if I do not complain, possibly everything is working satisfactory or perfect :D (sorry for not saying so explicitely).
...
Except when I've no spare time to complain, or am looking for the suitable words ;D
Title: Re: BT 0.58
Post by: fred on May 29, 2010, 09:41:27 AM
Quote from: jjwhalen on May 28, 2010, 05:57:52 PM
I probably just never noticed this before :-[  When BT starts up, it apparently sends a "reread preference override file" to all attached clients.  Is this by design?  Certainly a good idea when you're starting/restarting a client, but is it necessary to send this to 1...n clients that are already running ???  I don't immediately see a down side, but what's the up side?  Just a thought.

P.S. Thanks for the project filter in Messages 8)  I haven't tested it yet for all projects I'm attached to, but I will.  If you don't hear anything more, please assume the filter works OK, at least at my house.
In the computers tab the field Days work and Wanted tasks.
e.g. when you set Days work = 10 and Wanted tasks = 1000.
When at startup the tasks are < 1000 on that computer, days are set to 10 and that means setting the preferences.

The log will show: 29 mei 2010 - 11:12:48 Regulator, set work buffer ---- Host: localhost,This, WU now: 790, WU needed: 1000, Set days: 10

The check if ever 2 hours or so, when above 1000 the buffer is set to 1 day.

But only use this on a a computer that is running and will keep on running at the time the remote hosts are running.
I only use this on the localhost myself, with a copy of BT running on several localhosts.

So disabling this feature means clearing the 2 columns.
Title: Re: BT 0.58
Post by: jjwhalen on May 30, 2010, 03:58:22 PM
Old problem :'( I just got a couple of entries in History with no Elapsed Time--I haven't seen that one in awhile.  One task ran only a few minutes, but the other was several hours long.  Both were on the same host.  The client had the <report results immediately> switch set (in case that makes a difference).
Title: Re: BT 0.58
Post by: Corsair on May 30, 2010, 06:21:36 PM
Quote from: jjwhalen on May 30, 2010, 03:58:22 PM
Old problem :'( I just got a couple of entries in History with no Elapsed Time--I haven't seen that one in awhile.  One task ran only a few minutes, but the other was several hours long.  Both were on the same host.  The client had the <report results immediately> switch set (in case that makes a difference).

same for me, but I have a lot of WU without elapsed time, not the same host,
not the same project, different hosts and across of some project that I'm running
e.g. Ibercivies, Milkyway, Enigma@Home, SETI@Home too.
:'( :'(
Title: Re: BT 0.58
Post by: fred on May 30, 2010, 06:38:19 PM
Quote from: jjwhalen on May 30, 2010, 03:58:22 PM
Old problem :'( I just got a couple of entries in History with no Elapsed Time--I haven't seen that one in awhile.  One task ran only a few minutes, but the other was several hours long.  Both were on the same host.  The client had the <report results immediately> switch set (in case that makes a difference).
Hmm I have a pretty long log with close to 1000 entries.
The only thing that could cause this is that the following sequence.
Upload -> Ready to report -> Gone
Is less than 10 seconds = the history interval timer.
If that happens the task is simply gone before it can be detected.
And report results immediately makes this more likely.

I could make the history interval shorter.... but that means a lot more overhead.
Title: Re: BT 0.58
Post by: fred on May 30, 2010, 06:42:16 PM
Quote from: Corsair on May 30, 2010, 06:21:36 PM
Quote from: jjwhalen on May 30, 2010, 03:58:22 PM
Old problem :'( I just got a couple of entries in History with no Elapsed Time--I haven't seen that one in awhile.  One task ran only a few minutes, but the other was several hours long.  Both were on the same host.  The client had the <report results immediately> switch set (in case that makes a difference).

same for me, but I have a lot of WU without elapsed time, not the same host,
not the same project, different hosts and across of some project that I'm running
e.g. Ibercivies, Milkyway, Enigma@Home, SETI@Home too.
:'( :'(
One of the other problems is that the BOINC client doesn't have a special communication thread.
That means if it's busy with something else it's simply not responding at all.
E.g. when Seti is down (often) the client sometimes freezes, probably until there is a communication timeout. And than the client responds again.
But I will try to figure out why there is no time at all, because there should be a time from the running state.
Title: Re: BT 0.58
Post by: jjwhalen on May 30, 2010, 09:59:33 PM
Quote from: fred on May 30, 2010, 06:38:19 PM
Quote from: jjwhalen on May 30, 2010, 03:58:22 PM
Old problem :'( I just got a couple of entries in History with no Elapsed Time--I haven't seen that one in awhile.  One task ran only a few minutes, but the other was several hours long.  Both were on the same host.  The client had the <report results immediately> switch set (in case that makes a difference).
Hmm I have a pretty long log with close to 1000 entries.
The only thing that could cause this is that the following sequence.
Upload -> Ready to report -> Gone
Is less than 10 seconds = the history interval timer.
If that happens the task is simply gone before it can be detected.
And report results immediately makes this more likely.

I could make the history interval shorter.... but that means a lot more overhead.

This makes sense, which is why I mentioned the switch being set 8)  (I set this config option when I go to bed in order to meet the cutoff for BOINCstats' daily stats update, which happens in the wee hours of my morning, California time).

The only issue I have with this explanation is that I'd expect to see all, or nearly all tasks reported under <report results immediately> with no Elapsed Time, and that's just not the case--it's maybe 1:25 ???  Obviously Corsair has a much bigger problem with this than I do, for who knows what different set of variables.
Title: Re: BT 0.58
Post by: jjwhalen on May 30, 2010, 10:11:45 PM
Cosmetic change:

Now that
Quote- Changed: Menu: Extra->update all projects: Only updates projects that have work to report.
has been implemented, you might want to rename the menu item to something more appropriate like "Extra>Report all completed tasks" ;)
Title: Re: BT 0.58
Post by: jjwhalen on May 30, 2010, 10:34:08 PM
In History, Einstein@home tasks of the 3.02 Global Correlations S5 search #1 (S5GCESSE2) subproject are listed with the "Reported GPU" color even though this is a CPU executable :o  I'm guessing this means there's something screwed up in the XML from the client?

EDIT -- Forgot to mention: they are NOT shown with GPU colors on the Tasks tab.
Title: Re: BT 0.58
Post by: Pepo on May 30, 2010, 10:54:51 PM
I do not have that much entries in my history (less than 400 now) and also do not use RRI, but have also noticed one incomplete line:

Proj App Name Elapsed time Completed Reported Status Computer
WUProp@Home 1.32 Data collect (nci) wu_1274387826_15435 11:59:34 (00:00:08) 29.05.2010 18:24:59 29.05.2010 18:26:00 Reported: Ok (localhost)
PrimeGrid 5.11 PPS LLR pps_llr_extended_50134541 29.05.2010 17:56:21 29.05.2010 17:56:21 Reported: Ok (localhost)
PrimeGrid 5.11 PPS LLR pps_llr_extended_50351349 00:42:14 (00:38:02) 29.05.2010 16:11:53 29.05.2010 17:36:11 Reported: Ok (remotehost)


The corresponding messages:
29. 5. 2010 17:37:06 PrimeGrid Starting task pps_llr_extended_50134541_0 using llrPPS version 511
29. 5. 2010 17:54:47 WUProp@Home [task_debug] result wu_1274387826_15435_1 checkpointed
29. 5. 2010 17:56:11 PrimeGrid [task_debug] Process for pps_llr_extended_50134541_0 exited
29. 5. 2010 17:56:11 PrimeGrid [task_debug] task_state=EXITED for pps_llr_extended_50134541_0 from handle_exited_app
29. 5. 2010 17:56:11 PrimeGrid Computation for task pps_llr_extended_50134541_0 finished
29. 5. 2010 17:56:11 PrimeGrid [task_debug] result state=FILES_UPLOADING for pps_llr_extended_50134541_0 from CS::app_finished
29. 5. 2010 17:56:11 CPDN Beta [cpu_sched] Starting famous_t073_599_200_000098719_1(resume)
29. 5. 2010 17:56:11 CPDN Beta [task_debug] task_state=EXECUTING for famous_t073_599_200_000098719_1 from start
29. 5. 2010 17:56:11 CPDN Beta Restarting task famous_t073_599_200_000098719_1 using famous version 607
29. 5. 2010 17:56:13 PrimeGrid Started upload of pps_llr_extended_50134541_0_0
29. 5. 2010 17:56:14 PrimeGrid Finished upload of pps_llr_extended_50134541_0_0
29. 5. 2010 17:56:14 PrimeGrid [task_debug] result state=FILES_UPLOADED for pps_llr_extended_50134541_0 from CS::update_results
29. 5. 2010 17:56:15 PrimeGrid [sched_op_debug] Starting scheduler request
29. 5. 2010 17:56:15 PrimeGrid Sending scheduler request: To report completed tasks.
29. 5. 2010 17:56:15 PrimeGrid Reporting 1 completed tasks, not requesting new tasks
29. 5. 2010 17:56:15 PrimeGrid [sched_op_debug] CPU work request: 0.00 seconds; 0.00 CPUs
29. 5. 2010 17:56:15 PrimeGrid [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs
29. 5. 2010 17:56:16 PrimeGrid Scheduler request completed
29. 5. 2010 17:56:16 PrimeGrid [sched_op_debug] Server version 611
29. 5. 2010 17:56:16 PrimeGrid Project requested delay of 7 seconds
29. 5. 2010 17:56:16 PrimeGrid [sched_op_debug] handle_scheduler_reply(): got ack for result pps_llr_extended_50134541_0
29. 5. 2010 17:56:16 PrimeGrid [sched_op_debug] Deferring communication for 7 sec
29. 5. 2010 17:56:16 PrimeGrid [sched_op_debug] Reason: requested by project
29. 5. 2010 17:56:21 Pirates@Home [sched_op_debug] Starting scheduler request


As can be seen, it was (also) just a problem of a task being reported immediately after being finished and uploaded, most probably because the deadline was in just some 16 hours. (BTW, the task (http://www.primegrid.com/result.php?resultid=173428984) took 19:03.77 (10:24.11) minutes.)

IIRC the BT's Refresh rate was set to Normal.
----
Note to buglist and/or wishlist: the Use and Computer columns are not exported at all and (in my case) the local computer's name (plus one empty line afterwards) is inserted in front of the copied text, regardless of which computer's history lines I'm copying. -> The computer name appears the same way when copying text from Messages, but it makes more sense in this case because just one computer's messages are being displayed.
Title: Re: BT 0.58
Post by: Corsair on May 30, 2010, 11:07:06 PM
well about the log I've activated <report results immediately> too.
Title: Re: BT 0.58
Post by: Pepo on May 30, 2010, 11:18:05 PM
Quote from: fred on May 30, 2010, 06:42:16 PM
But I will try to figure out why there is no time at all, because there should be a time from the running state.
Yes this would be often somewhat sufficient.

(Is it worth some indication that it was not exactly the reported time? Maybe Report: Missed or so in the Status column, because claiming Reported: OK is a pure lie or at least a wished fiction ;) I could also have aborted the task in the refresh time window as well...)
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 07:39:30 AM
Quote from: jjwhalen on May 30, 2010, 10:11:45 PM
Cosmetic change:

Now that
Quote- Changed: Menu: Extra->update all projects: Only updates projects that have work to report.
has been implemented, you might want to rename the menu item to something more appropriate like "Extra>Report all completed tasks" ;)
Noted, but renaming gets more complicated with all the languages in place.
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 07:46:40 AM
Quote from: jjwhalen on May 30, 2010, 10:34:08 PM
In History, Einstein@home tasks of the 3.02 Global Correlations S5 search #1 (S5GCESSE2) subproject are listed with the "Reported GPU" color even though this is a CPU executable :o  I'm guessing this means there's something screwed up in the XML from the client?

EDIT -- Forgot to mention: they are NOT shown with GPU colors on the Tasks tab.
Hmm has to do with the plan class. In tasks in the application column between ( )  like (cuda), is the plan class. Like cuda ati etc. Is it empty, e.g. no () or did you override the plan class.
The history is the same.
You can send me a copy of the history file of that machine, the cvs is the most recent one. So I can see the plan class myself, even better.
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 07:50:40 AM
Quote from: Pepo on May 30, 2010, 10:54:51 PM
Note to buglist and/or wishlist: the Use and Computer columns are not exported at all and (in my case) the local computer's name (plus one empty line afterwards) is inserted in front of the copied text, regardless of which computer's history lines I'm copying. -> The computer name appears the same way when copying text from Messages, but it makes more sense in this case because just one computer's messages are being displayed.
Done
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 07:51:58 AM
Quote from: Corsair on May 30, 2010, 11:07:06 PM
well about the log I've activated <report results immediately> too.
That's what gets you into these troubles.
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 07:54:41 AM
Quote from: Pepo on May 30, 2010, 11:18:05 PM
Quote from: fred on May 30, 2010, 06:42:16 PM
But I will try to figure out why there is no time at all, because there should be a time from the running state.
Yes this would be often somewhat sufficient.

(Is it worth some indication that it was not exactly the reported time? Maybe Report: Missed or so in the Status column, because claiming Reported: OK is a pure lie or at least a wished fiction ;) I could also have aborted the task in the refresh time window as well...)
Added to the list: Missed.
I will also make the history wall clock locked and add an option to adjust the now fixed frequency.
Title: Re: BT 0.58
Post by: jjwhalen on May 31, 2010, 04:14:30 PM
Quote from: fred on May 31, 2010, 07:46:40 AM
Quote from: jjwhalen on May 30, 2010, 10:34:08 PM
In History, Einstein@home tasks of the 3.02 Global Correlations S5 search #1 (S5GCESSE2) subproject are listed with the "Reported GPU" color even though this is a CPU executable :o  I'm guessing this means there's something screwed up in the XML from the client?

EDIT -- Forgot to mention: they are NOT shown with GPU colors on the Tasks tab.
Hmm has to do with the plan class. In tasks in the application column between ( )  like (cuda), is the plan class. Like cuda ati etc. Is it empty, e.g. no () or did you override the plan class.
The history is the same.
You can send me a copy of the history file of that machine, the cvs is the most recent one. So I can see the plan class myself, even better.

The plan class is as quoted above (S5GCESSE2).  In Tasks the short name is 3.02 einstein_S5GC1 (S5GCESSE2).  The executable is ProgramData\BOINC\Projects\einstein.phys.uwm.edu\einstein_S5GC1_3.02_windows_intelx86__S5GCESSE2.exe.  No, I'm not overriding plan class.

BTW Einstein's other subproject 3.08 Arecibo Binary Pulsar Search (STSP) is listed both in Tasks and History with regular CPU colors as are (as far as I can tell) all other non-GPU tasks that I'm running.

B/w :)
Title: Re: BT 0.58
Post by: fred on May 31, 2010, 04:33:44 PM
Quote from: jjwhalen on May 31, 2010, 04:14:30 PM
Quote from: fred on May 31, 2010, 07:46:40 AM
Quote from: jjwhalen on May 30, 2010, 10:34:08 PM
In History, Einstein@home tasks of the 3.02 Global Correlations S5 search #1 (S5GCESSE2) subproject are listed with the "Reported GPU" color even though this is a CPU executable :o  I'm guessing this means there's something screwed up in the XML from the client?

EDIT -- Forgot to mention: they are NOT shown with GPU colors on the Tasks tab.
Hmm has to do with the plan class. In tasks in the application column between ( )  like (cuda), is the plan class. Like cuda ati etc. Is it empty, e.g. no () or did you override the plan class.
The history is the same.
You can send me a copy of the history file of that machine, the cvs is the most recent one. So I can see the plan class myself, even better.

The plan class is as quoted above (S5GCESSE2).  In Tasks the short name is 3.02 einstein_S5GC1 (S5GCESSE2).  The executable is ProgramData\BOINC\Projects\einstein.phys.uwm.edu\einstein_S5GC1_3.02_windows_intelx86__S5GCESSE2.exe.  No, I'm not overriding plan class.

BTW Einstein's other subproject 3.08 Arecibo Binary Pulsar Search (STSP) is listed both in Tasks and History with regular CPU colors as are (as far as I can tell) all other non-GPU tasks that I'm running.

B/w :)

Ok I will try to fix this in the next version.