eFMer - BoincTasks and TThrottle forum

BoincTasks For Window, Mac & Linux => Beta Testing => Topic started by: Pepo on October 26, 2011, 07:39:27 AM

Title: BT 1.25
Post by: Pepo on October 26, 2011, 07:39:27 AM
The changeset for BT 1.25 seems to ne promising :) going to be a wonderful release!
Hope this thread will be as short as possible ;)
Title: Re: BT 1.25
Post by: Pepo on October 26, 2011, 08:11:35 AM
Title: Re: BT 1.25
Post by: fred on October 26, 2011, 11:31:17 AM
I've seen the collapse problem once, but when you restart BT it should go away, I'm not sure why this happens, have to check.

The multiple selection option is the default Windows setting, this is how these things should work.
Removing the check and you get the behavioral that is more in line with the regular list.
Title: Re: BT 1.25
Post by: Pepo on October 26, 2011, 01:31:54 PM
Quote from: fred on October 26, 2011, 11:31:17 AM
I've seen the collapse problem once, but when you restart BT it should go away, I'm not sure why this happens, have to check.
In my case it persists. On the other machine, three notices related to both remote_hosts.cfg and cc_config.xml. In BT 1.24 a few minutes ago they were closing just fine. I've also noticed that the trailing numbers in URLs are being updated all three at once (like 144+145+143 -> 147+148+146), regardless of which [-] is being clicked on.

As on TTh's Temp graph dialog, BT's temp graph has also these raised color boxes next to the Core/Max/Gpu checkboxes.

Interesting or weird, until BOINC client on a particular machine (e.g. localhost, not tried with remotes) does connect to BT, BT does not display the machine's temperatures, although it is connected to its TTh. If already all runs, restarting the client does not stop the temperature from being continuously  updated.
I think that the temperature might be displayed immediately after BT+TTh being connected. This could aid with troubleshooting while trying to connect to a remote BOINC client - when TTh is connected and temperatures get updated, then the network is just fine, problem is with authentication, etc.
Title: Re: BT 1.25
Post by: Pepo on October 26, 2011, 05:34:42 PM
Quote from: Pepo on October 26, 2011, 01:31:54 PM
Quote from: fred on October 26, 2011, 11:31:17 AM
I've seen the collapse problem once, but when you restart BT it should go away, I'm not sure why this happens, have to check.
In my case it persists. On the other machine, three notices related to both remote_hosts.cfg and cc_config.xml. In BT 1.24 a few minutes ago they were closing just fine. I've also noticed that the trailing numbers in URLs are being updated all three at once (like 144+145+143 -> 147+148+146), regardless of which [-] is being clicked on.
With the same BT instance, I've updated BOINC 6.12.34 -> 6.12.41 - I can suddenly close these three notices. Perhaps their format changed slightly in between and BT 1.25 does not support the older format correctly? (A wild guess.)
Title: Re: BT 1.25
Post by: Pepo on October 27, 2011, 12:05:45 AM
Title: Re: BT 1.25
Post by: fred on October 27, 2011, 06:09:58 AM
Quote from: Pepo on October 27, 2011, 12:05:45 AM

  • "-Add: Graph toolbar: Data transfer (BOINC V7)" - there is no BOINC V7 yet to try out, just the 6.13.x series, but BT 1.25 refuses to consider them being "V7". So, how to test it in advance?
6.13 is the V7 beta. As soon as the feature is implemented in a 6.13 version, I will implement it.
Title: Re: BT 1.25
Post by: Pepo on October 27, 2011, 09:30:34 AM
Quote from: fred on October 27, 2011, 06:09:58 AM
Quote from: Pepo on October 27, 2011, 12:05:45 AM
"-Add: Graph toolbar: Data transfer (BOINC V7)" - there is no BOINC V7 yet to try out, just the 6.13.x series, but BT 1.25 refuses to consider them being "V7". So, how to test it in advance?
6.13 is the V7 beta. As soon as the feature is implemented in a 6.13 version, I will implement it.
:) I've thought it already is, just could not trigger it ;D
Title: Re: BT 1.25
Post by: fred on October 27, 2011, 10:13:18 AM
Quote from: Pepo on October 27, 2011, 09:30:34 AM
:) I've thought it already is, just could not trigger it ;D
Can only do so much .. :o... V 1.26
Title: Re: BT 1.25
Post by: Pepo on October 28, 2011, 10:44:04 PM
(Fred, I suspect you'll hate me and will try some killing woodoo on my doll :P)

Connected to local Win x64 6.13.9 client. After testing PrimeGrid + TTh, I've noticed that under the BOINC client process, there are two orphaned PrimeGrid tasks I've aborted and updated some 20 minutes ago. They were heavily using CPU, thus TThrottle was throttling all known valid payload to nearly zero. So far all correct.

I've aborted these 2 processes and then observed the throttle% graph, slowly coming from 99% upwards to maybe 40%. The remaining tasks' CPU usage slowly rose (observed in process monitoring tools), but BT somehow got out of order. These running tasks were suddenly not green-highlighted, their Status column described them as "New" or "New - Suspended by user", instead of "Running" or anything else. Throttle column did not show any number or bar. In Computers tab, the connection to TThrottle 5.40 seemed to be OK. In the Messages tab, there was all rosa-highlighted and in the message ID column, weird numbers flashing between zero and something huge were jumping around; date-time column contained just "--":

0 -- Starting BOINC client version 6.13.9 for windows_x86_64
0 -- This a development version of BOINC and may not function properly
825309449 -- Config: report completed tasks immediately
1633951847 -- Config: GUI RPC allowed from:
1633951847 -- Config:   pavilon6
778658670 -- Config:   vetroplach
778658670 -- Config:   localhost
[.....]
537542255 WUProp@Home -- [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1920299838 PrimeGrid -- [checkpoint] result pps_llr_104435382_0 checkpointed
1647262730 WUProp@Home -- [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
537542255 PrimeGrid -- [checkpoint] result pps_llr_104435382_0 checkpointed
1869770812 surveill@home -- Computation for task wu_1319568902_127067_0 finished
1847617390 surveill@home -- Started upload of wu_1319568902_127067_0_data
537542249 surveill@home -- Started upload of wu_1319568902_127067_0_uris
0 surveill@home -- Finished upload of wu_1319568902_127067_0_data
0 surveill@home -- Finished upload of wu_1319568902_127067_0_uris
0 surveill@home -- Sending scheduler request: To report completed tasks.
0 surveill@home -- Reporting 1 completed tasks, requesting new tasks for CPU
0 surveill@home -- Scheduler request completed: got 0 new tasks
0 surveill@home -- Not sending work - max number of probes in your network exceeded
0 WUProp@Home -- [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
0 PrimeGrid -- [checkpoint] result pps_llr_104435382_0 checkpointed
0 WUProp@Home -- [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
0 PrimeGrid -- [checkpoint] result pps_llr_104435382_0 checkpointed
0 -- Suspending computation - user request
1701995878 WUProp@Home -- [cpu_sched] Preempting wu_v3_1319038646_293619_0 (left in memory)
0 PrimeGrid -- [cpu_sched] Preempting pps_llr_104435382_0 (left in memory)
0 -- Resuming computation
0 WUProp@Home -- [cpu_sched] Resuming wu_v3_1319038646_293619_0
896081961 PrimeGrid -- [cpu_sched] Resuming pps_llr_104435382_0
1701474162 PrimeGrid -- task pps_llr_104435382_0 resumed by user
0 WUProp@Home -- [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
0 PrimeGrid -- [checkpoint] result pps_llr_104435382_0 checkpointed


Some failed attempt to reconnect to the client?

5 minutes later all was back to normal. Here comes the correct log - matching lines:

1 27.10.11 10:45 Starting BOINC client version 6.13.9 for windows_x86_64
2 27.10.11 10:45 This a development version of BOINC and may not function properly
3 27.10.11 10:45 Config: report completed tasks immediately
4 27.10.11 10:45 Config: GUI RPC allowed from:
5 27.10.11 10:45 Config:   pavilon6
6 27.10.11 10:45 Config:   vetroplach
7 27.10.11 10:45 Config:   localhost
[.....]
1181 WUProp@Home 29.10.11 00:12 [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1182 PrimeGrid 29.10.11 00:12 [checkpoint] result pps_llr_104435382_0 checkpointed
1183 WUProp@Home 29.10.11 00:13 [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1184 PrimeGrid 29.10.11 00:13 [checkpoint] result pps_llr_104435382_0 checkpointed
1185 surveill@home 29.10.11 00:13 Computation for task wu_1319568902_127067_0 finished
1186 surveill@home 29.10.11 00:13 Started upload of wu_1319568902_127067_0_data
1187 surveill@home 29.10.11 00:13 Started upload of wu_1319568902_127067_0_uris
1188 surveill@home 29.10.11 00:13 Finished upload of wu_1319568902_127067_0_data
1189 surveill@home 29.10.11 00:13 Finished upload of wu_1319568902_127067_0_uris
1190 surveill@home 29.10.11 00:13 Sending scheduler request: To report completed tasks.
1191 surveill@home 29.10.11 00:13 Reporting 1 completed tasks, requesting new tasks for CPU
1192 surveill@home 29.10.11 00:13 Scheduler request completed: got 0 new tasks
1193 surveill@home 29.10.11 00:13 Not sending work - max number of probes in your network exceeded
1194 WUProp@Home 29.10.11 00:14 [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1195 PrimeGrid 29.10.11 00:14 [checkpoint] result pps_llr_104435382_0 checkpointed
1196 WUProp@Home 29.10.11 00:15 [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1197 PrimeGrid 29.10.11 00:15 [checkpoint] result pps_llr_104435382_0 checkpointed
1198 29.10.11 00:15 Suspending computation - user request
1199 WUProp@Home 29.10.11 00:15 [cpu_sched] Preempting wu_v3_1319038646_293619_0 (left in memory)
1200 PrimeGrid 29.10.11 00:15 [cpu_sched] Preempting pps_llr_104435382_0 (left in memory)
1201 29.10.11 00:15 Resuming computation
1202 WUProp@Home 29.10.11 00:15 [cpu_sched] Resuming wu_v3_1319038646_293619_0
1203 PrimeGrid 29.10.11 00:15 [cpu_sched] Resuming pps_llr_104435382_0
1204 PrimeGrid 29.10.11 00:15 task pps_llr_104435382_0 resumed by user
1205 WUProp@Home 29.10.11 00:16 [checkpoint] result wu_v3_1319038646_293619_0 checkpointed
1206 PrimeGrid 29.10.11 00:16 [checkpoint] result pps_llr_104435382_0 checkpointed


While in such weird state, in BOINC Manager everything seemed to be fine.
Title: Re: BT 1.25
Post by: fred on October 29, 2011, 09:04:14 AM
0         --   Starting BOINC client version 6.13.9 for windows_x86_64   
0         --   This a development version of BOINC and may not function properly  :-X

You may want to try 6.13.10.

Almost certainly a client problem.

New is 0 also.
Title: Re: BT 1.25
Post by: Pepo on October 29, 2011, 06:51:24 PM
Quote from: fred on October 29, 2011, 09:04:14 AM
You may want to try 6.13.10. Almost certainly a client problem.
If indeed, then it is almost certainly not fixed enough in 6.13.10 (http://boinc.berkeley.edu/trac/changeset/24497/trunk/boinc/client/client_types.cpp).

Or, did the client send such havoc to the BoincTasks? Happened once more. BT was pretty CPU-busy at the moment (well, a couple of minutes, until I've stopped BOINC client). I'm trying now 6.13.10 - with the PrimeGrid tests and disabling GPU, I've fullfilled 6.13.9's prerequisites for resetting my few local projects  ;D
Title: Re: BT 1.25
Post by: idahofisherman on November 03, 2011, 08:35:00 PM
I have been having trouble with BT using to Much CPU (98%) since about BT 1.23.  I finally changed the history to  120 seconds from 15.  This seems to have fixed the cpu problem, but now I have increased "Missed" instead of Report OK. 

Is there a time limit for missed?

Title: Re: BT 1.25
Post by: Pepo on November 03, 2011, 09:36:47 PM
Quote from: idahofisherman on November 03, 2011, 08:35:00 PM
I finally changed the history to  120 seconds from 15.  [...] but now I have increased "Missed" instead of Report OK. 
Is there a time limit for missed?
From my experience with nCi tasks, they are often completely reported (with already a new task downloaded and started) 4-7 seconds after finishing previous task. With 4+10 sec. setting, I have 70% OK and 30% OK* + Missed report states, with 2-5 sec. I get at least 90% OK + OK* and just an occasional complete miss.
Title: client shutdown problem
Post by: 3216842 on November 04, 2011, 11:56:04 AM
Hello to all and congratulation for this great software.
I have a little annoying problem with BT: When stopping BONIC client via menu, BT comes to a little freeze and after ~15-20 sek. BT shows up an error message "The BOINC client couldn't be shut down". Further ~10 sek. later the client  and after this all running tasks/WUs stops (in this order) :-( .
This behavior i watched in BT 1.23/.24/.25. Running XP Pro SP3, BOINC 6.12.34, BOINC Manager is not running.

Log:
04 November 2011 - 12:18:00 Shut down BoincTasks ---- The BOINC client is shutting down
04 November 2011 - 12:18:30 Shut down BoincTasks ---- The BOINC client has shut down
04 November 2011 - 12:18:30 Shut down BoincTasks ---- Der BOINC-Client konnte nicht beendet werden
04 November 2011 - 12:18:30 Connect ---- The connection was lost, because the client stopped

Hope for a fix.

__W__
Title: Re: BT 1.25
Post by: fred on November 04, 2011, 11:57:57 AM
Quote from: idahofisherman on November 03, 2011, 08:35:00 PM
Is there a time limit for missed?
Missed is not seen in an upload or reported state.
It depends on the project. The history only works properly on projects with a dependable time to completion.
It that's more that 50% off it may cause a missed history.
A project going from 60 to 0 in one second will probably produce a missed.
Setting 120 seconds means, that the tasks shouldn't be gone in 120 seconds.
The sampling time is: to completion / 2.

On the other hand, a 15 seconds setting shouldn't give you  98% load.
Make sure BT is closed
Title: Re: client shutdown problem
Post by: fred on November 04, 2011, 12:30:57 PM
Quote from: 3216842 on November 04, 2011, 11:56:04 AM
Hello to all and congratulation for this great software.
I have a little annoying problem with BT: When stopping BONIC client via menu, BT comes to a little freeze and after ~15-20 sek. BT shows up an error message "The BOINC client couldn't be shut down". Further ~10 sek. later the client  and after this all running tasks/WUs stops (in this order) :-( .
This behavior i watched in BT 1.23/.24/.25. Running XP Pro SP3, BOINC 6.12.34, BOINC Manager is not running.

Log:
04 November 2011 - 12:18:00 Shut down BoincTasks ---- The BOINC client is shutting down
04 November 2011 - 12:18:30 Shut down BoincTasks ---- The BOINC client has shut down
04 November 2011 - 12:18:30 Shut down BoincTasks ---- Der BOINC-Client konnte nicht beendet werden
04 November 2011 - 12:18:30 Connect ---- The connection was lost, because the client stopped

Hope for a fix.

__W__
I will fix the freeze on the main program, while it's waiting for the client to shut down.
Der BOINC-Client konnte nicht beendet werden. : Means the client is still running even as it's ordered to shut down.
The reason may be a running BOINC Manager that restarted the client. Or a failure on the client itself to properly shut down.
This is by no means a BT problem.
Title: Re: client shutdown problem
Post by: Pepo on November 04, 2011, 01:32:58 PM
Quote from: 3216842 on November 04, 2011, 11:56:04 AM
When stopping BONIC client via menu, BT comes to a little freeze and after ~15-20 sek. BT shows up an error message "The BOINC client couldn't be shut down". Further ~10 sek. later the client  and after this all running tasks/WUs stops (in this order) :-(
You could also check BOINC event log, whether it took the client some additional unexpected time to stop the task applications. (But I've already complained D.A. that the client is not fair and verbose enough on this.)

Quote from: fred on November 04, 2011, 12:30:57 PM
The reason may be [...] a failure on the client itself to properly shut down.
I do not remember anymore, how responsive the client was, while it was waiting for unresponsive tasks to finish, until finally killed them. This could have been the 20-30 sec. communication delay.
Title: Re: client shutdown problem
Post by: fred on November 04, 2011, 03:02:51 PM
Quote from: Pepo on November 04, 2011, 01:32:58 PM
I do not remember anymore, how responsive the client was, while it was waiting for unresponsive tasks to finish, until finally killed them. This could have been the 20-30 sec. communication delay.
That could be it, I raised the timeout to 1 minute.
As it is in a separate thread now, a wait isn't an issue.
Title: Re: client shutdown problem
Post by: 3216842 on November 04, 2011, 10:07:47 PM
Quote from: fred on November 04, 2011, 12:30:57 PM
I will fix the freeze on the main program, while it's waiting for the client to shut down.
Der BOINC-Client konnte nicht beendet werden. : Means the client is still running even as it's ordered to shut down.
The reason may be a running BOINC Manager that restarted the client. Or a failure on the client itself to properly shut down.
This is by no means a BT problem.
To get some clearance to some points, i have done some testing:
- BOINC Manager is not running
- BOINC Client shuts down much later than expected with the error message as popup and in logs and no other related log entries are listed
- the shutdown problem shows up, even if no WUs/tasks are running (all tasks halted)
- shuting down BOINC client from BOINC Manager is no problem and works fast (no matter running/not running any tasks), so i don't think this is a problem of the Client to shutdown ?!
- no problems with shuting down the client with BT pre .23

Happy debugging
__W__

Title: Re: client shutdown problem
Post by: Pepo on November 04, 2011, 11:26:16 PM
Quote from: 3216842 on November 04, 2011, 10:07:47 PM
- shuting down BOINC client from BOINC Manager is no problem and works fast (no matter running/not running any tasks), so i don't think this is a problem of the Client to shutdown ?!
Maybe a side note - recent BOINC versions appear as shutting down very fast - in reality, as soon as the Manager gets a notification from the client that "it understood it should shut down", it simply disappears in nirvana. The client then slowly tries to finishing the tasks, while noone is aware and bothers anymore :-X :-\
Title: Re: client shutdown problem
Post by: 3216842 on November 05, 2011, 01:22:02 AM
Quote from: Pepo on November 04, 2011, 11:26:16 PM
Maybe a side note - recent BOINC versions appear as shutting down very fast - in reality, as soon as the Manager gets a notification from the client that "it understood it should shut down", it simply disappears in nirvana. The client then slowly tries to finishing the tasks, while noone is aware and bothers anymore :-X :-\
The shutdown over the BOINC Manager is much faster than the shutdown over BT.
I know about Windows "fooling" around with running/not running programms and second it's only XP Pro, so i crosschecked it with some systemtools like Sysinternals Process Explorer and some other tools  ;D .

__W__
Title: Re: client shutdown problem
Post by: fred on November 05, 2011, 09:49:49 AM
Quote from: 3216842 on November 05, 2011, 01:22:02 AM
The shutdown over the BOINC Manager is much faster than the shutdown over BT.
Lets see how you like 1.26.
Title: only cosmetic
Post by: 3216842 on November 05, 2011, 11:52:22 AM
Quote from: fred on November 05, 2011, 09:49:49 AMLets see how you like 1.26.
I am shure that i will like it 8) .
Just a little cosmetic thing, the "WWW" rollout menu is twice as long as it should be and at the end of the "Extras" menu is a surplus separator line :o .
;D ;D ;D Uhhh, what a horror, this is looking bad ;D ;D ;D
:P but much better than the BOINC Manager :P

__W__
Title: Re: BT 1.25
Post by: fred on November 05, 2011, 03:05:02 PM
Quote from: idahofisherman on November 03, 2011, 08:35:00 PM
This seems to have fixed the cpu problem, but now I have increased "Missed" instead of Report OK. 
What project(s) is causing these problems.
Title: Re: BT 1.25
Post by: fred on November 06, 2011, 06:35:16 PM
This is an example:

11777   PrimeGrid   06-11-2011 19:16   Computation for task llrCUL_104726483_4 finished   
11778   PrimeGrid   06-11-2011 19:16   Restarting task llrCUL_105073299_1 using llrCUL version 609   
11779   PrimeGrid   06-11-2011 19:16   Started upload of llrCUL_104726483_4_0   
11780   PrimeGrid   06-11-2011 19:16   Finished upload of llrCUL_104726483_4_0   
11781   PrimeGrid   06-11-2011 19:16   Sending scheduler request: To report completed tasks.   
11782   PrimeGrid   06-11-2011 19:16   Reporting 1 completed tasks, not requesting new tasks   
11783   PrimeGrid   06-11-2011 19:16   Scheduler request completed   

PrimeGrid   6.09 Cullen (LLR)   llrCUL_104726483_4   03d,15:44:09 (02d,04:17:26)   06-11-2011 19:17   06-11-2011 19:17      Missed

Missed by 1 second  :-X

As you can see the upload->ready->gone is within 1 second. So this is quite impossible the catch. At least not with extreme overhead.
Title: Re: BT 1.25
Post by: Pepo on November 06, 2011, 11:11:35 PM
Quote from: fred on November 06, 2011, 06:35:16 PM
This is an example:

11777   PrimeGrid   06-11-2011 19:16   Computation for task llrCUL_104726483_4 finished      
11779   PrimeGrid   06-11-2011 19:16   Started upload of llrCUL_104726483_4_0   
11781   PrimeGrid   06-11-2011 19:16   Sending scheduler request: To report completed tasks.   
11783   PrimeGrid   06-11-2011 19:16   Scheduler request completed   

PrimeGrid   6.09 Cullen (LLR)   llrCUL_104726483_4   03d,15:44:09 (02d,04:17:26)   06-11-2011 19:17   06-11-2011 19:17      Missed

Missed by 1 second  :-X

As you can see the upload->ready->gone is within 1 second. So this is quite impossible the catch. At least not with extreme overhead.
Unfortunately, Fred, I can personally not see it - your example is not that obvious, it is missing ":seconds" in the time values  ;)
(The same happens to me when posting logs - I often have to turn seconds on and repost them.)


BTW, if the upload phase would start a minute after finhished, what would guess the History, if looking at the task some 10 seconds after finished - already in the Uploading phase? (I would guess so.)
Title: Re: BT 1.25
Post by: fred on November 07, 2011, 08:41:56 AM
Quote from: Pepo on November 06, 2011, 11:11:35 PM
Unfortunately, Fred, I can personally not see it - your example is not that obvious, it is missing ":seconds" in the time values  ;)
I will make 2 changes for 1.26.
1) If a tasks running state has changed (running -> upload), the next history fetch will be without any delay. Now it is after the minimum cycle. Thus gaining 4 seconds.
2) A check "Time left not very accurate". In this mode the interval will change from timeleft / 2 to timeleft /4. This shouldn't add to much extra overhead as only the running tasks are read back and not everything.
Title: Re: BT 1.25
Post by: Pepo on November 07, 2011, 10:50:09 AM
Yes, both could help to catch them. Just these Surveills are monitoring-unfriendly ;D - their last second ETA (at some 87%) is often more than 2 minutes :-X there are apparently no tricks possible.



Just seen something different regarding timing in Messages: 5298 PrimeGrid 07.11.11 10:48:18 [checkpoint] result LLR_SGS_107255402_0 checkpointed
5301 PrimeGrid 07.11.11 10:49:07 Computation for task LLR_SGS_107255402_0 finished
5302 PrimeGrid 07.11.11 10:49:07 Starting task LLR_SGS_106902280_0 using llrTPS version 609
5303 PrimeGrid 07.11.11 10:49:08 Started upload of LLR_SGS_107255402_0_0
5304 PrimeGrid 07.11.11 10:49:10 Finished upload of LLR_SGS_107255402_0_0
5305 PrimeGrid 07.11.11 10:49:15 Sending scheduler request: To report completed tasks.
5307 PrimeGrid 07.11.11 10:49:18 Scheduler request completed: got 0 new tasks

and no more notes on LLR_SGS_107255402_0 - completed.
But the History says  PrimeGrid LLR_SGS_107255402_0 00:31:58 (00:26:39) 07.11.11 10:49:14 07.11.11 10:51:19 Reported: OK where the times are Elapsed / Finished / Reported: it is said being reported 2 minutes after the scheduler request being finished in the event log.

Actually, when I look now at tasks just being finished, if the History notices tasks' "Ready to report" state, they are all getting "Reported" timestamp in History 1-2 minutes after finishing their scheduler report (although the tasks disappear immediately from Tasks tab). I've also seen the transition from "Sending" to "Ready to report" happening with similar more than 1 minute delay.
5387 PrimeGrid 07.11.11 11:11:08 Computation for task LLR_SGS_106902280_0 finished
5390 PrimeGrid 07.11.11 11:11:10 Finished upload of LLR_SGS_106902280_0_0
5393 PrimeGrid 07.11.11 11:11:15 Scheduler request completed: got 0 new tasks
PrimeGrid LLR_SGS_106902280_0 00:22:00 (00:21:09) 07.11.11 11:11:14 07.11.11 11:12:55 Reported: OK

5475 surveill@home 07.11.11 11:29:14 Computation for task wu_1320320103_162664_0 finished
5476 surveill@home 07.11.11 11:29:15 Started upload of wu_1320320103_162664_0_data
5479 surveill@home 07.11.11 11:29:17 Finished upload of wu_1320320103_162664_0_data
5480 surveill@home 07.11.11 11:29:21 Sending scheduler request: To report completed tasks.
5482 surveill@home 07.11.11 11:29:22 Scheduler request completed: got 0 new tasks
surveill@home wu_1320320103_162664_0 00:16:13 (00:00:02) 07.11.11 11:29:21 07.11.11 11:31:29 Reported: OK

On the machine, History sampling is set to 4-10 seconds. Why the delay? If some task gets "missed", its timestamp is immediately (i.e. 1-5 sec.) after noticing it was gone.
Title: Re: BT 1.25
Post by: fred on November 07, 2011, 11:39:24 AM
Quote from: Pepo on November 07, 2011, 10:50:09 AM
On the machine, History sampling is set to 4-10 seconds. Why the delay? If some task gets "missed", its timestamp is immediately (i.e. 1-5 sec.) after noticing it was gone.
Normally only the running tasks are fetched. So a state from Uploading -> Ready is not noticed.
Once in 120 second a full fetch is performed (very recourse intensive, imagine a couple of thousand tasks). That's why there is a max 2 minute delay.
A missed shows up in the full fetch, so the time stamp is immediately.

But I did some tweaking an I think it should perform better in 1.26, but BT can't do the impossible.
Title: Re: BT 1.25
Post by: fred on November 08, 2011, 09:02:44 AM
Some good news from David:

I will implement your idea of retaining reported tasks for 1 minute;

So the new BOINC client should be better at handling the History without misses.
Title: Re: BT 1.25
Post by: Pepo on November 08, 2011, 09:51:46 AM
Good initiative! (http://www.efmer.eu/forum_tt/Themes/default/images/post/thumbup.gif)

Let's see how will it cope with 1-task nCi projects, which immediately deliver and start a new task upon reporting the previous one. If the old task will just stay 1 more minute in a "Reported" state, all will be just fine.
Title: Re: BT 1.25
Post by: nanoprobe on December 03, 2011, 03:00:02 PM
I was just alerted by BoincTasks to upgrade to version 1.25. I upgraded as usual but this version will not work for me. When I try to start it I get nothing but an empty white screen. Tried uninstall and reinstall, same problem. Reinstalled version 1.16 and it works fine.
Win XP Pro SP3. Boinc version 6:10:58  32bit.
Title: Re: BT 1.25
Post by: Beyond on December 03, 2011, 03:28:07 PM
I'd suggest 1.21 for the most stable version or 1.28 if you're a bit more adventurous.  Everything between was troublesome here, including the symptoms you describe.  To use 1.28 I have to disable the long term history (new feature).  It also does not handle machines with flaky connections (such as weak WiFi) as well as 1.21.  1.28 often has to be restarted (and sometimes crashes) when machines lose connection for any reason (WiFi, reboot, crash, etc.).  Fred, do you want more dmp files?  I sent one a while ago but haven't heard if you found anything interesting.
Title: Re: BT 1.25
Post by: fred on December 03, 2011, 04:11:13 PM
Quote from: Beyond on December 03, 2011, 03:28:07 PM
Fred, do you want more dmp files?  I sent one a while ago but haven't heard if you found anything interesting.
I do.  ;D
Title: Re: BT 1.25
Post by: nanoprobe on December 03, 2011, 04:29:24 PM
Quote from: Beyond on December 03, 2011, 03:28:07 PM
I'd suggest 1.21 for the most stable version or 1.28 if you're a bit more adventurous.  Everything between was troublesome here, including the symptoms you describe.
Will do. Thanks
Title: Re: BT 1.25
Post by: Beyond on December 04, 2011, 01:58:25 AM
Quote from: fred on December 03, 2011, 04:11:13 PM
Quote from: Beyond on December 03, 2011, 03:28:07 PM
Fred, do you want more dmp files?  I sent one a while ago but haven't heard if you found anything interesting.
I do.  ;D
Four of the little dears should be there :o
Title: Re: BT 1.25
Post by: fred on December 04, 2011, 10:12:10 AM
Quote from: Beyond on December 04, 2011, 01:58:25 AM
Four of the little dears should be there :o
The problem is I never got them or any previous....
Probably blocked along the way.
Title: Re: BT 1.25
Post by: Beyond on December 04, 2011, 11:38:44 PM
Quote from: fred on December 04, 2011, 10:12:10 AM
Quote from: Beyond on December 04, 2011, 01:58:25 AM
Four of the little dears should be there :o
The problem is I never got them or any previous....
Probably blocked along the way.
They still show in your upload directory:
Title: Re: BT 1.25
Post by: fred on December 05, 2011, 08:09:06 AM
Quote from: Beyond on December 04, 2011, 11:38:44 PM
Quote from: fred on December 04, 2011, 10:12:10 AM
Quote from: Beyond on December 04, 2011, 01:58:25 AM
Four of the little dears should be there :o
The problem is I never got them or any previous....
Probably blocked along the way.
They still show in your upload directory:
Oeps forgot that one.
They all point to the same problem, but now, what causes it.
Title: Re: BT 1.25
Post by: fred on December 05, 2011, 10:43:21 AM
Move over to V 1.29.

What you can do is set memory integrity checking

In log.xml set <heap_check> to 1
Title: Re: BT 1.25
Post by: Beyond on December 05, 2011, 04:39:12 PM
moved post to the 1.29 beta thread...