No more "Missed" or "OK *" tasks in History

Started by Pepo, October 11, 2011, 01:41:26 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

Pepo

As of BT 1.22 (for those who enable it), the majority of finished tasks are being shifted into Long-term history, thus allowing the Short-term History tab & data to be much more lightweight, with less overhead. Would it be possible to relax the module and let it react faster or more often, to capture finishing tasks? At the moment, even if BT is open, with the Tasks tab active and refreshing each 3 seconds, I can observe some tiny tasks finishing, then uploading, being reported and vanishing in a rapid succession, but the History nearly always lists these task as "Missed". That's at 3-20 seconds History refresh, Smart mode being on. I've now tried to lower the upper bound to 5 seconds and will try to observe the History tab next time such tiny task will finish.

But my (maybe already asked, in the past, or just not answered or not understood by myself) Question is, that regardless of any timing issues and smart mode: If BT already clearly notices (and tells me through the Tasks tab), that a task is already being uploaded, why can't it trigger an event in the History code area, that this task has already transitioned from Running into Uploading state? When I see a task waiting to be reported (in the Tasks tab), how can be that History meand that BT has missed the moment?? Why can then BT tell me (in the History tab), that it has completely missed this task's Uploading+ReadyToReport states, when it already displayed the task in these states? Is it necessary to capture the tasks' states in two separate loops? One module might notify another module about a task's state, instead of letting it temporaruily snoop much faster... (And miss anyway.)

:(
Peter

fred

The Tasks and History are separate.
The reason is that Tasks is normally visible  for a short while only and useless for the history.
History uses a slower interval of 1/2 the expected time to completion.
So tasks that behave badly, have an incorrect time to completion and don't get caught.

The only way, is to remove the smart mode and go to a fixed interval or lower the maximum interval time.



Pepo

Quote from: fred on October 11, 2011, 04:25:03 PM
The Tasks and History are separate.
The reason is that Tasks is normally visible  for a short while only and useless for the history.
History uses a slower interval of 1/2 the expected time to completion.
So tasks that behave badly, have an incorrect time to completion and don't get caught.

The only way, is to remove the smart mode and go to a fixed interval or lower the maximum interval time.
The only? Why can't the Tasks part trigger an event for the History part, into its timer event queue - "Hey, Smartie, make one more snapshot just now!"
Peter

fred

Quote from: Pepo on October 11, 2011, 04:58:08 PM
The only? Why can't the Tasks part trigger an event for the History part, into its timer event queue - "Hey, Smartie, make one more snapshot just now!"
The Tasks window is only shown a short while, so no sense of using it. And this takes additional resources.
Lets go back to the problem, how do the missed tasks behave?
Do they have a correct est. time? They go from 1 minute to 0?

Pepo

Quote from: fred on October 12, 2011, 06:24:01 AM
how do the missed tasks behave?
Do they have a correct est. time? They go from 1 minute to 0?
ATM a good candidate to be observed here is surveill@home's crawler. A single tiny regular nCi task with just few bytes to be sent fast, its estimate is a bit more than 15 minutes and is being kept through its progress. The progress% does update initially at each screen refresh, but unfortunately just approx. once a minute over 10%, which is not frequently enough at the final steps (and hurts the ETA, which is off course what hurts the History's SmartMode). So, yes, unfortunately the est. time actually rises a bit to 2 minutes and then finally falls to zero.

Very similarly it is happening with WUProp tasks - also nCi, around 2:58 hours crunch time, rare progress% update at the final steps, ETA falls from 1 minute to 0, fast upload + immediate report (as usual for nCi tasks).

Sometimes (especially if I shorten the upper History snooping interval bound) the history loop will catch the tasks in uploading or uploaded state and apparently shorten its update interval, then the tasks do not get missed - either OK or OK*.

SETI cuda tasks on my home machine are also not updating their progress at each screen refresh, but their upload takes substantially longer, thus do not get missed. But they are often nearly late (because of mostly other users logged in and GPU calculation being stopped, or myself logged in and the calculation being deeply throttled (fan speed)) and tend to report immediately, so mostly they finish as OK*.
Peter

fred

The only way to catch tasks that go uncontrolled from 1 to 0 minutes is a quick refresh rate.

Try e.g. smart mode: Min update 2, Max update 4.
Or 1,3

Pepo

I've tried 3-6 sec nearly a day ago, the ratio went from 70% missed / 30% OK*+OK to 80% OK+OK* / 20% missed. That's understandable.

I just have to observe the CPU usage longer-term. But I currently have no remote hosts to try and the network transfer usage is my concern.

But I also do understand that the Smart mode can not be smart enough, if the task does not cooperate.
Peter

fred

Quote from: Pepo on October 12, 2011, 08:57:53 AM
ATM a good candidate to be observed here is surveill@home's crawler
The project is closed right now. So no way of testing it.