BT 0.62 - Rules / Temperature

Started by tlsi2000, June 28, 2010, 03:07:02 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

tlsi2000

Temperature Rule

This is currently one value.
And I suspect that it checks the CPU.
Is there a way to check the CPU and the GPU temperatures separately ?

This is because some recent SETI tasks are having difficulty jumping into GPU mode after initialization, and stay with a high CPU (==> CPU temp), when it should be released and the GPU take over.

Having this breakout will allow me to trap for this occurrence.

Thanks.

fred

Quote from: tlsi2000 on June 28, 2010, 03:07:02 PM
Temperature Rule

This is currently one value.
And I suspect that it checks the CPU.
Is there a way to check the CPU and the GPU temperatures separately ?

This is because some recent SETI tasks are having difficulty jumping into GPU mode after initialization, and stay with a high CPU (==> CPU temp), when it should be released and the GPU take over.

Having this breakout will allow me to trap for this occurrence.

Thanks.

Both temperatures can be used.
When a task that matches, is a GPU task, than the GPU temperature is checked.
You can add the rule from the tasks view by selection the right gpu task. This is always the best way to add a rule as it rules out typos.

But you can just a well use the Use type rule. Like Use = 0.04 Cpu + 1 Gpu.

You could also use the CPU% rule, as the cpu% should be quite high.

JStateson

#2
cant get rule to work, not sure what I am doing wrong.  collatz has been running 3 days (normally only 30 minutes), should have been suspended as the temp was under the rule limit.



turning loging on or off seems to show no debug activity other than throttle messages.  I expected to see the temp being measured or messages to the effect the rule was enabled, etc.

I downloaded .63 but same problem.  Also, the checkbox state is not being saved after a rule edit.

Is there a wildcard to use so I dont have to spell the project exactly?  What about the app - wildcard for the app.  Some projects the app changes as I recall.

fred

Quote from: BeemerBiker on July 01, 2010, 04:30:49 AM
cant get rule to work, not sure what I am doing wrong.  collatz has been running 3 days (normally only 30 minutes), should have been suspended as the temp was under the rule limit.



turning loging on or off seems to show no debug activity other than throttle messages.  I expected to see the temp being measured or messages to the effect the rule was enabled, etc.

I downloaded .63 but same problem.  Also, the checkbox state is not being saved after a rule edit.

Is there a wildcard to use so I dont have to spell the project exactly?  What about the app - wildcard for the app.  Some projects the app changes as I recall.
I just checked the rule, it should work.
Did you add the rule from the Tasks view, by using the right mouse and select "Add rule". This eliminates typos.
I will add a warning message in the log if there is no task found for a rule.
To check what's happening use the rule log, not the regular log.


JStateson

#4
I cant make it work.  Went and uninstalled then re-installed .63 on  vista 64 system, dual monitor.  Set rules for < , > and for gpu and cpu. Nothing showed up in the rules log.  If I set the time for 10 seconds, something should have happened after 10 seconds.


I thought the [ x ] box (in the result creation/edit/delete box) was for enableing rules.  It is actually for selecting the one to edit (or delete).  Think a radio button would be more appropriate for that dialog box rather than a checkbox.   Ideally one should highlight the line item to edit and the [ x ] should show which rules are enabled or disabled.

Un-accountably, the rules log dialog box is always empty.
---

Another problem

The gadget tool tip is always blank. In addition, when the tasking display under the gadget setup dialog box is updated,  the cursor is moved away from the text input box and is not restored. It takes several tries before a value can be entered in, for example, the time field.  I have to be either very quick to enter it in, or use the mouse to position the cursor before each digit is typed.


fred

Quote from: BeemerBiker on July 01, 2010, 08:23:33 AM
1) I cant make it work.  Went and uninstalled then re-installed .63 on  vista 64 system, dual monitor.  Set rules for < , > and for gpu and cpu. Nothing showed up in the rules log.  If I set the time for 10 seconds, something should have happened after 10 seconds.

2) I thought the [ x ] box (in the result creation/edit/delete box) was for enableing rules.  It is actually for selecting the one to edit (or delete).  Think a radio button would be more appropriate for that dialog box rather than a checkbox.   Ideally one should highlight the line item to edit and the [ x ] should show which rules are enabled or disabled.

Un-accountably, the rules log dialog box is always empty.
---

Another problem

3) The gadget tool tip is always blank. In addition, when the tasking display under the gadget setup dialog box is updated,  the cursor is moved away from the text input box and is not restored. It takes several tries before a value can be entered in, for example, the time field.  I have to be either very quick to enter it in, or use the mouse to position the cursor before each digit is typed.
1) Do you see the rule color appear in the Tasks view.
The rule log should at least show the rule entry, after you press ok in the Edit dialog. It will take up to 15 seconds for the rule to show up.
Next all rules will be reset. And a trigger will take 10 - 25 seconds.
To check things, change the rule to Elapsed time > 10 to check if the other entries are valid. And remove the Event and check "Show logging".
A new/changed rule is not active when the rule Editor is still open.
2) A made a note.
3) Ok the focus moves from the settings dialog to the Gadget, thus removing the cursor. I will change that.
The tooltip, how many computers are connected? The 9 shown?

JStateson

#6
I tried all you suggested but still no-go. I am posting links to the images as they take up too much space in the display box.  Rules were created using the add-rule mechanism you suggested.

(1) No computers are shown in the .63 gadget whereas the .62 did show 9 computers.

(2) I set the elapsed rule for 1 hour, 2 minutes, 3 seconds for task that were already all over 2 hours. Time was set to 15 seconds.  I also set temp rule for > 75c and 10 seconds and all cpu's were >= 77c.  Nothing happened, neither in the display nor the log.  The project was not suspended nor was there an entry in the rules log as shown here.   There was no entry in the rules log whether the x was shown in the checkbox or not.

(3) The state of the Show logging checkbox is being toggled off after the Rule editor is refreshed and dismissed.  To observe the problem do the following exactly:
- Add a rule and put an x in the Show logging [ ] then click on OK to dismiss the box
- put a check in the rule just created and click on edit
- observe that the checkbox is still checked as one would expect
- Click on OK to dismiss
- put a check in the rule just edited and click on edit again
- observe that the Show logging checkbox is no longer checked

(4) The state of the "BoincTasks Settings::Rules::Computer" checkbox is not preserved after the "Rule editor " dialog box is dismissed.  This is observed when running through step (3) above.

fred

Quote from: BeemerBiker on July 01, 2010, 02:58:41 PM
I tried all you suggested but still no-go. I am posting links to the images as they take up too much space in the display box.  Rules were created using the add-rule mechanism you suggested.

(1) No computers are shown in the .63 gadget whereas the .62 did show 9 computers.

(2) I set the elapsed rule for 1 hour, 2 minutes, 3 seconds for task that were already all over 2 hours. Time was set to 15 seconds.  I also set temp rule for > 75c and 10 seconds and all cpu's were >= 77c.  Nothing happened, neither in the display nor the log.  The project was not suspended nor was there an entry in the rules log as shown here.   There was no entry in the rules log whether the x was shown in the checkbox or not.

(3) The state of the Show logging checkbox is being toggled off after the Rule editor is refreshed and dismissed.  To observe the problem do the following exactly:
- Add a rule and put an x in the Show logging [ ] then click on OK to dismiss the box
- put a check in the rule just created and click on edit
- observe that the checkbox is still checked as one would expect
- Click on OK to dismiss
- put a check in the rule just edited and click on edit again
- observe that the Show logging checkbox is no longer checked

(4) The state of the "BoincTasks Settings::Rules::Computer" checkbox is not preserved after the "Rule editor " dialog box is dismissed.  This is observed when running through step (3) above.
1) The gadget is coupled with the history that is on by default.
2) As you see no computers in the gadget the rules don't work. As there is no entry in rules log the history is not running.
3) Forgot the set something.
4) As I will change that anyway...

JStateson

Thanks - I was not aware that history had to be enabled.  Looks like it is working.  The rules log shows the rule I just added and the gadget has stuff in it :-)

fred

Quote from: BeemerBiker on July 01, 2010, 05:12:56 PM
Thanks - I was not aware that history had to be enabled.  Looks like it is working.  The rules log shows the rule I just added and the gadget has stuff in it :-)

I suggest to change to rule and set the time to a couple of minutes.
Otherwise when the card cools down, it will suspend every following task, because the card doesn't have time to warm up again.

tlsi2000

A solution to my problem
for those who might need a similar solution.

I re-addressed the temperature issue with the SETI init phase of the GPU tasks.

Now that I know (from Fred - thx) that the tasks are using the temperature based upon the *designated* task type, and not the phase that they are currently in.......

[ for reference, the SETI GPU tasks seem to do a CPU-intensive phase that sets up the GPU task,  and *sometimes* they do not complete this phase correctly, and just seem to get 'stuck' in a never ending loop, going on until manually reviewed and suspended -- lost time, lost CPU power that could be more productive ]

Now the rule checks that the temperature (the GPU temp) is below a certain value for a few minutes, and then suspends the task if that condition exists.

Rule:  Temp < 45 C  for 2 minutes
[ plenty of time for the init phase to get the GPU going, and the GPU temp should rise above this value for the duration of the processing ]

This now works for what I wanted to catch.
It has caught several tasks already, and it keeping the CPU/GPU system productively busy.

Thanks for the rules operations !

Thanks.