News:

Follow BoincTasks on Twitter Facebook        Visit our website here.
BoincTasks cloud login is working again

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - John C

#1
Wish List / New Action Request
July 18, 2010, 03:58:33 AM
Fred -
I can't begin to express what a benefit these rules have been.  This truly is a game changer in terms of automating boinc.

OK, after playing with it for several weeks, there are a few things I'd like to propose. 

The first (and most important to me) is a new "pause & resume" action.

Especially for DNETC, tasks will get "stuck" but they are reset when you suspend and then resuming them so that usually they will finish successfully.  Right now, I'm just suspending when they get to a time mark and then manually restarting them later.  I'd love to have a "pause & resume" action within the rules so that I could automate doing that.  The key here is that I need to be able to have 2 rules with different thresholds so when a task gets to 8:00 it will reset (and its fine if it does it every minute) but once it gets to 12:00, I need the second rule to engage that suspends (or eventually aborts) it entirely.  Are rules processed in the order they are entered?  So that I would need to enter the 12:00 rule first and then the 8:00 rule, at 9:00 will it fail the 12:00 rule and then process the 8:00 rule?  Or would it process both and therefore the second rule would resume the pause that was intended to be permanent?  If that were the case, I'd really need an abort action to be added now for this to work.

I'd still like to see a rule that fired when the error count exceeded a trigger point on a project within a given time period.  So, if I had 10 errors on primegrid within a 5 minute period, as an example, then I could suspend that project.

And the other rule that I'd still like to see is to have a rule triggered when a machine loses its connection and fails to reconnect.  I'd only use this one with the external scripting capability to reboot the box that wasn't connecting (and was presumably frozen).  It's doesn't happen a lot and therefore is a lower priority, but this rule would really allow a bulletproof operational environment.

Thanks again for all you are doing here.  I keep BT up 24x7.  Good stuff.
#2
Beta Testing / BT 0.62
June 23, 2010, 02:45:29 PM
The "Progress % / min"  rule event is exactly what I needed, but though it isn't really using a minute as its sampling and therefore it is firing prematurely.  DNETC only updates the % complete every 15 seconds or so, but then it will jump a couple of %.  Right now, BT is noticing that it goes 3-5 seconds with no progress and then is extrapolating that to be 0% per minute, which is causing the rule to fire when it shouldn't.

Is this tied to the refresh rate?  Any way we can make this rule sample over a longer time period?  Would prefer that it be a full minute between comparisons.
It is now sampled over 28 seconds, you can set a time. This will cause the rule to trigger only if the rule stays valid for a longer period of time. E.g. 5 minutes.
#3
Wish List / History & Messages
June 18, 2010, 04:09:59 PM
Shouldn't the "messages" roll off after some period of time?  Once you get a dozen machines connected, I have to believe all of that stuff adds up and it clearly isn't following the config for history.  I'd have a separate config on the history tab asking how long to keep messages.  Default in my mind would be two days.

Also, even one day of history is too much for me and since it is a drop down, I can't include fractional values.  Should that start with a couple of entries such as 1 hour, 4 hours, 12 hours, before counting up numbers of days?
#4
Beta Testing / BT 0.61
June 18, 2010, 03:14:17 AM
Since we don't yet have "application stalled", I just created a rule for computer "Cruncher-1" on project "DNETC@HOME" (no application) that if Elapsed Time > Value(12:00) (No Time) then color purple and suspend task.  I've had a couple tasks go over, but nothing has happened.

1.  Are the rule actions not coded yet?
2.  Is application or time required?
3.  Did I somehow misspell the computer or project name (I am sure they both match what is shown on the tasks screen)
4.  Is there a bug?

Not sure what is wrong, but I'm not seeing what I was expecting to see.
#5
Wish List / Rules & Actions
June 01, 2010, 11:56:06 PM
Monitoring is great, but I'd love to have rules in BoincTask to automate managing the farm in addition to watching it.

1.  The ability to take an action (such as run a script) when a computer loses it's connection or goes a certain amount of time with no work.  When a computer freezes up, I want to be able to issue an SNMP command to the PDU to reboot it (power off & on).  If BoincTasks could do that, great!  But if that's too much to ask for, then let it at least run a batch/script and I'll embed the SNMP command there.

2.  If a task goes over a threshhold (more than 150% done, for example), then I'd love to have a rule that aborted it.  Collatz has a bad habit of not erroring out and having a task consume every GPU forever.  At a minimum, allow a rule where we can suspend a task that runs longer than a given maximum allowable do that we can later decide if it should be aborted or allowed to continue.

I am sure there are also other benefits of rules.  I'd love to be able to send myself an email (or a text using the email gateway) if a machine goes too long with no work.  But the two above are the ones that I most need.

BIG THANKS!