Recent Posts

Pages: [1] 2 3 ... 10
1
Wish List / Pass useful parameters like stuck GPU # to a batch file
« Last post by JStateson on November 17, 2019, 03:29:40 pm »
Occasionally a GPU gets hung and never finishes a job, or it can reject a job within seconds of receiving it.  These events are quickly discovered using the rules mechanism.  Currently, a batch file can be executed and an email or text message can easily be sent.  However, it would be advantageous to the project and the user, to be able to handle the situation automatically.  This can only be implemented if identifying parameters can be passed from BoincTasks to the handler.  At a minimum, the following parameters might be needed

$temp---------temperature of the device assuming tthrottle running or "none"
$device-------device id of GPU (D0, D1, etc) or just "CPU" if not a co-processor
$ip_address---need to know which system has problem
$port---------if needed to communicate with client and some systems have multiple clients
$password-----if needed to communicate with the client
$rule_name----the name of the rule could have an identifying phrase useful to the handler
$computer-----name of the system
$platform-----handler might need to know which OS: Linux, mac, windows
$project------name of project would be useful to handler
$app----------name of app
$rule_count---number of times rule has been applied

Example of rule usage

if Elapsed time > 5 minutes,  project "SETI@home",  app "8.01 setiathome_v8 (cuda90)", run program:
d:\ProgramData\boinc\scripts\HandleRule.bat $rule_name $ $ip_address $device

With these additions, more useful rules can be contributed as well as 3rd party scripts or apps such as resetting the GPU, excluding it from use by the Boinc client, or shutting down the client or system.

There is a discussion back in jan 2019 by Boinc principals here where they are considering adding xml files that basically duplicate a few of the BoincTasks rules.  Their xml includes, for example, instructions to a particular nvidia board to enable or disable.
This functionality is partially present in BoincTasks but is missing the parameters required to identify the device and system having the problem.  Even if their "Computing prefs 2.0" is implemented it would required those XML file to be present on each system.

The device_id can be 0, 1, 2 etc for each type of GPU so it must include a type such as nvidia, intel, amd, etc
Need to be consistent with naming used by the exclude_gpu which appear to be
  [<type>NVIDIA|ATI|intel_gpu</type>]
2
Questions / Re: BSOD when installing
« Last post by fred on November 16, 2019, 09:19:41 pm »
Tried to install latest version.
Got BSOD during the installation
From WhoCrashed:
I lost the ability to build new drivers.
It's very very expensive to release a driver with all the new restrictions.
Runs in the thousands of dollars.
3
Questions / BSOD when installing
« Last post by TomasL on November 16, 2019, 12:28:41 pm »
Tried to install latest version.
Got BSOD during the installation
From WhoCrashed:

On Fri 2019-11-15 19:05:59 your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\111519-10546-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x1C14E0)
Bugcheck code: 0xD1 (0xFFFFF8050D8347C8, 0x2, 0x8, 0xFFFFF8050D8347C8)
Error: DRIVER_IRQL_NOT_LESS_OR_EQUAL
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This indicates that a kernel-mode driver attempted to access pageable memory at a process IRQL that was too high.
This bug check belongs to the crash dump test that you have performed with WhoCrashed or other software. It means that a crash dump file was properly written out.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.



On Fri 2019-11-15 19:05:59 your computer crashed or a problem was reported
crash dump file: C:\Windows\MEMORY.DMP
This was probably caused by the following module: ntkrnlmp.exe (nt!setjmpex+0x8149)
Bugcheck code: 0xD1 (0xFFFFF8050D8347C8, 0x2, 0x8, 0xFFFFF8050D8347C8)
Error: DRIVER_IRQL_NOT_LESS_OR_EQUAL
Bug check description: This indicates that a kernel-mode driver attempted to access pageable memory at a process IRQL that was too high.
This bug check belongs to the crash dump test that you have performed with WhoCrashed or other software. It means that a crash dump file was properly written out.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.

4
While you can't set a task to "do not disturb", you can control how many tasks of each project run at a time by using max_concurrent and project_max_concurrent in app_config.xml files as explained on the page Client configuration.
5
Wish List / Run custom app at each system under All Computers
« Last post by JStateson on October 26, 2019, 04:12:58 pm »
Right click on a computer in the "All Computers" tree brings up a select list of apps to run.

Apps would be on a tab similar to "Extra" -> "BoincTask settings" -> "Messages"

Buttons such as ADD, DEL, TEST etc.

example of what might look like.  Instead of "Project" and "Message"
  "Name"                    "Command"
PuttyLinux            "C:\Program Files\PuTTY\putty.exe" username@$(IP_ADDRESS) -pw password
IssueMWUpdate          "D:\RUN_MW_RPC_APP.BAT" $(IP_ADDRESS) $(PORT) $(PASSWORD)

etc


The names would show up in the dropdown box
6
Wish List / Remove projects not initializing
« Last post by Roadranner on October 25, 2019, 03:49:16 pm »
I tried to install QNC Continual.
Because of an http-error the initialization doesn't complete.
With BT I'm not able to execute any action. I had to go to the (external) machine to remove the project with Boinc Manager.
It would be nice to control this with Bt, too.
7
thank you for quick replying Fred. I realized the password windows 10 asks for is the microsoft online password, not the local machine users password.
It did work opening the program with "Run as administrator".
Thank you
8
Thanks for the reply Fred. I guessed as much but just wondered if I was missing something.

I will just manually manage some CPU load balancing to get these WUs running again.
9
Questions / Re: [OT] Running x64 app only in Moo
« Last post by fred on October 12, 2019, 10:17:14 pm »
or even a rule in BOINCtask to reject such app (only x86)??
Ask the project.
Maybe a project config on your computer of a project setting.
10
I did look at the rules setting but could not see anything there that seemed like it would do what I am asking.
BOINC handles this automatically a task with such deadline that far away will probably be bumped to the end of the list.
Not much you can do, but hope BOINC works as it should.
Pages: [1] 2 3 ... 10