News:

Follow BoincTasks on Twitter Facebook        Visit our website here.
BoincTasks cloud login is working again

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - JStateson

#1
The Berkeley supplied "all_projects_list.xml" seems to be outdated according to Milkyway, WCG, and possibly others.

For example the typical warning request (usual 100's of them)

This project seems to have changed its URL.  When convenient, remove the project, then add http://milkyway.cs.rpi.edu/milkyway

I did the requested change for several Milkyway and WCG systems and noticed that the ones I changed no longer show up when using the project selection side bar.  The projects that show the warning message are the only ones that show up in the filter. By "no longer show up" I mean the active tasks are not listed.

I edited the "all_projects_file.xml" on one of the problem systems and changed
        <url>https://milkyway.cs.rpi.edu/milkyway/</url>
        <web_url>https://milkyway.cs.rpi.edu/milkyway/</web_url>

to

        <url>http://milkyway.cs.rpi.edu/milkyway/</url>
        <web_url>http://milkyway.cs.rpi.edu/milkyway/</web_url>

but had no effect.  I assume the project sidebar is created by adding active project urls from each system in BT (I have 9) and one of the Milkyway entries has https and that one is found before the http one. Another problem I see is that the project's master url does not bother to use http even after making the required change from https to http like the project requested.

From looking at the Milkyway and WCG forum and their problems and lack of response (mainly Milkyway) I doubt if the projects is going to ask Berkeley to amend that "all_projects_list.xml" and some projects like wprop are not on the list anyway.  Wprop shows up in the project filter but the tasks are missing just like the Milkway and WCG tasks.

I am going to make a guess that if "https" and "http" are not included in the lookup the tasks will be found but that is just a guess.
#2
I have a rule that suspends the GPU for 20 seconds then issues a resume.  I use it to restart a hung task.

"c:\Program Files\Boinc\boinccmd.exe" --host h110btc  --set_gpu_mode never
cmd.exe /c timeout.exe /T 20
"c:\Program Files\Boinc\boinccmd.exe" --host h110btc  --set_gpu_mode always

The rule works fine if Boinctasks runs when Windows starts which is the normal case.
It does not work if I run Boinctasks from the desktop shortcut.  I need to specify "run as administrator" else none of the 3 lines of code do anything.

It is inconvenient to have to "run as administrator".  Possibly there is some security setting or ownership change to allow the apps to work without having to run as administrator.  I tried a number of things but they did not help.

Those lines of code are in a script ".cmd" file.  It is not necessary to "run as administrator" if I chose to run the script manually.  However, Boinctasks needs to run as administrator to have the script file work.
#3
I now believe this is a hardware problem of some type. The low CPU utilization reported is incorrect.  There is something different in the win11 OS causing this.

If replace the win11 boot disk with my win10 boot the CPU utilization is back to %100 as is normal.  I have the boinc apps on the D drive so whether I boot with 10 or 11 the BOINC apps pick up where they left off.

My CPU is "skylake-x" i9-7900x which has known heating problems and there are reports of "phantom throttling"

Recently I ran two CPU intensive apps.  The cpu utilization was %50 (or less) but they finished in the same time that a windows 10 comparable system finished.  I cannot run more than two or the system overheats I am considering removing the Intel heat spreader as is recommended by other skylake-x ownners.
#4
I found the problem and the solution.  This problem only showed up in Win11 22H2.

I cannot start BOINCTASKS first.  I have to start BOINC and then BOINGTASKS.

If I start BT first it seems there is no thread or CPU switching and the throughput drops to 2x or 3x or worse depending on how many BOINC applications are running

Discussion over at Microsoft
https://answers.microsoft.com/en-us/windows/forum/all/terrible-performance-loss-with-win11-22h2-how-to/128d232a-6b23-4c44-b2e4-56649b590e73

[edit] Problem seems related to core isolation. If I disable core isolation the cpu utilization starts increasing and within a few minutes** is back up to %90 like it should be. If core isolation is enabled then ir-regardless of the order that boinc is started the cpu utilization drops and eventually the "red" wanning shows up.

**could take a while to rise if the work unit has been running at low utilization for a long time.  However, any new tasks quickly get to %90+
#5
The problem is not the video board.  I just put the old video board back in and the problem is still there.
The problem is either the Dell bios upgrade from 206 -> 207 or the security changes to core and device protection enabled in windows 11 about the same time as that 3070 was put in.
In any event it is not caused by bointasks or tthrottle.

I went into the bios and disabled speed step and locked the CPU to the base frequency of 3.6ghz. CPU-z shows the CPUs running at 3.6.  There is no throttling and the work units for CPU bound tasks take twice as long as an older and slower Xeon.  It is as if the internal cache it not working anymore, or something is stealing the CPU cycles.

It is not permitted to downgrade the bios, will have to get with Dell support.

The %99 cpu and the almost twice as fast speed is the older Xeon. I would never have noticed this problem if I was not running boinc tasks. Windows 11 performance and resource monitoring do not show any problems.

[/img][/quote]
#6
Sorry, I cut off the image.

All those tasks in red are CPU only tasks, no GPU.  Most are now in the 20%
I have never seen CPU tasks that low
CPU time percent is set to %100

windows 11, CPU has 20 threads.

Just noticed something - I disabled GPU usage and all the CPU power levels are slowly increasing.
Went up and back down, just varying about %20 <-> %60 but should be at %99.
Going to swap the 3070 out for a 1660 and see if the problem disappears.
Will run enough CPU work units to see if the run time is really different depending on which graphics board is used.

going to run some tests using cosmology home and compare to other systems then swap out the graphics board.

#7
RTX-2080 was replaced under warranty with RTX-3070 (Gigabyte did not have any more 2080)
System seems to be running well but the CPU usages for CPU bound tasks shown values under %50 that were normally %99 using the gtx-1660 or rtx-2080
I removed all processors from windows 11 then rebooted but that did not fix the problem.  The win11 resource cpu plots indicate the cpu is used a lot more than the %35
Wondering if my "free" RTX-3070 was a refurbished replacement with a problem or may the problem is the express-3 motherboard with an express-4 graphics card.

#8
Thanks for the additional link.  I noticed that all of them suggested letting windows decide the size of the page or swap file.  That might be true for some users and systems.  I know that the last time I let windows manage my printers I had a large word document printing on my small label printer.

Another problem that I failed to mention.  I wrote that BT is handling 7 systems and 82 apps but actually there were over 600 tasks waiting to run that BT handles.  That is a lot of traffic with updates every few seconds.   World community grid does not feed the BOINC app like most other projects.  I just went and set the number of tasks from unlimited to 20 to cut down the queue size.  I assume a smaller list of tasks will mean faster updates from BT.  If 20 does not work then possibly setting resource to 0 will work.  WCG seems to never run out of work so I dont need a large queue of tasks waiting.
#9
I came here looking for why over the last several months my system has really slowed especially noticeable in BoincTasks "updates" but also affecting other programs.  Using windows 10 resource manager I observed disk drive C generally %100 most of the time with a very rare drop to under %5

BT is handling 7 systems and 82 apps.  It was getting difficult to even scroll BT and I had to wait 20-30 seconds to see a response. Sometimes the response was "Not Responding".  Disk usage was always showing  %100 during these times.


I used the following for ideas that worked so well I had to come here to post about it.
https://www.drivereasy.com/knowledge/100-disk-usage-windows-10-fixed/

What I did and how it worked

1.  Saw that iCloud photos was a big user, I disallowed iCloud from any access to my files.  This brought the disk usage down and a slight improvement.


2.  Went to startup services and changed sysmain from automatic to manual.  This made a huge difference in startup.  No long %100 for the first 3-4 minutes after rebooting or starting windows.  I suspect that this app informs M$OFT what programs you run most of the time in addition to pre-loading them in memory.  I don't need to have gridcoin research, BOINC or BT preloaded especially gridcoin as it is huge.

3. Change system performance from "best looking" to "adjust for best performance".  This made a huge difference in boinctasks.  Some time ago after a feature update I was asked if Windows could change my display settings to improve them.  I think this caused the shift from best performance to best looking.

4. Set virtual memory to custom:  minimum 4096 max 32768 for C drive only,  Nothing for D drive.  I have 32gb ram and the recommended was 1.5 * 32 but I went with 32 instead of 48.

It is as if I have a new computer again!!!!


hope this helps someone.
#10
Questions / Re: Can't connect to BOINC client
March 21, 2020, 04:24:06 AM
Put the following into the cc_config.xml file at \programdata\boinc under "options"

<allow_remote_gui_rpc>1</allow_remote_gui_rpc>

suggest this:
<cc_config>
  <log_flags>
  </log_flags>
  <options>
   <use_all_gpus>1</use_all_gpus>
   <allow_remote_gui_rpc>1</allow_remote_gui_rpc>
  </options>
</cc_config>

double check your password file for white space or unprintable character.   If you delete the gui_rpc_auth.cfg file it is automatically re-created with a 32 char password when boinc starts up.  Best is to use notepad and delete the line and save an empty file.  Its length should be exactly "0" unless you want a password
#11
Questions / Re: Can't connect to BOINC client
March 19, 2020, 01:37:50 PM
Windows or Linux client?

If Linux then at /etc/boinc-client you need to edit the file remote_hosts.cfg and add the name of the boinctasks system or its IP address.  This is not needed for windows.

If client is in windows make sure there is only one client running and that the manager is not running.  Use tasks manager to verify.  Make sure that each system can "ping" the other using system names else network problem.

From the system running the client, from an admin command prompt do the following:
c:\Program Files\boinc>boinccmd --get_host_info

do the same thing from system running boinctasks. I assume it is also running boinc (but not boincmgr)
c:\Program Files\boinc>boinccmd --host YOUR_REMOTE_SYSTEM --get_host_info

Installing boinctasks under windows should automatically ask to allow access through the firewall.

If each system can ping each other then try a telnet connection from the boinctasks system to the client

telnet YOUR_REMOTE_SYSTEM 31416

Pressing CTRL-C should generate an error message such as "<boinc_gui_rpc_reply>" plus other stuff.   If you do not see that message suspect firewall or network problem.  Use CTRL-] then "quit" to exit telnet.

HTH
#12
If you have several systems running the same app, the BoincTasks history reader can now estimate the total number of work units per day you can complete.  This takes into account idle time between completion of work units.  You need to have minimum of 24 hours of BT history.

You will need to know the averaged credit per work unit. For some projects and apps, the amount is fixed.  For example. Einstein-at-home's Gamma Ray Pulse Binary search #1 is always 3,465 credits.  Other projects require the average be calculated.  That can be done using this web site

For example:  this url represents one of the current board leader at SETI.  If you click on that url and then select "calculate" you will see an average of about 80 credits per work units based on 20 works units.

Once you have the average credit per work units you can estimate your total throughput by running the BT history reader and selecting all the histories for each system that is running, for the example below, Einstein.  The BT reader will then show all the apps that all the systems are running.  You must then select only the apps for which you have the average credit.  A shown below, the apps for Gamma Ray Pulse Binary search #1  have been selected.  You can then click on the "SAVE" to get a listing on notepad of the number of work units per day.  That number can then be multiplied by the average credit per work unit.  A shown below the estimated credits per day would be around 14,000,000.  Due to the way projects calculate the actual daily credit, it may take 2-3 weeks at 24/7 before that value shows up.



Executables are here (there is no install, just a zip file)
https://stateson.net/BTHistory/bthistory_64_32_bins.zip
All sources are at GitHub and require VS2017
https://github.com/JStateson/Gridcoin-BoincTask-HistoryReader
The above includes the web app "HostProjectStats" sources.
#13
Questions / Need clarification on interface messages
January 31, 2020, 09:05:48 PM
I ran into a problem when receiving temperatures when multiple manufacture video boards are being used.  This mainly affects my Linux program that is sending temperature information to Boinctasks for display as a TThrottle temp.

From a windows system running tthrottle, with one each NVidia and ATI, your BT debug log shows the following:

<TThrottle><HN:JYSArea51><PV 7.72><AC 0><TC 41><TG 65><NV 1><NA 1><DC 100><DG 100><CT0 36.1><CT1 38.5><CT2 37.2><CT3 36.4><CT4 36.0><CT5 40.8><CT6 36.1><CT7 36.3><CT8 36.3><CT9 39.2><GT0 41.0><GT1 65.0><RSPJI3$0q><AA0><SC85><SG83><XC100><MC2><TX><TThrottle>


The temperature of 41.0 was the NVidia, the "<NV 1>"
The temperature of 65.0 is the ATI, the "<NA 1>"
I did not see anything for intel: was expecting an "<NI 0>" or something like that.

If my guess is correct, then if there are 6 nvidia and 3 ati then there should 9 values of: <GT 0>...<GT 8>
All preceded by <NV 6><NA 3>
However, that is just a guess as I was unable to observer multipole ATI temps on systems with NVidia board.
I then looked at a windows system that had an Intel GPU in addition to 6 ATI GPUs.

<TThrottle><HN:s9x00><PV 7.72><AC 0><TC 57><TG 59><NV 0><NA 6><DC 100><DG 100><CT0 58.3><CT1 59.6><CT2 58.0><CT3 58.9><GT0 52.0><GT1 59.0><GT2 59.0><GT3 59.0><GT4 59.0><GT5 59.0><RSSh)b+1m><AA0><SC79><SG97><XC100><MC2><TX><TThrottle>


The intel temperature is displayed by Boinctasks is 59.0 degrees from looking at the display.  I am guessing that value came from one of the GT1...GT5 since they are all 59.0.  I am guessing that, based on the Intel having the GPU incorporated in the CPU, the temperature should be closer to CT0 or any of CT0..CT3.  The last 5 video boards are all identical and all run identical work units so it is no surprise that 5 of the 6 are exactly 59.0

(1)  Question:  is the 59.0 displayed by BT from the CPU temps?  If so, then that is correct for imbedded Intel HD graphics.
However, CT0 shows 58.3, not 59 and I suspect that Intel temps comes from your <TG 59>, the maximum temp.  The Intel temp was associated with project collatz which supports intel and is labeled as "1INT".  The other projects were Milkyway and d0..d5 of "(ATIs)"

The brings me to the second problem: 
(2) What to have my Linux program send to BT to show temperatures when there are multiple NVidia, ATI and maybe a single Intel.  Boinc numbers coprocessors D0..Dn-1 for n NVidia and the same for AMD: D0..Dn-1.  I don't know of any intel co-processor boards that are GPUs so AFAICT there is only 1 Intel possible.

Currently, if a mixture of NVidia and ATI then I only bother to report the coprocessors that have the bigger count, as I do not know how to format the message to BT to properly identify the coprocessors.

Following shows temperatures from Linux systems running NVidia plus one Intel GPU tasks.  The wuprop tasks is displayed as it allows me to check the CPU temperatures.




Both systems run Ubuntu 18.04 as shown here
https://einsteinathome.org/host/12783910

Since BOINC does not keep track of the actual board name nor do they use the same D0..Dn-1 numbering as the Linux kernel, I had to come up with a translation table to display the correct temps adjacent to the actual D0..Dn-1 boards.

For the TB85 mining rig and NVidia only:

<devmap>
<Num_GPUs>6</Num_GPUs>
<1>0 5 01:00.0 NV GTX-1070</1>
<2>1 0 02:00.0 NV GTX-1660-Ti</2>
<3>2 1 03:00.0 NV P102-100</3>
<4>3 2 04:00.0 NV P102-100</4>
<5>4 3 05:00.0 NV P102-100</5>
<6>5 4 06:00.0 NV GTX-1070-Ti</6>
</devmap>


For the BTC110

<devmap>
<Num_GPUs>9</Num_GPUs>
<1>0 1 01:00.0 NV GTX-1060-6GB</1>
<2>1 3 02:00.0 NV GTX-1060-3GB</2>
<3>2 4 03:00.0 NV GTX-1060-3GB</3>
<4>3 2 04:00.0 NV P106-100</4>
<5>4 0 05:00.0 NV GTX-1070</5>
<6>5 8 08:00.0 NV P106-090</6>
<7>6 5 0A:00.0 NV GTX-1060-3GB</7>
<8>7 6 0B:00.0 NV GTX-1060-3GB</8>
<9>8 7 0E:00.0 NV GTX-1060-3GB</9>
</devmap>
#14
Wish List / Re: Remove projects not initializing
December 20, 2019, 01:50:41 PM
There could be a possible problem here but I have been unable to duplicate it.

Have noticed that when attaching a project a few seconds go by before the project becomes responsive.

For example:  On two of my systems the CPU is not capable of running CPU tasks as it is two busy with the video board and their OpenCL.  If I try to add another project,  I have to click repeatedly on the "initializing project" and when the project becomes responsive I select no-new-tasks.  Once that takes effect I log onto the project, locate my "new" system and set the venue to prevent CPU tasks.  There may be a way to avoid this but it is easier this way.  Unfortunately, if I do not react quick enough, one or two CPU tasks sneak in before the NNT takes effect and have to be aborted.

Thinking about this, it seems logical that if the project never initializes one would never get control. However, Boinctasks does not actually "attach" the project.  It sends a message to the boinc client and asks it to attach.  Sometimes it attaches in a few seconds, other times it can take a few minutes.  It could be you didn't wait long enough.  If the client (boinc) has a problem initializing the project it may not respond to the manager.

RR: I tried to duplicate the problem but did not find QNC listed. The only project that started with Q was quake catcher.  Is there another name this project goes under?

Dave:  I put in goofygrid thinking that garbage-in garbage-out might hang up BT but all I got was an error message and the fake project never got initialized.  If you meant gpugrid then I had had no problem attaching.

On Tuesdays, SETI is regularly down for maintenance.  I once tried to attach with a new Linux system and had to wait a few minutes before BT recovered but that was understandable. 
#15
Wish List / Re: BoincTasks Stealing Window Focus
December 18, 2019, 03:55:59 PM
I have been running 1.80 24/7 since it came out and have not seen anything like you mentioned.  Maybe there is some setting you have enabled that is causing it.

The only time I have ever lost focus to BoincTasks was when one of my "rules' got triggered and sent a text message to my  phone that the temperature went too high on one of my systems.  The focus problem was obvious as the rules log popped up on the display and I found myself typing "into it".  I stopped that by not allowing the rules log to be enabled when activating a Boinctask rule.  That log was only good for diagnostics anyway.

Exactly what is the symptom you are seeing?