Provide support for 16 physical cores

Started by ChrisSibbald, July 05, 2012, 11:52:03 PM

Previous topic - Next topic

0 Members and 3 Guests are viewing this topic.

ChrisSibbald

I have a dual Xeon 2687W workstation and it appears TThrottle cannot handle this many cores.  During initialization TThrottle just retries and retries to determine number of cores.  Is this expected?  Do you plan to support 16 physical cores?  I am a "paying" member  :D

fred

Quote from: ChrisSibbald on July 05, 2012, 11:52:03 PM
I have a dual Xeon 2687W workstation and it appears TThrottle cannot handle this many cores.  During initialization TThrottle just retries and retries to determine number of cores.  Is this expected?  Do you plan to support 16 physical cores?  I am a "paying" member  :D
TThrottle should be able to handle 12 Temp cores, that's 24 cores, should be enough for now.
Do you use the latest Beta version? If not do so, it fixed these sort of problems.
If the problems remain, I need the LATEST log file from here: C:\Users\username\AppData\Roaming\eFMer\TThrottle\log.
You can email the log, because it's a bit long.
If for some reason the log is empty:

In TThrottle.xml (C:\Program Files\eFMer\TThrottle). If the file isn't there copy it from the example folder.

Make sure:

file is set to 1:

<logging>
  <file>1</file>
  <email>0</email>
</logging>

ChrisSibbald

Thanks for the Speedy response Fred.  I will try the beta version and let you know how I make out.

ChrisSibbald

HI Fred,

No luck with Beta.  Same behavior, calibration keeps restarting.  Where can I find the email address to send you the log file?

fred

Quote from: ChrisSibbald on July 06, 2012, 11:16:16 AM
HI Fred,

No luck with Beta.  Same behavior, calibration keeps restarting.  Where can I find the email address to send you the log file?
Got you logs. Need everything from the log tab as well to see what's going on. (after TThrottle startup)

ChrisSibbald

Thanks Fred.  Here is the text of the log from the TThrottle UI.  Let me know if there is anything else I can do to help.

----------------log-------------------

06 July 2012 - 13:05:58 Driver installed properly. Driver Version: 2.3
06 July 2012 - 13:05:58 Driver regulator: active

Program version: 5.70 64Bit
Microsoft Windows 7 Ultimate Edition Service Pack 1 (build 7601), 64-bit

Language: User: 1033 ENU ,System: 1033 ENU

nvidia: found 2 logical devices
nvidia: found 2 physical devices
nvidia: Temperature 58 °C, max Temperature 127 °C
nvidia: Temperature 60 °C, max Temperature 127 °C

nvidia: GeForce GTX 580, GeForce GTX 580

Vendor ID: GenuineIntel
Vendor: INTEL
HighestIntegerValue: 0000000D - Processor Signature: 000206D7
Misc. info: 02200800
Feature Flags1 1FBEE3FF
Feature Flags2 BFEBFBFF

Processor:       Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz               
Processor: Family: 6h, Model: 2D, Stepping: 7

TJunction read from CPU: 92  °C, TJunction using: 92  °C

Core Temperature: 54 °C, Raw Data: 88260000
882b0000,882b0000,882a0000,882a0000,882d0000,882d0000,882d0000,882d0000,
882e0000,882e0000,88290000,88290000,882b0000,882b0000,882c0000,882c0000,
88220000,88220000,88240000,88240000,88270000,88270000,882a0000,882a0000,
This Processor has 16 cores and  8 temperature sensors.

BOINC:
lhcathome2.cern.ch_test4theory
lhcathomeclassic.cern.ch_sixtrack
setiathome.berkeley.edu
www.worldcommunitygrid.org

You can help by reading www.efmer.eu/boinc/faq.html How can I help!
Select the send EMail button,or copy everything in this logging window and mail it to me!
boinc[At]efmer[Dot]eu. We use this information to improve this product.


fred

Quote from: ChrisSibbald on July 06, 2012, 09:46:35 PM
This Processor has 16 cores and  8 temperature sensors.
This indicates that the calibration has succeeded.
As always 1 sensor / 1 core = 2 shared cores.

Does the calibration dialog still pop up?

ChrisSibbald

Hi Fred,

The calibration dialog continuously popped up.  I let it run for quite some time (many popups) then killed TThrottle.  I will try again and be more patient.

ChrisSibbald

Hi Fred

No luck.  The calibration dialog keeps popping up.  As soon as the progress bar gets to 100% TThrottle either exits/restarts or simply minimizes to task bar and the calibration restarts.  Here is the content of the log from the UI.  I noted that CPU 2 is showing up as XXX...perhaps that has something to do with it.  I will also email you the log file.

----------------UI log--------------
Number of matching Programs (Processes): 19
CPU:1 (18%) - PID:328344 (4)   Slot:17   http://www.worldcommunitygrid.org/   CMD2_2541-1HCI_A.clustersOccur-3BJI_A.clustersOccur_726
CPU:1 (14%) - PID:326968 (4)   Slot:18   http://www.worldcommunitygrid.org/   CMD2_2541-1HCI_A.clustersOccur-3BJI_A.clustersOccur_620
CPU:1 (12%) - PID:204664 (4)   Slot:27   http://www.worldcommunitygrid.org/   CMD2_2541-1HCI_A.clustersOccur-3BJI_A.clustersOccur_676
CPU:1 (8%) - PID:175288 (3)   Slot:28   http://www.worldcommunitygrid.org/   qf530_00071
CPU:1 (2%) - PID:186276 (3)   Slot:20   http://www.worldcommunitygrid.org/   qf530_00039
CPU:1 (1%) - PID:205636 (3)   Slot:29   http://www.worldcommunitygrid.org/   HFCC_target-9_01354412_target-9_0000
CPU:1 (5%) - PID:223692 (3)   Slot:9   http://www.worldcommunitygrid.org/   DSFL_000090-1_0000010_0206
CPU:1, GPU:0, PID:338072 (2)   Child:   wcgrid_dsfl_vina_prod_x86.exe.6.25
XXX: 0 (0%) - PID:240556 (3)   Slot:12   http://www.worldcommunitygrid.org/   faah34564_ZINC24193330_xJ1_xtal_00
XXX: 0 (0%) - PID:249672 (3)   Slot:19   http://www.worldcommunitygrid.org/   GFAM_x3bpmB_PfFP3_0029538_0140
CPU:1, GPU:0, PID:329808 (2)   Child:   wcgrid_gfam_vina_prod_x86.exe.6.12
XXX: 0 (0%) - PID:326408 (3)   Slot:24   http://www.worldcommunitygrid.org/   faah34564_ZINC27675489_xJ1_xtal_00
XXX: 0 (0%) - PID:312588 (3)   Slot:6   http://www.worldcommunitygrid.org/   SN2S_2X8L_1000195_0945
CPU:1, GPU:0, PID:336648 (2)   Child:   wcgrid_sn2s_vina_prod_x86.exe.6.20
XXX: 0 (0%) - PID:299780 (3)   Slot:3   http://www.worldcommunitygrid.org/   faah34570_ZINC09339383_xJ1_xtal_01
XXX: 0 (0%) - PID:274052 (3)   Slot:1   http://www.worldcommunitygrid.org/   c4cw_target06_074537065
XXX: 0 (0%) - PID:324340 (3)   Slot:5   http://www.worldcommunitygrid.org/   faah34570_ZINC09090403_xJ1_xtal_04
XXX: 0 (0%) - PID:326532 (3)   Slot:10   http://www.worldcommunitygrid.org/   c4cw_target06_074538411
XXX: 0 (0%) - PID:327184 (3)   Slot:25   http://www.worldcommunitygrid.org/   DSFL_000090-1_0000023_0227
CPU:1, GPU:0, PID:338860 (2)   Child:   wcgrid_dsfl_vina_prod_x86.exe.6.25
XXX: 0 (0%) - PID:240532 (3)   Slot:16   http://www.worldcommunitygrid.org/   GFAM_x3bpmB_PfFP3_0029595_0225
CPU:1, GPU:0, PID:332852 (2)   Child:   wcgrid_gfam_vina_prod_x86.exe.6.12
XXX: 0 (0%) - PID:328076 (3)   Slot:13   http://www.worldcommunitygrid.org/   DSFL_000090-1_0000024_0235
CPU:1, GPU:0, PID:334896 (2)   Child:   wcgrid_dsfl_vina_prod_x86.exe.6.25
XXX: 0 (0%) - PID:328196 (4)   Slot:21   http://www.worldcommunitygrid.org/   CMD2_2541-2QZU_A.clustersOccur-3BIY_A.clustersOccur_8
XXX: 0 (0%) - PID:326040 (3)   Slot:8   http://www.worldcommunitygrid.org/   X0900065551380200603231443
XXX: 0 (0%) - PID:124388 (3)   Slot:4   http://www.worldcommunitygrid.org/   SN2S_2X8L_1000195_0829
CPU:1, GPU:0, PID:339228 (1)   Child:   wcgrid_sn2s_vina_prod_x86.exe.6.20
XXX: 0 (0%) - PID:326228 (3)   Slot:30   http://lhcathome2.cern.ch/test4theory/   uc_1340953163_41242
GPU:1 0 PID:335708 (5)   Slot:15   http://setiathome.berkeley.edu/   28jn10ab.6881.16022.12.10.50
GPU:1 0 PID:336248 (5)   Slot:2   http://setiathome.berkeley.edu/   20au10af.5841.124908.14.10.241
GPU:1 0 PID:334448 (5)   Slot:7   http://setiathome.berkeley.edu/   01jl10aa.26205.23986.3.10.97
GPU:1 0 PID:334584 (5)   Slot:0   http://setiathome.berkeley.edu/   01jl10aa.26205.23986.3.10.46
GPU:1 0 PID:338868 (5)   Slot:11   http://setiathome.berkeley.edu/   01jl10aa.26205.23986.3.10.117
GPU:1 0 PID:338912 (5)   Slot:14   http://setiathome.berkeley.edu/   01jl10aa.26205.23986.3.10.115
----------------------------------------------------------------------------------------------------------------------------------------------

ChrisSibbald

One other data point...

I ran this box for a while with only one CPU and TThrottle ran fine with one Xeon E5-2687W.  When I added the second CPU I started having these issues.

fred

Quote from: ChrisSibbald on July 07, 2012, 01:22:06 PM
One other data point...

I ran this box for a while with only one CPU and TThrottle ran fine with one Xeon E5-2687W.  When I added the second CPU I started having these issues.
OK, this should be enough for some testing or a test build.

fred

Close down TThrottle
Remove the logs from: C:\Users\fred\AppData\Roaming\eFMer\TThrottle\log
Unzip and move the exe to C:\Program Files\eFMer\TThrottle
http://www.efmer.eu/download/boinc/boinc_tasks/unified/TThrottle64_570_test.zip

Start TThrottle and wait until the calibration finished.
I like to see the logging in: C:\Users\fred\AppData\Roaming\eFMer\TThrottle\log

Fire up regedit.exe and check the keys in HKEY_CURRENT_USER\Software\eFMer\TThrottle64\calibration
Give me core_delta (2) and cores (16).

ChrisSibbald

Thanks Fred.  Just sent you an email with the log and screenshot of registry keys.

First Time Processor Check complete, TThrottle restarts and does First time processor check, repeats.  I let it try 5 times then used task manager to kill TThrottle.

fred

Quote from: ChrisSibbald on July 05, 2012, 11:52:03 PM
I have a dual Xeon 2687W workstation and it appears TThrottle cannot handle this many cores.  During initialization TThrottle just retries and retries to determine number of cores.  Is this expected?  Do you plan to support 16 physical cores? :D
OK, now I see it, the CPU has 16 Threads *2 = 32 Threads and that's more than this version supports right now (24).
Let's see what I can do, this is a lot more work. Because it involves the graphs as well and BoincTasks. :-X

ChrisSibbald

Thanks Fred.  I would really appreciate it if you can support 32 threads.