1.27.still looping and using most of the CPU (98%)

Started by idahofisherman, November 14, 2011, 04:29:33 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

idahofisherman

This has been happening since 1.20, but have not been able to pin point it to any thing specific.  It alo seems to cause the boincmgr to crash with invalid password supplied.  When I reconnect using localhost the boincmgr comes  up alright.  Exiting Boinctasks and restarting it allows it to run for a while, but continues to slowly build up the CPU usage until the computer becomes unusable and the boincmgr loses its connection to the client.

Are there any suggestions as to  how to track this down.  Its become a pain in the butt. because I can not leave boinctasks unattended for more than 12 hours.

fred

Quote from: idahofisherman on November 14, 2011, 04:29:33 AM
This has been happening since 1.20, but have not been able to pin point it to any thing specific.  It alo seems to cause the boincmgr to crash with invalid password supplied.  When I reconnect using localhost the boincmgr comes  up alright.  Exiting Boinctasks and restarting it allows it to run for a while, but continues to slowly build up the CPU usage until the computer becomes unusable and the boincmgr loses its connection to the client.

Are there any suggestions as to  how to track this down.  Its become a pain in the butt. because I can not leave boinctasks unattended for more than 12 hours.
It's not a good idea to run both the Manager and BoincTasks.
As the BOINC Manager is getting problems connecting, it points to some problem with the BOINC Manager or the BOINC client. But I've seen the BOINC Manager do this on my own computer, if it gets too many WU's to handle.
It may be a memory problem in the Manager that effects BoincTasks as well.

What BOINC version are you using? What Windows version? 32 or 64 bit.

fred

For V 1.28 I will add some run time graphs, to better analyze the problem.

idahofisherman

#3
I am presently running XP3 32 bit.

I cant seem to get BT to connect to Client without the boincmgr running.

fred

Quote from: idahofisherman on November 14, 2011, 09:00:20 PM
I am presently running XP3 32 bit.

I cant seem to get BT to connect to Client without the boincmgr running.
Probably relate to: http://www.efmer.eu/forum_tt/index.php?topic=862.0
If you can answer the same question.

idahofisherman

I run in administration mode at all times.   Have also changed BT to start Boinc client.  Seems to work okay and I can attach the client.  Have also changed the BOINC folders and sub folders from read only to allow all activities.

idahofisherman

Still looping.  NNoticed that Boinc is not running, when I find BT looping.  Have to kill BT with taskmgr and restart it.

fred

Quote from: idahofisherman on November 17, 2011, 08:42:56 AM
Still looping.  NNoticed that Boinc is not running, when I find BT looping.  Have to kill BT with taskmgr and restart it.
BOINC Version?
Why is the BOINC client not running?

idahofisherman

Boinc version 6.13.12.  I don't know why it isn't running.  It starts when I start BT.  Is there some where I can look to see why it stopped?

fred

Quote from: idahofisherman on November 17, 2011, 05:12:38 PM
Boinc version 6.13.12.  I don't know why it isn't running.  It starts when I start BT.  Is there some where I can look to see why it stopped?
6.13.12 is more alpha, a bit too buggy for me to test. I have one computer running, but strange things happen.
That explains the strange behavior a crashed client.

Pepo

It smells pretty ill like my issues with BT 1.25-1.27 and BOINC 6.13.9+6.13.10.
Please, idahofisherman, have you observed any of the issues I've been reporting?

I'm just thinking of checking an even newer 6.13.x client. I'd like to find out rather sooner (alpha) than at later (pre-release) stages...
Peter

fred

I check out the new BOINC versions, but it's too much work to check all the changes.  In V 10 I reported a problem writing the cc_config.xml, lets see if they fixed it.

idahofisherman

#12
Here is the last part of stdoutdae file for when Boinc wasn't running and BT was in a loop.  Maybe this will be some help.

18-Nov-2011 17:29:17 [---] failed to rename xfer history file: Error 5
18-Nov-2011 17:29:26 [Server for testing Bolpex] Scheduler request completed
18-Nov-2011 17:30:11 [EDGI Demo Project] Sending scheduler request: Requested by project.
18-Nov-2011 17:30:11 [EDGI Demo Project] Not reporting or requesting tasks
18-Nov-2011 17:30:14 [EDGI Demo Project] Scheduler request completed
18-Nov-2011 17:30:14 [---] Can't rename current state file to previous state file; The process cannot access the file because it is being used by another process. (0x20)
18-Nov-2011 17:30:14 [---] rename error: Access is denied. (0x5)


Here is the BT log at the end just before I terminated BT while it was in a  loop:

Elements,Port: 31416, connection error
18 November 2011 - 18:11:46 Update State ---- Host: 192.168.0.196, Rpc Thread ID: 5156, wu_1321525201_66099_0
18 November 2011 - 18:13:10 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:14:29 Update State ---- Host: 192.168.0.100, Rpc Thread ID: 2600, wu_1321525201_66086_0
18 November 2011 - 18:14:45 Update State ---- Host: 192.168.0.199, Rpc Thread ID: 5444, wu_1321525201_66200_0
18 November 2011 - 18:15:24 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:17:26 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:18:14 Update State ---- Host: 192.168.0.102, Rpc Thread ID: 888, qcnc_001213_0
18 November 2011 - 18:19:27 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:20:24 Update State ---- Host: 192.168.0.197, Rpc Thread ID: 4920, wu_1321525201_66325_0
18 November 2011 - 18:21:28 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:23:28 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:25:29 Connect, init ---- Host: localhost, Elements,Port: 31416, connection error
18 November 2011 - 18:26:04 Update State ---- Host: 192.168.0.102, Rpc
18-Nov-2011 17:30:14 [---] Couldn't write state file: rename() failed; giving up


It looks like it  is trying to connect to the localhost Boinc Client, and just keeps looping .

fred

Quote from: idahofisherman on November 19, 2011, 01:51:56 AM
Here is the last part of stdoutdae file for when Boinc wasn't running and BT was in a loop.  Maybe this will be some help.

18-Nov-2011 17:29:17 [---] failed to rename xfer history file: Error 5
18-Nov-2011 17:30:14 [---] Can't rename current state file to previous state file; The process cannot access the file because it is being used by another process. (0x20)
18-Nov-2011 17:30:14 [---] rename error: Access is denied. (0x5)

It looks like it  is trying to connect to the localhost Boinc Client, and just keeps looping .
Something is locking files in the BOINC data folder.
I see 2 different files being locked.
Is there a rescheduler running or do you have a backup program running?