News:

Follow BoincTasks on Twitter Facebook        Visit our website here.
BoincTasks cloud login is working again

Main Menu

Remote Read Error

Started by spurtle, September 22, 2010, 11:09:28 PM

Previous topic - Next topic

0 Members and 3 Guests are viewing this topic.

spurtle

Greetings.

I have 7 (6+localhost) machines monitored by BT from my main PC.  I have now installed BT on one of the other machines on the network and can't get it to connect to one other machine - the others all connect fine.  All machines including the troublesome one are still connected OK to the 1st copy of BT.
The problem shows as being a read error -102.
Have extensively read old threads regarding connection issues but not found any solution.

Running BT .71, Win XP SP3, all config files identical and password is common to all machines. Have traditionally used names in hosts files without trouble but have added IP (and rebooted) but no change.   
All running the same firewall and AV settings. Don't have TThrottle on any.

Any suggestions welcome and appreciated as I've run out of ideas.

fred

Quote from: spurtle on September 22, 2010, 11:09:28 PM
Greetings.

I have 7 (6+localhost) machines monitored by BT from my main PC.  I have now installed BT on one of the other machines on the network and can't get it to connect to one other machine - the others all connect fine.  All machines including the troublesome one are still connected OK to the 1st copy of BT.
The problem shows as being a read error -102.
Have extensively read old threads regarding connection issues but not found any solution.

Running BT .71, Win XP SP3, all config files identical and password is common to all machines. Have traditionally used names in hosts files without trouble but have added IP (and rebooted) but no change.   
All running the same firewall and AV settings. Don't have TThrottle on any.

Any suggestions welcome and appreciated as I've run out of ideas.
Try updating to the latest beta first 0.74, it's stable, so you have the same version as most of the testers and me.
Where do you see the -102 error. If in the BT log please give me the entire line of text.

spurtle

Hi Fred,

Yes BT log....
23 September 2010 - 20:22:38 Connect, read error ---- Host: 10.1.1.9,cruncher,Port: 31416, Error Number: -102

Just updated to .74 no change.

31416 is open and boinc.exe is listening on it....

Protocol           [TCP]
Program [PID]   boinc.exe [2704]
State              Established
Local Address   10.1.1.9 (cruncher)
Port              31416
Remote Address    10.1.1.13
Remote Port   1052
Path and File   C:\Program Files\BOINC\boinc.exe   
Description   BOINC client

(NB 10.1.1.13 is main machine)
When I close BT on main machine....

Protocol           [TCP]
Program [PID]   boinc.exe [2704]
State              Listening
Local Address   0.0.0.0  (cruncher)
Port              31416
Remote Address    0.0.0.0
Remote Port   0
Path and File   C:\Program Files\BOINC\boinc.exe   
Description   BOINC client

Since my last post I have tried remote connections using Boinc Mgr. Same result.
I can connect to all other machines except cruncher from 2nd BT machine. 
Main machine connects through BM to any/all including cruncher without any problems.

spurtle

Not sure if relevant but might eliminate some possibilities...

Cruncher is a boinc dedicated machine.
Well not even a whole machine - just a mobo, psu and hdd sitting on a shelf in the basement.
Nothing else runs on it other than OS, AV and Boinc.
I operate it by remote desktop.

fred

Error -102 read error normally means no response at all from the remote machine.
This indicates a firewall problem or BOINC refusing to connect.
You don't see any messages in the BOINC log on the remote computer, like connection refused.

spurtle

From boinc messages...

23/09/2010 10:42:51 p.m.      GUI RPC request from non-allowed address 10.1.1.10

I have turned off Win firewall and uninstalled AVG , rebooted to no avail.
What stumps me is one machine can access and other can't.
If it was firewall or AV I'd expect both machines to be blocked.

(10.1.1.10 is the correct IP for the BT machine that can't connect)

fred

Quote from: spurtle on September 23, 2010, 11:51:40 AM
From boinc messages...

23/09/2010 10:42:51 p.m.      GUI RPC request from non-allowed address 10.1.1.10
This means BOINC is not responding to the BT request because it thinks it's an invalid address.
So the firewall is ok, BOINC sees the request.

Make sure the ip address is in the list.

http://www.boinc-wiki.info/GUI_RPC_request_from_non-allowed_address_%27%28ip-address%29%27

spurtle

Indeed it is.
Started out with names as mentioned in my 1st post.
Have since added IP.
Current status is name only of main machine (connects), name plus IP of second machine.
Have redone the file twice to ensure no invisible spaces or other corruption exists.

fred

Quote from: spurtle on September 23, 2010, 06:55:44 PM
Indeed it is.
Started out with names as mentioned in my 1st post.
Have since added IP.
Current status is name only of main machine (connects), name plus IP of second machine.
Have redone the file twice to ensure no invisible spaces or other corruption exists.
There is another way to allow RPC connections.

http://boincfaq.mundayweb.com/index.php?language=1&view=91 as a cc_config.xml option, if the BOINC version is not too old.
Look for allow_remote_gui_rpc. It is available as an command line option with -- for boinc.exe as well.

spurtle

Already had the cc_config option in place before posting.
Obviously that didn't help.

It sure is a head scratcher.
I just don't see any logical answer.
If cruncher had a problem, main machine also shouldn't connect.
If 2nd machine had a problem it shouldn't connect to the other 5 machines.

I have stopped fetching new tasks and when cruncher runs out of work in about 6 hrs I'm going to reinstall Boinc.  Don't have any reason to think that will fix it but trying anything at this stage.

spurtle

Reinstalled Boinc for no change.
Reset router and tried several other things too.

Then I uninstalled all the network drivers on the machine which I have BT on.
Installed latest drivers.
Bingo, that fixed cruncher.  Connecting OK now.

Downside is I now have 2 other machines I can't connect to with all the same symptoms. Identical messages etc. Main machine still connects to everything fine.
Definitely a case of one step forward and two back!
Getting close to the big hammer repair.

Seems to me this is not a BT defect at all, but any further suggestions are welcomed.

Cheers.



fred

Quote from: spurtle on September 24, 2010, 05:45:33 AM
Reinstalled Boinc for no change.
Reset router and tried several other things too.

Then I uninstalled all the network drivers on the machine which I have BT on.
Installed latest drivers.
Bingo, that fixed cruncher.  Connecting OK now.

Downside is I now have 2 other machines I can't connect to with all the same symptoms. Identical messages etc. Main machine still connects to everything fine.
Definitely a case of one step forward and two back!
Getting close to the big hammer repair.

Seems to me this is not a BT defect at all, but any further suggestions are welcomed.

Cheers.
Indeed the problem is not BT related.
The best help you can get is here:
http://boinc.berkeley.edu/dev/forum_forum.php?id=10They know a lot more about BOINC than I do.

fred

Another way to check your allowed IP addresses.

Check the first lines of the BOINC log closely after startup.

3         24-09-2010 06:03:29   Config: GUI RPC allowed from:   
4         24-09-2010 06:03:29   Config:   192.168.xxx.xxx   

It list all allowed IP addresses.

I'm using 6.10.56

spurtle

Solved!!  Party time! ;D
Put hammer away.

Went out for a while and on return noticed in BT that machine 5 had picked up a new task for a project it isn't allowed to run.  How strange.  Checked project website and found that task was issued to machine 9. Ah ha!
Carefully compared work in BT with work in BMgr on machines.  Oops nothing adds up in several machines.
BT computers tab -> IP addresses are wrongly assigned to machines.  eg BT shows machine 9's IP and data as being from machine 5.
Manually changed/corrected 4 IP addresses and all is well - with all computers now connected.

How this came about I have no idea.  Perhaps it was a BT error all along? I'm not even sure if this was the problem from the beginning of the troubles.

Anyway thanks for your generous efforts to help Fred.

Cheers

spurtle

I can now confirm that this was the problem all along.

On the project website I have checked the returned task results of cruncher from a couple of days ago and BT says they were returned by another machine.

It seems to me that following numerous machine boots, router boots and a BT upgrade, this instance of BT has somehow failed to recognise reassigned dynamic LAN IP's.
The BT on the main machine has clearly coped as it has not failed to connect all along.