Automatic detach and reattach to Seti

Message boards : Number crunching : Automatic detach and reattach to Seti
Message board moderation

To post messages, you must log in.

AuthorMessage
HDL
Volunteer tester

Send message
Joined: 20 Apr 05
Posts: 27
Credit: 11,577,352
RAC: 0
United Kingdom
Message 545407 - Posted: 13 Apr 2007, 10:36:53 UTC
Last modified: 13 Apr 2007, 10:38:03 UTC

I am using Bonic 5.9.0.32 created by Crunch3r on 3 computers. I first installed 5.8.11 from Berkeley, then overwrote it with 5.9.0.32. It worked well for a few days. Then for the last few days, it automatic detach and retach to Seti several times. Especially last night, one computer detach and attach twice. Now I have 4 copies of the same computer. Alghough all the work units are intact.

Even bigger problem is: when the Seti is detached, Berkeley send the detached work units to other computers. When the work units sent to other computers are finished before mine, then my computer will not get credit as it says the results are too late to be validated, even though the original deadline is weeks away.

Does anybody else have the same problem? Would it be the Boinc 5.9.0.32?
ID: 545407 · Report as offensive
Profile michael37
Avatar

Send message
Joined: 23 Jul 99
Posts: 311
Credit: 6,955,447
RAC: 0
United States
Message 545460 - Posted: 13 Apr 2007, 13:58:05 UTC

Just happened to me. The host was fine yesterday. During the night, it detached and re-attached. I am using fairly old venerable boinc 5.4.9.

4/13/2007 8:18:49 AM	1444	Reason: To report completed tasks
4/13/2007 8:18:49 AM	1445	Reporting 1 tasks
4/13/2007 8:18:54 AM	1446	Scheduler request succeeded
4/13/2007 8:18:54 AM	1447	Message from server: Can't find host record
4/13/2007 8:18:54 AM	1448	Generated new host CPID: b7047f0face637a4cfe5b3118b716358


ID: 545460 · Report as offensive
HDL
Volunteer tester

Send message
Joined: 20 Apr 05
Posts: 27
Credit: 11,577,352
RAC: 0
United Kingdom
Message 545489 - Posted: 13 Apr 2007, 14:41:42 UTC - in response to Message 545460.  
Last modified: 13 Apr 2007, 14:42:46 UTC

Just happened to me. The host was fine yesterday. During the night, it detached and re-attached. I am using fairly old venerable boinc 5.4.9.

4/13/2007 8:18:49 AM	1444	Reason: To report completed tasks
4/13/2007 8:18:49 AM	1445	Reporting 1 tasks
4/13/2007 8:18:54 AM	1446	Scheduler request succeeded
4/13/2007 8:18:54 AM	1447	Message from server: Can't find host record
4/13/2007 8:18:54 AM	1448	Generated new host CPID: b7047f0face637a4cfe5b3118b716358


I checked messages, they are similar to previously reported

2007-04-12 21:35:33 [SETI@home] Scheduler RPC succeeded [server version 509]
2007-04-12 21:35:33 [SETI@home] Message from server: Can't find host record
2007-04-12 21:35:33 [---] General prefs: from SETI@home (last modified 2007-04-12 21:17:10)
2007-04-12 21:35:33 [---] Host location: work
2007-04-12 21:35:33 [---] General prefs: no separate prefs for work; using your defaults
2007-04-12 21:35:33 [SETI@home] Deferring communication 11 sec, because requested by project
2007-04-12 21:35:33 [SETI@home] Generated new host CPID: b438405162018a11f281f91b02edbd9b

This detach-attach phenomenon is reported in another thread. It looks like a Setiwide problem.

Most annoyingly, a few of finished work units do not get credits.
ID: 545489 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21731
Credit: 7,508,002
RAC: 20
United Kingdom
Message 545525 - Posted: 13 Apr 2007, 16:54:51 UTC - in response to Message 545489.  
Last modified: 13 Apr 2007, 16:55:25 UTC

...This detach-attach phenomenon is reported in another thread. It looks like a Setiwide problem.

Most annoyingly, a few of finished work units do not get credits.

By how many people so far?

(I've not seen this yet but then again I'm stuck in EDF :-( ... )

There are various server and database shufflings in progress. Something may well have "Ooooopsied". Then again, are you sure that you do not have filesystem or anti-virus or even virus problems?

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 545525 · Report as offensive
HDL
Volunteer tester

Send message
Joined: 20 Apr 05
Posts: 27
Credit: 11,577,352
RAC: 0
United Kingdom
Message 545619 - Posted: 13 Apr 2007, 19:50:24 UTC - in response to Message 545525.  

...This detach-attach phenomenon is reported in another thread. It looks like a Setiwide problem.

Most annoyingly, a few of finished work units do not get credits.

By how many people so far?

(I've not seen this yet but then again I'm stuck in EDF :-( ... )

There are various server and database shufflings in progress. Something may well have "Ooooopsied". Then again, are you sure that you do not have filesystem or anti-virus or even virus problems?

Happy crunchin',
Martin

See this thread,http://setiathome.berkeley.edu/forum_thread.php?id=38714, they are reporting the same detach/attach problems and WUs are not getting credits.

ID: 545619 · Report as offensive
Temujin
Volunteer tester

Send message
Joined: 19 Oct 99
Posts: 292
Credit: 47,872,052
RAC: 0
United Kingdom
Message 545715 - Posted: 13 Apr 2007, 22:55:14 UTC

I've had 4 machines detach and re-attach with no input from me, all in the last 3-4 days.
The machines involved are a mixture of winxp and liunx but all are running boinc-5.8.11 and Chicken2.2B
ID: 545715 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 545784 - Posted: 14 Apr 2007, 0:56:23 UTC

Whoa.........what's all this detach and reatach crap????? The Boinc client should not be able to do this without the user's input, NO? And the application...never!!
Has anybody figured out how this is happening? I have not had the misfortune. All of the 'file not found' locked up downloads cleared themsleves when Matt fixed the database. If one of my rigs detached on it's own, I might have to shoot it.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 545784 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 545815 - Posted: 14 Apr 2007, 2:11:59 UTC

I agree it's definitely strange, but there's been too many reports to just dismiss it as "nothing".

Notice the message tab snippets indicating the project side had "lost" the host ID record and/or CPID. That would explain the appearance of the spontaneous detaches in the Host Summary and wouldn't take any action from the host other than making a connection to the scheduler. Same kind of thing happens when you restore from a backup and the RPC Sequence Number is wrong.

I'm guessing the troubles with the "Zombie Hunt" is the reason for this.

Alinator
ID: 545815 · Report as offensive
Profile michael37
Avatar

Send message
Joined: 23 Jul 99
Posts: 311
Credit: 6,955,447
RAC: 0
United States
Message 545938 - Posted: 14 Apr 2007, 6:55:16 UTC

I decided to reset the project on my computers affected by the automatic detach and reattach problem. It seems to clear all the post-effects.

ID: 545938 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 545957 - Posted: 14 Apr 2007, 8:05:53 UTC
Last modified: 14 Apr 2007, 8:38:55 UTC

I was haunted by the "client detached" issue (other detach/attach thread), and got fed up with the OK reporting and zero WU result recognition.

I did not detach from the project and then reattach. I have aborted all the WUs in my cache and started again. Problem is the 24 hour maximum number of WUs rule is restricting me. But this will clear in a day or so!

Looks like it's cleared now!

Just returned the results from the first 4 WUs, and the pending credit rose by the appropriate amount. So, it looks like things have/are returning to normal (from a first, and, very small sample).
It's good to be back amongst friends and colleagues



ID: 545957 · Report as offensive
Ivailo Bonev
Volunteer tester
Avatar

Send message
Joined: 26 Jun 00
Posts: 247
Credit: 35,864,461
RAC: 2
Bulgaria
Message 545973 - Posted: 14 Apr 2007, 9:06:04 UTC
Last modified: 14 Apr 2007, 9:08:27 UTC

Main workhorse detached twice for last month without user assistance:

12 April
30 March
and lost many hours without credits for some units(here Lost Credits and there Another Loss) :(
ID: 545973 · Report as offensive
Profile Fivestar Crashtest
Volunteer tester
Avatar

Send message
Joined: 10 Dec 99
Posts: 226
Credit: 5,377,978
RAC: 0
United States
Message 546163 - Posted: 14 Apr 2007, 19:10:29 UTC

I found this had happened to me today on my FX-60. :( It had been RACing up nicely until today. I looked at the result page and saw everything was "client detached" yet when I looked at the actual computer, everything looked normal. When the results get reported they get no credit. And they were the ones in the 60's, too.

I am using 5.4.11 BOINC and the 2.2B app.

Sad crunching today,

Pam
ID: 546163 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 546189 - Posted: 14 Apr 2007, 20:13:59 UTC
Last modified: 14 Apr 2007, 20:15:03 UTC

I don't know what you guys do to deserve such bad luck like that. :-(

I deliberately leave mine alone as much as possible and sometimes even go as far as to try and set one or two up to fail as people describe here in the forum to get some better insight to what might being going on to cause it.

It seems like no matter what I do, mine just keep Timex'ing along and simply ignore my attempts to break it. ;-)

I think we can pretty well safely say this spontaneous detaching issue is a project backend issue. So there's not much you can do to prevent it on your end, except maybe reduce your CI and keep a closer eye on your hosts to minimize the "damage" if and when it happens.

Alinator
ID: 546189 · Report as offensive
HDL
Volunteer tester

Send message
Joined: 20 Apr 05
Posts: 27
Credit: 11,577,352
RAC: 0
United Kingdom
Message 546263 - Posted: 14 Apr 2007, 22:33:55 UTC - in response to Message 546163.  

I found this had happened to me today on my FX-60. :( It had been RACing up nicely until today. I looked at the result page and saw everything was "client detached" yet when I looked at the actual computer, everything looked normal. When the results get reported they get no credit. And they were the ones in the 60's, too.

I am using 5.4.11 BOINC and the 2.2B app.

Sad crunching today,

Pam

I changed 5.9.0.32 to 5.8.15 still have the detach problem.

For most of your detached ones, if your host can crunch them and report the results before others send in the results, you can still get credit, otherwise it will say too late to be validated.

At moment, I just reduce the size of cache and cruch and report the results as soon as possible.

ID: 546263 · Report as offensive
Franz Bauer

Send message
Joined: 8 Feb 01
Posts: 127
Credit: 9,690,361
RAC: 0
Canada
Message 546515 - Posted: 15 Apr 2007, 7:20:28 UTC

I had about 50 of these "client detached" work units on 1 of my computers in one big batch. I still have 11 to get rid of.

Franz
ID: 546515 · Report as offensive
Ivailo Bonev
Volunteer tester
Avatar

Send message
Joined: 26 Jun 00
Posts: 247
Credit: 35,864,461
RAC: 2
Bulgaria
Message 548581 - Posted: 18 Apr 2007, 16:47:16 UTC - in response to Message 545973.  
Last modified: 18 Apr 2007, 17:01:46 UTC

Main workhorse detached twice for last month without user assistance:

12 April
30 March
and lost many hours without credits for some units(here Lost Credits and there Another Loss) :(

Another (3-rd) spontaneous detach for one month on the same computer, it's too much for me... think we have a bug in Boinc Server backend...
ID: 548581 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 548587 - Posted: 18 Apr 2007, 16:51:01 UTC - in response to Message 548581.  

Main workhorse detached twice for last month without user assistance:

12 April
30 March
and lost many hours without credits for some units(here Lost Credits and there Another Loss) :(

Another (3-rd) spontaneous detach for one month on the same conmputer, it's too much for me... think we have a bug in Boinc Server backend...


'Ya think??

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 548587 · Report as offensive

Message boards : Number crunching : Automatic detach and reattach to Seti


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.