Host with strange error

Message boards : Number crunching : Host with strange error
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 771932 - Posted: 22 Jun 2008, 13:40:40 UTC
Last modified: 22 Jun 2008, 13:44:42 UTC

Many times restarted @100%, the same counts every WU, too many credit claimed for too small computation time for such CPU.
http://setiathome.berkeley.edu/results.php?hostid=2046662&offset=120

Resuts are not validated, but...
Maximum daily WU quota per CPU 100/day
Why quota doesn't go down ????
ID: 771932 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51469
Credit: 1,018,363,574
RAC: 1,004
United States
Message 771933 - Posted: 22 Jun 2008, 13:45:28 UTC

Whatever the problem is, the host is not being granted any credit for the work....
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 771933 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 771934 - Posted: 22 Jun 2008, 13:46:46 UTC
Last modified: 22 Jun 2008, 13:49:34 UTC

Yes, but quota still 100 WU per day - that is the problem (it leech WUs at 17sec per WU rate! )
ID: 771934 · Report as offensive
UncleVom

Send message
Joined: 25 Dec 99
Posts: 123
Credit: 5,734,294
RAC: 0
Canada
Message 771942 - Posted: 22 Jun 2008, 14:28:50 UTC - in response to Message 771934.  

Yes, but quota still 100 WU per day - that is the problem (it leech WUs at 17sec per WU rate! )


My guess a virtual machine gone wild.

Why it is not stopped from getting more work may be related to this.

IIRC the work unit quota is 100 per CPU core.


UncleVom
ID: 771942 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 771978 - Posted: 22 Jun 2008, 16:09:57 UTC

I think the quota only goes down for errors. These are marked as "Success" even tho' he is indeed returning bad results. It looks like he got 100 for June 22, then nothing for four hours since, so he seems to have hit the quota. Glad he is not running a quad!
ID: 771978 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 772005 - Posted: 22 Jun 2008, 16:56:58 UTC - in response to Message 771942.  

IIRC the work unit quota is 100 per CPU core.


It is 100 per core, but I believe BOINC still treats the number as a whole, per system. So if a single core returns a bad result, it will be lowered to 99 per core, even on a quad core.



As for a possible answer to the question "How is this person allowed to download so much while returning bad results?" Could it be a bad overclock returning incorrect results instead of invalid results? I think the quota only covers a CPU returning invalid results but does nothing against incorrect results.
ID: 772005 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 772025 - Posted: 22 Jun 2008, 17:53:13 UTC - in response to Message 771978.  

I think the quota only goes down for errors. These are marked as "Success" even tho' he is indeed returning bad results.
...

That's right. Whatever has gone wrong with the host began on June 17, and the 5.8.16 core client doesn't recognize there's a problem. I think the stderr.txt stuff indicates the core client hasn't been able to clear out the slot, so the app is running WUs starting from the leftover final checkpoint. More recent versions of BOINC do checking and create a new slot if they were unable to clear an old one.
                                                                  Joe
ID: 772025 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 772033 - Posted: 22 Jun 2008, 18:08:35 UTC - in response to Message 772025.  
Last modified: 22 Jun 2008, 18:14:44 UTC

I think the quota only goes down for errors. These are marked as "Success" even tho' he is indeed returning bad results.
...

That's right. Whatever has gone wrong with the host began on June 17, and the 5.8.16 core client doesn't recognize there's a problem. I think the stderr.txt stuff indicates the core client hasn't been able to clear out the slot, so the app is running WUs starting from the leftover final checkpoint. More recent versions of BOINC do checking and create a new slot if they were unable to clear an old one.
                                                                  Joe


I thought "success" merely meant that the WU completed and was successfully returned to the server, not necessarily meaning a 'good' or 'bad' result.
ID: 772033 · Report as offensive
Profile Jakob Creutzfeld
Volunteer tester
Avatar

Send message
Joined: 13 Oct 00
Posts: 611
Credit: 2,025,000
RAC: 0
Germany
Message 772144 - Posted: 22 Jun 2008, 21:32:49 UTC
Last modified: 22 Jun 2008, 21:34:06 UTC

Seems like BOINC is not able to clean up the working slot directory, so each unit uses an old result/state file, "restarts" at 99.78%, take only some seconds to completion, and claims too much credit (AFAIK credits are managed by BOINC, not the science app). Something like wrong/insufficious access/directory rights maybe?

Andy
ID: 772144 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 772282 - Posted: 23 Jun 2008, 4:31:05 UTC

So, if I understood right, quota mechanism leaves BOINC unprotected in case of intentional modification of science app to return bad results marked as success ones?
It could lead to some kind of DoS attack with great network load (new WU could be downloaded at very high rate in such situation).
ID: 772282 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 772490 - Posted: 23 Jun 2008, 16:58:30 UTC - in response to Message 772282.  

So, if I understood right, quota mechanism leaves BOINC unprotected in case of intentional modification of science app to return bad results marked as success ones?
It could lead to some kind of DoS attack with great network load (new WU could be downloaded at very high rate in such situation).

I doubt the difference between a quota of 100 and a reduced quota of 1 per day would make much difference to someone designing that sort of DoS attack, they'd need a large number of hosts in either case to make the attack effective.

I just sent a polite PM to the system's owner, though as there are no forum posts listed for the account he probably won't see it.
                                                              Joe
ID: 772490 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 772830 - Posted: 24 Jun 2008, 10:00:38 UTC - in response to Message 772490.  
Last modified: 24 Jun 2008, 10:02:53 UTC

So, if I understood right, quota mechanism leaves BOINC unprotected in case of intentional modification of science app to return bad results marked as success ones?
It could lead to some kind of DoS attack with great network load (new WU could be downloaded at very high rate in such situation).

I doubt the difference between a quota of 100 and a reduced quota of 1 per day would make much difference to someone designing that sort of DoS attack, they'd need a large number of hosts in either case to make the attack effective.

I just sent a polite PM to the system's owner, though as there are no forum posts listed for the account he probably won't see it.
                                                              Joe


But, why would someone turn in faulty results, in the first place.
The credit claimes are all the same and kind off high!
Stupid question maybe, how could one 'proof', f.i. this is fixed?
ID: 772830 · Report as offensive

Message boards : Number crunching : Host with strange error


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.