Rash of Validate errors

Message boards : Number crunching : Rash of Validate errors
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
whawn

Send message
Joined: 11 Apr 00
Posts: 18
Credit: 1,053,191
RAC: 2
United States
Message 714586 - Posted: 18 Feb 2008, 2:17:31 UTC

I have been suffering a rash of validate errors the past few days. None of the Wu processed today succeeded, and several 'validate errors' remain on the results page from previous days, but not as many as had occurred.

I haven't seen a lot of discussion here, so I'm wondering whether I'm in some way unique.

I know this problem has occurred in the past, and a clean-up script has been used to fix the trouble. But, I'm wondering whether my WU are being picked up, or are simply disappearing from my results page?
ID: 714586 · Report as offensive
Profile Shane Meyer
Volunteer tester
Avatar

Send message
Joined: 22 Jan 00
Posts: 126
Credit: 31,280,265
RAC: 42
Australia
Message 714618 - Posted: 18 Feb 2008, 3:50:27 UTC - in response to Message 714586.  

I have been suffering a rash of validate errors the past few days. None of the Wu processed today succeeded, and several 'validate errors' remain on the results page from previous days, but not as many as had occurred.

I haven't seen a lot of discussion here, so I'm wondering whether I'm in some way unique.

I know this problem has occurred in the past, and a clean-up script has been used to fix the trouble. But, I'm wondering whether my WU are being picked up, or are simply disappearing from my results page?


No one will be able to help you while your computers are hidden
They can be caused by OC your computer too much,too much heat and a lot of things but without seeing the WU's more targeted help cannot be supplied

ID: 714618 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 714634 - Posted: 18 Feb 2008, 4:50:24 UTC - in response to Message 714618.  


No one will be able to help you while your computers are hidden
They can be caused by OC your computer too much,too much heat and a lot of things but without seeing the WU's more targeted help cannot be supplied


Validate Error is not the same as Compute Error.

Validate Error has been substantiated as a problem on SETI's side of the system, when the uploaded result file is not present when the validator goes to validate a workunit.

Unhiding the computer(s) will gain access to the result list, but it will not allow anyone to provide any additional insight other than suggesting a BOINC client that doesn't report immediately or setting the connect interval high enough that it isn't reporting immediately so as to reduce the occurrances of the issue.

For more information, the Validate Errors II thread can provide a wealth of information on the subject.
ID: 714634 · Report as offensive
whawn

Send message
Joined: 11 Apr 00
Posts: 18
Credit: 1,053,191
RAC: 2
United States
Message 714644 - Posted: 18 Feb 2008, 5:58:31 UTC - in response to Message 714634.  


Unhiding the computer(s) will gain access to the result list, but it will not allow anyone to provide any additional insight other than suggesting a BOINC client that doesn't report immediately or setting the connect interval high enough that it isn't reporting immediately so as to reduce the occurrances of the issue.

For more information, the Validate Errors II thread can provide a wealth of information on the subject.


Thanks, Brian,

H'ever, my client is not reporting immediately. It usually reports on the next upload opportunity, so there is anywhere between an hour to six hours between an upload and reporting. For the past several upload attempts, I'm typically seeing this:

2/17/2008 9:50:20 PM|SETI@home|Computation for task 30dc06af.6706.15205.12.7.171_2 finished
2/17/2008 9:50:20 PM|SETI@home|Starting 05dc06ae.16424.4980.7.7.100_1
2/17/2008 9:50:20 PM|SETI@home|Starting task 05dc06ae.16424.4980.7.7.100_1 using setiathome_enhanced version 527
2/17/2008 9:50:23 PM|SETI@home|Started upload of 30dc06af.6706.15205.12.7.171_2_0
2/17/2008 9:50:24 PM|SETI@home|[error] Error on file upload: no command
2/17/2008 9:50:24 PM|SETI@home|Giving up on upload of 30dc06af.6706.15205.12.7.171_2_0: fatal upload error

And I'm now seeing several files (including the one referenced above) as 'ready to report' even though the system has 'given up on upload...'

Trying to follow the thread you speak of is both difficult and not especially enlightening.

For example, I haven't seen anything concerning whether the affected files are found and fixed or simply disappear, with a consequent waste of time and electrons.

I thought starting a new thread (the old began 'way back last May) might bring a more compact view of the situation.

Thanks,
Walt
ID: 714644 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 714648 - Posted: 18 Feb 2008, 6:09:58 UTC - in response to Message 714644.  


H'ever, my client is not reporting immediately. It usually reports on the next upload opportunity, so there is anywhere between an hour to six hours between an upload and reporting. For the past several upload attempts, I'm typically seeing this:

2/17/2008 9:50:20 PM|SETI@home|Computation for task 30dc06af.6706.15205.12.7.171_2 finished
2/17/2008 9:50:20 PM|SETI@home|Starting 05dc06ae.16424.4980.7.7.100_1
2/17/2008 9:50:20 PM|SETI@home|Starting task 05dc06ae.16424.4980.7.7.100_1 using setiathome_enhanced version 527
2/17/2008 9:50:23 PM|SETI@home|Started upload of 30dc06af.6706.15205.12.7.171_2_0
2/17/2008 9:50:24 PM|SETI@home|[error] Error on file upload: no command
2/17/2008 9:50:24 PM|SETI@home|Giving up on upload of 30dc06af.6706.15205.12.7.171_2_0: fatal upload error

And I'm now seeing several files (including the one referenced above) as 'ready to report' even though the system has 'given up on upload...'

Trying to follow the thread you speak of is both difficult and not especially enlightening.

For example, I haven't seen anything concerning whether the affected files are found and fixed or simply disappear, with a consequent waste of time and electrons.

I thought starting a new thread (the old began 'way back last May) might bring a more compact view of the situation.


Several people have noticed the "fatal upload error" / "no command" situation. In theory, a script is run every 24-48 hours that is supposed to find results in a Validate Error condition and "correct" the issue. Not sure how that correction is implemented. If it blanket grants credits, then that would seem "good for everyone", but what if a result should've been invalid?

Then there's the whole other kettle o' fish: What if results that should've been validated are not picked up by the script? If valid submissions are not validated, that reduces total cr/sec values based on actual work performed. This is another flaw in the credit system here that needs correction before rubber stamping this project as a "reference"...

Anyway, carry on... You may want to go ahead and unhide your computers, just so that you don't get that same request again. Unless you have some sort of Non-Disclosure Agreement that you've signed due to work issues or if you are a tester for Engineering Sample processors from Intel or AMD, there's no harm in showing your computers. The computer name and IP address are only shown to you. They are not shown to the whole world.

Brian

ID: 714648 · Report as offensive
Profile Jason A. Countryman
Volunteer tester
Avatar

Send message
Joined: 29 Aug 03
Posts: 139
Credit: 50,172,873
RAC: 2
United States
Message 714659 - Posted: 18 Feb 2008, 6:51:30 UTC

I had the same problem, which i wrote about in another thread. I just restarted BOINC and it went away
ID: 714659 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 714660 - Posted: 18 Feb 2008, 6:55:16 UTC - in response to Message 714644.  

For the past several upload attempts, I'm typically seeing this:

2/17/2008 9:50:20 PM|SETI@home|Computation for task 30dc06af.6706.15205.12.7.171_2 finished
2/17/2008 9:50:20 PM|SETI@home|Starting 05dc06ae.16424.4980.7.7.100_1
2/17/2008 9:50:20 PM|SETI@home|Starting task 05dc06ae.16424.4980.7.7.100_1 using setiathome_enhanced version 527
2/17/2008 9:50:23 PM|SETI@home|Started upload of 30dc06af.6706.15205.12.7.171_2_0
2/17/2008 9:50:24 PM|SETI@home|[error] Error on file upload: no command
2/17/2008 9:50:24 PM|SETI@home|Giving up on upload of 30dc06af.6706.15205.12.7.171_2_0: fatal upload error

And I'm now seeing several files (including the one referenced above) as 'ready to report' even though the system has 'given up on upload...'
Thanks,
Walt



Hi Walt,

On Feb 16, I noticed a string of those same errors on my laptop that started about 18:00 UTC on Feb 14. And now that I look at the results list for that system, I see a bunch with "Validate error" from Feb 15-16.

I recall noticing a "runaway" workunit, one that had exceeded it's estimated runtime and was still less than 50% complete back on Feb 13 or 14 and had cancelled it. Coincidence? Maybe.

Anyway, the problems went away after restarting BOINC and I'm now getting credit for my completed WU's once again. Yeah:-)

Good luck,
John

ID: 714660 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714734 - Posted: 18 Feb 2008, 10:18:37 UTC - in response to Message 714644.  

Trying to follow the thread you speak of is both difficult and not especially enlightening.

I agree wholeheartedly. Let's try and summarise the state of knowledge.

I'm now convinced that there are (at least) two quite separate and distinct causes of validate errors, with two separate and distinct outcomes.

1) The general, low-level, one or two at a time amongst good results, that we all get - but some people seem to get more than others. Reposting:
I think regular readers know my take on this thread:

  • 'Validate error' (and we're not talking about anything else here, like computation errors) tends only to happen if you try to "report" a result too soon after "uploading" it.
  • 'Reporting too soon' can happen sporadically to anyone, but is endemic to BOINC v5.10.(<=13) with a connect interval of zero, or with certain third-party 'optimised' BOINC clients.
  • The staff run a manual script once every 24 hours or so, to correct as many of these errors as they can.
  • The script does not parse this thread looking for result ID numbers! It works directly on the underlying database at Berkeley.



These 'Type 1' validate errors seem to be a simple timing problem: the uploaded result data file isn't available quickly enough. When the script runs (some hours later), the uploaded data is available, the validators can do their work, and if the work you've uploaded is valid, then you can be awarded credit for it as per the normal rules.

2) Error on file upload: no command
These have also always been around, but - partly because of the other dross in the 'Validate errors II' thread - I hadn't noticed them until I got bitten by one myself last week. I think that, under normal conditions, they happen less often than 'Type 1' validate errors.

'Type 2' errors are much more serious:
a) Once a host computer gets itself into this state, every finished task seems to end with a "Error on file upload: no command"
b) Because the file upload is cancelled permanently and completely, there will never be any result data at Berkeley to validate, and the re-validation script can't help. (Brian, please note). You will never get any credit for these.

Fortunately, something a simple as a restart BOINC seems to break the cycle of upload errors, and restore normal science/credit services.

So my advice to people who notice a 'Validate error' becomes:

Look in the message log. If you see any sign of a "Error on file upload: no command", restart BOINC immediately (that's all you can do). If there are no upload errors, check your BOINC version, connect interval etc. as before, and wait for the script to run.

--------------------------------------

I'm treating 'Type 2' validate errors as a bug in the BOINC system: I'm not sure yet whether it's a bug in the Server code or the Client code, or some subtle deadlock between the two that only rarely happens. I have received one assertion (by PM) that this should not happen with BOINC clients v5.10.30 and above: I'd be interested if anyone can substantiate a 'Type 2' validate error with the latest BOINC clients (version number and platform, please). Further detailed error reports and diagnostic information will be welcomed here - separate login required to post.
ID: 714734 · Report as offensive
Morris
Volunteer tester

Send message
Joined: 11 Sep 01
Posts: 57
Credit: 9,077,302
RAC: 29
Italy
Message 714738 - Posted: 18 Feb 2008, 10:33:28 UTC
Last modified: 18 Feb 2008, 10:37:50 UTC

I had the same prob yesterday evening, while reporting ***15*** results, not much you can say, but that is quite a big number of WU for me ...
What is strange on my side is that the whole bunch experienced the same prob

[cut]
17-Feb-2008 23:19:42 [---] Resuming network activity
17-Feb-2008 23:19:43 [SETI@home] Started upload of 29dc06ae.24173.4571.7.6.2_2_0
17-Feb-2008 23:19:43 [SETI@home] Started upload of 29no06ah.15386.4571.10.6.137_0_0
17-Feb-2008 23:19:46 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:19:46 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:19:46 [SETI@home] Giving up on upload of 29dc06ae.24173.4571.7.6.2_2_0: fatal upload error
17-Feb-2008 23:19:46 [SETI@home] Giving up on upload of 29no06ah.15386.4571.10.6.137_0_0: fatal upload error
17-Feb-2008 23:19:46 [SETI@home] Started upload of 02mr07aa.12599.20113.3.6.45_0_0
17-Feb-2008 23:19:46 [SETI@home] Started upload of 02mr07aa.12599.22567.3.6.203_1_0
17-Feb-2008 23:19:49 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:19:49 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:19:49 [SETI@home] Giving up on upload of 02mr07aa.12599.20113.3.6.45_0_0: fatal upload error
17-Feb-2008 23:19:49 [SETI@home] Giving up on upload of 02mr07aa.12599.22567.3.6.203_1_0: fatal upload error
17-Feb-2008 23:19:49 [SETI@home] Started upload of 02mr07aa.12599.22567.3.6.71_0_0

[end cut]

Nothing strange on my side, simply re-enabled network activity, and the boinc manager did the whole lot. As a result, fifteen "Ready to report" WU, that suddenly turned in 15 "VALIDATE ERROR"...

I have the impression i wasted a lot of cpu time , isn't it ?


M.

[added later]
i'm running Manager Version 5.10.30 , with Chicken optimized client 2.14-SSE2-PM, in case that means something ....
ID: 714738 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 714740 - Posted: 18 Feb 2008, 10:45:22 UTC - in response to Message 714734.  
Last modified: 18 Feb 2008, 10:49:14 UTC


'Type 2' errors are much more serious:
a) Once a host computer gets itself into this state, every finished task seems to end with a "Error on file upload: no command"
b) Because the file upload is cancelled permanently and completely, there will never be any result data at Berkeley to validate, and the re-validation script can't help. (Brian, please note). You will never get any credit for these.


Thanks. Like I said, I wasn't sure how the script dealt with the situation. It is good to know that the result files are sent through if they are there.

That indeed would make the "type 2" more serious. Once a quorum is met, no more results would get generated and so the WU will transition and get assimilated within 24 hours or so, thus wiping out all trace of a problem, except for people who are keeping track...

I'm treating 'Type 2' validate errors as a bug in the BOINC system: I'm not sure yet whether it's a bug in the Server code or the Client code, or some subtle deadlock between the two that only rarely happens. I have received one assertion (by PM) that this should not happen with BOINC clients v5.10.30 and above: I'd be interested if anyone can substantiate a 'Type 2' validate error with the latest BOINC clients (version number and platform, please). Further detailed error reports and diagnostic information will be welcomed here - separate login required to post.


In all the time I had been obtaining work from here, the only times I received a Validate Error were of the "type 1" variety. If you suspect that the "type 2" have been around the whole time, then keep in mind that I am still using 5.8.16. It may be worth pursuing the idea of if 5.8.xx has the issue or not, considering it does have a healthy share of users still...as does 5.4.11...
ID: 714740 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714748 - Posted: 18 Feb 2008, 11:23:52 UTC

Earliest references I can find:

Message 587697 - Joe Segur, 16 June 2007
He mentions earlier problems with BOINC v5.4.x

Message 615333 - Sutaru Tsureku, 5 August 2007
This exact problem. Several posts further on (617164), Jim-R points out that Sutaru was using BOINC v5.10.7 at the time.
ID: 714748 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 714785 - Posted: 18 Feb 2008, 13:09:52 UTC
Last modified: 18 Feb 2008, 13:21:56 UTC

I noticed/posted my first 'no command' error at: 27 May 2007 21:53:50 UTC
But maybe I had before them also, but didn't saw them..

And then I got lot of help here in this board, thanks to all again!!

But, we didn't found/eliminated the reason..


Mauro said, he use BOINC V5.10.30 and he have the 'no command' errors too..

I use BOINC V5.10.28 with Crunch3rs V6.1.0


So to now, we have only to wait and drink a tea or coffee, or?
Nobody know to now the answer why we get sometimes the 'no command' errors..


But I'm happy that I'm no longer alone..
So now maybe is on the side of Berkeley more interest to find and eliminate this problem..
ID: 714785 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714791 - Posted: 18 Feb 2008, 13:19:25 UTC - in response to Message 714785.  
Last modified: 18 Feb 2008, 13:23:49 UTC

I noticed/posted my first 'no command' error at: 27 May 2007 21:53:50 UTC
But maybe I had before them also, but didn't saw them..

Can you remember which version of BOINC you were using at the time?

None of the v5.10.xx sequence were released until 29 May 2007, so unless you tried a v5.9.xx Beta Test copy, I guess it was most likely v5.8.16
And then I got lot of help here in this board, thanks to all again!!

But, we didn't found/eliminated the reason..


Mauro said, he use BOINC V5.10.30 and he have the 'no command' errors too..

Can you post a link to Mauro's post, to save me searching? Thanks.

[Edit - found it at #714738 - this thread. Didn't look very hard, did I?]
I use BOINC V5.10.28 with Crunch3rs V6.1.0

So to now, we have only to wait and drink a tea or coffee, or?
Nobody know to now the answer why we get sometimes the 'no command' errors..

Yes :-)

But the more detailed bug reports we can gather here, the quicker the process should be.
ID: 714791 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 714795 - Posted: 18 Feb 2008, 13:31:49 UTC - in response to Message 714791.  
Last modified: 18 Feb 2008, 13:33:36 UTC

I noticed/posted my first 'no command' error at: 27 May 2007 21:53:50 UTC
But maybe I had before them also, but didn't saw them..

Can you remember which version of BOINC you were using at the time?

None of the v5.10.xx sequence were released until 29 May 2007, so unless you tried a v5.9.xx Beta Test copy, I guess it was most likely v5.8.16
And then I got lot of help here in this board, thanks to all again!!

But, we didn't found/eliminated the reason..


Mauro said, he use BOINC V5.10.30 and he have the 'no command' errors too..

Can you post a link to Mauro's post, to save me searching? Thanks.

[Edit - found it at #714738 - this thread. Didn't look very hard, did I?]
I use BOINC V5.10.28 with Crunch3rs V6.1.0

So to now, we have only to wait and drink a tea or coffee, or?
Nobody know to now the answer why we get sometimes the 'no command' errors..

Yes :-)

But the more detailed bug reports we can gather here, the quicker the process should be.



I don't remember really which Version of BOINC I had to this time..
I never used Beta Versions. Only final Versions.
I had maybe V5.2.13 with TruXofts V5.3.12.tx36 because of the 'CPU-affinity'
But then I saw after long time that sometimes only 3 Cores were running on my QUAD.
And then, I don't remember which Version I used.
It must be V5.8.16, like you said.


Yes, this was the post from Mauro I read..


So how we could help now to find and eliminate the reason?
What we need to post here?
ID: 714795 · Report as offensive
Morris
Volunteer tester

Send message
Joined: 11 Sep 01
Posts: 57
Credit: 9,077,302
RAC: 29
Italy
Message 714797 - Posted: 18 Feb 2008, 13:38:02 UTC - in response to Message 714791.  


Can you post a link to Mauro's post, to save me searching? Thanks.

[Edit - found it at #714738 - this thread. Didn't look very hard, did I?]


Richard, maybe i forgot to mention something important....
this is NOT the case of reporting too fast or clicking "update" in a nervous way , since last evening the server was not reachable from my place

[LOG]
17-Feb-2008 23:20:03 [SETI@home] Started upload of 08ja07ad.32736.1708.7.7.251_0_0
17-Feb-2008 23:20:04 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:20:04 [SETI@home] Giving up on upload of 22fe07aj.11174.21340.15.7.12_0_0: fatal upload error
17-Feb-2008 23:20:05 [SETI@home] [error] Error on file upload: no command
17-Feb-2008 23:20:05 [SETI@home] Giving up on upload of 08ja07ad.32736.1708.7.7.251_0_0: fatal upload error
17-Feb-2008 23:20:17 [SETI@home] Sending scheduler request: Requested by user. Requesting 642839 seconds of work, reporting 15 completed tasks
17-Feb-2008 23:20:26 [SETI@home] Scheduler request succeeded: got 0 new tasks
17-Feb-2008 23:20:26 [SETI@home] Message from server: Incomplete request received.
[/LOG]

so i went to bed and this morning i reported the WU

[LOG]

18-Feb-2008 08:36:58 [---] Resuming network activity
18-Feb-2008 08:36:58 [SETI@home] Started upload of 22no06ae.22162.2117.13.7.235_0_0
18-Feb-2008 08:36:58 [SETI@home] Started upload of 22no06ae.27122.3753.12.7.16_1_0
18-Feb-2008 08:37:03 [---] Project communication failed: attempting access to reference site
18-Feb-2008 08:37:03 [SETI@home] Scheduler request failed: Couldn't connect to server
18-Feb-2008 08:37:07 [---] Access to reference site succeeded - project servers may be temporarily down.
18-Feb-2008 08:37:07 [SETI@home] Finished upload of 22no06ae.22162.2117.13.7.235_0_0
18-Feb-2008 08:37:09 [SETI@home] Finished upload of 22no06ae.27122.3753.12.7.16_1_0
18-Feb-2008 08:37:23 [SETI@home] Sending scheduler request: Requested by user. Requesting 665658 seconds of work, reporting 17 completed tasks
18-Feb-2008 08:37:39 [SETI@home] Scheduler request succeeded: got 20 new tasks
18-Feb-2008 08:37:42 [SETI@home] Started download of 30dc06ah.17223.890.15.7.135
[/LOG]

but, IMHO, the "mess" was already done. I'm telling this because this morning i reported the 15 "bad" WU, plus two more computed over the night (so , no boinc restart, nothing different on this side) . As a result, the 15 WU were invalidated, the two additional were granted credit as normal... sounds weird , uh ?

M.



ID: 714797 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714801 - Posted: 18 Feb 2008, 13:55:10 UTC - in response to Message 714797.  

but, IMHO, the "mess" was already done. I'm telling this because this morning i reported the 15 "bad" WU, plus two more computed over the night (so , no boinc restart, nothing different on this side) . As a result, the 15 WU were invalidated, the two additional were granted credit as normal... sounds weird , uh ?

M.

Thanks, that's useful additional information.

I wonder whether it was something at the servers which changed (there was some sort of update overnight - yesterday it wasn't possible to log onto these message boards if you weren't logged in already, today it's OK again), or whether 'suspending network activity', then 'resuming network activity' is all it takes.
ID: 714801 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714804 - Posted: 18 Feb 2008, 14:00:15 UTC - in response to Message 714795.  

So how we could help now to find and eliminate the reason?
What we need to post here?

When I get time, I'm going to go back over the logs on the server where I had the problem - see if I can spot anything unusual.

But apart from that, I think you've already posted the most useful comment:
ID: 714804 · Report as offensive
Bert

Send message
Joined: 12 Oct 06
Posts: 84
Credit: 813,295
RAC: 0
United States
Message 714808 - Posted: 18 Feb 2008, 14:10:06 UTC - in response to Message 714804.  

So how we could help now to find and eliminate the reason?
What we need to post here?

When I get time, I'm going to go back over the logs on the server where I had the problem - see if I can spot anything unusual.

But apart from that, I think you've already posted the most useful comment:


There is something else that started to happen recently. Previously when you read a thread it was marked as read, the yellow would disappear. Now threads that have been read are still marked as unread.

And, something to add to a wish list, I wish there was a way to mark a thread as read without having to actually go into the thread.


ID: 714808 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 714810 - Posted: 18 Feb 2008, 14:20:03 UTC - in response to Message 714808.  

There is something else that started to happen recently. Previously when you read a thread it was marked as read, the yellow would disappear. Now threads that have been read are still marked as unread.

Agreed - irritating, isn't it?

Actually, I find it only happens if you use the browser 'back' button to return to the thread index: then you also (new) have to do an F5 refresh to get the flags in sync (IE7). But if you click on the Number crunching link above, you can do it in one.
And, something to add to a wish list, I wish there was a way to mark a thread as read without having to actually go into the thread.

You can mark all threads as read, but it would be nicer to be able just to mark one of them.
ID: 714810 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 714935 - Posted: 18 Feb 2008, 19:29:59 UTC - in response to Message 714748.  

Earliest references I can find:

Message 587697 - Joe Segur, 16 June 2007
He mentions earlier problems with BOINC v5.4.x


Given the following quote from Joe about your diagnosis back then, it appears to me that the situation you brought up would be with multi-core machines and not single core. Is that true? OTOH, users are not currently needing to abort the transfers, if I understand things correctly. It seems that BOINC gives up on its' own...

Parsing for signals is later, and divided by signal type. Because it is common for there to be no signals for a particular type, having none at all would cause no difficulty. In any case, that 5.8.0 case of no signals in the result file was simply an instance of the size mismatch error which caused upload to fail, backoff, rinse and repeat forever until the user aborted the transfer. Richard Haselgrove diagnosed the problem as BOINC restarting a completed WU if another completed about simultaneously.
ID: 714935 · Report as offensive
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Rash of Validate errors


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.