Validate Errors II

Message boards : Number crunching : Validate Errors II
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 704547 - Posted: 26 Jan 2008, 19:00:37 UTC - in response to Message 704442.  

Here's my wu with a validate error:

http://setiathome.berkeley.edu/workunit.php?wuid=209430498

Did not see the status of the work unit update to "Ready to Report" from "Uploading". It simply disappeared from the display.

The workunit that was acknowledged approximately one minute before:

http://setiathome.berkeley.edu/workunit.php?wuid=209430637

I'm using Crunch3r's enhanced BOINC. Snippet of log file:


1/25/2008 11:52:48 PM|SETI@home|[file_xfer] Finished upload of file 28no06ah.19864.24612.15.6.223_1_0
1/25/2008 11:52:48 PM|SETI@home|[file_xfer] Throughput 10609 bytes/sec
1/25/2008 11:53:53 PM||Time passed...reporting result(s) now.
1/25/2008 11:53:53 PM|SETI@home|Sending scheduler request: To report completed tasks
1/25/2008 11:53:53 PM|SETI@home|Reporting 1 tasks
1/25/2008 11:53:58 PM|SETI@home|Scheduler RPC succeeded [server version 601]
1/25/2008 11:53:58 PM|SETI@home|Deferring communication for 11 sec
1/25/2008 11:53:58 PM|SETI@home|Reason: requested by project
[color=blue]1/25/2008 11:54:14 PM|SETI@home|Computation for task 28no06ah.19864.24612.15.6.131_0 finished[/color]
1/25/2008 11:54:14 PM|SETI@home|Starting 28no06ah.30219.5798.16.6.159_0
1/25/2008 11:54:14 PM|SETI@home|Starting task 28no06ah.30219.5798.16.6.159_0 using setiathome_enhanced version 528
[color=red]1/25/2008 11:54:14 PM|SETI@home|Sending scheduler request: To fetch work[/color]
1/25/2008 11:54:14 PM|SETI@home|Requesting 14764 seconds of new work
[color=green]1/25/2008 11:54:16 PM|SETI@home|[file_xfer] Started upload of file 28no06ah.19864.24612.15.6.131_0_0
1/25/2008 11:54:18 PM|SETI@home|[file_xfer] Finished upload of file 28no06ah.19864.24612.15.6.131_0_0[/color]
1/25/2008 11:54:18 PM|SETI@home|[file_xfer] Throughput 29714 bytes/sec
1/25/2008 11:54:19 PM|SETI@home|Scheduler RPC succeeded [server version 601]
1/25/2008 11:54:19 PM|SETI@home|Deferring communication for 11 sec
1/25/2008 11:54:19 PM|SETI@home|Reason: requested by project
1/25/2008 11:54:21 PM|SETI@home|[file_xfer] Started download of file 02mr07aa.14314.10297.9.6.57
1/25/2008 11:54:21 PM|SETI@home|[file_xfer] Started download of file 30dc06ac.19037.1710.3.6.19
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Finished download of file 02mr07aa.14314.10297.9.6.57
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Throughput 187647 bytes/sec
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Finished download of file 30dc06ac.19037.1710.3.6.19
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Throughput 189115 bytes/sec
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Started download of file 02mr07aa.14314.10297.9.6.171
1/25/2008 11:54:24 PM|SETI@home|[file_xfer] Started download of file 02mr07aa.14314.10297.9.6.177
1/25/2008 11:54:27 PM|SETI@home|[file_xfer] Finished download of file 02mr07aa.14314.10297.9.6.171
1/25/2008 11:54:27 PM|SETI@home|[file_xfer] Throughput 177918 bytes/sec
1/25/2008 11:54:27 PM|SETI@home|[file_xfer] Finished download of file 02mr07aa.14314.10297.9.6.177
1/25/2008 11:54:27 PM|SETI@home|[file_xfer] Throughput 179246 bytes/sec
1/25/2008 11:54:35 PM|SETI@home|Sending scheduler request: To fetch work
[color=purple]1/25/2008 11:54:35 PM|SETI@home|Requesting 2317 seconds of new work, and reporting 1 completed tasks[/color]
1/25/2008 11:54:40 PM|SETI@home|Scheduler RPC succeeded [server version 601]
1/25/2008 11:54:40 PM|SETI@home|Deferring communication for 11 sec
1/25/2008 11:54:40 PM|SETI@home|Reason: requested by project
1/25/2008 11:54:42 PM|SETI@home|[file_xfer] Started download of file 29dc06ae.2951.7843.15.6.19
1/25/2008 11:54:45 PM|SETI@home|[file_xfer] Finished download of file 29dc06ae.2951.7843.15.6.19
1/25/2008 11:54:45 PM|SETI@home|[file_xfer] Throughput 184757 bytes/sec

There is a possibility that the request for work (red) triggered a report for the completed task (blue) even though the task had not been uploaded (green)

I believe the cause was the purple report less than 20 seconds after the green upload completed. AFAIK, BOINC is very reliable at showing when it sends reports.
                                                                  Joe
ID: 704547 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 707677 - Posted: 3 Feb 2008, 13:11:14 UTC



I have a 'validate error' from 1 Feb 2008 14:51:10 UTC and to now it's not fixed..
So please can someone say, when the script is running every day?

wuid=209397711



ID: 707677 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 708020 - Posted: 4 Feb 2008, 14:41:06 UTC - in response to Message 704547.  

I believe the cause was the purple report less than 20 seconds after the green upload completed. AFAIK, BOINC is very reliable at showing when it sends reports.


It is interesting to note that other projects do not seem to have this vexing problem... I only see it talked about here on a routine basis... Of course, in fairness, I haven't checked out ALL the projects' fora...
ID: 708020 · Report as offensive
vtphipps

Send message
Joined: 18 May 99
Posts: 2
Credit: 128,767
RAC: 0
United States
Message 708826 - Posted: 6 Feb 2008, 16:49:47 UTC

I have found these errors here as well :)
But I only get them on one of my 4 computers.
I have 2 linux that keep on doing there thing w/o erors in my lan
also a win 98se system keeps on going with out errors
it is my winxp doing 1.8 ghz. I have tryed all I could think to do and
just detatched all project from it as rosetta@home was also
getting the errors:(

It is running the factory mem 256meg i have got 2 1 gig cards coming
I'll give it another try after those are installed.
Not sure what version of boinc it was as I have uninstalled it now.

Doug
ID: 708826 · Report as offensive
Profile bj

Send message
Joined: 11 Oct 00
Posts: 163
Credit: 50,429,507
RAC: 0
United States
Message 708829 - Posted: 6 Feb 2008, 17:02:25 UTC

This dosn't have anything to do with validation errors but want to pass along that I had a power supply problem and then some software/bios problems with the same computer. There were only six work units there that had not uploaded due to the maintenance period. So there will be some "client detached" messages. But after checking the computer on the web site; don't know how there are more than the six of the client detached on work units. Unless its because I reinstalled the boinc software and the other units that had been uploaded is being also shown as "client detached" message.
If anyone knows the reason, please post it. I've done this many times in the past and this the first time this has come up. Sorry to all who has this on their work unit.

bj
ID: 708829 · Report as offensive
vtphipps

Send message
Joined: 18 May 99
Posts: 2
Credit: 128,767
RAC: 0
United States
Message 709357 - Posted: 7 Feb 2008, 19:46:44 UTC - in response to Message 708826.  

I have found these errors here as well :)
But I only get them on one of my 4 computers.
I have 2 linux that keep on doing there thing w/o erors in my lan
also a win 98se system keeps on going with out errors
it is my winxp doing 1.8 ghz. I have tryed all I could think to do and
just detatched all project from it as rosetta@home was also
getting the errors:(

It is running the factory mem 256meg i have got 2 1 gig cards coming
I'll give it another try after those are installed.
Not sure what version of boinc it was as I have uninstalled it now.

Doug


well my debugging I have given that computer a (work) preferance setting
that way I can change the prefs it uses only.
I did rename the program files\\boinc directory then reinstall boink
still waiting for an error pkt.
ID: 709357 · Report as offensive
kevint
Volunteer tester

Send message
Joined: 17 May 99
Posts: 414
Credit: 11,680,240
RAC: 0
United States
Message 709369 - Posted: 7 Feb 2008, 20:34:38 UTC - in response to Message 708020.  

I believe the cause was the purple report less than 20 seconds after the green upload completed. AFAIK, BOINC is very reliable at showing when it sends reports.


It is interesting to note that other projects do not seem to have this vexing problem... I only see it talked about here on a routine basis... Of course, in fairness, I haven't checked out ALL the projects' fora...



Brian, good to see you again :)

No, the validate errors seem to be a SETI only phenomenon. Something that has been cooked up special for their dedicated contributors. The guys that run the project have a difficult time with this one, and have not been able to completely fix the issues. From my understanding of the problem, it is caused by the huge amount of traffic on the servers. Best way to fix this in my opinion is for 1/2 of the SETI crunchers to leave and crunch for other projects. :)


Crunch all you want, they will make more (validation errors).

ID: 709369 · Report as offensive
Profile criton
Avatar

Send message
Joined: 28 Feb 00
Posts: 131
Credit: 13,351,000
RAC: 2
United Kingdom
Message 712384 - Posted: 14 Feb 2008, 6:19:07 UTC

more here

http://setiathome.berkeley.edu/result.php?resultid=706291640
http://setiathome.berkeley.edu/result.php?resultid=706333989
http://setiathome.berkeley.edu/result.php?resultid=706377222
http://setiathome.berkeley.edu/result.php?resultid=707837629
http://setiathome.berkeley.edu/result.php?resultid=707970077


thank you les ( greenfinger )
ID: 712384 · Report as offensive
Profile TerryG
Avatar

Send message
Joined: 11 Mar 01
Posts: 16
Credit: 15,351,703
RAC: 37
United Kingdom
Message 712576 - Posted: 14 Feb 2008, 19:58:36 UTC

I've got two WUs that failed with a validate error:

http://setiathome.berkeley.edu/workunit.php?wuid=220451705
http://setiathome.berkeley.edu/workunit.php?wuid=220108370

For me, that's nearly 12 hours wasted unless there is retrospective crediting.
ID: 712576 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 712583 - Posted: 14 Feb 2008, 20:24:54 UTC - in response to Message 712576.  

I've got two WUs that failed with a validate error:

http://setiathome.berkeley.edu/workunit.php?wuid=220451705
http://setiathome.berkeley.edu/workunit.php?wuid=220108370

For me, that's nearly 12 hours wasted unless there is retrospective crediting.

I haven't seen anything to suggest that the corrective script is not still being run on a daily basis. If you don't get the credits, then let us know as it will be of interest to many.

F.
ID: 712583 · Report as offensive
Profile champ
Volunteer tester
Avatar

Send message
Joined: 12 Mar 03
Posts: 3642
Credit: 1,489,147
RAC: 0
Germany
Message 712891 - Posted: 15 Feb 2008, 11:12:34 UTC

ID: 712891 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 712894 - Posted: 15 Feb 2008, 11:38:18 UTC - in response to Message 712891.  


Here is my validate error: http://setiathome.berkeley.edu/result.php?resultid=749593643

That's an odd one, as you partner MrBlue has -9 overflow, in fact it looks like that's all he gets, for all his results, on his computer.

I'd say you were the one that's correct and his should be the erroneous result.
ID: 712894 · Report as offensive
Profile champ
Volunteer tester
Avatar

Send message
Joined: 12 Mar 03
Posts: 3642
Credit: 1,489,147
RAC: 0
Germany
Message 712896 - Posted: 15 Feb 2008, 11:54:02 UTC - in response to Message 712894.  


Here is my validate error: http://setiathome.berkeley.edu/result.php?resultid=749593643

That's an odd one, as you partner MrBlue has -9 overflow, in fact it looks like that's all he gets, for all his results, on his computer.

I'd say you were the one that's correct and his should be the erroneous result.



Hope you are right. Thanks Winterknight.
ID: 712896 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 713038 - Posted: 15 Feb 2008, 18:17:01 UTC - in response to Message 712894.  


Here is my validate error: http://setiathome.berkeley.edu/result.php?resultid=749593643

That's an odd one, as you partner MrBlue has -9 overflow, in fact it looks like that's all he gets, for all his results, on his computer.

I'd say you were the one that's correct and his should be the erroneous result.

Unfortunately the "Validate error" means the result file couldn't be found, though I agree the values reported to the Scheduler look good. I think it will take two more results to reach a consensus. With luck, Eric's script to fix validate errors will be able to grant credit for 749593643 after that.
                                                                Joe
ID: 713038 · Report as offensive
Profile criton
Avatar

Send message
Joined: 28 Feb 00
Posts: 131
Credit: 13,351,000
RAC: 2
United Kingdom
Message 713117 - Posted: 15 Feb 2008, 19:59:01 UTC

SOME MORE HERE
http://setiathome.berkeley.edu/result.php?resultid=742566844
http://setiathome.berkeley.edu/result.php?resultid=740560307
http://setiathome.berkeley.edu/result.php?resultid=740284266
http://setiathome.berkeley.edu/result.php?resultid=740284254
http://setiathome.berkeley.edu/result.php?resultid=740284144
http://setiathome.berkeley.edu/result.php?resultid=740284138

thank you les ( greenfinger )
ID: 713117 · Report as offensive
Profile TerryG
Avatar

Send message
Joined: 11 Mar 01
Posts: 16
Credit: 15,351,703
RAC: 37
United Kingdom
Message 713187 - Posted: 15 Feb 2008, 22:20:40 UTC - in response to Message 712576.  

I've got two WUs that failed with a validate error:

http://setiathome.berkeley.edu/workunit.php?wuid=220451705
http://setiathome.berkeley.edu/workunit.php?wuid=220108370

For me, that's nearly 12 hours wasted unless there is retrospective crediting.


I've had six of these in total now. None of them (including those in my original post) have as yet been granted any credit.
ID: 713187 · Report as offensive
whawn

Send message
Joined: 11 Apr 00
Posts: 18
Credit: 1,053,191
RAC: 2
United States
Message 713194 - Posted: 15 Feb 2008, 22:55:37 UTC

And I have had six 'validate error' results in the past 24 hours. Two of them were later resolved and credit is pending, the other four remain in limbo.

Does it do any good to post WU ID numbers here?
ID: 713194 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 713454 - Posted: 16 Feb 2008, 10:05:40 UTC

I think regular readers know my take on this thread:

  • 'Validate error' (and we're not talking about anything else here, like computation errors) tends only to happen if you try to "report" a result too soon after "uploading" it.
  • 'Reporting too soon' can happen sporadically to anyone, but is endemic to BOINC v5.10.(<=13) with a connect interval of zero, or with certain third-party 'optimised' BOINC clients.
  • The staff run a manual script once every 24 hours or so, to correct as many of these errors as they can.
  • The script does not parse this thread looking for result ID numbers! It works directly on the underlying database at Berkeley.


Having said all that, why did one of my hosts (host 3150564) - only - get 14 validate errors overnight? It's running the same version of BOINC, on the same venue preference settings, as other hosts, but it's the only one with the errors.

The machine is a remote server, so I'll have to fire up the VPN and take a look. I'll let you know if I find anything interesting.

ID: 713454 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 713523 - Posted: 16 Feb 2008, 11:34:12 UTC

Well, I found the reason:

Lots of

[error] Error on file upload: no command

in the message log. (first occurrence 2008-02-14 23:58:24 - UTC)

Since each one is followed by

[file_xfer] Permanently failed upload of xxxxx
Giving up on upload of xxxxx: server rejected file

of course validation will fail - there's nothing to validate.

But the question remains: why did this only happen to one computer? BOINC was still running OK, and uploading results to other projects. Still, I've restarted it just to be on the safe side. Wait and see.
ID: 713523 · Report as offensive
Miklos M.

Send message
Joined: 5 May 99
Posts: 955
Credit: 136,115,648
RAC: 73
Hungary
Message 713803 - Posted: 16 Feb 2008, 18:54:57 UTC - in response to Message 713454.  

I think regular readers know my take on this thread:

  • 'Validate error' (and we're not talking about anything else here, like computation errors) tends only to happen if you try to "report" a result too soon after "uploading" it.
  • 'Reporting too soon' can happen sporadically to anyone, but is endemic to BOINC v5.10.(<=13) with a connect interval of zero, or with certain third-party 'optimised' BOINC clients.
  • The staff run a manual script once every 24 hours or so, to correct as many of these errors as they can.
  • The script does not parse this thread looking for result ID numbers! It works directly on the underlying database at Berkeley.


Having said all that, why did one of my hosts (host 3150564) - only - get 14 validate errors overnight? It's running the same version of BOINC, on the same venue preference settings, as other hosts, but it's the only one with the errors.

The machine is a remote server, so I'll have to fire up the VPN and take a look. I'll let you know if I find anything interesting.



I just got 5 days' of work/time wasted on validate errors. Does not feel good, that is for sure. I did report all of them on time, I think.

ID: 713803 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · Next

Message boards : Number crunching : Validate Errors II


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.