Message boards :
Number crunching :
The Validation Shuffle.....
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next
Author | Message |
---|---|
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Just I thought I had while I was out mowing the lawn...... Back some time ago when there was another major rash of validate errors, didn't Eric have a program or script he wrote and ran to autmatically fix/repair/relink or whatever caused the error and issue the credit after the fact? Or is my memory incorrect? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
dnolan Send message Joined: 30 Aug 01 Posts: 1228 Credit: 47,779,411 RAC: 32 |
Just to throw another voice into the fray, have a few on my boxes, too. 7 on one, a couple on another and I think 1 on a third... Hope this all gets fixed soon! -Dave |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Just to throw another voice into the fray, have a few on my boxes, too. 7 on one, a couple on another and I think 1 on a third... And it looks like at least on your top quad, you are running the stock 5.10.13 Boinc client, so I don't think we can toss this problem at the optimized clients. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
geoff Send message Joined: 25 Apr 00 Posts: 123 Credit: 34,100,351 RAC: 18 |
Just I thought I had while I was out mowing the lawn...... Mark your memory is correct, Eric did have a script and corrected the Validation errors that happened sometime ago. Geoff |
dnolan Send message Joined: 30 Aug 01 Posts: 1228 Credit: 47,779,411 RAC: 32 |
Just to throw another voice into the fray, have a few on my boxes, too. 7 on one, a couple on another and I think 1 on a third... Yup, on all my boxes, and my connect setting is 0.1, so pretty large compared to others... -Dave |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Is everyones on the same day or range of days? Is there a time it happens more often? You noted [much) earlier in the thread that my "Data-Vac" was not picking these up. I noted that last week when it tried to divide "---" by 3600 to plot a graph! I probably missed it because there were no Validate errors on any of the hosts I was looking at during the early development. How things change!! I am still monitoring a number of hosts to get a baseline for when who? releases his super-App and I have sorted them by Date / Time reported (Data as of this morning's run - about 08:30 UTC; I expect there will be more during the day): 648006925 173132387 30 Oct 2007 12:17:45 UTC 31 Oct 2007 10:39:04 UTC Over Validate error 646999739 173748285 29 Oct 2007 6:46:14 UTC 31 Oct 2007 22:59:16 UTC Over Validate error 646999881 173748289 29 Oct 2007 6:46:14 UTC 31 Oct 2007 23:19:19 UTC Over Validate error 646571624 173546794 29 Oct 2007 0:47:32 UTC 1 Nov 2007 0:47:02 UTC Over Validate error 646999777 173748251 29 Oct 2007 6:46:14 UTC 1 Nov 2007 1:29:40 UTC Over Validate error 646720802 173615568 29 Oct 2007 4:11:48 UTC 1 Nov 2007 1:59:40 UTC Over Validate error 647440483 173957497 29 Oct 2007 18:25:32 UTC 1 Nov 2007 11:06:28 UTC Over Validate error 647001957 173749285 29 Oct 2007 6:50:22 UTC 1 Nov 2007 12:40:42 UTC Over Validate error 647179454 173833326 29 Oct 2007 19:42:32 UTC 1 Nov 2007 13:22:47 UTC Over Validate error 647179636 173833421 29 Oct 2007 19:43:29 UTC 1 Nov 2007 14:14:06 UTC Over Validate error 647219520 173852354 29 Oct 2007 21:07:57 UTC 1 Nov 2007 15:09:54 UTC Over Validate error 647230256 173857483 29 Oct 2007 21:26:13 UTC 1 Nov 2007 15:42:54 UTC Over Validate error 647002031 173749368 29 Oct 2007 6:50:22 UTC 1 Nov 2007 16:10:31 UTC Over Validate error 647268846 173875941 29 Oct 2007 22:39:47 UTC 1 Nov 2007 16:54:42 UTC Over Validate error 648937048 174657831 1 Nov 2007 6:35:11 UTC 1 Nov 2007 22:32:22 UTC Over Validate error 647441862 173957997 30 Oct 2007 5:30:18 UTC 1 Nov 2007 23:01:28 UTC Over Validate error 646726118 173535908 29 Oct 2007 4:22:23 UTC 1 Nov 2007 3:29:21 UTC Over Validate error 646725548 173617715 29 Oct 2007 4:22:23 UTC 1 Nov 2007 3:30:53 UTC Over Validate error 646726016 173617941 29 Oct 2007 4:22:24 UTC 1 Nov 2007 4:53:44 UTC Over Validate error 646725862 173617850 29 Oct 2007 4:22:24 UTC 1 Nov 2007 5:02:06 UTC Over Validate error 647000451 173748561 29 Oct 2007 6:47:36 UTC 1 Nov 2007 6:48:08 UTC Over Validate error 647001265 173748982 29 Oct 2007 6:49:00 UTC 1 Nov 2007 6:59:35 UTC Over Validate error 647001427 173749068 29 Oct 2007 6:49:00 UTC 1 Nov 2007 8:01:04 UTC Over Validate error 647375485 173926675 29 Oct 2007 16:38:21 UTC 1 Nov 2007 9:17:40 UTC Over Validate error 639000112 169989576 19 Oct 2007 10:16:10 UTC 2 Nov 2007 1:32:27 UTC Over Validate error 647224579 173854680 29 Oct 2007 12:37:56 UTC 2 Nov 2007 10:29:11 UTC Over Validate error 647223117 173854051 29 Oct 2007 12:37:56 UTC 2 Nov 2007 10:29:11 UTC Over Validate error 647224599 173854740 29 Oct 2007 12:37:56 UTC 2 Nov 2007 10:29:11 UTC Over Validate error 647519618 173994974 30 Oct 2007 8:28:16 UTC 2 Nov 2007 13:09:32 UTC Over Validate error 647520054 173995205 30 Oct 2007 8:29:09 UTC 2 Nov 2007 14:23:41 UTC Over Validate error 639248014 169562963 19 Oct 2007 17:01:55 UTC 2 Nov 2007 14:24:50 UTC Over Validate error 647543470 174006527 30 Oct 2007 9:30:00 UTC 2 Nov 2007 16:27:33 UTC Over Validate error 639026676 170002074 19 Oct 2007 11:04:38 UTC 2 Nov 2007 18:17:14 UTC Over Validate error 639115280 170039804 19 Oct 2007 13:31:12 UTC 2 Nov 2007 19:32:24 UTC Over Validate error 639463694 170200718 19 Oct 2007 23:03:07 UTC 2 Nov 2007 19:57:10 UTC Over Validate error 639000260 169989609 19 Oct 2007 10:16:10 UTC 2 Nov 2007 2:40:59 UTC Over Validate error 639223826 170090236 19 Oct 2007 16:18:34 UTC 2 Nov 2007 20:52:59 UTC Over Validate error 639000652 169989795 19 Oct 2007 10:16:52 UTC 2 Nov 2007 20:57:07 UTC Over Validate error 648345332 173912853 31 Oct 2007 11:46:10 UTC 2 Nov 2007 23:11:04 UTC Over Validate error 648345036 174379767 31 Oct 2007 11:46:10 UTC 2 Nov 2007 23:11:04 UTC Over Validate error 647356421 173917632 29 Oct 2007 16:05:58 UTC 2 Nov 2007 3:56:13 UTC Over Validate error 647366355 173922346 29 Oct 2007 16:22:08 UTC 2 Nov 2007 6:11:39 UTC Over Validate error 639547896 170240676 20 Oct 2007 1:58:00 UTC 3 Nov 2007 2:40:12 UTC Over Validate error 639663813 169897269 20 Oct 2007 1:40:40 UTC 3 Nov 2007 2:53:24 UTC Over Validate error 647359746 173919183 30 Oct 2007 2:06:45 UTC 3 Nov 2007 7:33:15 UTC Over Validate error So there doesn't seem to be any correlation there. Still haven't experienced one of these on my one (lonely) box. F. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I sent an email to Eric asking him to check this thread. I don't know if he will get it this weekend or if he will respond here. But I tried to make sure he is aware of the situation. Hopefully he can run his repair script and get some credit reinstated. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
I sent an email to Eric asking him to check this thread. I don't know if he will get it this weekend or if he will respond here. But I tried to make sure he is aware of the situation. Hopefully he can run his repair script and get some credit reinstated. And, more importantly, find whatever it is that is causing the problem and fix it at source: it's only worth clearing up the after-effects once you've stopped any new ones appearing. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Just I thought I had while I was out mowing the lawn...... In the first Validate Errors thread, Eric did at one point try to grant credit. IIRC it only worked for results which were still visible, so the fix worked fully on SETI Beta but mostly only for very recent results here. For those experimenting with Network activity disabled as a workaround, I suggest setting No New Tasks before enabling network activity, wait until all completed work has uploaded and whatever interval you think advisable, then allow new tasks. Under most circumstances that will delay the reporting, so if the present Validate errors are due to quick reporting it should help. You should start the sequence when there's several minutes to the next WU completion, if possible. But don't forget that there were several cases of bulk Validate errors well before BOINC 5.10.13, caused by other server problems such as dropped NSF mounts. I'm in agreement that BOINC shouldn't cause this kind of problem, and it probably only does because the developers basically assumed that servers were quite reliable. If the first attempt to validate a result fails to find the file the system has no mechanism to try again later. Adding something like that as an option probably wouldn't be a huge amount of work, perhaps only needing changes in the Validator and Transitioner code, and if the option included selectable amount of "later" it might be useful for many projects. Joe |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
As a slight aside, this is also going to impact on who? reaching his stable RAC. He has had quite a rash of them. F. |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 |
It should be noted that I was not trying to infer that "BOINC" caused the problem. The server infrastructure (NFS and whatnot) would be what I would suspect, barring any validator code changes... As several have mentioned, and what I alluded to last night as well, is any of the upload / other NFS transactions could hang out for a while and / or go "poof" (yes, highly technical term, I know)... Holding back on the reporting was a work-around the last time. It may or may not help this time, but it was a helpful suggestion in any case...
|
kevint Send message Joined: 17 May 99 Posts: 414 Credit: 11,680,240 RAC: 0 |
And to think- after not crunching SETI for several months, and hoping that maybe Berkeley had figured out the problems - so I came back again - spent several hours re-optimizing and loaded up my caches with 8 days of work, and started to get validation errors on returning work. What a complete waste of time and energy!!!!! WASTED $$$$ on POWER !!! - $$$$ I spend on power should not end up with validation errors because this silly problem. I guess I need to go back to doing what I was doing, crunching for a project that does not have so many problems. Aborting - resetting the project, and going leaving again. I am not going to crunch for a project that continues to have these sorts of problems --- continually - |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
... and as luck would have it, there's always someone who's going to come back during our bad times and complain about it. Bye! |
kevint Send message Joined: 17 May 99 Posts: 414 Credit: 11,680,240 RAC: 0 |
Hey dummy - check my stats - I think I have a right to complain And for someone that is a "moderator" you sure know how to start an unneeded flame war! |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Well I am a moderator too. Sorry you chose to try coming back to the project when they are having a problem. Things have actually been more reliable around here lately. Those of us who are steady Seti supporters have been discussing the problem in this thread and trying to find a common cause so it can be addressed by the admins. If you feel that you do not want to stick around while the problem is being sorted out, I can understand that. Please come back in a week or two, or check back in this thread to see how it's going. But please do so in a polite rational manner. The folks in this forum have no control over the problem, we are just discussing it. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Keith Send message Joined: 19 May 99 Posts: 483 Credit: 938,268 RAC: 0 |
I don't suppose this will be of much help, but I have also been affected by the validation errors, but only 3 of about 40 completed during 1st to 3rd November were marked as such. In each case, the task was completed successfully by another host, which would imply that there was a problem with my computer. these were all with my dual core G5 Mac 2881519 (but it does not surprise me that this was the only one affected as it far exceeds the total turnover of my other 2 computers running SETI). But, as this has only happened just recently, and is randomly, it seems, also affecting others, there seems to be no doubt it is something at the SETI end. Details of the 3 work units are as follows:- ========================================================================== 646603669 173561557 28 Oct 2007 21:56:09 UTC 1 Nov 2007 3:08:25 UTC Over Validate error Done 13,353.81 --- --- Workunit details created 28 Oct 2007 21:56:01 UTC name 10fe07af.4661.96906.7.6.91 646603669 2881519 28 Oct 2007 21:56:09 UTC 1 Nov 2007 3:08:25 UTC Over Validate error Done 13,353.81 --- --- 649140457 2952522 1 Nov 2007 8:12:44 UTC 2 Nov 2007 23:35:39 UTC Over Success Done 23,186.73 91.4 91.4 646603670 3579571 29 Oct 2007 1:28:40 UTC 29 Oct 2007 14:30:19 UTC Over Success Done 34,396.80 91.4 91.4 647151159 173819936 29 Oct 2007 10:38:33 UTC 1 Nov 2007 19:50:16 UTC Over Validate error Done 12,070.10 --- --- Workunit details created 29 Oct 2007 9:11:45 UTC name 10fe07ag.9206.14388.9.6.23 649705640 3961594 2 Nov 2007 7:07:34 UTC 5 Jan 2008 14:58:35 UTC In Progress Unknown New --- --- --- 647151159 2881519 29 Oct 2007 10:38:33 UTC 1 Nov 2007 19:50:16 UTC Over Validate error Done 12,070.10 --- --- 647151158 2735674 29 Oct 2007 18:43:17 UTC 30 Oct 2007 7:01:53 UTC Over Success Done 11,961.79 81.08 pending 649188520 174774956 1 Nov 2007 13:02:16 UTC 3 Nov 2007 14:01:23 UTC Over Validate error Done 11,377.21 --- --- Workunit details created 1 Nov 2007 4:31:34 UTC name 11fe07ag.28804.25021.7.6.96 650940319 --- --- --- Unsent Unknown New --- --- --- 649188520 2881519 1 Nov 2007 13:02:16 UTC 3 Nov 2007 14:01:23 UTC Over Validate error Done 11,377.21 --- --- 649188519 3331609 1 Nov 2007 9:17:51 UTC 2 Nov 2007 8:48:54 UTC Over Success Done 13,720.05 73.6 pending ========================================================================== Keith |
[KWSN]John Galt 007 Send message Joined: 9 Nov 99 Posts: 2444 Credit: 25,086,197 RAC: 0 |
Thanks, Mark...You took the words right out of my keyboard. I guess I don't get as worked up about this stuff as I used to. On Topic, I haven't noticed anymore errors from my machines, but haven't really checked since this morning. Clk2HlpSetiCty:::PayIt4ward |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Thank you too. Those of us who are here almost every day and have seen the problems come and go tend to go into analysis mode when it happens rather than just P&M about it. I'd much rather contribute to a possible solution if I can rather than just complain about the problem. (Not that we don't all express our dismay now and then...LOL)! I'll have the kitties release the day's crunching probably after midnight some time when I get home and see what happens. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Michael T. Allison Retired USN Send message Joined: 25 Oct 07 Posts: 462 Credit: 20,802 RAC: 0 |
geeks,BFD |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
Has the validation error problem been quietly resolved during the day today? My last validation error occured this morning, actually about 9 hours ago. Are you still seeing the error occur?? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.