Message boards :
Number crunching :
Computation Error - Bad Workunit Header
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next
Author | Message |
---|---|
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
Just had 68 go through with another load still down loading. I was hoping to fill the cache as well over weekend, never mind. Thanx Fred , for the idea, didn't know it Just signed up. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49 |
This definately sounds like splitter problems. I received a bunch of errors like this on several occasions. One that I got to check on had a bunch of wu's sent out with the binary data intact but just no header in it. On another occasion I received a bunch of files that were zero length! They didn't contain any header or data. On both occasions it was determined that a splitter had acted up and created the defective files. I certainly hope so, I'm getting them on all 3 of My PCs, And they are as follows: 5 x 13fe08ac.6464 4 x 13fe08ac.8515 2 x 13fe08ac.23325 4 x 13fe08ac.24787 And so far that is 15 total WU's that have been erroring on Me. :( I'm glad It's not My end that's causing the errors(and I thought It was), But at the same time I'm sad to see the splitters spitting out such WU's. Edit: I've had 3 more just in the last few minutes. 1 x 13fe08ac.23325(PC3) 2 x 13fe08ac.24787(PC2) The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
OK, here's the sitch. I have just picked a task from 13fe08ad. So far I have no reason to suspect it other than it is from the same day as the last batch of failed test data. Anybody else have some from this set to comment on yet? Alinator |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
OK, here's the sitch. Last reference I saw was Richard's post here F. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
OK, here's the sitch. No, none here. How does the header compare with my message 723462 below? Red or green? |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
OK, here's the sitch. It should be OK. Both Eric and Matt have indicated that it was a bad splitter, not bad "tape". The Server status page shows that 13fe08ad and 13fe08ae are currently being split, and that one splitter mb_splitter5 is currently disabled. Sir Arthur C Clarke 1917-2008 |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
OK, here's the sitch. Can't tell right now. It's on one of my remote hosts and I don't have it set up so I can look at the input files remotely. I'll check it tomorrow. @ Keith: OK, that makes sense now that I think about it in big picture terms. I was always going on the idea that it was more a larger scale test of the radar blanking technology rather than just a 'bad' splitter per se. They just chose an 2008 data set to make it easier to tell the good, from the bad, from the ugly, so to speak! :-) Alinator |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. Bit of a negative test on Richard's hypothesis but I suppose any additional data may help (and if one of them should crash and burn, then ...) F. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. Were they split (WU created) before or after Eric's "I'm on the case" message? |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. The oldest one was sent out on 6th March so that would make it "before". F. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. Ah. Not only 'before', but 'much before'. See message 724253 in the 'other' thread: Here's a funny thing. My linux box here has one of the 13fe08ac WUs, and it was crunching it at lunchtime when I checked it. When I checked the wingman, he had completed it and was also running linux. When I get home this evening I will check to see if it completed. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. Right; well my text search obviously failed miserably. I have now located a good half dozen that were issued to me in the early hours (UTC) of the 9th and all have empty <data_type>, <window> and <filter> tags. I'm now debating whether to try to find them in my cache and abort them - but I think that can wait until the morning!!! F. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
I just reported the one I had on my machine. My wingman and I have both finished with it but the bad thing I noticed is that the silly thing has been sent out to two more people. Too bad there isn't a way to catch these before they get reissued. Looks like these things are going to be causing headaches for some time to come before we see the last of them. :( PROUD MEMBER OF Team Starfire World BOINC |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
I just reported the one I had on my machine. My wingman and I have both finished with it but the bad thing I noticed is that the silly thing has been sent out to two more people. Too bad there isn't a way to catch these before they get reissued. Looks like these things are going to be causing headaches for some time to come before we see the last of them. :( I found this one which hasn't run yet on one of your hosts. You might want to give it the boot manually. Alinator |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
Thanks for the heads up Alinator, I missed that one. It's gone now. :) PROUD MEMBER OF Team Starfire World BOINC |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. I aborted every last one that I found, I just wish somebody could get rid of them before they get out and send them to the loony bin where they belong. :D As they are annoyingly empty. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Well... 2 of my 36 13fe08ac's have crunched normally. I have done a text search and not found any with empty <data_type>, <window> or <filter> tags so I am going to let them crunch through and see what happens. But why aborting? BOINC can do this job also.. |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
I thought that bad WU's got cancelled after 5 errors, but I've just checked my only one of these and it has now reached 5 errors with a new Unsent generated. http://setiathome.berkeley.edu/workunit.php?wuid=234337343 Sir Arthur C Clarke 1917-2008 |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
I thought that bad WU's got cancelled after 5 errors, but I've just checked my only one of these and it has now reached 5 errors with a new Unsent generated. http://setiathome.berkeley.edu/workunit.php?wuid=234337343 The one that has been sent was sent before the last one that reported in. This would be normal operation, because before that BOINC would be trying to get two units to form quorum. It should now mark the unsent unit as not needed. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
If I abort them now, then Boinc will re-send to another sap immediately so they will reach their 5 failures and be deleted from the system sooner. Otherwise, they would be hanging aroung in my cache for the next 3 or 4 days before being crunched - producing their errors - and then being sent out again. Just trying to speed the process up a little. F. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.