Message boards :
Number crunching :
Inconclusive Work Units Running AP Ver 6
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 10 · Next
Author | Message |
---|---|
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Have no Office installed atm. Got it. Will run the first test tomorrow after work. With each crime and every kindness we birth our future. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I have only 1 so far from my Q6600, http://setiathome.berkeley.edu/workunit.php?wuid=968825720. Got ap_24ja12ah_B3_P0_00233_20120410_31495.wu in the cage, though the r555/CPU app is a new one for this exercise. Typo correction from a long way down the page - Horatio's CPU app is r557, of course, not r577. |
X-Files 27 Send message Joined: 17 May 99 Posts: 104 Credit: 111,191,433 RAC: 0 |
Here's my list of Inconclusive: 2402009700 2401712891 2401149365 2400949559 2400173351 2400167519 2399688500 2399108050 2399007772 2398218968 2398098019 2395607586 2395603227 2395485254 2395484101 2395461657 2394473363 2394244633 2393827071 2393717959 2393435584 2393262477 2393125823 2391546328 2388404565 2384866932 2383779922 2382517326 2382107375 2381632875 2381559982 2381212026 2381140422 2380691271 2378557376 2377286245 2377249863 2376307271 2375792932 2375787448 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Here's my list of Inconclusive: Are those all from Pending AstroPulse v6 tasks for computer 2340442? With your computers hidden, it would take longer than I'm willing to spend before bedtime to check them all individually. But with that throughput, we can get a statistical impression of the scale of the problem - 18 valid, 6 invalid for that host as I type implies a 25% failure rate. That's not good, for you or for the project's bandwidth. |
Horacio Send message Joined: 14 Jan 00 Posts: 536 Credit: 75,967,266 RAC: 0 |
|
X-Files 27 Send message Joined: 17 May 99 Posts: 104 Credit: 111,191,433 RAC: 0 |
Here's my list of Inconclusive: Yes its for 2340442 and i don't have this issue with v5. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
I'll examine whatever results you folks can provide, might even undertake to do a stock CPU run of one of the WUs on my Q8200. It isn't very fast, took about 45 hours to do an offline stock run of WU 3954412 from Beta which I was checking because it was originally inconclusive on the r555 CPU build I was running. My offline stock run came up strongly similar to the r555 result as did the third wingman at Beta, so the indication there is some glitch in the _1 result from stock. There are many possible causes for inconclusives, it will be interesting to see if these cases turn up some which haven't been seen before. Joe |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
I'll examine whatever results you folks can provide, might even undertake to do a stock CPU run of one of the WUs on my Q8200. It isn't very fast, took about 45 hours to do an offline stock run of WU 3954412 from Beta which I was checking because it was originally inconclusive on the r555 CPU build I was running. My offline stock run came up strongly similar to the r555 result as did the third wingman at Beta, so the indication there is some glitch in the _1 result from stock. Did you see the list I put in the Panic thread? Might give you a place to start. |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
I didn't like that list over there. Here it is where it ought to be, I guess: Here's a list of optimized app v6 AP WUs that validated against stock V6 AP apps. __TASK_________WU_____Computer 2400174547 - 972282042 - 6011644 2399373128 - 971918307 - 5829212 2399245287 - 971860355 - 5829212 2399245286 - 954078529 - 5829212 2399240418 - 971858387 - 5829212 2399192807 - 971836264 - 5829212 2398100980 - 971335178 - 6011644 2392885930 - 968913612 - 6568834 2389492481 - 967359215 - 5829212 2381016725 - 963389045 - 5829212 One inconclusive: 2387615947 - 966484070 - 5829212 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I didn't like that list over there. Here it is where it ought to be, I guess: Yes, better here - though we're really trying to concentrate on the (few) inconclusives, which otherwise get rather lost in the greater number of valid results. Anyway, ap_02ja12aa_B5_P1_00071_20120406_22092.wu has been added to the menagerie - that's r555 on your CPU, there. Thanks. |
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0 |
Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too. http://setiathome.berkeley.edu/results.php?hostid=6567776&offset=0&show_names=0&state=0&appid=12 Those with short runtimes, no matter how its marked(GPU or CPU) are done by GTX 470. I suspected the GPU to be faulty, bcs another GPU on this system suddenly died, so I ran MB tasks last few days - no problem so far, everything validates. Seems problem is only AP related and not GPU fault. I got only one validated from all of 15-20 tasks I ran on this GPU. |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
This should probably go in the gripe thread: 'grad student code' I coded for my PhD and I'd have been ashamed to make such a mess of the debug logging. WT* is the use of an output that only tells you how far the app got? that's debug logging while you are developing not for release! So after I vented, we are looking at inconclusive/invalid of optimised/stock CPU in the order of up to 10-25% ?! Since the stock output is virtually useless for even comparing the basic numbers of signals, that will have to go through offline running... Ok guys, while we need to grab the files at the inconclusive stage, on this scale it's impossible to run them all. Please notify us when an inconclusive goes invalid, at which stage we can proceed to offline runs Thanks all for your help. I'm not the Pope. I don't speak Ex Cathedra! |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
One of my inconclusivs has now been validated http://setiathome.berkeley.edu/workunit.php?wuid=954832873 Its got a bit of history as well... Regards, Cliff, Been there, Done that, Still no damm T shirt! |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too. 'all units are created equal' - there are no 'CPU' or 'GPU' WU. If the app has a problem on your GPU it doesn't matter if you rescheduled or they were direct sends. I'm wondering about the effect of unroll on the GPU - i.e. if too low unroll leads to errors. I'm not the Pope. I don't speak Ex Cathedra! |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
The 1st invalid I reported had its unroll set to 2, after advice I increased the unroll to 10, still got inconclusives, but 1 validated ok.. The default unroll was AFAIR set to 1 originally.. So it might well have something to do with whats going on Regards, Cliff, Been there, Done that, Still no damm T shirt! |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
IIRC minimum unroll is 2 - thanks Cliff. So that validated WU has unroll 10 on a GTX 560 (1023MB) driver: 301.24 hbomber has unroll 4 on a GTX 470 (1279MB) driver: 301.24 and it looks like all GPU crunched units are going inconclusive. X-Files 27 has unroll 8 on a GeForce GTX 580 (1536MB) driver: 296.35 some are validating but I don't see any of the valids coming from an inconclusive state. I'm not the Pope. I don't speak Ex Cathedra! |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
ARGH - make that 'too high unroll' Unless it has to be not too high and not too low, in which case we're in trouble. I'm not the Pope. I don't speak Ex Cathedra! |
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0 |
I know they are same, the units. But I had to note it, to prevent any further questions or misunderstanding. I can run same unit on GPU with different unroll, next on CPU(presuming it will be valid), and compare them eventually. Thus I can avoid that long wait until they get validated by server, producing junk and wasting energy. The problem is I don't know how to compare the results. If there is a relatively easy way - comparison of output file e.g, I'll find the proper unroll value for my card. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too. To low doesn´t hurt. Only to high values result in overflows. Keep in mind lots of crappy work is out in the field atm. I got 5 in a row with 100% blanking 2 with 98%. With each crime and every kindness we birth our future. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
I know they are same, the units. But I had to note it, to prevent any further questions or misunderstanding. You can easily run these values on your card. <cmdline>-instances_per_device 2 -unroll 10 -ffa_block 6144 -ffa_block_fetch 2048</cmdline> unroll 12 might be possible but i suggest to try 10 first. With each crime and every kindness we birth our future. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.