Inconclusive Work Units Running AP Ver 6

Message boards : Number crunching : Inconclusive Work Units Running AP Ver 6
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 10 · Next

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1220067 - Posted: 18 Apr 2012, 22:02:17 UTC - in response to Message 1220055.  

Have no Office installed atm.
You could send me the links via mail.

And I'm not sure I have your email atm, after I lost my daily driver with the contact list in a power cut :-(

I can do you a PM here, though - that should do it?

Edit - PM sent.


Got it.

Will run the first test tomorrow after work.



With each crime and every kindness we birth our future.
ID: 1220067 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1220089 - Posted: 18 Apr 2012, 23:10:17 UTC - in response to Message 1220062.  

I have only 1 so far from my Q6600, http://setiathome.berkeley.edu/workunit.php?wuid=968825720.

Cheers.

Got ap_24ja12ah_B3_P0_00233_20120410_31495.wu in the cage, though the r555/CPU app is a new one for this exercise.

Typo correction from a long way down the page - Horatio's CPU app is r557, of course, not r577.
ID: 1220089 · Report as offensive
Profile X-Files 27
Avatar

Send message
Joined: 17 May 99
Posts: 104
Credit: 111,191,433
RAC: 0
Canada
Message 1220097 - Posted: 18 Apr 2012, 23:39:38 UTC

ID: 1220097 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1220099 - Posted: 18 Apr 2012, 23:48:59 UTC - in response to Message 1220097.  

Here's my list of Inconclusive:

Are those all from Pending AstroPulse v6 tasks for computer 2340442? With your computers hidden, it would take longer than I'm willing to spend before bedtime to check them all individually.

But with that throughput, we can get a statistical impression of the scale of the problem - 18 valid, 6 invalid for that host as I type implies a 25% failure rate. That's not good, for you or for the project's bandwidth.
ID: 1220099 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1220100 - Posted: 18 Apr 2012, 23:57:38 UTC

ID: 1220100 · Report as offensive
Profile X-Files 27
Avatar

Send message
Joined: 17 May 99
Posts: 104
Credit: 111,191,433
RAC: 0
Canada
Message 1220116 - Posted: 19 Apr 2012, 0:59:09 UTC - in response to Message 1220099.  

Here's my list of Inconclusive:

Are those all from Pending AstroPulse v6 tasks for computer 2340442? With your computers hidden, it would take longer than I'm willing to spend before bedtime to check them all individually.

But with that throughput, we can get a statistical impression of the scale of the problem - 18 valid, 6 invalid for that host as I type implies a 25% failure rate. That's not good, for you or for the project's bandwidth.

Yes its for 2340442 and i don't have this issue with v5.
ID: 1220116 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1220175 - Posted: 19 Apr 2012, 5:36:36 UTC

I'll examine whatever results you folks can provide, might even undertake to do a stock CPU run of one of the WUs on my Q8200. It isn't very fast, took about 45 hours to do an offline stock run of WU 3954412 from Beta which I was checking because it was originally inconclusive on the r555 CPU build I was running. My offline stock run came up strongly similar to the r555 result as did the third wingman at Beta, so the indication there is some glitch in the _1 result from stock.

There are many possible causes for inconclusives, it will be interesting to see if these cases turn up some which haven't been seen before.
                                                                   Joe
ID: 1220175 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1220200 - Posted: 19 Apr 2012, 7:08:26 UTC - in response to Message 1220175.  

I'll examine whatever results you folks can provide, might even undertake to do a stock CPU run of one of the WUs on my Q8200. It isn't very fast, took about 45 hours to do an offline stock run of WU 3954412 from Beta which I was checking because it was originally inconclusive on the r555 CPU build I was running. My offline stock run came up strongly similar to the r555 result as did the third wingman at Beta, so the indication there is some glitch in the _1 result from stock.

There are many possible causes for inconclusives, it will be interesting to see if these cases turn up some which haven't been seen before.
                                                                   Joe


Did you see the list I put in the Panic thread?

Might give you a place to start.
ID: 1220200 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1220202 - Posted: 19 Apr 2012, 7:34:05 UTC - in response to Message 1220200.  



Did you see the list I put in the Panic thread?

Might give you a place to start.


I didn't like that list over there. Here it is where it ought to be, I guess:

Here's a list of optimized app v6 AP WUs that validated against stock V6 AP apps.


__TASK_________WU_____Computer
2400174547 - 972282042 - 6011644
2399373128 - 971918307 - 5829212
2399245287 - 971860355 - 5829212
2399245286 - 954078529 - 5829212
2399240418 - 971858387 - 5829212
2399192807 - 971836264 - 5829212
2398100980 - 971335178 - 6011644
2392885930 - 968913612 - 6568834
2389492481 - 967359215 - 5829212
2381016725 - 963389045 - 5829212

One inconclusive:

2387615947 - 966484070 - 5829212
ID: 1220202 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1220216 - Posted: 19 Apr 2012, 9:11:28 UTC - in response to Message 1220202.  

I didn't like that list over there. Here it is where it ought to be, I guess:

One inconclusive:

2387615947 - 966484070 - 5829212

Yes, better here - though we're really trying to concentrate on the (few) inconclusives, which otherwise get rather lost in the greater number of valid results.

Anyway, ap_02ja12aa_B5_P1_00071_20120406_22092.wu has been added to the menagerie - that's r555 on your CPU, there. Thanks.
ID: 1220216 · Report as offensive
hbomber
Volunteer tester

Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 0
Bulgaria
Message 1220219 - Posted: 19 Apr 2012, 9:57:37 UTC
Last modified: 19 Apr 2012, 10:02:05 UTC

Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too.

http://setiathome.berkeley.edu/results.php?hostid=6567776&offset=0&show_names=0&state=0&appid=12
Those with short runtimes, no matter how its marked(GPU or CPU) are done by GTX 470.

I suspected the GPU to be faulty, bcs another GPU on this system suddenly died, so I ran MB tasks last few days - no problem so far, everything validates. Seems problem is only AP related and not GPU fault.
I got only one validated from all of 15-20 tasks I ran on this GPU.
ID: 1220219 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1220223 - Posted: 19 Apr 2012, 10:20:41 UTC

This should probably go in the gripe thread: 'grad student code'

I coded for my PhD and I'd have been ashamed to make such a mess of the debug logging.
WT* is the use of an output that only tells you how far the app got? that's debug logging while you are developing not for release!

So after I vented, we are looking at inconclusive/invalid of optimised/stock CPU in the order of up to 10-25% ?!

Since the stock output is virtually useless for even comparing the basic numbers of signals, that will have to go through offline running...

Ok guys, while we need to grab the files at the inconclusive stage, on this scale it's impossible to run them all. Please notify us when an inconclusive goes invalid, at which stage we can proceed to offline runs

Thanks all for your help.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1220223 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1220225 - Posted: 19 Apr 2012, 10:34:32 UTC

One of my inconclusivs has now been validated
http://setiathome.berkeley.edu/workunit.php?wuid=954832873

Its got a bit of history as well...


Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1220225 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1220227 - Posted: 19 Apr 2012, 10:47:22 UTC - in response to Message 1220219.  

Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too.

http://setiathome.berkeley.edu/results.php?hostid=6567776&offset=0&show_names=0&state=0&appid=12
Those with short runtimes, no matter how its marked(GPU or CPU) are done by GTX 470.

I suspected the GPU to be faulty, bcs another GPU on this system suddenly died, so I ran MB tasks last few days - no problem so far, everything validates. Seems problem is only AP related and not GPU fault.
I got only one validated from all of 15-20 tasks I ran on this GPU.


'all units are created equal' - there are no 'CPU' or 'GPU' WU. If the app has a problem on your GPU it doesn't matter if you rescheduled or they were direct sends.

I'm wondering about the effect of unroll on the GPU - i.e. if too low unroll leads to errors.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1220227 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1220229 - Posted: 19 Apr 2012, 10:55:58 UTC - in response to Message 1220227.  


I'm wondering about the effect of unroll on the GPU - i.e. if too low unroll leads to errors.


The 1st invalid I reported had its unroll set to 2, after advice I increased the unroll to 10, still got inconclusives, but 1 validated ok..

The default unroll was AFAIR set to 1 originally..

So it might well have something to do with whats going on

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1220229 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1220238 - Posted: 19 Apr 2012, 11:33:42 UTC - in response to Message 1220229.  


I'm wondering about the effect of unroll on the GPU - i.e. if too low unroll leads to errors.


The 1st invalid I reported had its unroll set to 2, after advice I increased the unroll to 10, still got inconclusives, but 1 validated ok..

The default unroll was AFAIR set to 1 originally..

So it might well have something to do with whats going on

Regards,


IIRC minimum unroll is 2 - thanks Cliff.

So that validated WU has unroll 10 on a GTX 560 (1023MB) driver: 301.24

hbomber has unroll 4 on a GTX 470 (1279MB) driver: 301.24
and it looks like all GPU crunched units are going inconclusive.

X-Files 27 has unroll 8 on a GeForce GTX 580 (1536MB) driver: 296.35
some are validating but I don't see any of the valids coming from an inconclusive state.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1220238 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1220248 - Posted: 19 Apr 2012, 12:03:56 UTC

ARGH - make that 'too high unroll'

Unless it has to be not too high and not too low, in which case we're in trouble.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1220248 · Report as offensive
hbomber
Volunteer tester

Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 0
Bulgaria
Message 1220254 - Posted: 19 Apr 2012, 12:23:21 UTC
Last modified: 19 Apr 2012, 12:24:31 UTC

I know they are same, the units. But I had to note it, to prevent any further questions or misunderstanding.
I can run same unit on GPU with different unroll, next on CPU(presuming it will be valid), and compare them eventually. Thus I can avoid that long wait until they get validated by server, producing junk and wasting energy.

The problem is I don't know how to compare the results.
If there is a relatively easy way - comparison of output file e.g, I'll find the proper unroll value for my card.
ID: 1220254 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1220257 - Posted: 19 Apr 2012, 12:24:36 UTC - in response to Message 1220227.  
Last modified: 19 Apr 2012, 12:25:58 UTC

Same here, lots of invalids and inconclusives which will turn to invalids I guess. In the beginning I suspected they might be, bcs I calculated CPU units with NV GPU, but those send to GPU from server, got trashed in same way too.

http://setiathome.berkeley.edu/results.php?hostid=6567776&offset=0&show_names=0&state=0&appid=12
Those with short runtimes, no matter how its marked(GPU or CPU) are done by GTX 470.

I suspected the GPU to be faulty, bcs another GPU on this system suddenly died, so I ran MB tasks last few days - no problem so far, everything validates. Seems problem is only AP related and not GPU fault.
I got only one validated from all of 15-20 tasks I ran on this GPU.


'all units are created equal' - there are no 'CPU' or 'GPU' WU. If the app has a problem on your GPU it doesn't matter if you rescheduled or they were direct sends.

I'm wondering about the effect of unroll on the GPU - i.e. if too low unroll leads to errors.


To low doesn´t hurt.
Only to high values result in overflows.

Keep in mind lots of crappy work is out in the field atm.
I got 5 in a row with 100% blanking 2 with 98%.


With each crime and every kindness we birth our future.
ID: 1220257 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1220258 - Posted: 19 Apr 2012, 12:29:16 UTC - in response to Message 1220254.  
Last modified: 19 Apr 2012, 12:32:53 UTC

I know they are same, the units. But I had to note it, to prevent any further questions or misunderstanding.
I can run same unit on GPU with different unroll, next on CPU(presuming it will be valid), and compare them eventually. Thus I can avoid that long wait until they get validated by server, producing junk and wasting energy.

The problem is I don't know how to compare the results.
If there is a relatively easy way - comparison of output file e.g, I'll find the proper unroll value for my card.


You can easily run these values on your card.

<cmdline>-instances_per_device 2 -unroll 10 -ffa_block 6144 -ffa_block_fetch 2048</cmdline>

unroll 12 might be possible but i suggest to try 10 first.


With each crime and every kindness we birth our future.
ID: 1220258 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 10 · Next

Message boards : Number crunching : Inconclusive Work Units Running AP Ver 6


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.