You might want to check this one Again...

Message boards : Number crunching : You might want to check this one Again...
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1365004 - Posted: 6 May 2013, 1:35:08 UTC - in response to Message 1364849.  

Another one that makes you say Whut?
I had the 6850 in that machine to see how it worked with Windows 8. It completed close to 200 APs in those few days without a problem, until;
Workunit 1228052924
2959989611 	6686127 	27 Apr 2013, 2:31:36 UTC 	27 Apr 2013, 11:31:54 UTC 	Completed and validated 	20,933.70 	4,127.53 	780.63 	AstroPulse v6 v6.04 (opencl_nvidia_100)
2959989612 	6796475 	27 Apr 2013, 2:31:40 UTC 	28 Apr 2013, 17:32:17 UTC 	Completed, marked as invalid 	2,227.31 	778.26 	0.00 	AstroPulse v6 Anonymous platform (ATI GPU)
2963067135 	6887239 	28 Apr 2013, 21:52:18 UTC 	5 May 2013, 8:56:24 UTC 	Completed and validated 	426,448.30 	312,739.90 	780.63 	AstroPulse v6 v6.01

...

Your result apparently fell victim to a quirk of the AP Validator. This portion of the stderr gives most of the needed info for a good guess at the difficulty:
single pulses: 2
repetitive pulses: 1
percent blanked: 0.00
Rep. pulse: peak_power=5160; dm=-4912; fft_num=0; scale=6752; ffa_scale=4; period=1.97e-291
Single pulse: peak_power=654.7; dm=-11557; scale=9
Single pulse: peak_power=650; dm=-13836; scale=9

The other piece of data needed is the threshold set in AP WU headers for scale 9 single pulses, which is 648.259888 . The Validator ignores single pulses which are less than 1.01 * the threshold (654.7424869 for scale 9). The second single pulse is clearly less than that so can be discounted. The first one, however, is so close that it may be either below or above, and the wingmates' results on the other side of that critical level. And because it's the higher of the two single pulses found, it would also be the "best" pulse at scale 9 so is effectively counted twice in the calculation of whether at least half of your signals match the canonical result.

It is of course quite rare for all the factors to line up like that. But with different hardware and software producing results, it is inevitable that some signals sufficiently close to a critical level won't match.
                                                                  Joe
ID: 1365004 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1365014 - Posted: 6 May 2013, 3:57:08 UTC - in response to Message 1365004.  

I had a feeling it had to be another one of those quirks. Seems those quirks are directly related to the number of tasks completed. Something similar to the first post where two hosts with the same overflow error eventually teamed up to produce valid results...according to the validator. Considering the noted different types of hardware and software involved, maybe those critical levels are a bit too critical? It would seem the level of accuracy sought isn't easily obtained, unless, both hosts are suffering from the same malfunction. I've witnessed how accurate those canonical results can be. As time progresses, the differences in hardware/software will most likely increase even further. I guess I can expect even more quirks...
ID: 1365014 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1367216 - Posted: 12 May 2013, 17:23:06 UTC
Last modified: 12 May 2013, 17:24:43 UTC

Another one, Workunit 1235724601
The validator keeps getting the -12 Error from the Old GPU Apps and fails to call in a CPU Host....So, I get an Invalid? Whut?

2976456136 	6797524 	4 May 2013, 9:23:08 UTC 	4 May 2013, 10:53:47 UTC 	Completed, can't validate 	166.16 	35.16 	0.00 	SETI@home Enhanced Anonymous platform (ATI GPU)
2976456137 	6907433 	4 May 2013, 9:23:08 UTC 	4 May 2013, 10:38:12 UTC 	Error while computing 	2,631.69 	180.31 	--- 	SETI@home Enhanced v6.09 (cuda23)
2976582225 	6946287 	4 May 2013, 10:38:15 UTC 	5 May 2013, 9:14:15 UTC 	Error while computing 	133.16 	50.58 	--- 	SETI@home Enhanced v6.10 (cuda_fermi)
2978973833 	6215837 	5 May 2013, 9:14:27 UTC 	5 May 2013, 14:08:16 UTC 	Error while computing 	148.23 	65.16 	--- 	SETI@home Enhanced v6.10 (cuda_fermi)
2979509813 	5484531 	5 May 2013, 14:08:26 UTC 	9 May 2013, 12:48:39 UTC 	Error while computing 	200.28 	58.83 	--- 	SETI@home Enhanced v6.10 (cuda_fermi)
2987745778 	6269537 	9 May 2013, 15:44:38 UTC 	11 May 2013, 4:04:42 UTC 	Error while computing 	206.58 	58.09 	--- 	SETI@home Enhanced v6.10 (cuda_fermi)
2991325527 	6867572 	11 May 2013, 7:52:42 UTC 	12 May 2013, 8:26:04 UTC 	Error while computing 	18.09 	15.37 	--- 	SETI@home Enhanced v6.10 (cuda_fermi)

errors: Too many errors (may have bug)
ID: 1367216 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1377645 - Posted: 6 Jun 2013, 21:10:08 UTC

Unfortunately, this happens from time to time. Just have to take it, sometimes. I got one just now, with AstroPulse: 1258964237

3028141953 	6590324 	4 Jun 2013, 19:31:10 UTC 	4 Jun 2013, 20:12:53 UTC 	Error while computing 	0.00 	0.00 	--- 	AstroPulse v6 v6.04 (ati_opencl_100)
3028141954 	5338379 	4 Jun 2013, 19:31:15 UTC 	4 Jun 2013, 19:36:29 UTC 	Error while computing 	0.00 	0.00 	--- 	AstroPulse v6 v6.04 (ati_opencl_100)
3028399179 	6882889 	5 Jun 2013, 2:37:22 UTC 	5 Jun 2013, 2:42:33 UTC 	Error while downloading 	0.00 	0.00 	--- 	AstroPulse v6 v6.01
3028435665 	6804917 	5 Jun 2013, 3:37:46 UTC 	5 Jun 2013, 3:42:56 UTC 	Error while computing 	0.00 	0.00 	--- 	AstroPulse v6 v6.04 (ati_opencl_100)
3028852786 	5073546 	5 Jun 2013, 10:53:08 UTC 	5 Jun 2013, 10:58:19 UTC 	Error while computing 	0.00 	0.00 	--- 	AstroPulse v6 v6.04 (ati_opencl_100)
3028889170 	6690685 	5 Jun 2013, 11:42:31 UTC 	6 Jun 2013, 19:44:53 UTC 	Completed, can't validate 	3,344.82 	1,788.19 	0.00 	AstroPulse v6
Anonymous platform (NVIDIA GPU)
3029200437 	4501000 	5 Jun 2013, 18:02:23 UTC 	5 Jun 2013, 18:07:36 UTC 	Error while computing 	9.34 	6.82 	--- 	AstroPulse v6 v6.04 (ati_opencl_100)

Soli Deo Gloria
ID: 1377645 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1386580 - Posted: 1 Jul 2013, 20:03:11 UTC
Last modified: 1 Jul 2013, 20:30:55 UTC

Oh Noes, another Invalid! But wait. One of my CPU tasks is being marked Invalid? Against what?
OpenCL 1.2 AMD-APP (1084.4) alias Standard Cat 13.1. But, Cat 13.1 is known buggie. Both the 'Valids' were using AMD-APP 1084.4?

3048663187 	5416070 	23 Jun 2013, 9:24:57 UTC 	23 Jun 2013, 9:34:56 UTC 	Error while computing 	0.00 	0.00 	--- 	AstroPulse v6 v6.07 (cuda_opencl_100)
3048663188 	6859445 	23 Jun 2013, 9:24:58 UTC 	29 Jun 2013, 1:47:24 UTC 	Completed and validated 	6,935.72 	932.92 	712.36 	AstroPulse v6 v6.06 (opencl_ati_100)
3048675141 	6797524 	23 Jun 2013, 9:35:08 UTC 	30 Jun 2013, 22:21:10 UTC 	Completed, marked as invalid 	33,266.47 	32,673.26 	0.00 	AstroPulse v6 Anonymous platform (CPU)
3058117956 	6787341 	30 Jun 2013, 22:21:18 UTC 	1 Jul 2013, 7:00:52 UTC 	Completed and validated 	2,801.75 	688.07 	712.36 	AstroPulse v6 v6.06 (opencl_ati_100)

Maybe someone needs to tell the validater about Cat 13.1.
I've been robbed!
ID: 1386580 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1386588 - Posted: 1 Jul 2013, 20:32:15 UTC - in response to Message 1386580.  
Last modified: 1 Jul 2013, 20:39:29 UTC

Oh Noes, another Invalid! But wait. One of my CPU tasks is being marked Invalid? Against what?
OpenCL 1.2 AMD-APP (1084.4) alias Standard Cat 13.1. But, Cat 13.1 is known buggie. Both the 'Valids' were using AMD-APP 1084.4?

Maybe someone needs to tell the validater about Cat 13.1.
I've been robbed!

There is no known problem with Cat 13.1 and the AMD/ATI OpenCL Astropulse app, the problem with Cat 13.1 is with the AMD/ATI OpenCL MB app,

If you had run the r1797 CPU app rather than the old r557 app, you would have seen what signals didn't match.

Your wingman's results:
single pulses: 3
repetitive pulses: 1
percent blanked: 0.00
Rep. pulse: peak_power=8379; dm=3568; fft_num=0; scale=3520; ffa_scale=4; period=1.322e+137
Single pulse: peak_power=652.9; dm=-6336; scale=9
Single pulse: peak_power=87.07; dm=11731; scale=5
Single pulse: peak_power=133.9; dm=11744; scale=6

Your result:
single pulses: 3
repetitive pulses: 1
percent blanked: 0.00


Claggy
ID: 1386588 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1386591 - Posted: 1 Jul 2013, 20:41:37 UTC - in response to Message 1386588.  
Last modified: 1 Jul 2013, 21:08:04 UTC

Oh Noes, another Invalid! But wait. One of my CPU tasks is being marked Invalid? Against what?
OpenCL 1.2 AMD-APP (1084.4) alias Standard Cat 13.1. But, Cat 13.1 is known buggie. Both the 'Valids' were using AMD-APP 1084.4?

Maybe someone needs to tell the validater about Cat 13.1.
I've been robbed!

There is no known problem with Cat 13.1 and the AMD/ATI OpenCL Astropulse app, the problem with Cat 13.1 is with the AMD/ATI OpenCL MB app,

If you had run the r1797 CPU app rather than the old r557 app, you would have seen what signals didn't match.

single pulses: 3
repetitive pulses: 1
percent blanked: 0.00
Rep. pulse: peak_power=8379; dm=3568; fft_num=0; scale=3520; ffa_scale=4; period=1.322e+137
Single pulse: peak_power=652.9; dm=-6336; scale=9
Single pulse: peak_power=87.07; dm=11731; scale=5
Single pulse: peak_power=133.9; dm=11744; scale=6


Claggy

13.1 known to cuase issues with OpenCL btw...
Raistmer
Volunteer developer

?

Here's another, talking about a 'known' issue;
For overflowed tasks found signal sequence not always match CPU version. That one is a double;
"Cat 13.1 should still be avoided"
Claggy


Run another CPU instance on that WU and see who is invalid...
ID: 1386591 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1386598 - Posted: 1 Jul 2013, 21:11:37 UTC - in response to Message 1386591.  

13.1 known to cuase issues with OpenCL btw...
Raistmer
Volunteer developer

?

Yes, on the Multibeam app only, I'm the Alpha Tester that reported problems with it, the compilations produced by Cat 13.1 cause driver restarts on Mid Range GPUs on Mid AR Wu's,
(it's O.K on High end GPUs like Mike's with 2 or 3 times as many CUs as mine),
later revisions of the MB app helped reduce the problems, but have not eliminated them, Astropulse on the other hand has had no problems with Cat 13.1

Run another CPU instance on that WU and see who is invalid...

Impossible, unless you have the Wu file stashed away ready for rerunning, If a result is show as Invalid, then the Wu file will have been purged very shortly afterwards:

http://setiathome.berkeley.edu/sah/download_fanout/f2/ap_31mr08aa_B1_P0_00039_20130623_07674.wu

You're got to get hold of the Wu while it's still showing as inconclusive.

Claggy
ID: 1386598 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1386602 - Posted: 1 Jul 2013, 21:21:36 UTC - in response to Message 1386598.  
Last modified: 1 Jul 2013, 21:24:06 UTC

13.1 known to cuase issues with OpenCL btw...
Raistmer
Volunteer developer

?

Yes, on the Multibeam app only, I'm the Alpha Tester that reported problems with it, the compilations produced by Cat 13.1 cause driver restarts on Mid Range GPUs on Mid AR Wu's,
(it's O.K on High end GPUs like Mike's with 2 or 3 times as many CUs as mine),
later revisions of the MB app helped reduce the problems, but have not eliminated them, Astropulse on the other hand has had no problems with Cat 13.1

Run another CPU instance on that WU and see who is invalid...

Impossible, unless you have the Wu file stashed away ready for rerunning, If a result is show as Invalid, then the Wu file will have been purged very shortly afterwards:

http://setiathome.berkeley.edu/sah/download_fanout/f2/ap_31mr08aa_B1_P0_00039_20130623_07674.wu

You're got to get hold of the Wu while it's still showing as inconclusive.

Claggy

Hey, no problem. It's bad science in the database as far as I'm concerned. Those CPUs on My host have completed many more APs than the others combined. Just about all My other CPU APs are Valid, so, I'd hate to think they are now Invalid....All those Hundreds of AP CPU tasks...Invalid...
ID: 1386602 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Number crunching : You might want to check this one Again...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.