How to read APBench validation output


log in

Advanced search

Message boards : Number crunching : How to read APBench validation output

Author Message
hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,768
RAC: 216
Bulgaria
Message 1327874 - Posted: 16 Jan 2013, 9:25:50 UTC
Last modified: 16 Jan 2013, 9:26:53 UTC

This is the output:

ref-ap_6.01r557_SSE2_331_AVX.exe-ap_28ap12ac_B1_P1_00116_20130110_10796.wu.res: <ap_signal>29,<pulses>19,<best_pulses>10
result-AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe-ap_28ap12ac_B1_P1_00116_20130110_10796.wu.res: <ap_signal>29,<pulses>19,<best_pulses>10
All Signals: Weakly similar or Different.
Pulses: Checked 19, 19 , Strongly Similar
Best Pulses: Weakly similar or Different.

-(.\testDatas\ref\ref-ap_6.01r557_SSE2_331_AVX.exe-ap_28ap12ac_B1_P1_00116_20130110_10796.wu.res)-
Reportable Single Pulses: 10 [OK], 4 above threshold*THRESHOLD_FUDGE
Reportable Repeating Pulses: 9 [OK]
Single Pulses (Best): 10 [Weak], 4 above threshold*THRESHOLD_FUDGE

-(.\testDatas\result-AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe-ap_28ap12ac_B1_P1_00116_20130110_10796.wu.res)-
Reportable Single Pulses: 10 [OK], 4 above threshold*THRESHOLD_FUDGE
Reportable Repeating Pulses: 9 [OK]
Single Pulses (Best): 10 [Weak], 4 above threshold*THRESHOLD_FUDGE


Is it OK?
Can someone explain me "above threshold" part and "Single Pulses (Best): 10 [Weak]". Generally, how does probable invalid result output look like?
____________

Josef W. Segur
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4206
Credit: 1,030,844
RAC: 270
United States
Message 1328089 - Posted: 16 Jan 2013, 22:11:57 UTC - in response to Message 1327874.

Yes, it's OK.

It is indeed somewhat confusing when comparing results from a CPU application and an OpenCL application.

The AP Validator only checks single pulses which are "above threshold*THRESHOLD_FUDGE", but rescmpAP checks all. Since the OpenCL applications don't waste time sending single pulses back from GPU to CPU unless they're definitely needed, the "Single Pulses (Best)" positions which the Validator isn't going to look at don't match CPU results and "Single Pulses (Best): 10 [Weak]" is expected.

The threshold is what applications use to decide whether a pulse should be reported, THRESHOLD_FUDGE is 1.01 so the Validator is only checking single pulses which are at least 1% above threshold.

Joe

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,768
RAC: 216
Bulgaria
Message 1328223 - Posted: 17 Jan 2013, 9:54:44 UTC
Last modified: 17 Jan 2013, 9:57:47 UTC

Thank you, Josef.
How does output will look like, if there is a possible invalid result? Except obvious difference in pulses count, in which part of this output will be shown and how, if found pulses are not similar enough to validate?
____________

Josef W. Segur
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4206
Credit: 1,030,844
RAC: 270
United States
Message 1328321 - Posted: 17 Jan 2013, 17:56:11 UTC - in response to Message 1328223.

Thank you, Josef.
How does output will look like, if there is a possible invalid result? Except obvious difference in pulses count, in which part of this output will be shown and how, if found pulses are not similar enough to validate?

Here's one example of a comparison which the Validator would score as inconclusive so another task would be sent:

ref-ap_6.00r486_SSE.exe-LoThresh_v5.dat.res: <ap_signal>70,<pulses>60,<best_pulses>10
result-ap_6.01r548x_SSE2_331_AVX.exe-LoThresh_v5.dat.res: <ap_signal>70,<pulses>60,<best_pulses>10
All Signals: Weakly similar or Different.
Pulses: pulse at signal 28 has no match (direction -->)
Weakly similar or Different.
Best Pulses: Checked 10, 10 , Strongly Similar

-(.\testDatas\ref\ref-ap_6.00r486_SSE.exe-LoThresh_v5.dat.res)-
Reportable Single Pulses: 30 [OK], 18 above threshold*THRESHOLD_FUDGE
Reportable Repeating Pulses: 30 [Weak]
Single Pulses (Best): 10 [OK], 10 above threshold*THRESHOLD_FUDGE

-(.\testDatas\result-ap_6.01r548x_SSE2_331_AVX.exe-LoThresh_v5.dat.res)-
Reportable Single Pulses: 30 [OK], 18 above threshold*THRESHOLD_FUDGE
Reportable Repeating Pulses: 30 [Weak]
Single Pulses (Best): 10 [OK], 10 above threshold*THRESHOLD_FUDGE

That "Pulses: pulse at signal 28 has no match (direction -->)" is the primary indicator. In this case, the subsequent counts indicate the mismatch was on a Repeating Pulse, and the Validator checks all of those.

Another possible cause of an inconclusive would be if one of the results has fewer "above threshold*THRESHOLD_FUDGE" Single Pulses than the other.

It's generally not possible to know if the final Validator judgement for inconclusives would be "invalid" or both get credit because they're weakly similar. However, cases where one result has e.g. 40 reportable signals and the other none are fairly obvious.

The comparison is more rigorous than the Validator because if there's some difference the result files can be manually compared to figure out what happened and judge if it's something which needs fixing.
Joe

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,768
RAC: 216
Bulgaria
Message 1328335 - Posted: 17 Jan 2013, 18:15:57 UTC

Great information, thank you again!
I'm running overclocked pieces, so testing after each new tuning needs to pass yet another filter before units flow back to server unattended.
Thats all I needed to know for now.
____________

Message boards : Number crunching : How to read APBench validation output

Copyright © 2014 University of California