Validation Inconclusive, is it me or is this becoming more common?


log in

Advanced search

Message boards : Number crunching : Validation Inconclusive, is it me or is this becoming more common?

Author Message
Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,898,665
RAC: 2,449
United States
Message 1324928 - Posted: 5 Jan 2013, 14:59:22 UTC
Last modified: 5 Jan 2013, 15:08:29 UTC

Since they've broken that out for filtering on the results page I've notice that it's been growing as more and more of my GPU units are now waiting for a 2nd opinion. Now I agree that outright mismatches in number of pulses/spikes/triplets/Gaussians detected warrant another look but when both a CPU and GPU unit report the same, even all 0s, why would it be classified as inconclusive? It can't just be because of the flop count, there's no reason why two different apps, optimized for their specific platforms, have the same or reasonably close flop count.

When they first introduced the breakout, my inconclusive units were around 1/3 of my pending. Now they are nearly the same and is more than my daily valid. Did someone tightened up the valid requirements recently?

Edit: I just noticed that it appears that the additional unit isn't being assigned and hasn't been for a while now. That may have to do with the huge pile of MB units generated post crash being handed out first and would also explain why the number of inconclusive is growing.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8535
Credit: 59,534,564
RAC: 87,931
United Kingdom
Message 1324942 - Posted: 5 Jan 2013, 15:34:16 UTC

I'm not sure about being more common, it certainly "plagues" some crunchers far worse than others.
There are certainly crunchers out there that are far worse at producing an inconclusive result than others. My uncalibrated eye suggests that the same crunchers have a higher than average error rate, and show far more tasks than one would reasonably expect with the currently imposed distribution limits.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6264
Credit: 740,403
RAC: 1,168
United States
Message 1324961 - Posted: 5 Jan 2013, 16:24:25 UTC
Last modified: 5 Jan 2013, 16:25:21 UTC

As I recall from previous discussions, the std.err.out report that we see is only a summary, not the whole story. There are significant differences between CPU and GPU apps in the way they process the WU data, so even if the pulse/spike/etc Counts are the same, they may not BE the same pulses, spikes, etc. I have seen this many times when my stock CPU apps are paired with GPU or Optimized apps.

There are also differences in precision and thresholds, so a particular signal may be above threshold on one app and just below it (and so not be counted) on another. There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look.

Also, IIRC, resends go to the end of the queue, so they wait until after all the current batch of new WUs get sent out. They built up a big batch of new WUs to help us get through the planned outage this weekend, the resends are behind them.
____________
Donald
Infernal Optimist / Submariner, retired

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24558
Credit: 33,901,422
RAC: 23,994
Germany
Message 1324976 - Posted: 5 Jan 2013, 17:05:16 UTC
Last modified: 5 Jan 2013, 17:06:02 UTC

Definitely not your fault Keith.
I noticed this a while back.

Will run a few of those offline to check it closer.
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4306
Credit: 1,075,680
RAC: 1,297
United States
Message 1325031 - Posted: 5 Jan 2013, 19:13:39 UTC - in response to Message 1324961.

...
There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look.
...

100% is required for an immediate "strongly similar" validation, any less goes inconclusive. Credit (and the accompanying "valid" status) is granted eventually for all which match at least 50% (weakly similar).

Classic S@h kept track of the "best so far" signal of each type so there would be something to show on the graphics, and that of course carried over into seti_boinc. In addition those "best" signals are sent in the result uploaded to the BOINC servers unless there's an overflow, to give the Validator something to compare even if there are no reportable signals. So in addition to the counts of reportable signals, validation includes up to four additional signal comparisons.
Joe

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2292
Credit: 8,816,987
RAC: 4,041
United States
Message 1325189 - Posted: 6 Jan 2013, 8:05:02 UTC

I have noticed in the past few weeks being AP-only that I'm starting to see a lot of inconclusive results on the first go-around. Almost always, it is an open_cl GPU app that I get paired with..sometimes stock, sometimes optimized. At this moment, I have three of them, and they're all against stock.

In these past few weeks, all of them have granted me credit, as well as the other two parties. I haven't seen one yet that left someone with nothing, so that must be evidence of the precision thresholds between CPU and GPU. They're all above 50% similar, but obviously wasn't 100% for the first wingmate.

It is interesting though, because other APs will get paired up with stock open_cl and validate just fine. It just used to be maybe 1 out of 100 that would end up being inconclusive, but now it seems to be closer to about 1 in 20.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24558
Credit: 33,901,422
RAC: 23,994
Germany
Message 1325201 - Posted: 6 Jan 2013, 9:35:00 UTC
Last modified: 6 Jan 2013, 9:53:07 UTC

Dont forget difference in hardware.
Also drivers have difference precision.
All GPU apps are tested with 99.x % validness.
____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24558
Credit: 33,901,422
RAC: 23,994
Germany
Message 1325208 - Posted: 6 Jan 2013, 10:44:32 UTC

Here is a very good example.

http://setiathome.berkeley.edu/workunit.php?wuid=1145162083

Both are running a AMD CPU with same app.
But my CPU found 1 pulse more.

____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,898,665
RAC: 2,449
United States
Message 1325298 - Posted: 6 Jan 2013, 14:39:10 UTC - in response to Message 1325208.
Last modified: 6 Jan 2013, 14:39:35 UTC

And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness?

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4141
Credit: 33,641,963
RAC: 27,818
United Kingdom
Message 1325324 - Posted: 6 Jan 2013, 15:59:06 UTC - in response to Message 1325298.
Last modified: 6 Jan 2013, 15:59:25 UTC

And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness?

It didn't find nothing, it found Zero REPORTED signals, there were signals found, just they aren't good enough to be reported:

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 1.402081

Best spike: peak=23.39342, time=10.07, d_freq=1420177277.75, chirp=9.8563, fft_len=64k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+011, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0
Best pulse: peak=4.749264, time=21.05, period=0.5931, d_freq=1420183958.7, score=0.9256, chirp=-46.445, fft_len=64
Best triplet: peak=0, time=-2.122e+011, period=0, d_freq=0, chirp=0, fft_len=0


Flopcounter: 8852515626531.082000

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Wallclock time elapsed since last restart: 870.4 seconds



Claggy

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6264
Credit: 740,403
RAC: 1,168
United States
Message 1325328 - Posted: 6 Jan 2013, 16:21:22 UTC - in response to Message 1325298.

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.

As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today).

As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{)
____________
Donald
Infernal Optimist / Submariner, retired

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8634
Credit: 51,641,494
RAC: 49,018
United Kingdom
Message 1325330 - Posted: 6 Jan 2013, 16:35:53 UTC - in response to Message 1325328.

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.

As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today).

As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{)

I suspect it's even more drastic than that, and I don't think it was a deliberate attempt to stock up in advance of the outage.

We had a server status page lockup at 02:50 UTC on 01 January, and at that moment the number of tasks shown 'ready to send' was below the high water mark that inhibits the splitters. I suspect they never stopped splitting until normal server status display was re-established some 36 hours later, by which time nearly five million tasks were available:



We've been chewing them ever since, and I doubt a single new task has been split from tape in all that time.

Some time fairly soon, we're going to reach the tail-end, comprised exclusively of replacement tasks for all the timeouts/errors/aborts/abandons/inconclusives for last week - probably half a million of them. That should cut down the pendings a bit.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8634
Credit: 51,641,494
RAC: 49,018
United Kingdom
Message 1325365 - Posted: 6 Jan 2013, 20:03:51 UTC

Looks like we dug down into the resend stratum about half an hour ago.

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,898,665
RAC: 2,449
United States
Message 1325563 - Posted: 7 Jan 2013, 18:55:33 UTC - in response to Message 1325365.
Last modified: 7 Jan 2013, 18:57:44 UTC

And yes, as sure as Galoka didn't think that the Ulus were too ugly to save, my Validation Inconclusive stack has been steadily decreasing since last night now that all the tie-breaker units have been assigned to wingmen. That'll also mean my RAC will bounce back up to where it should historically be.

It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code. It makes me feel like I'm not doing proper science.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,063,759
RAC: 454
United Kingdom
Message 1325662 - Posted: 8 Jan 2013, 2:05:39 UTC - in response to Message 1325563.

It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code..

That iz odd,
i was thinking my ATI card waz finding more than stock in my Validation Inconclusive stack,
Ho hum, thats life :¬)

Message boards : Number crunching : Validation Inconclusive, is it me or is this becoming more common?

Copyright © 2014 University of California