Validation Inconclusive, is it me or is this becoming more common?

Message boards : Number crunching : Validation Inconclusive, is it me or is this becoming more common?
Message board moderation

To post messages, you must log in.

AuthorMessage
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1324928 - Posted: 5 Jan 2013, 14:59:22 UTC
Last modified: 5 Jan 2013, 15:08:29 UTC

Since they've broken that out for filtering on the results page I've notice that it's been growing as more and more of my GPU units are now waiting for a 2nd opinion. Now I agree that outright mismatches in number of pulses/spikes/triplets/Gaussians detected warrant another look but when both a CPU and GPU unit report the same, even all 0s, why would it be classified as inconclusive? It can't just be because of the flop count, there's no reason why two different apps, optimized for their specific platforms, have the same or reasonably close flop count.

When they first introduced the breakout, my inconclusive units were around 1/3 of my pending. Now they are nearly the same and is more than my daily valid. Did someone tightened up the valid requirements recently?

Edit: I just noticed that it appears that the additional unit isn't being assigned and hasn't been for a while now. That may have to do with the huge pile of MB units generated post crash being handed out first and would also explain why the number of inconclusive is growing.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1324928 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1324933 - Posted: 5 Jan 2013, 15:11:28 UTC - in response to Message 1324928.  

Since they've broken that out for filtering on the results page I've notice that it's been growing as more and more of my GPU units are now waiting for a 2nd opinion. Now I agree that outright mismatches in number of pulses/spikes/triplets/Gaussians detected warrant another look but when both a CPU and GPU unit report the same, even all 0s, why would it be classified as inconclusive? It can't just be because of the flop count, there's no reason why two different apps, optimized for their specific platforms, have the same or reasonably close flop count.

When they first introduced the breakout, my inconclusive units were around 1/3 of my pending. Now they are nearly the same and is more than my daily valid. Did someone tightened up the valid requirements recently?

Edit: I just noticed that it appears that the additional unit isn't being assigned and hasn't been for a while now. That may have to do with the huge pile of MB units generated post crash being handed out first and would also explain why the number of inconclusive is growing.


I think this was always going on in the background, and it's only that they sorted it into a new stat just to keep us worried...LOL.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1324933 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22149
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1324942 - Posted: 5 Jan 2013, 15:34:16 UTC

I'm not sure about being more common, it certainly "plagues" some crunchers far worse than others.
There are certainly crunchers out there that are far worse at producing an inconclusive result than others. My uncalibrated eye suggests that the same crunchers have a higher than average error rate, and show far more tasks than one would reasonably expect with the currently imposed distribution limits.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1324942 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1324944 - Posted: 5 Jan 2013, 15:39:13 UTC - in response to Message 1324942.  
Last modified: 5 Jan 2013, 15:39:35 UTC

I'm not sure about being more common, it certainly "plagues" some crunchers far worse than others.
There are certainly crunchers out there that are far worse at producing an inconclusive result than others. My uncalibrated eye suggests that the same crunchers have a higher than average error rate, and show far more tasks than one would reasonably expect with the currently imposed distribution limits.

I watch the stats.......keep a window open with them and refresh it now and again.
I have caught a rig going south now and again, but VI is NOT an indication of a problem, unless something goes really wild.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1324944 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1324961 - Posted: 5 Jan 2013, 16:24:25 UTC
Last modified: 5 Jan 2013, 16:25:21 UTC

As I recall from previous discussions, the std.err.out report that we see is only a summary, not the whole story. There are significant differences between CPU and GPU apps in the way they process the WU data, so even if the pulse/spike/etc Counts are the same, they may not BE the same pulses, spikes, etc. I have seen this many times when my stock CPU apps are paired with GPU or Optimized apps.

There are also differences in precision and thresholds, so a particular signal may be above threshold on one app and just below it (and so not be counted) on another. There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look.

Also, IIRC, resends go to the end of the queue, so they wait until after all the current batch of new WUs get sent out. They built up a big batch of new WUs to help us get through the planned outage this weekend, the resends are behind them.
Donald
Infernal Optimist / Submariner, retired
ID: 1324961 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34249
Credit: 79,922,639
RAC: 80
Germany
Message 1324976 - Posted: 5 Jan 2013, 17:05:16 UTC
Last modified: 5 Jan 2013, 17:06:02 UTC

Definitely not your fault Keith.
I noticed this a while back.

Will run a few of those offline to check it closer.


With each crime and every kindness we birth our future.
ID: 1324976 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1325031 - Posted: 5 Jan 2013, 19:13:39 UTC - in response to Message 1324961.  

...
There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look.
...

100% is required for an immediate "strongly similar" validation, any less goes inconclusive. Credit (and the accompanying "valid" status) is granted eventually for all which match at least 50% (weakly similar).

Classic S@h kept track of the "best so far" signal of each type so there would be something to show on the graphics, and that of course carried over into seti_boinc. In addition those "best" signals are sent in the result uploaded to the BOINC servers unless there's an overflow, to give the Validator something to compare even if there are no reportable signals. So in addition to the counts of reportable signals, validation includes up to four additional signal comparisons.
                                                                    Joe
ID: 1325031 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1325189 - Posted: 6 Jan 2013, 8:05:02 UTC

I have noticed in the past few weeks being AP-only that I'm starting to see a lot of inconclusive results on the first go-around. Almost always, it is an open_cl GPU app that I get paired with..sometimes stock, sometimes optimized. At this moment, I have three of them, and they're all against stock.

In these past few weeks, all of them have granted me credit, as well as the other two parties. I haven't seen one yet that left someone with nothing, so that must be evidence of the precision thresholds between CPU and GPU. They're all above 50% similar, but obviously wasn't 100% for the first wingmate.

It is interesting though, because other APs will get paired up with stock open_cl and validate just fine. It just used to be maybe 1 out of 100 that would end up being inconclusive, but now it seems to be closer to about 1 in 20.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1325189 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34249
Credit: 79,922,639
RAC: 80
Germany
Message 1325201 - Posted: 6 Jan 2013, 9:35:00 UTC
Last modified: 6 Jan 2013, 9:53:07 UTC

Dont forget difference in hardware.
Also drivers have difference precision.
All GPU apps are tested with 99.x % validness.


With each crime and every kindness we birth our future.
ID: 1325201 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34249
Credit: 79,922,639
RAC: 80
Germany
Message 1325208 - Posted: 6 Jan 2013, 10:44:32 UTC

Here is a very good example.

http://setiathome.berkeley.edu/workunit.php?wuid=1145162083

Both are running a AMD CPU with same app.
But my CPU found 1 pulse more.



With each crime and every kindness we birth our future.
ID: 1325208 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1325298 - Posted: 6 Jan 2013, 14:39:10 UTC - in response to Message 1325208.  
Last modified: 6 Jan 2013, 14:39:35 UTC

And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness?

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1325298 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1325324 - Posted: 6 Jan 2013, 15:59:06 UTC - in response to Message 1325298.  
Last modified: 6 Jan 2013, 15:59:25 UTC

And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness?

It didn't find nothing, it found Zero REPORTED signals, there were signals found, just they aren't good enough to be reported:

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 1.402081

Best spike: peak=23.39342, time=10.07, d_freq=1420177277.75, chirp=9.8563, fft_len=64k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+011, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0
Best pulse: peak=4.749264, time=21.05, period=0.5931, d_freq=1420183958.7, score=0.9256, chirp=-46.445, fft_len=64
Best triplet: peak=0, time=-2.122e+011, period=0, d_freq=0, chirp=0, fft_len=0


Flopcounter: 8852515626531.082000

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Wallclock time elapsed since last restart: 870.4 seconds



Claggy
ID: 1325324 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1325328 - Posted: 6 Jan 2013, 16:21:22 UTC - in response to Message 1325298.  

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.

As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today).

As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{)
Donald
Infernal Optimist / Submariner, retired
ID: 1325328 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1325330 - Posted: 6 Jan 2013, 16:35:53 UTC - in response to Message 1325328.  

Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping.

I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another.

As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today).

As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{)

I suspect it's even more drastic than that, and I don't think it was a deliberate attempt to stock up in advance of the outage.

We had a server status page lockup at 02:50 UTC on 01 January, and at that moment the number of tasks shown 'ready to send' was below the high water mark that inhibits the splitters. I suspect they never stopped splitting until normal server status display was re-established some 36 hours later, by which time nearly five million tasks were available:



We've been chewing them ever since, and I doubt a single new task has been split from tape in all that time.

Some time fairly soon, we're going to reach the tail-end, comprised exclusively of replacement tasks for all the timeouts/errors/aborts/abandons/inconclusives for last week - probably half a million of them. That should cut down the pendings a bit.
ID: 1325330 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1325365 - Posted: 6 Jan 2013, 20:03:51 UTC

Looks like we dug down into the resend stratum about half an hour ago.
ID: 1325365 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1325563 - Posted: 7 Jan 2013, 18:55:33 UTC - in response to Message 1325365.  
Last modified: 7 Jan 2013, 18:57:44 UTC

And yes, as sure as Galoka didn't think that the Ulus were too ugly to save, my Validation Inconclusive stack has been steadily decreasing since last night now that all the tie-breaker units have been assigned to wingmen. That'll also mean my RAC will bounce back up to where it should historically be.

It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code. It makes me feel like I'm not doing proper science.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1325563 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1325662 - Posted: 8 Jan 2013, 2:05:39 UTC - in response to Message 1325563.  

It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code..

That iz odd,
i was thinking my ATI card waz finding more than stock in my Validation Inconclusive stack,
Ho hum, thats life :¬)
ID: 1325662 · Report as offensive

Message boards : Number crunching : Validation Inconclusive, is it me or is this becoming more common?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.