Validation Inconclusive, is it me or is this becoming more common?

Author	Message
Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1324928 - Posted: 5 Jan 2013, 14:59:22 UTC Last modified: 5 Jan 2013, 15:08:29 UTC Since they've broken that out for filtering on the results page I've notice that it's been growing as more and more of my GPU units are now waiting for a 2nd opinion. Now I agree that outright mismatches in number of pulses/spikes/triplets/Gaussians detected warrant another look but when both a CPU and GPU unit report the same, even all 0s, why would it be classified as inconclusive? It can't just be because of the flop count, there's no reason why two different apps, optimized for their specific platforms, have the same or reasonably close flop count. When they first introduced the breakout, my inconclusive units were around 1/3 of my pending. Now they are nearly the same and is more than my daily valid. Did someone tightened up the valid requirements recently? Edit: I just noticed that it appears that the additional unit isn't being assigned and hasn't been for a while now. That may have to do with the huge pile of MB units generated post crash being handed out first and would also explain why the number of inconclusive is growing. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1324928 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1324933 - Posted: 5 Jan 2013, 15:11:28 UTC - in response to Message 1324928. Since they've broken that out for filtering on the results page I've notice that it's been growing as more and more of my GPU units are now waiting for a 2nd opinion. Now I agree that outright mismatches in number of pulses/spikes/triplets/Gaussians detected warrant another look but when both a CPU and GPU unit report the same, even all 0s, why would it be classified as inconclusive? It can't just be because of the flop count, there's no reason why two different apps, optimized for their specific platforms, have the same or reasonably close flop count. When they first introduced the breakout, my inconclusive units were around 1/3 of my pending. Now they are nearly the same and is more than my daily valid. Did someone tightened up the valid requirements recently? Edit: I just noticed that it appears that the additional unit isn't being assigned and hasn't been for a while now. That may have to do with the huge pile of MB units generated post crash being handed out first and would also explain why the number of inconclusive is growing. I think this was always going on in the background, and it's only that they sorted it into a new stat just to keep us worried...LOL. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1324933 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22188 Credit: 416,307,556 RAC: 380	Message 1324942 - Posted: 5 Jan 2013, 15:34:16 UTC I'm not sure about being more common, it certainly "plagues" some crunchers far worse than others. There are certainly crunchers out there that are far worse at producing an inconclusive result than others. My uncalibrated eye suggests that the same crunchers have a higher than average error rate, and show far more tasks than one would reasonably expect with the currently imposed distribution limits. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1324942 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1324944 - Posted: 5 Jan 2013, 15:39:13 UTC - in response to Message 1324942. Last modified: 5 Jan 2013, 15:39:35 UTC I'm not sure about being more common, it certainly "plagues" some crunchers far worse than others. There are certainly crunchers out there that are far worse at producing an inconclusive result than others. My uncalibrated eye suggests that the same crunchers have a higher than average error rate, and show far more tasks than one would reasonably expect with the currently imposed distribution limits. I watch the stats.......keep a window open with them and refresh it now and again. I have caught a rig going south now and again, but VI is NOT an indication of a problem, unless something goes really wild. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1324944 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1324961 - Posted: 5 Jan 2013, 16:24:25 UTC Last modified: 5 Jan 2013, 16:25:21 UTC As I recall from previous discussions, the std.err.out report that we see is only a summary, not the whole story. There are significant differences between CPU and GPU apps in the way they process the WU data, so even if the pulse/spike/etc Counts are the same, they may not BE the same pulses, spikes, etc. I have seen this many times when my stock CPU apps are paired with GPU or Optimized apps. There are also differences in precision and thresholds, so a particular signal may be above threshold on one app and just below it (and so not be counted) on another. There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look. Also, IIRC, resends go to the end of the queue, so they wait until after all the current batch of new WUs get sent out. They built up a big batch of new WUs to help us get through the planned outage this weekend, the resends are behind them. Donald Infernal Optimist / Submariner, retired ID: 1324961 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1324976 - Posted: 5 Jan 2013, 17:05:16 UTC Last modified: 5 Jan 2013, 17:06:02 UTC Definitely not your fault Keith. I noticed this a while back. Will run a few of those offline to check it closer. With each crime and every kindness we birth our future. ID: 1324976 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1325031 - Posted: 5 Jan 2013, 19:13:39 UTC - in response to Message 1324961. ... There is a mechanism to determine how many signals match and how many do not. IIRC, it is 50% or more to validate, less than 50% you get an "inconclusive" and a third look. ... 100% is required for an immediate "strongly similar" validation, any less goes inconclusive. Credit (and the accompanying "valid" status) is granted eventually for all which match at least 50% (weakly similar). Classic S@h kept track of the "best so far" signal of each type so there would be something to show on the graphics, and that of course carried over into seti_boinc. In addition those "best" signals are sent in the result uploaded to the BOINC servers unless there's an overflow, to give the Validator something to compare even if there are no reportable signals. So in addition to the counts of reportable signals, validation includes up to four additional signal comparisons. Joe ID: 1325031 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1325189 - Posted: 6 Jan 2013, 8:05:02 UTC I have noticed in the past few weeks being AP-only that I'm starting to see a lot of inconclusive results on the first go-around. Almost always, it is an open_cl GPU app that I get paired with..sometimes stock, sometimes optimized. At this moment, I have three of them, and they're all against stock. In these past few weeks, all of them have granted me credit, as well as the other two parties. I haven't seen one yet that left someone with nothing, so that must be evidence of the precision thresholds between CPU and GPU. They're all above 50% similar, but obviously wasn't 100% for the first wingmate. It is interesting though, because other APs will get paired up with stock open_cl and validate just fine. It just used to be maybe 1 out of 100 that would end up being inconclusive, but now it seems to be closer to about 1 in 20. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1325189 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1325201 - Posted: 6 Jan 2013, 9:35:00 UTC Last modified: 6 Jan 2013, 9:53:07 UTC Dont forget difference in hardware. Also drivers have difference precision. All GPU apps are tested with 99.x % validness. With each crime and every kindness we birth our future. ID: 1325201 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1325208 - Posted: 6 Jan 2013, 10:44:32 UTC Here is a very good example. http://setiathome.berkeley.edu/workunit.php?wuid=1145162083 Both are running a AMD CPU with same app. But my CPU found 1 pulse more. With each crime and every kindness we birth our future. ID: 1325208 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1325298 - Posted: 6 Jan 2013, 14:39:10 UTC - in response to Message 1325208. Last modified: 6 Jan 2013, 14:39:35 UTC And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness? Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping. I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1325298 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1325324 - Posted: 6 Jan 2013, 15:59:06 UTC - in response to Message 1325298. Last modified: 6 Jan 2013, 15:59:25 UTC And that I can understand. However I've found another case were my system and my wing man's reported zero spikes, pulses, triplets, gaussians found and the it is still considered inconclusive. Did we somehow not find the same degree of nothingness? It didn't find nothing, it found Zero REPORTED signals, there were signals found, just they aren't good enough to be reported: Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 1.402081 Best spike: peak=23.39342, time=10.07, d_freq=1420177277.75, chirp=9.8563, fft_len=64k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+011, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=4.749264, time=21.05, period=0.5931, d_freq=1420183958.7, score=0.9256, chirp=-46.445, fft_len=64 Best triplet: peak=0, time=-2.122e+011, period=0, d_freq=0, chirp=0, fft_len=0 Flopcounter: 8852515626531.082000 Spike count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Wallclock time elapsed since last restart: 870.4 seconds Claggy ID: 1325324 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1325328 - Posted: 6 Jan 2013, 16:21:22 UTC - in response to Message 1325298. Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping. I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another. As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today). As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{) Donald Infernal Optimist / Submariner, retired ID: 1325328 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1325330 - Posted: 6 Jan 2013, 16:35:53 UTC - in response to Message 1325328. Anyway like I stated before, the build up of inconclusive units is because the tie-breaker isn't being assigned for some reason over the last week or so. I imagine in the normal course of events (especially due to the 100 unit queue limits) the tie-breaker will return in a day or three and the work unit moved to validated, keeping the inconclusive list from growing longer while also keeping my RAC from slipping. I'm more interested is seeing the tie-breaker being assigned again than knowing the reason why one detection of nothing isn't the same as another. As I mentioned earlier, resends go to the end of the queue, behind all current new WUs. They built up a batch of over 1 million new WUs to issuie ahead of the expected outage Friday, those had to be assigned before our resends could go out. I had one that erroreed on me on 2 Jan, the resend didn't go out until the 6th (today). As some of us used to say a lot more often than we do now, around here patience is not just a virtue, it is a requirement (8{) I suspect it's even more drastic than that, and I don't think it was a deliberate attempt to stock up in advance of the outage. We had a server status page lockup at 02:50 UTC on 01 January, and at that moment the number of tasks shown 'ready to send' was below the high water mark that inhibits the splitters. I suspect they never stopped splitting until normal server status display was re-established some 36 hours later, by which time nearly five million tasks were available: We've been chewing them ever since, and I doubt a single new task has been split from tape in all that time. Some time fairly soon, we're going to reach the tail-end, comprised exclusively of replacement tasks for all the timeouts/errors/aborts/abandons/inconclusives for last week - probably half a million of them. That should cut down the pendings a bit. ID: 1325330 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1325365 - Posted: 6 Jan 2013, 20:03:51 UTC Looks like we dug down into the resend stratum about half an hour ago. ID: 1325365 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1325563 - Posted: 7 Jan 2013, 18:55:33 UTC - in response to Message 1325365. Last modified: 7 Jan 2013, 18:57:44 UTC And yes, as sure as Galoka didn't think that the Ulus were too ugly to save, my Validation Inconclusive stack has been steadily decreasing since last night now that all the tie-breaker units have been assigned to wingmen. That'll also mean my RAC will bounce back up to where it should historically be. It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code. It makes me feel like I'm not doing proper science. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1325563 ·

.clair. Send message Joined: 4 Nov 04 Posts: 1300 Credit: 55,390,408 RAC: 69	Message 1325662 - Posted: 8 Jan 2013, 2:05:39 UTC - in response to Message 1325563. It still bothers me that the optimized ATI/AMD code has a tendency to under detect pulses relative to the stock CPU code.. That iz odd, i was thinking my ATI card waz finding more than stock in my Validation Inconclusive stack, Ho hum, thats life :Â¬) ID: 1325662 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.