Error after match?

Message boards : Number crunching : Error after match?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 2025700 - Posted: 31 Dec 2019, 14:26:09 UTC
Last modified: 31 Dec 2019, 15:12:55 UTC

This just looks strange to me. Two work units which had validated, then the WU is sent to me where the result is deemed invalid.
Questions is why was the WU sent out again? and could this be some sort of bad validation?
https://setiathome.berkeley.edu/workunit.php?wuid=3808043165

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 2025700 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3777
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2025704 - Posted: 31 Dec 2019, 14:42:33 UTC - in response to Message 2025700.  

Please see this thread. The quorum on -9 overflows is now 3 (possibly even 4) due to all of the AMD RX series cards cross-validating incorrect -9 results.
ID: 2025704 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 2025706 - Posted: 31 Dec 2019, 15:05:49 UTC - in response to Message 2025704.  
Last modified: 31 Dec 2019, 15:14:46 UTC

Yes but that thread is discussing AMD GPUs. My example is between three NVIDIA GPUs?
And the three WUs did not validate.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 2025706 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14656
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2025707 - Posted: 31 Dec 2019, 15:26:22 UTC

OK, that looks like a problem.

1) A third wingmate for checking purposes after overflow is what Eric introduced because of the AMD problem. But I think it applies to all WUs, not just those allocated to AMD cards.
2) The first two NVidia cards are both running special sauce - I think it's been acknowledged that this has problems on overflows. But they were so far out of tolerance that your CPU (not NVidia - only two of those) wasn't even weakly similar? That's bad. You found a pulse that the others missed.
ID: 2025707 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3777
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2025708 - Posted: 31 Dec 2019, 15:26:42 UTC - in response to Message 2025706.  
Last modified: 31 Dec 2019, 15:33:07 UTC

The change is for all -9s. The validator has no capacity to evaluate what platform complete the task. The thread merely explains why this is being done.
Edit: @Richard Did TBar indicate that his newest 10.2 compile improve pulse-finding to reduce the Inconclusives? I seem to remember that.
ID: 2025708 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14656
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2025710 - Posted: 31 Dec 2019, 15:43:48 UTC - in response to Message 2025708.  

@Richard Did TBar indicate that his newest 10.2 compile improve pulse-finding to reduce the Inconclusives? I seem to remember that.
I read that too.

But in this case the first is a 101 built by TBar, and the second was a mutex build by Ian. Sounds like it's one that should be checked on the bench, but the data file won't be available now validation has concluded. It probably should have gone to a fourth wingmate, and then we could have caught it for forensic dissection.
ID: 2025710 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 2025713 - Posted: 31 Dec 2019, 15:59:17 UTC - in response to Message 2025710.  

Sorry that I missed the fact that I was doing the WU on the CPU.
I thought is looked strange in any case and it just might be.
Thanks for looking.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 2025713 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2025718 - Posted: 31 Dec 2019, 16:18:31 UTC

this is a function of how CUDA vs other apps go about finding signals combined with the limit of 30 signals before exit.

Looks like you have 2 systems running the special app vs a CPU app on a noisy WU.

the CUDA apps find triplets first, hitting 30, then exiting.

the CPU app looks for pulses first, finding 29, moving to triplets, finds another, then exiting. So in this case, the two CUDA apps agree and get the credit. say if the limit was 100 or 1000, then maybe we would see less resends due to noisy WUs. but I understand needing a limit somewhere.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2025718 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2025729 - Posted: 31 Dec 2019, 18:32:59 UTC - in response to Message 2025710.  

@Richard Did TBar indicate that his newest 10.2 compile improve pulse-finding to reduce the Inconclusives? I seem to remember that.
I read that too.

But in this case the first is a 101 built by TBar, and the second was a mutex build by Ian. Sounds like it's one that should be checked on the bench, but the data file won't be available now validation has concluded. It probably should have gone to a fourth wingmate, and then we could have caught it for forensic dissection.

If you check it on the bench you will find that All CUDA Apps, even the Baseline CUDA App built by nVidia in 2007, will give that result. A few years ago when most results were coming from the Window's CUDA App that result would have been deemed "normal" and valid.
ID: 2025729 · Report as offensive

Message boards : Number crunching : Error after match?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.