What happened here ?

Message boards : Number crunching : What happened here ?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 699561 - Posted: 12 Jan 2008, 13:25:32 UTC

Not had time to click through all the result links to look for trends, but http://setiathome.berkeley.edu/workunit.php?wuid=199559134 looks worthy of investigation by someone with time to spare.

I think Joe Segur was interested in WU's like this a while back. I've been a bit busy in the non-SETI world recently, so have not had time to keep up to date with the latest debates here.
Sir Arthur C Clarke 1917-2008
ID: 699561 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 699644 - Posted: 12 Jan 2008, 21:45:22 UTC - in response to Message 699561.  

Not had time to click through all the result links to look for trends, but http://setiathome.berkeley.edu/workunit.php?wuid=199559134 looks worthy of investigation by someone with time to spare.

I think Joe Segur was interested in WU's like this a while back. I've been a bit busy in the non-SETI world recently, so have not had time to keep up to date with the latest debates here.

Yes, the 704923776 task result for that WU is an example of a bug I'm attempting to solve, but there's not enough information in the Task Details pages to be much help. The bug is almost certainly random and not related to the WU characteristics so I consider the compute errors from 2 other hosts unrelated.
                                                                Joe
ID: 699644 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 699739 - Posted: 13 Jan 2008, 13:02:45 UTC - in response to Message 699644.  

Josef W. Segur wrote:

Yes, the 704923776 task result for that WU is an example of a bug I'm attempting to solve, but there's not enough information in the Task Details pages to be much help. The bug is almost certainly random and not related to the WU characteristics so I consider the compute errors from 2 other hosts unrelated.
                                                                Joe


The 2 hosts on that WU that have not generated Compute Errors have different versions of MS Windows (Vista and XP) and one is an Intel Core 2 Quad, the other an AMD 64. When I have seen this before, it was the Vista system that had the -9 overflow, this time it is the XP one.

One of the ones with compute errors had a group of these close together, the other one seems to have just had 1 compute error.

As you say, there is not realy enough data to see any trend, just another random WU where the hosts dissagreed about the result.
Sir Arthur C Clarke 1917-2008
ID: 699739 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 700019 - Posted: 14 Jan 2008, 15:54:35 UTC

The validation has completed for that WU http://setiathome.berkeley.edu/workunit.php?wuid=199559134

The decider was:

<core_client_version>5.5.0</core_client_version>
<stderr_txt>
Optimized SETI@Home Enhanced application

Optimizers: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra
Version: Windows SSE 32-bit 'Ni!' based on seti V5.15 'Chicken Good!'
Rev: (R-2.0|QxK|FFT:IPP_SSE|Ben-Joe)
CPUID: 'AMD K7x Athlon XP (Barton)'
cpus: 1 cores: 1 threads: 1 cache: L1=64K L2=512K L3=0K
features: mmx 3Dnow 3Dnow+ sse
speed: 1831 MHz -- read megs/sec: L1=10314, L2=4320, RAM=919

Work Unit Info
True angle range: 0.407370

Spikes Pulses Triplets Gaussians Flops
2 0 1 1 16437514820842
</stderr_txt>

The other valid result was:

<core_client_version>5.10.20</core_client_version>
<![CDATA[
<stderr_txt>
Optimized SETI@Home Enhanced application
Optimizers: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra
Version: Windows SSSE3 32-bit based on S@H V5.15 'Noo? No - Ni!'
Revision: R-2.4V|xT|FFT:IPP_SSSE3|Ben-Joe
CPUID: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
Speed: 4 x 2400 MHz
Cache: L1=64K L2=4096K
Features: MMX SSE SSE2 SSE3 SSSE3

Work Unit Info
WU Credit multi. is: 2.85
WU True angle range: 0.407370

Spikes Pulses Triplets Gaussians Flops
3 0 1 1 16436770508021

</stderr_txt>
]]>
Sir Arthur C Clarke 1917-2008
ID: 700019 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 700025 - Posted: 14 Jan 2008, 16:41:54 UTC
Last modified: 14 Jan 2008, 16:43:24 UTC

Interesting that the spike count doesn't match but the WU still validated.

Although I did note the one of them was running the Coop v2.0.

Alinator
ID: 700025 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 700043 - Posted: 14 Jan 2008, 18:14:48 UTC - in response to Message 700025.  

Interesting that the spike count doesn't match but the WU still validated.

Although I did note the one of them was running the Coop v2.0.

Alinator

The Coop v2.0 result was also chosen as canonical. If this were a generously funded project with staff sitting around looking for something to do, grabbing those results for human inspection before they're deleted would resolve the questions that raises.

Although different counts which give "strongly similar" validation should be rare, both the way the Validator works and the limits of what can be measured make it possible. At FFT lengths of 256 and smaller the time resolution of analysis is 0.41 seconds or less for Spikes, Triplets, and Pulses. Because the time matching criteria in the Validator is 1 second, two signals within 1 second can be seen as identical. And because 1 second of telescope motion is less than 1 beam width for any angle range below 5.3687, those signals are identical for our purposes.
                                                                Joe
ID: 700043 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 700050 - Posted: 14 Jan 2008, 18:59:43 UTC - in response to Message 700043.  

Interesting that the spike count doesn't match but the WU still validated.

Although I did note the one of them was running the Coop v2.0.

Alinator

The Coop v2.0 result was also chosen as canonical. If this were a generously funded project with staff sitting around looking for something to do, grabbing those results for human inspection before they're deleted would resolve the questions that raises.

Although different counts which give "strongly similar" validation should be rare, both the way the Validator works and the limits of what can be measured make it possible. At FFT lengths of 256 and smaller the time resolution of analysis is 0.41 seconds or less for Spikes, Triplets, and Pulses. Because the time matching criteria in the Validator is 1 second, two signals within 1 second can be seen as identical. And because 1 second of telescope motion is less than 1 beam width for any angle range below 5.3687, those signals are identical for our purposes.
                                                                Joe


Yep, I remember that now from when you talked about this when the 2.0 version first came out and was having some validate trouble on LF data.

So I guess we can file this under the category 'Close enough for government work'. ;-)

Alinator
ID: 700050 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 701858 - Posted: 20 Jan 2008, 12:23:03 UTC
Last modified: 20 Jan 2008, 12:23:46 UTC

Here is another WU for the what happened here curious.

I just spotted this one in the task list for one of my "wingmen".

I'm not sure why the Redundant result, Cancelled by server happened on a WU that has not reached quorum yet?
Sir Arthur C Clarke 1917-2008
ID: 701858 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 701994 - Posted: 20 Jan 2008, 17:52:26 UTC
Last modified: 20 Jan 2008, 18:46:47 UTC

Take a look at the deadline for the wingman's task. The 221 was issued 4 days after the deadline expired.

So I'm going to guess that this was not a 221 per se, but rather an unconditional abort from the project to get a zombie WU moving again and it just gets reported to the status pages as a 221 for some reason. Perhaps one of the 'mysterious' backend tasks we hear about from time to time. ;-)

<edit> Looking at the time line for all the tasks on the WU, it would seem that the project actually gave it the 'boot' around the 11th, but the host didn't contact the project again until the 14th. So it would seem the 'de-zombizer' sets the status for the task to 'Redundant', regardless of whether it really is.

Alinator

<edit2> DUHHH.... Alinator!!!

Even simpler than that. Since the task had expired and had already been reissued to a new wingman, it was redundant at the time of contact and unstarted at that point, so it got 221'ed. The project had no way to know the new wingman was not going to report at all when it issued the 221 on the 14th.

Alinator
ID: 701994 · Report as offensive

Message boards : Number crunching : What happened here ?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.