Posts by Saicere

1) Message boards : Number crunching : Intel GPU Validation Inconclusive (Message 1788674)
Posted 19 May 2016 by Profile Saicere
there could be something weird going on with the Seti@Home result validator as well.

Consider simple example.
Let say validator consider results as different if difference more than 2.
Let result A be 1, result B be 3.1
Apparently they are different for validator.
Then result C comes and it's 1.5

What validator should discard then?

I'm not saying that the validator doesn't have to make choices like this, but when WU results involving the Intel HD 530 produce "Validation Inconclusive" results 50% of the time, there might be something off with the thresholds for flagging it as such. The other possible conclusion is that the HD 530 GPU and/or OpenCL drivers are just bad at math.
2) Message boards : Number crunching : Intel GPU Validation Inconclusive (Message 1788452)
Posted 18 May 2016 by Profile Saicere
Ah, nearly forgot. You should also check this thread:

Thanks for the tip. I looked at the thread in question, but I'm not seeing any kind of runtime error or anything else that indicates that the work unit failed in any way, it just seems to be generating results that are slightly different from some other computers.

Without knowing the logic behind the result validator/comparator, it seems sufficiently different to get a "Validation Inconclusive" state but still get validated after a third result. For the GPU results for these two computers, they are now sitting at 20 valid, 1 invalid and 25 inconclusive. Out of the valid ones, quite a few were inconclusive, but then got validated for all three tasks, which seems weird. If the first two task results had discrepancies, shouldn't at least one of them be invalid?

I guess it's possible that Intel pulled another FDIV bug with the Skylake HD 530 GPU, but there could be something weird going on with the Seti@Home result validator as well.

Also, driver versions:

OpenCL: Intel GPU 0: Intel(R) HD Graphics 530 (driver version, device version OpenCL 2.0, 13041MB, 13041MB available, 221 GFLOPS peak)
OpenCL CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz (OpenCL driver vendor: Intel(R) Corporation, driver version, device version OpenCL 2.0 (Build 10094))
3) Message boards : Number crunching : Intel GPU Validation Inconclusive (Message 1788408)
Posted 18 May 2016 by Profile Saicere
Was looking a bit more at this, specifically the one work unit that was marked as invalid:

The results look similar, while the AMD GPU and CPU results found 9 pulses and a triplet, the Intel GPU result found 10 pulses. Comparing the actual results from the AMD and Intel GPU, sorted by time with the differing entry at the bottom:


Pulse: peak=0.1869025, time=45.82, period=0.06417, d_freq=1763327896.77, score=1.011, chirp=-20.16, fft_len=128 
Pulse: peak=0.5658, time=45.82, period=0.4011, d_freq=1763332430.22, score=1.053, chirp=48.532, fft_len=256 
Pulse: peak=6.95606, time=45.84, period=15.58, d_freq=1763333460.1, score=1.052, chirp=-97.251, fft_len=512 
Pulse: peak=7.482055, time=45.84, period=15.58, d_freq=1763333451.55, score=1.131, chirp=-97.437, fft_len=512 
Pulse: peak=7.584933, time=45.99, period=22.91, d_freq=1763331222.83, score=1.02, chirp=51.262, fft_len=4k
Pulse: peak=4.384547, time=45.99, period=10.5, d_freq=1763326967.09, score=1.035, chirp=55.625, fft_len=4k
Pulse: peak=5.635067, time=45.9, period=14.58, d_freq=1763325916.18, score=1.032, chirp=-68.506, fft_len=2k
Pulse: peak=4.664864, time=45.9, period=10.77, d_freq=1763329940.72, score=1.073, chirp=-76.392, fft_len=2k
Pulse: peak=7.535296, time=46.17, period=21, d_freq=1763323584.46, score=1.044, chirp=46.654, fft_len=8k

Triplet: peak=10.01984, time=48.65, period=28.95, d_freq=1763323381.63, chirp=43.492, fft_len=512 

Pulse: peak=0.1866636, time=45.82, period=0.06417, d_freq=1763327896.77, score=1.01, chirp=-20.16, fft_len=128 
Pulse: peak=0.5739036, time=45.82, period=0.4011, d_freq=1763332430.22, score=1.068, chirp=48.532, fft_len=256 
Pulse: peak=6.865847, time=45.84, period=15.58, d_freq=1763333460.1, score=1.038, chirp=-97.251, fft_len=512 
Pulse: peak=7.336316, time=45.84, period=15.58, d_freq=1763333451.55, score=1.109, chirp=-97.437, fft_len=512 
Pulse: peak=7.647188, time=45.99, period=22.91, d_freq=1763331222.83, score=1.028, chirp=51.262, fft_len=4k
Pulse: peak=4.426685, time=45.99, period=10.5, d_freq=1763326967.09, score=1.045, chirp=55.625, fft_len=4k
Pulse: peak=5.647213, time=45.9, period=14.58, d_freq=1763325916.18, score=1.034, chirp=-68.506, fft_len=2k
Pulse: peak=4.581229, time=45.9, period=10.77, d_freq=1763329940.72, score=1.054, chirp=-76.392, fft_len=2k
Pulse: peak=7.537786, time=46.17, period=21, d_freq=1763323584.46, score=1.044, chirp=46.654, fft_len=8k

Pulse: peak=1.256464, time=45.84, period=1.668, d_freq=1763332922.82, score=1, chirp=54.879, fft_len=512 

Outside of the last pulse/triplet and a minor discrepancy in the peaks, the results are matching, with identical time/period/d_freq/chirp/fft_len.
4) Message boards : Number crunching : Intel GPU Validation Inconclusive (Message 1788382)
Posted 18 May 2016 by Profile Saicere
I recently added two new Skylake 6700K-based computers with the Intel GPU enabled and available for Seti@Home to use, but I'm seeing a lot of "Validation Inconclusive" results on both for work units using the Intel OpenCL application (specifically, SETI@home v8 v8.00 (opencl_intel_gpu_sah) windows_intelx86).

CPU-generated units seem fine, and most of the GPU results get updated to "Valid", there's just one "Invalid" at this point in time.

Any idea what might be causing the discrepancy and large number of inconclusive results?
5) Message boards : Number crunching : Crunching / Gaming Computer Build 2010 (Message 950790)
Posted 29 Nov 2009 by Profile Saicere
120mm 3000rpm fan? I hope you're not planning on sleeping in the same *house*. :)

Also, if those are the speakers you are going to use, you might as well use the integrated sound card.
6) Message boards : Technical News : Sad Trombone (Nov 25 2009) (Message 949895)
Posted 26 Nov 2009 by Profile Saicere
You're talking a lot about MySQL problems on that new monster of yours, and in case you weren't aware, I just thought you should know that MySQL scales horribly on a large number of cores. It can barely scale to 8, so there's virtually no chance that you're getting good results on 24.

Sun has an "official" writeup of a scaling attempt here:

Their suggestion? Run several instances of MySQL on the same server. Which is a bit meh, but if you insist on running MySQL on a high-performance project like this, it might be worth looking into splitting up the most trafficked tables into separate instances. Then you could also enable the safe commit behavior individually on the critical servers.

You've also mentioned running into replica problems on jocelyn after a primary crash, which is very common. Here's a trick to get around it, at least temporarily.

After the crash, feed jocelyn the following through the MySQL console:


Replace [NEXT FILE] with the name of the first binary log file on mork that it started writing after the crash, typically mysql-bin.002342 or something similar. You can get the name by running SHOW MASTER STATUS; on mork. This will skip the corrupted end of the previous binary log, and restart replication from the new file.

Then verify that it's running with a SHOW SLAVE STATUS;

Note that you *cannot be sure* that the state on the replica and the primary is now consistent unless you are using safe commits, however it should still be consistent enough for non-science use.
7) Message boards : Number crunching : Welcome to the 10 year club (Message 949878)
Posted 26 Nov 2009 by Profile Saicere
So, I've been on-again off-again for months at a time (can't run this thing during summer), but yea. Hi. :)

As a sidenote, this i7 has been crunching more in two weeks than my previous contestant managed in 8 months.

©2017 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.