Monitoring inconclusive GBT validations and harvesting data for testing

Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 36 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1813309 - Posted: 29 Aug 2016, 0:59:08 UTC - in response to Message 1813220.  
Last modified: 29 Aug 2016, 1:04:05 UTC

...So far the validator appears to be coping with the limited 'unofficial' Mac and Linux builds in circulation, though if significant co-validations begin to occur I will likely have to raise a discussion with project staff...

I'm still scratching my head over this one. The Current Stock Mac nVidia App is producing nearly 100% Inconclusive results when running the Current Mac OSX. I decided to see how one was doing with those co-validations. The fourth valid task in the list seems suspect;
http://setiathome.berkeley.edu/workunit.php?wuid=2249369730
5123535836   5879059   28 Aug 2016, 15:43:41 UTC   28 Aug 2016, 17:23:45 UTC   Completed and validated   1,548.38    96.09   54.07   SETI@home v8 v8.00 (opencl_nvidia_mac) x86_64-apple-darwin
5123535837   8018045   28 Aug 2016, 15:44:19 UTC   28 Aug 2016, 20:18:50 UTC   Completed and validated   1,077.36   416.23   54.07   SETI@home v8 v8.00 (opencl_nvidia_mac) x86_64-apple-darwin

The First Host has, In progress (3) · Validation pending (48) · Validation inconclusive (38) · Valid (40) · Invalid (1) · Error (0)
The Second has; In progress (108) · Validation pending (454) · Validation inconclusive (349) · Valid (258) · Invalid (9) · Error (2)
I didn't go any further, I was afraid of what I might find. All the Mac nVidia Hosts have this problem running the current OSX, and have had it for a while.
Seems to me even the rawest Alpha is working better than the Current Stock Mac nVidia App.
There is a much better Mac nVidia App at Beta, but only a couple of people are running it.
ID: 1813309 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813319 - Posted: 29 Aug 2016, 2:08:29 UTC - in response to Message 1813309.  

Simple answer. I hold X-branch to higher standards than are applied to the broken stock Mac app that is pushing itself to the top of the list as primary justification for this thread ;)

While the Mac/Linux penetration of unfinished petri based apps is low, there is enough information accumulated to know there is more work to be done before it can hit primetime. Petri routinely acknowledges this himself.

My understanding of Richard's work here is identifying the major problems, so they can potentially be addressed. With Petri's alphas already a known quantity, somewhat contained, and being actively worked on, then rationalising the actual brokenness on the main offenders can begin.

An unfinished application causing minimal impact, is not, on the bigger scale, a problem, but a 'development tool'.

As soon as you step from Mac/Linux onto Windows, then the potential for damage increases probably tenfold or more. Just weight of numbers.

Care is needed to ensure the situation doesn't step from crappy Mac stock builds into abject chaos. Chaos is my realm, and it isn't pretty.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813319 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1813324 - Posted: 29 Aug 2016, 2:38:37 UTC - in response to Message 1813319.  
Last modified: 29 Aug 2016, 2:39:53 UTC

My point is, just about any Mac nVidia App that produces over around 20% first time validations is an improvement over the situation that has existed for quite some time. The current App at Beta was first posted back in January and produces over 95% first time validations. One of the couple Hosts running it is having one of his best days in terms of activity, http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=73656&offset=20
I remember someone saying that Baseline App at Beta didn't need Beta testing...

The problem with the Intel iGPUs is also well known. I ran across a couple of those in that Mac Valid list. I would suspect any 'Valid' task that included a Mac nVidia GPU & Intel iGPU, and also any two iGPUs that validated with each other. There are quite a few Intel iGPUs out there.

I don't remember anyone saying Petri's App should be posted for anyone, however, it should be recognized that in the case of the current Mac nVidia App just about anything is an improvement. The Intel iGPUs aren't far behind either.
ID: 1813324 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813333 - Posted: 29 Aug 2016, 2:51:06 UTC - in response to Message 1813324.  

Yes, however you're trying to draw a line between broken, and somewhat less broken, which falls over with the concept of 'inconclusive'. It's better to be all good, or more or less completely broken. Airy fairy results will pollute the situation, to the point that the worst stock apps feel like trolls.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813333 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813374 - Posted: 29 Aug 2016, 7:06:12 UTC - in response to Message 1813324.  

The Intel iGPUs aren't far behind either.

Same question: what limits in plan class should be implemented?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813374 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1813377 - Posted: 29 Aug 2016, 7:15:21 UTC - in response to Message 1813374.  

The Intel iGPUs aren't far behind either.

Same question: what limits in plan class should be implemented?

Alternatively: what can we do to fix it (them)?
ID: 1813377 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813378 - Posted: 29 Aug 2016, 7:17:32 UTC - in response to Message 1813377.  
Last modified: 29 Aug 2016, 7:19:36 UTC

The Intel iGPUs aren't far behind either.

Same question: what limits in plan class should be implemented?

Alternatively: what can we do to fix it (them)?

We can limit distribution to known good hosts. Though after more than month of waiting I start to doubt even in this - still no response from Eric ("only" week from last one though :D).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813378 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1813380 - Posted: 29 Aug 2016, 7:34:12 UTC - in response to Message 1813378.  

The Intel iGPUs aren't far behind either.

Same question: what limits in plan class should be implemented?

Alternatively: what can we do to fix it (them)?

We can limit distribution to known good hosts. Though after more than month of waiting I start to doubt even in this - still no response from Eric ("only" week from last one though :D).

I did see a team message yesterday from Angela: "Last week was [Eric's] first week back to work after a two week vacation, so things were pretty crazy for him."
ID: 1813380 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1813383 - Posted: 29 Aug 2016, 8:03:20 UTC - in response to Message 1813380.  
Last modified: 29 Aug 2016, 8:04:45 UTC


I did see a team message yesterday from Angela: "Last week was [Eric's] first week back to work after a two week vacation, so things were pretty crazy for him."

Yep, hobby leaves more time for action usually ;D

Meantime: there is nothing to offer regarding iGPU exclusion list - concentrate on it.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1813383 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1813387 - Posted: 29 Aug 2016, 8:28:52 UTC - in response to Message 1813383.  

Considering Apple uses strictly Intel CPUs, and has been known to have 'arrangements' with Intel, I wouldn't be surprised to find the Apple OpenCL and Intel OpenCL problems are related. Apple uses their own version of OpenCL. There are No Apple OpenCL drivers that I know of. There are No Apple Intel iGPU drivers either. That means the Intel iGPUs are using the Apple driver, just as the AMD & nVidia GPUs are using the Apple OpenCL. Solve the problem with the Apple OpenCL and you might solve the problem with the Intel iGPUs. It's just a thought...
ID: 1813387 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1813390 - Posted: 29 Aug 2016, 9:07:35 UTC

I haven't read through this thread completely but i want to post my findings.
Look at these results!
It looks like that Petris code is more accurate than the Apple code in my mind.

Quick compare:

CPU:
Spike: peak=24.09495, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.42812, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31746, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
Autocorr: peak=19.48975, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Spike: peak=24.78413, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k

APPLE:
Spike: peak=24.09507, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.4283, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31762, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
Autocorr: peak=19.48845, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Spike: peak=24.11478, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k

PETRI:
Spike: peak=24.09493, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.42814, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31745, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
Autocorr: peak=19.48978, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Spike: peak=24.78424, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k

This was a WU that has been crunched by cpu, apple and by Petris tweaked code. The problem to me seems like this precision drift started way back or is my assumption wrong?

http://setiathome.berkeley.edu/workunit.php?wuid=2248302050

Complete STDERR below

CPU:

<stderr_txt>

Build features: SETI8 Non-graphics FFTW USE_SSE3 x64
CPUID: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

Cache: L1=64K L2=256K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 FMA3 SSE4.1 SSE4.2 AVX
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768

Windows optimized setiathome_v8 application
Based on Intel, Core 2-optimized v8-nographics V5.13 by Alex Kan
SSE3xj Win64 Build 3330 , Ported by : Raistmer, JDWhale

SETI8 update by Raistmer
Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.410125
Spike: peak=24.09495, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.42812, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31746, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
Autocorr: peak=19.48975, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Spike: peak=24.78413, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k

Best spike: peak=24.78413, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k
Best autocorr: peak=19.48975, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Best gaussian: peak=2.818341, mean=0.5207298, ChiSq=1.416708, time=12.58, d_freq=1420752531.85,
score=-0.9713898, null_hyp=2.204585, chirp=10.797, fft_len=16k
Best pulse: peak=3.243532, time=6.544, period=0.9724, d_freq=1420754890.84, score=0.9946, chirp=-87.382, fft_len=64
Best triplet: peak=0, time=-2.122e+011, period=0, d_freq=0, chirp=0, fft_len=0


Flopcounter: 38853887058724.039000

Spike count: 4
Autocorr count: 1
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Wallclock time elapsed since last restart: 8021.9 seconds

10:32:47 (3944): called boinc_finish(0)

</stderr_txt>

APPLE:

<stderr_txt>
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE3x OS X 64bit Build 3321 , Ported by : Raistmer, JDWhale, Urs Echternacht


OpenCL version by Raistmer, r3321

Number of OpenCL platforms: 1


OpenCL Platform Name: Apple
Number of devices: 1
Max compute units: 2
Max work group size: 1024
Max clock frequency: 745Mhz
Max memory allocation: 134217728
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GT 640M
Vendor: NVIDIA
Driver version: 10.10.13 310.42.25f01
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422


Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.410125
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
Spike: peak=24.09507, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.4283, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31762, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE3x OS X 64bit Build 3321 , Ported by : Raistmer, JDWhale, Urs Echternacht


OpenCL version by Raistmer, r3321

Number of OpenCL platforms: 1


OpenCL Platform Name: Apple
Number of devices: 1
Max compute units: 2
Max work group size: 1024
Max clock frequency: 745Mhz
Max memory allocation: 134217728
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: No
Name: GeForce GT 640M
Vendor: NVIDIA
Driver version: 10.10.13 310.42.25f01
Version: OpenCL 1.2
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422


Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.410125
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
Spike: peak=24.09507, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.4283, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31762, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 2.00 percent.
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 2.00 percent.
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
Autocorr: peak=19.48845, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 44.13 percent.
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...
OpenCL platform detected: Apple
Number of OpenCL devices found : 1
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used
DOUBLE_FP supported.
cl_khr_fp64 supported.
cl_APPLE_fp64_basic_ops supported.
FERMI : true

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY OCL_CHIRP3 ASYNC_SPIKE FFTW SSE3 64bit
System: Darwin x86_64 Kernel: 15.6.0
CPU : Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
GenuineIntel x86, Family 6 Model 58 Stepping 9
Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl
ar=0.410125 NumCfft=200105 NumGauss=1151864510 NumPulse=226294165221 NumTriplet=452692139687
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 66.08 percent.
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 128MB
Total device global memory: 512MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: no
period_iterations_num=50
Spike: peak=24.11478, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k

Best spike: peak=24.4283, time=33.56, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Best autocorr: peak=19.48845, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Best gaussian: peak=2.805941, mean=0.5229927, ChiSq=1.398399, time=12.58, d_freq=1420752531.85,
score=-1.300901, null_hyp=2.174957, chirp=10.797, fft_len=16k
Best pulse: peak=3.245201, time=6.54, period=0.9724, d_freq=1420754890.84, score=0.9951, chirp=-87.382, fft_len=64
Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0


Flopcounter: 19269428530118.300781

Spike count: 4
Autocorr count: 1
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Time cpu in use since last restart: 18.1 seconds
GPU device sync requested... ...GPU device synched
16:01:17 (34168): called boinc_finish(0)

</stderr_txt>

PETRI:

<stderr_txt>
setiathome_CUDA: Found 4 CUDA device(s):
Device 1: GeForce GTX 750 Ti, 1998 MiB, regsPerBlock 65536
computeCap 5.0, multiProcs 5
pciBusID = 1, pciSlotID = 0
Device 2: GeForce GTX 750 Ti, 2000 MiB, regsPerBlock 65536
computeCap 5.0, multiProcs 5
pciBusID = 2, pciSlotID = 0
Device 3: GeForce GTX 750 Ti, 2000 MiB, regsPerBlock 65536
computeCap 5.0, multiProcs 5
pciBusID = 4, pciSlotID = 0
Device 4: GeForce GTX 750 Ti, 2000 MiB, regsPerBlock 65536
computeCap 5.0, multiProcs 5
pciBusID = 5, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 2
setiathome_CUDA: CUDA Device 2 specified, checking...
Device 2: GeForce GTX 750 Ti is okay
SETI@home using CUDA accelerated device GeForce GTX 750 Ti
Using pfb = 8 from command line args
Using pfp = 240 from command line args
Using unroll = 12 from command line args

setiathome v8 enhanced x41p_zi3d, Cuda 7.50 special
Compiled with NVCC 8.0, using 6.5 libraries. Modifications done by petri33.



Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is : 0.410125
Sigma 3
Thread call stack limit is: 1k
Spike: peak=24.09493, time=33.55, d_freq=1420751144.89, chirp=-0.028652, fft_len=128k
Spike: peak=24.42814, time=33.55, d_freq=1420751144.89, chirp=-0.033273, fft_len=128k
Spike: peak=24.31745, time=33.55, d_freq=1420751144.88, chirp=-0.037895, fft_len=128k
Autocorr: peak=19.48978, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Spike: peak=24.78424, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k
cudaAcc_free() called...
cudaAcc_free() running...
cudaAcc_free() PulseFind freed...
cudaAcc_free() Gaussfit freed...
cudaAcc_free() AutoCorrelation freed...
1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE.
13
Best spike: peak=24.78424, time=68.79, d_freq=1420750198.94, chirp=-95.62, fft_len=32k
Best autocorr: peak=19.48978, time=73.82, delay=5.1587, d_freq=1420751429.61, chirp=-7.0919, fft_len=128k
Best gaussian: peak=2.818346, mean=0.5207286, ChiSq=1.416716, time=12.58, d_freq=1420752531.85,
score=-0.9712276, null_hyp=2.204598, chirp=10.797, fft_len=16k
Best pulse: peak=3.243528, time=6.544, period=0.9724, d_freq=1420754890.84, score=0.9946, chirp=-87.382, fft_len=64
Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0

Flopcounter: 40132345887159.960938

Spike count: 4
Autocorr count: 1
Pulse count: 0
Triplet count: 0
Gaussian count: 0
09:44:58 (30909): called boinc_finish(0)

</stderr_txt>

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1813390 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813394 - Posted: 29 Aug 2016, 9:31:52 UTC - in response to Message 1813390.  

I haven't read through this thread completely but i want to post my findings.
Look at these results!
It looks like that Petris code is more accurate than the Apple code in my mind.
...


Exactly. This is one of those rare cases where more broken is better than slightly less broken, just because of the way validation works.

Let's be clear I'm in full support of Petri's work, however you're wanting to shove unfinished code in for the sakes of performance, when the landscape is turd. The stock Apple situation needs resolution before new technologies are introduced. New technologies are not a solution, but an incremental refinement.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813394 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1813395 - Posted: 29 Aug 2016, 9:39:43 UTC - in response to Message 1813394.  

Let's be clear I'm in full support of Petri's work, however you're wanting to shove unfinished code in for the sakes of performance, when the landscape is turd. The stock Apple situation needs resolution before new technologies are introduced. New technologies are not a solution, but an incremental refinement.


No, not at this Point! I just wanted to post my findings how come the Apple code is not targeted aswell.
It just seems odd and this kind of proved that we must dig further in the original code because in one way or Another some parts of the optimisation tree seems to differ from the Berkeley compiled executable with no optimisations in Place.

I just wanted to Point out that it seems like the precision drifting started way back Before he started to improve it for Nvidia cards and someone needs to dig deeper to see why it has started to not align to original code.

Has someone here a list of applications that you could do a "checklist" on that has a strongly similar of atleast 99.95% or so, the best would ofcourse be 100%.

I didn't know that other applications differed so much more compared to Petris but still considers valid or is that Apple code not distributed by the S@H servers?
Just confused now here.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1813395 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1813396 - Posted: 29 Aug 2016, 9:42:06 UTC - in response to Message 1813394.  

however you're wanting to shove unfinished code in for the sakes of performance


Not this time, now i'm more than ever concerned that other applications has precision drifting aswell and this needs to be addressed properly to pinpoint where it all began Deep within those burried lines of code.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1813396 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1813397 - Posted: 29 Aug 2016, 9:43:12 UTC - in response to Message 1813395.  

Not only do we have a checklist. We also need the results.sah file the apps produce so we can run it against a known valid result file. Having the stderr doesn't help much if it doesn't supply what was found.
ID: 1813397 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813399 - Posted: 29 Aug 2016, 9:53:28 UTC - in response to Message 1813395.  

Yeah, I agree there is chaos/confusion. IMO Just let Richard do his mission, so that we have clear decks to make things better.

The confusion (not only yours) comes from the change in telescopes that the OpenCL apps cope with better, And Petri running with the newest tech+hardware without experience in how Seti validation works (i.e. 30% inconclusive isn't good enough).

At present, obviously broken apps (like stock Mac) are valuable, in that they give some context to the changes. Since we have that context, and clear representative targets, then development goals are clear. No amount of piling extra crud on the existing situation will help, only confuse it more.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813399 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813400 - Posted: 29 Aug 2016, 9:55:17 UTC - in response to Message 1813396.  

however you're wanting to shove unfinished code in for the sakes of performance


Not this time, now i'm more than ever concerned that other applications has precision drifting aswell and this needs to be addressed properly to pinpoint where it all began Deep within those burried lines of code.


Yep. That's partly with the telescope change, partly with Coders, partly with too much change at once.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813400 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1813403 - Posted: 29 Aug 2016, 10:00:21 UTC

One good news is that I am setting up a mac VM so I am able to provide some result files :)
ID: 1813403 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1813404 - Posted: 29 Aug 2016, 10:02:07 UTC

Having been watching this thread and seeing how people have pitched in to help out (and I'm really pleased to see that), I think the main issues are turning out to be:

A) Petri's code is good, but it could still be better - the code itself allows some wasteful inconclusive results to demand an extra replication be sent out before validation. But Petri - with help from others and the results reported here - is working on that. It isn't finished yet.

B) The stock Apple applications, and the intel_gpu situation, are a mess - even more so when they combine in the Mac intel_gpu applications.

(B) should remind us that a full science application isn't made up of pure code alone. There's what we would understand as pure coding, and then there at least one, and for GPUs apps at least two, compilation stages. Compiler settings are at least as important as raw code in producing a mathematically-acceptable application.

One of the GPU compilation steps is under our control - the ordinary C-code compiler. My suspicion is that something went wrong with the stock Mac CPU app at this stage - a compiler option set for speed instead of accuracy, perhaps. If we could contact whoever compiled that app in the first place, it could be fixed.

The second GPU compilation stage - for OpenCL, at least - is not under our control to the same extent. It happens on the user's computer, automatically, using tools installed on each user's machine as part of the driver package. It seems as if the intel_gpus' problems lie in this area - the intel_gpu applications are problematic at Einstein too, although their apps run fine here with an older driver. I wish the BOINC scientific community was large and coherent enough to form a critical mass that Intel would need to work with - but I don't think we've ever succeeded in reaching that stage.
ID: 1813404 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1813406 - Posted: 29 Aug 2016, 10:08:52 UTC - in response to Message 1813404.  

Pretty sure Petri would be happy with that assessment, as I am. Will do some thinking, and possibly research, on the Intel OpenCL situation. My fear is that Intel's implementation of OpenCL might possibly have been about the (marketing) Flops, more than the actual accurate results. Time will tell.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1813406 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 36 · Next

Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.