Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 83 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1855894 - Posted: 17 Mar 2017, 4:42:36 UTC - in response to Message 1855882.  

Shame you can't find a cheaper method. For me, I was able to use my Existing DDR2 ram, CPU cooler and Power supply. You can find a Core2 Quad on eBay for around $20. So, all it took to go from a Dual core CPU was the $22 board and $20 CPU. Much cheaper than a new board, cpu, ram, etc. It runs the GPUs just fine, and with the Special App I don't worry much about CPU tasks.

You can edit or create the xorg.conf manually to add Fan control and Overclocking. Open NVIDIA X Server Settings and select X Server Display Configuration. At the bottom select Save to X Configuration File. At the Dialogue window select Show preview, usually it will fail if you try to save it, so, select all the text and copy it. Then open a Terminal and enter gksu nautilus, navigate to the file, /etc/X11/xorg.conf, and open it with gedit. Paste the copied Configuration over whatever is there and add the Option "Coolbits" "12" in the below location{s). You have to add the Option to each screen;
...
Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Coolbits" "12"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "Coolbits" "12"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "CRT-0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen2"
    Device         "Device2"
    Monitor        "Monitor2"
    DefaultDepth    24
    Option         "Coolbits" "12"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "CRT-0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Save the file.
Then restart the machine. Coolbits 12 will give Fan control and OCing.
Works for me.
ID: 1855894 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1855913 - Posted: 17 Mar 2017, 8:03:29 UTC - in response to Message 1855889.  
Last modified: 17 Mar 2017, 8:10:34 UTC

"sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus"

Try that.

It has x in the config and cool-bits is now 28.
You get options to set fan, gpu, vram, ...


. . Thanks Petri,

. . I should have checked that twice, I missed the x before config.

. . Now I can cool those 1060s :)

. . Oops. That ran but did not open the app to change the fan speed etc. It reported that it used /etc/x11/xorg.conf and added the option thermal configuration check = true to screens 0 & 1 then backed up the file.

. . Is it another command to bring up that lovely nVidia app that lets you set things for the video card and screen?

Stephen

:)
ID: 1855913 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1855992 - Posted: 17 Mar 2017, 15:25:53 UTC - in response to Message 1855867.  

. . Also one other point. You were correct about the problem updating cc_config.xml settings. With this installation in /home/BOINC and that setting in cc_config.xml set to 1 it is now running as 20 and 0 instead of 30 and 10.
Now that it's running at normal priority have you noticed any difference in GPU Utilization in nvidia-smi? On my machine the Arecibo tasks run around the low-90%, the BLC13s run around the mid-90% range with spikes up to 100%.

The x41p_zi3t1b version seems to be working well on both the Ubuntu machines and the Mac. The single GPU machine still has just a handful of legitimate inconclusives while the other two 3 GPU machines finally dropped into the 40s. The Mac caught a number of instant overflows and went back up for now. I suspect it will take over a month before they drop into the 30s as there are a lot of old inconclusives to wait on.
Hmmm, perhaps a new App release tomorrow, maybe even one for OSX.
ID: 1855992 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856005 - Posted: 17 Mar 2017, 16:15:35 UTC - in response to Message 1855913.  

"sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus"

Try that.

It has x in the config and cool-bits is now 28.
You get options to set fan, gpu, vram, ...


. . Thanks Petri,

. . I should have checked that twice, I missed the x before config.

. . Now I can cool those 1060s :)

. . Oops. That ran but did not open the app to change the fan speed etc. It reported that it used /etc/x11/xorg.conf and added the option thermal configuration check = true to screens 0 & 1 then backed up the file.

. . Is it another command to bring up that lovely nVidia app that lets you set things for the video card and screen?

Stephen

:)


nvidia-settings is one command to set things. Without options it opens up a window to set them manually each time the mache has to be rebooted.

to automate things you can create a script to do something like this:
root@Linux1:~/Downloads/BOINC# cat schedb.sh
#!/bin/bash

/usr/bin/nvidia-smi -pm 1

#gtx 1080s

#prefer fastest mode
nvidia-settings -a "[gpu:0]/GPUPowerMizerMode=1"

#may not work on some GPUs
nvidia-settings -a "[GPU:0]/GPUOverVoltageOffset=16000"

#do some overclocking
/usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[3]=1100" -a "[gpu:0]/GPUGraphicsClockOffset[3]=190"

#set fan speed percentage
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=96"

#set power limit in Watts
/usr/bin/nvidia-smi -i 0 -pl 215

#set application clocks
/usr/bin/nvidia-smi -i 0 -ac 5005,1911

# set another GPU
nvidia-settings -a "[gpu:1]/GPUPowerMizerMode=1"
nvidia-settings -a "[GPU:1]/GPUOverVoltageOffset=16000"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUMemoryTransferRateOffset[3]=1100" -a "[gpu:1]/GPUGraphicsClockOffset[3]=190"
nvidia-settings -a "[gpu:1]/GPUFanControlState=1" -a "[fan:1]/GPUTargetFanSpeed=96"
/usr/bin/nvidia-smi -i 1 -pl 215
/usr/bin/nvidia-smi -i 1 -ac 5005,1911

nvidia-settings -a "[gpu:2]/GPUPowerMizerMode=1"
nvidia-settings -a "[GPU:2]/GPUOverVoltageOffset=16000"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUMemoryTransferRateOffset[3]=1100" -a "[gpu:2]/GPUGraphicsClockOffset[3]=190"
nvidia-settings -a "[gpu:2]/GPUFanControlState=1" -a "[fan:2]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-smi -i 2 -pl 215
/usr/bin/nvidia-smi -i 2 -ac 5005,1911

nvidia-settings -a "[gpu:3]/GPUPowerMizerMode=1"
nvidia-settings -a "[GPU:3]/GPUOverVoltageOffset=16000"
/usr/bin/nvidia-settings -a "[gpu:3]/GPUMemoryTransferRateOffset[3]=1100" -a "[gpu:3]/GPUGraphicsClockOffset[3]=180"
nvidia-settings -a "[gpu:3]/GPUFanControlState=1" -a "[fan:3]/GPUTargetFanSpeed=96"
/usr/bin/nvidia-smi -i 3 -pl 215
/usr/bin/nvidia-smi -i 3 -ac 5005,1911


# gtx980s

#/usr/bin/nvidia-smi -i 0 -pl 230
#/usr/bin/nvidia-smi -i 2 -pl 230
#/usr/bin/nvidia-settings -a "[gpu:1]/GPUMemoryTransferRateOffset[3]=200" -a "[gpu:1]/GPUGraphicsClockOffset[3]=30"
#/usr/bin/nvidia-settings -a "[gpu:2]/GPUMemoryTransferRateOffset[3]=200" -a "[gpu:2]/GPUGraphicsClockOffset[3]=20"
#/usr/bin/nvidia-smi -i 0 -ac 3605,1321
#/usr/bin/nvidia-smi -i 2 -ac 3605,1324

#/usr/bin/nvidia-settings -a "[GPU:1]/GPUOverVoltageOffset=16000"
#/usr/bin/nvidia-settings -a "[GPU:2]/GPUOverVoltageOffset=16000"
#nvidia-settings -a "[gpu:1]/GPUFanControlState=1" -a "[fan:1]/GPUTargetFanSpeed=90"
#nvidia-settings -a "[gpu:2]/GPUFanControlState=1" -a "[fan:2]/GPUTargetFanSpeed=90"





for (( ; ; ))
do
  schedtool -a 1,2,3,4 `pidof setiathome_x41zc_x86_64-pc-linux-gnu_cuda65`
  schedtool -a 1,2,3,4 `pidof setiathome_x41zc_x86_64-pc-linux-gnu_cuda65_v8`
  schedtool -a 1,2,3,4 `pidof ap_7.01r2793_sse3_clGPU_x86_64`
  schedtool -a 6,7,8,9,10,11 `pidof MBv8_8.05r3345_avx_linux64`
  schedtool -a 6,7,8,9,10,11 `pidof setiathome_8.04_i686-pc-linux-gnu`
  schedtool -a 6,7,8,9,10,11 `pidof ap_7.05r2728_avx_linux32e`
  schedtool -a 5 `pidof compiz`
  sleep 2
  rmdir ~petri/Downloads/BOINC/slots/1?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/2?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/3?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/4?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/5?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/6?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/7?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/8?*  2>/dev/null
  rmdir ~petri/Downloads/BOINC/slots/9?*  2>/dev/null
done
	
The schedtool sets apps to run on spesific range of CPUs.
The rmdir is a relic to remove any excess slot directories from BOINC folder. I had 600 of them once.


To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856005 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1856010 - Posted: 17 Mar 2017, 16:27:15 UTC - in response to Message 1855992.  

. . Also one other point. You were correct about the problem updating cc_config.xml settings. With this installation in /home/BOINC and that setting in cc_config.xml set to 1 it is now running as 20 and 0 instead of 30 and 10.
Now that it's running at normal priority have you noticed any difference in GPU Utilization in nvidia-smi? On my machine the Arecibo tasks run around the low-90%, the BLC13s run around the mid-90% range with spikes up to 100%.

The x41p_zi3t1b version seems to be working well on both the Ubuntu machines and the Mac. The single GPU machine still has just a handful of legitimate inconclusives while the other two 3 GPU machines finally dropped into the 40s. The Mac caught a number of instant overflows and went back up for now. I suspect it will take over a month before they drop into the 30s as there are a lot of old inconclusives to wait on.
Hmmm, perhaps a new App release tomorrow, maybe even one for OSX.


. . The top output

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 8196 stephen   20   0 22.642g 620056 327636 R  43.3 15.3   1:28.22 setiathome+ 
 8226 stephen   20   0 22.589g 564216 327304 R  43.3 13.9   0:27.53 setiathome+ 
 2335 stephen   20   0  608916  43772  26960 R  18.6  1.1 103:15.51 boincmgr    


. . The nividia-smi output

Fri Mar 17 23:04:55 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 0000:01:00.0      On |                  N/A |
| 40%   69C    P2    55W / 120W |   1893MiB /  6068MiB |     80%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 106...  Off  | 0000:02:00.0     Off |                  N/A |
| 31%   58C    P2    55W / 120W |   1730MiB /  6072MiB |     58%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1009    G   /usr/lib/xorg/Xorg                             118MiB |
|    0      2067    G   compiz                                          43MiB |
|    0      8226    C   ...ome_x41p_zi3k+_x86_64-pc-linux-gnu_cuda80  1727MiB |
|    1      8263    C   ...ome_x41p_zi3k+_x86_64-pc-linux-gnu_cuda80  1727MiB |
+-----------------------------------------------------------------------------+


. . So it seems even with those NI values it doesn't come close to utilising them fully.

. . FWIW, runtimes for Arecibo normals are 3.5 to 4.25 mins, Blc13's are 4.75 to 5.5 mins, halflings (VHAR) and overflow tasks are from 1.25 to 2.5 mins.

. . Both GPUs are crunching Arecibo tasks (NARAs) ATM so those are the readings for that.

. . Not that it matters but the times are wrong, the clock is running 4hrs and 7 mins behind and I cannot fix it. Not even by selecting the manual setting for the clock.

Stephen

.
ID: 1856010 · Report as offensive
Profile crainey

Send message
Joined: 6 Jun 99
Posts: 1
Credit: 32,436,908
RAC: 0
United States
Message 1856027 - Posted: 17 Mar 2017, 17:26:52 UTC

Can anyone tell me where to find the CUDA8 MB Linux apps modified by Petri? I've looked around and can't seem to find where to download them. Thanks!
ID: 1856027 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1856233 - Posted: 18 Mar 2017, 5:53:26 UTC

There is a Crunchers Anonymous link in the first post of this thread, the App can be found there.

Seems the recent x41p_zi3t1b build is producing unmatched Overflows on both platforms with the Arecibo tasks. So far I haven't seen any problems with the x41p_zi3k+ build, so, I guess we'll be staying with the x41p_zi3k+ build for now.
ID: 1856233 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856253 - Posted: 18 Mar 2017, 8:49:27 UTC - in response to Message 1856233.  

There is a Crunchers Anonymous link in the first post of this thread, the App can be found there.

Seems the recent x41p_zi3t1b build is producing unmatched Overflows on both platforms with the Arecibo tasks. So far I haven't seen any problems with the x41p_zi3k+ build, so, I guess we'll be staying with the x41p_zi3k+ build for now.


An overflow is an overflow (noisy packet). It may contain 60 spikes, autocorrelations and pulses. They are searched in different work queues on GPU. Results are checked and reported. A different implementation (parallel) can find them in any order. No order can be said best.

The rate of inconclusive results can be found here: http://setiathome.berkeley.edu/results.php?hostid=7475713&offset=0&show_names=0&state=4&appid=29

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856253 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1856258 - Posted: 18 Mar 2017, 9:36:01 UTC - in response to Message 1856253.  

An overflow is an overflow (noisy packet). It may contain 60 spikes, autocorrelations and pulses. They are searched in different work queues on GPU. Results are checked and reported. A different implementation (parallel) can find them in any order. No order can be said best.

But if they can be reported in an order that matches what is considered to be the reference application, then Inconclusives are significantly reduced and the load on the database is reduced as the WU doesn't need processing once more & people get their Credit sooner.
Grant
Darwin NT
ID: 1856258 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856271 - Posted: 18 Mar 2017, 11:24:31 UTC
Last modified: 18 Mar 2017, 11:25:42 UTC

As always, Start with reproducing under bench with precisely the same parameters against stock/win32 CPU reference, then again with unroll set to 1. Assuming you reproduce mismatch with higher unroll, and then matches without unroll, then you just confirm what we already know. The Application is fast but broken, as it does not match the serial variants. Anyone that claims it it OK to choose whatever signals they like from the full dataset, has no idea what they are talking about, and you should point them out so I can ridicule them with computer science.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856271 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856272 - Posted: 18 Mar 2017, 11:35:25 UTC - in response to Message 1856253.  

@petri33. It is Still not OK to mismatch serial CPU on the overflows. This is because CPU serial has no choice, but you do, but are choosing to be lazy and not do a full reduction. Do it properly or don't bother contributing further.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856272 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856273 - Posted: 18 Mar 2017, 11:40:40 UTC - in response to Message 1856271.  

As always, Start with reproducing under bench with precisely the same parameters against stock/win32 CPU reference, then again with unroll set to 1. Assuming you reproduce mismatch with higher unroll, and then matches without unroll, then you just confirm what we already know. The Application is fast but broken, as it does not match the serial variants. Anyone that claims it it OK to choose whatever signals they like from the full dataset, has no idea what they are talking about, and you should point them out so I can ridicule them with computer science.


a) If a packet is full of crap and the decision is made solely based on number of signals found, say 30, then it is completely irrelevant what sort of crap the packet contains.
b) If a packet has an acceptable amount of signals they must be reported correctly.
c) If a packet does not have reportable signals, finding best non reportable wastes cycles on cosmetics.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856273 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856274 - Posted: 18 Mar 2017, 11:41:44 UTC - in response to Message 1856273.  

As always, Start with reproducing under bench with precisely the same parameters against stock/win32 CPU reference, then again with unroll set to 1. Assuming you reproduce mismatch with higher unroll, and then matches without unroll, then you just confirm what we already know. The Application is fast but broken, as it does not match the serial variants. Anyone that claims it it OK to choose whatever signals they like from the full dataset, has no idea what they are talking about, and you should point them out so I can ridicule them with computer science.


a) If a packet is full of crap and the decision is made solely based on number of signals found, say 30, then it is completely irrelevant what sort of crap the packet contains.
b) If a packet has an acceptable amount of signals they must be reported correctly.
c) If a packet does not have reportable signals, finding best non reportable wastes cycles on cosmetics.

Petri


You are wrong. Make your code match serial CPU on overflows or I will abandon it entirely. Your Choice.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856274 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856275 - Posted: 18 Mar 2017, 11:44:18 UTC - in response to Message 1856274.  


You are wrong. Make your code match serial CPU on overflows or I will abandon it entirely. Your Choice.


:)
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856275 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856276 - Posted: 18 Mar 2017, 11:45:05 UTC - in response to Message 1856275.  


You are wrong. Make your code match serial CPU on overflows or I will abandon it entirely. Your Choice.


:)


Good to see we're on the same page ;)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856276 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1856278 - Posted: 18 Mar 2017, 11:47:52 UTC - in response to Message 1856253.  
Last modified: 18 Mar 2017, 12:01:23 UTC

There is a Crunchers Anonymous link in the first post of this thread, the App can be found there.

Seems the recent x41p_zi3t1b build is producing unmatched Overflows on both platforms with the Arecibo tasks. So far I haven't seen any problems with the x41p_zi3k+ build, so, I guess we'll be staying with the x41p_zi3k+ build for now.


An overflow is an overflow (noisy packet). It may contain 60 spikes, autocorrelations and pulses. They are searched in different work queues on GPU. Results are checked and reported. A different implementation (parallel) can find them in any order. No order can be said best.

The rate of inconclusive results can be found here: http://setiathome.berkeley.edu/results.php?hostid=7475713&offset=0&show_names=0&state=4&appid=29

Petri

What I meant by Unmatched Overflow is that the zi3t1b App overflowed but the WingPeople didn't. Sorta like the problems with the last few versions.
http://setiathome.berkeley.edu/results.php?hostid=6796479&state=5
https://setiathome.berkeley.edu/results.php?hostid=7769537&state=5
http://setiathome.berkeley.edu/workunit.php?wuid=2471201891
Three different machines, same problem. The difference is the previous versions Overflowed on the VLARs whereas these are Arecibos.

I've been testing different versions on the Mac and it appears the current setiathome_x41p_zi3k+_x86_64-apple-darwin_cuda80 build, using the CUDA 7.5 libraries, is the best. I'll see about replacing the old Special_CUDA75-ComputeCode3.2+ at C.A. with the newer version.
ID: 1856278 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856279 - Posted: 18 Mar 2017, 11:51:09 UTC - in response to Message 1856278.  

There is a Crunchers Anonymous link in the first post of this thread, the App can be found there.

Seems the recent x41p_zi3t1b build is producing unmatched Overflows on both platforms with the Arecibo tasks. So far I haven't seen any problems with the x41p_zi3k+ build, so, I guess we'll be staying with the x41p_zi3k+ build for now.


An overflow is an overflow (noisy packet). It may contain 60 spikes, autocorrelations and pulses. They are searched in different work queues on GPU. Results are checked and reported. A different implementation (parallel) can find them in any order. No order can be said best.

The rate of inconclusive results can be found here: http://setiathome.berkeley.edu/results.php?hostid=7475713&offset=0&show_names=0&state=4&appid=29

Petri

What I meant by Unmatched Overflow is that the zi3t1b App overflowed but the WingPeople didn't. Sorta like the problems with the last few versions.
http://setiathome.berkeley.edu/results.php?hostid=6796479&state=5
https://setiathome.berkeley.edu/results.php?hostid=7769537&state=5
http://setiathome.berkeley.edu/workunit.php?wuid=2471201891
Three different machines, same problem. The difference is the previous versions Overflowed on the VLARs whereas these are Arecibos.


Both Petri and I know Exactly what it is, and I beleive we have an understanding of how it's going to go.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856279 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856284 - Posted: 18 Mar 2017, 12:13:55 UTC - in response to Message 1856278.  


What I meant by Unmatched Overflow is that the zi3t1b App overflowed but the WingPeople didn't. Sorta like the problems with the last few versions.
http://setiathome.berkeley.edu/results.php?hostid=6796479&state=5
https://setiathome.berkeley.edu/results.php?hostid=7769537&state=5
http://setiathome.berkeley.edu/workunit.php?wuid=2471201891
Three different machines, same problem. The difference is the previous versions Overflowed on the VLARs whereas these are Arecibos.


Hi Jason,

Those are good examples of an error in the program. That executable reports finding spikes when it clearly should not.
I'll send you and TBar a link to the latest source code. A number of people are running more recent executables without errors.
The latest source has an option to set -unroll autotune .

usage: exename -unroll autotune -pfb M
where M is 4, 8, 16 ... or something.

To me an unmatched overflow is that my version reports 7 autocorr and 23 spikes and the wingman reports 25 spikes and 5 autocorr or vice versa, both 30 signals, i.e. a bad packet..
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856284 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1856285 - Posted: 18 Mar 2017, 12:17:14 UTC - in response to Message 1856284.  
Last modified: 18 Mar 2017, 12:18:27 UTC


To me an unmatched overflow is that my version reports 7 autocorr and 23 spikes and the wingman reports 25 spikes and 5 autocorr or vice versa, both 30 signals, i.e. a bad packet..


Good, keep that to work with and please email also. No reason not to match CPU reference other than dirty parallelism with no reductions. [or faulty reductions...]
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1856285 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1856287 - Posted: 18 Mar 2017, 12:22:32 UTC - in response to Message 1856285.  


To me an unmatched overflow is that my version reports 7 autocorr and 23 spikes and the wingman reports 25 spikes and 5 autocorr or vice versa, both 30 signals, i.e. a bad packet..


Good, keep that to work with and please email also. No reason not to match CPU reference other than dirty parallelism with no reductions. [or faulty reductions...]


PM sent.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1856287 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.