Setting up Linux to crunch CUDA90 and above for Windows users

留言板 : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
留言板合理

To post messages, you must log in.

前 · 1 . . . 100 · 101 · 102 · 103 · 104 · 105 · 106 . . . 161 · 后

作者消息
Profile Keith Myers Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:29 Apr 01
贴子:11776
积分:1,160,866,277
近期平均积分:1,873
United States
消息 1951020 - 发表于:21 Aug 2018, 2:09:27 UTC - 回复消息 1951018.  

Interesting to hear about the in-place system update not working. I have been offered the upgrade to 18.04 on both my 16.04 systems within the last day now. I've said no to the upgrade until after the WoW contest. Have doubts now about giving it a try. I think I will backup my BOINC folder and move it off the OS drive when I accept the upgrade installation. If it goes wrong, I can always do a clean install from flash drive and be up and ready for running in an hour for unpacking of my saved BOINC folder. That is my future plan at least.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1951020 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1951018 - 发表于:21 Aug 2018, 2:02:21 UTC - 回复消息 1950992.  
最近的修改日期:21 Aug 2018, 2:07:55 UTC

I haven't had any luck trying to in place upgrade either. It took forever and then had numerous problems afterwards. The best method I've found is to have a separate Home partition with numerous System partitions and one Swap partition. That way your Home folder stays the same and the only change is to the System Partition. Say, One 70 GB Home Partition, One 16 GB Swap Partition, and Three 50 GB System Partitions. To install another System select 'Something Else', then choose the Home Partition without formatting, the Swap Partition, and the System Partition with formatting. You can have 3 different Systems all using the same Home and Swap Partitions. If one System has troubles, you can just Format and install a new System while your Home Folder remains unchanged. Since I keep my BOINC folder in my Home folder, It's very fast and easy to quickly swap out to a clean system. It also helps to be able to boot to the Development System, Compile an App in 14.04.1, then boot to 16.04 and test it.
ID: 1951018 · 举报违规帖子     回复 引用
Profile Keith Myers Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:29 Apr 01
贴子:11776
积分:1,160,866,277
近期平均积分:1,873
United States
消息 1951016 - 发表于:21 Aug 2018, 1:54:38 UTC - 回复消息 1950992.  

Fortunately, I haven't seen any reports claiming "V0.97 Ate My 'Puter",
LMAO, hand up! Sort of ...
I tried the Ubuntu 14 ->> 16 update from the OS, and it wouldn't start back up ... continuous dot dot dot dot dot ... I thought it was a post install so let it sit for 90m and no go, rebooted and the same for 30 minutes ... time to pull the plug and re-image from the backup I made. What a waste of 8 bloody hours of backups/downloads/installing.
Next try will be 18 form a DVD ... later ...

Almost positive you could have avoided that with a change to the grub file in the installer to add nomodeset to the kernel command line.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1951016 · 举报违规帖子     回复 引用
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员

发送消息
已加入:1 Dec 99
贴子:2786
积分:685,657,289
近期平均积分:835
Canada
消息 1950992 - 发表于:21 Aug 2018, 0:20:06 UTC - 回复消息 1950989.  

Fortunately, I haven't seen any reports claiming "V0.97 Ate My 'Puter",
LMAO, hand up! Sort of ...
I tried the Ubuntu 14 ->> 16 update from the OS, and it wouldn't start back up ... continuous dot dot dot dot dot ... I thought it was a post install so let it sit for 90m and no go, rebooted and the same for 30 minutes ... time to pull the plug and re-image from the backup I made. What a waste of 8 bloody hours of backups/downloads/installing.
Next try will be 18 form a DVD ... later ...
ID: 1950992 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1950989 - 发表于:21 Aug 2018, 0:07:14 UTC - 回复消息 1950977.  
最近的修改日期:21 Aug 2018, 0:08:10 UTC

So he installed the Same System again? Expecting Different Results this time? Hopefully it will work out....
Which System/Kernel is he having trouble with? I'll mark that down for future reference.

Stephen, you appear to be a little behind. 0.96 had problems from Day One with the Arecibo tasks. Finally it was put to rest after a horde of Shorties. See, V0.96 Ate My Shorties...Just Like TBar Said it Would.
It was replaced by the Bug Fix V0.97 on Saturday. Also on Saturday We had a large number of Volunteer Alpha Tests step up to test the Bug Fix Release...probably didn't know they were Alpha testers. Fortunately, I haven't seen any reports claiming "V0.97 Ate My 'Puter", so, we are Much further along with testing than normal for Two Days. I will probably post the 0.97 version for Ubuntu 14.04 soon, along with the Pascal version of 0.97 which seems to need 16.04.

So far, the only troubling thing I've seen is 0.97 sometimes reports One less Pulse than the other App. Troubling 'cause I don't see any reason for it unlike the difference in Pulses seen with zi3v. zi3v would sometimes report an additional Pulse, but, it was a Pulse with a Score of Exactly One which the other Apps weren't reporting.
ID: 1950989 · 举报违规帖子     回复 引用
Profile Keith Myers Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:29 Apr 01
贴子:11776
积分:1,160,866,277
近期平均积分:1,873
United States
消息 1950977 - 发表于:20 Aug 2018, 23:20:30 UTC - 回复消息 1950972.  

He was able to install the 396.51 drivers after a complete distribution re-install. So whatever was hanging up the driver installation got sorted out on the new installation of the OS. Didn't help that earlier this morning the ppa server and ppa key server were unavailable for an hour. Back up again thankfully. So he got another host up on the 0.97 special app.

Looks like he determined that the casita router or switch was buckling under the traffic load, and could only support two systems, because the systems that couldn't connect to the internet in the casita work fine from the house. I told him to get a better unmanaged switch for the casita.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1950977 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950974 - 发表于:20 Aug 2018, 22:41:39 UTC - 回复消息 1950935.  

[quote]@ TBar

. . I believe you said you have a version of 0.96 that was compiled on a machine with a compatible software environment to work on this rig? If so is there a link to download it?

Stephen

?

This machine is running Ubuntu 14.04.1 running the software receive around Noon on Saturday, https://setiathome.berkeley.edu/show_host_detail.php?hostid=6906726
This is a recent result on an Arecibo VLAR, https://setiathome.berkeley.edu/result.php?resultid=6906865500
From Noon on Saturday to Noon on Monday is TWO Days, do you think it is Safe to release Software after TWO days of Testing?
I suppose you can argue it's more Safe than releasing it with Zero Days testing, as was done Saturday, but, I don't know of anyone else that releases Software with ZERO Days of testing.
Most of the time the testing is measured in Weeks. So, do you think Two days is enough?[/quote]

. . Hi TBar,

. . I was actually asking about the 0.96 version which I believe you have been running for a while now, sorry about the confusion. As to testing time? No, empirically an app should be tested for more than a couple of days on a single machine before being released to the general public. But if you are asking would I take the gamble and try it? Of course I would :) OK I will have to wait a while longer, but your previous message about getting CUDA92 compatible drivers loaded had me drooling ...

Stephen

:(
ID: 1950974 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950972 - 发表于:20 Aug 2018, 22:27:31 UTC - 回复消息 1950921.  

Just tried to help Zalster install the 396.51 drivers via the .run file in root recovery terminal. Met with disaster. Run file installation detected errors and was given permission to fix them and it did and then just threw out a bug error and corrupted the file system. System won't load anymore. Will have to reinstall complete again.

Only reason this was attempted is that the system complained about installing the 396.51 drivers from the Software Updater.


. . That's bad news, it explains why his numbers have dropped off a bit though ... :(

Stephen

:(
ID: 1950972 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1950935 - 发表于:20 Aug 2018, 18:52:23 UTC - 回复消息 1950902.  

@ TBar

. . I believe you said you have a version of 0.96 that was compiled on a machine with a compatible software environment to work on this rig? If so is there a link to download it?

Stephen

?

This machine is running Ubuntu 14.04.1 running the software receive around Noon on Saturday, https://setiathome.berkeley.edu/show_host_detail.php?hostid=6906726
This is a recent result on an Arecibo VLAR, https://setiathome.berkeley.edu/result.php?resultid=6906865500
From Noon on Saturday to Noon on Monday is TWO Days, do you think it is Safe to release Software after TWO days of Testing?
I suppose you can argue it's more Safe than releasing it with Zero Days testing, as was done Saturday, but, I don't know of anyone else that releases Software with ZERO Days of testing.
Most of the time the testing is measured in Weeks. So, do you think Two days is enough?
ID: 1950935 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1950925 - 发表于:20 Aug 2018, 17:26:48 UTC - 回复消息 1950921.  
最近的修改日期:20 Aug 2018, 18:16:05 UTC

It sounds as though whatever System was installed simply doesn't like that machine. I do hope you try a Different System this time. There are a large number of choices, you don't have to keep installing the same one expecting different results. There are even point updates that lock the Kernel at a certain level so it never changes beyond that level. All you need to know is which one works best on that hardware. I have the 14.04.1 version installed, the kernel Never changes from 3.13.xx. You could start until you find one that works. I see four different ones here, start with 16.04.1, then maybe 16.04.2, 16.04.3....16.04.4, http://old-releases.ubuntu.com/releases/xenial/
Or maybe start with 16.04.4, if no problems stay there. If problems, try 16.04.3....
ID: 1950925 · 举报违规帖子     回复 引用
Profile Keith Myers Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:29 Apr 01
贴子:11776
积分:1,160,866,277
近期平均积分:1,873
United States
消息 1950921 - 发表于:20 Aug 2018, 16:33:22 UTC

Just tried to help Zalster install the 396.51 drivers via the .run file in root recovery terminal. Met with disaster. Run file installation detected errors and was given permission to fix them and it did and then just threw out a bug error and corrupted the file system. System won't load anymore. Will have to reinstall complete again.

Only reason this was attempted is that the system complained about installing the 396.51 drivers from the Software Updater.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1950921 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950908 - 发表于:20 Aug 2018, 14:46:54 UTC - 回复消息 1950889.  

Load balancing on the cpu/gpu.

When I run each of two video cards (gtx 750Ti's) with one core / gpu I get a cpu usage rate of 80~%. When I back it off to say 0.33 of a cpu (I am using app_config.xml for this) per gpu, then it pegs at 100%.

I have read someplace that using a core / card under this application is 3%-10% faster.

Any guidance?

Tom

. . Hi Tom,

. . When you set the CPU usage in BOINC manager it can have a strange behaviour if you use the wrong values in app_info (or app_config). If you set it to use say 89% of cpu cores in manager, leaving one to support the GPU, but then in app_info tell it to use 1 CPU core per GPU task it might only run crunching on 6 cores instead of 7, but if you set the value in app_config to 0.99 it will happily use all 7 as you intended. If running multiple GPUs and you set it to 75% in manager but set it over 0.5 in app_info it will only use 5 not 6, while setting it to 0.49 or less it will again happily use 6 as intended. I take it you are running CUDA 90 with the default of BS on? Then setting manager to 0.89 and app_info to 0.49 or less will see crunching on 7 cores and the other supporting one task on each GPU. You can expect to find the CPU fully utilised or even perhaps a bit overcommitted in this state. If you run with bs off (using the -nobs option in the commandline section of app_info) then you should heed TBars warnings about severe over commit on CPUs supporting the task on each GPU. It would be wise to leave one core for each GPU plus one for safety to avoid this problem. Using -nobs will gain a noticeable but not huge increase in performance. It can be worth it if your CPU cores are not fast processors when crunching.

Stephen

.
ID: 1950908 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950902 - 发表于:20 Aug 2018, 14:15:58 UTC - 回复消息 1950842.  

@ TBar

. . I believe you said you have a version of 0.96 that was compiled on a machine with a compatible software environment to work on this rig? If so is there a link to download it?

Stephen

?
ID: 1950902 · 举报违规帖子     回复 引用
Profile Tom M
志愿者测试人员

发送消息
已加入:28 Nov 02
贴子:4973
积分:276,046,078
近期平均积分:462
消息 1950889 - 发表于:20 Aug 2018, 12:47:21 UTC

Load balancing on the cpu/gpu.

When I run each of two video cards (gtx 750Ti's) with one core / gpu I get a cpu usage rate of 80~%. When I back it off to say 0.33 of a cpu (I am using app_config.xml for this) per gpu, then it pegs at 100%.

I have read someplace that using a core / card under this application is 3%-10% faster.

Any guidance?

Tom
A proud member of the OFA (Old Farts Association).
A candidate for membership in the WWA (Walking Wounded Association).
ID: 1950889 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950879 - 发表于:20 Aug 2018, 11:56:21 UTC - 回复消息 1950833.  

I am pleased to announce that my I7-3770 machine (#ID: 8564832 is the Linux version) has successfully undergone dual boot upgrade and is happily munching along at the moment. I will have to boot back into Windows to get the rest of its tasks processed out.

The "au" tasks make it harder to see that the Gtx 750Ti is running a bunch faster but it is clear that the cpu is running its tasks faster than another box I have which doesn't have the AVX instruction set.

I am very pleased, even if I am running a version that is 2-3 releases behind. :)

Tom


. . I am running zi3v CUDA80 and pleased with the results. Still I would like to get 0.97 running :)

. . MORE POWER ! ugh! ugh! ugh! (I watched Home Improvements too much)

Stephen

:)
ID: 1950879 · 举报违规帖子     回复 引用
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:20 Sep 12
贴子:5398
积分:192,787,363
近期平均积分:628
Australia
消息 1950878 - 发表于:20 Aug 2018, 11:53:02 UTC

@ Raistmer

. . Have you succeeded with that portable flashdrive based Linux setup yet?

. .I am very distracted trying to sort out my setup woes ATM, I hope you are up and running the way you want very soon.

Stephen

:)
ID: 1950878 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1950874 - 发表于:20 Aug 2018, 10:27:45 UTC - 回复消息 1950869.  
最近的修改日期:20 Aug 2018, 10:51:32 UTC

Looks like I have some computers to update before the fun starts ...
Either that or you can wait for me to post the App compiled in Ubuntu 14.04.1 like I've been doing. The only reason those machines are working now is I compiled all the earlier Apps on 14.04.1, otherwise, you would have had the same problem from the get-go. I've tested the two Petri Apps on my 1050 against the CUDA 9.1 App and the difference on the BLC tasks is around 1 or 2%. So, if you have a Pascal card in 16.04 it will be about 1.5% faster than a Maxwell in Ubuntu 14.04.1. I think most people with the older systems can live with only being 48.5% faster rather than 50% faster than zi3v.

Here is the last run after yet another useless tweak. The times are pretty much consistent using different settings;
Current WU: blc01_2bit_guppi_58137_29542_HIP45689_0020.26400.818.21.44.80.vlar.wu
----------------------------------------------------------------
Running default app with command :... setiathome_x41p_V0.97_x86_64-pc-linux-gnu_cuda91 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 58.426357 58 94.732101
Gauss: start 58 stop 6 len -52
Sigma > GaussTOffsetStop: 58 > 6
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bB.....bB............bB............................................................................bB..................................bB..................................................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:            9            0            0            9            0
          16:           19            0            0           19            0
          32:           39            0            0           39           39
          64:           77            0            0           77           77
         128:          153            0            0          153          153
         256:          307            0            0          307          307
         512:          613            0            0          613          613
        1024:         1225            0            0         1225         1225
        2048:         2449            0            0         2449         2449
        4096:         4897            0            0         4897         4897
        8192:         9793            0            0         9793         9793
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time: ....................... 193 seconds

----------------------------------------------------------------
Running app with command : .......... setiV0.97.linux_x64_10x0 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 58.426357 58 94.732101
Gauss: start 58 stop 6 len -52
Sigma > GaussTOffsetStop: 58 > 6
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bB.....bB............bB............................................................................bB..................................bB..................................................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:            9            0            0            9            0
          16:           19            0            0           19            0
          32:           39            0            0           39           39
          64:           77            0            0           77           77
         128:          153            0            0          153          153
         256:          307            0            0          307          307
         512:          613            0            0          613          613
        1024:         1225            0            0         1225         1225
        2048:         2449            0            0         2449         2449
        4096:         4897            0            0         4897         4897
        8192:         9793            0            0         9793         9793
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time : ...................... 189 seconds
Speed compared to default : ......... 102 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Running app with command : .......... setiV0.97.multi_sm.linux_X86_64_cuda92 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 58.426357 58 94.732101
Gauss: start 58 stop 6 len -52
Sigma > GaussTOffsetStop: 58 > 6
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bB.....bB............bB............................................................................bB..................................bB..................................................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:            9            0            0            9            0
          16:           19            0            0           19            0
          32:           39            0            0           39           39
          64:           77            0            0           77           77
         128:          153            0            0          153          153
         256:          307            0            0          307          307
         512:          613            0            0          613          613
        1024:         1225            0            0         1225         1225
        2048:         2449            0            0         2449         2449
        4096:         4897            0            0         4897         4897
        8192:         9793            0            0         9793         9793
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time : ...................... 189 seconds
Speed compared to default : ......... 102 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Done with blc01_2bit_guppi_58137_29542_HIP45689_0020.26400.818.21.44.80.vlar.wu
====================================================================
Current WU: blc16_2bit_guppi_58185_76028_Dw1_off_0033.2471.1636.22.45.95.vlar.wu
----------------------------------------------------------------
Running default app with command :... setiathome_x41p_V0.97_x86_64-pc-linux-gnu_cuda91 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 112.274147 112 182.123337
Gauss: start 112 stop -48 len -160
Sigma > GaussTOffsetStop: 112 > -48
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbBbBbB.bB...bB...........bB...bBbB..bB......................................................................................................................................bB...................bBP.................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:           13            0            0           13            0
          16:           25            0            0           25            0
          32:           51            0            0           51           51
          64:          101            0            0          101          101
         128:          203            0            0          203          203
         256:          407            0            0          407          407
         512:          813            0            0          813          813
        1024:         1627            0            0         1627         1627
        2048:         3255            0            0         3255         3255
        4096:         6511            0            0         6511         6511
        8192:        13021            0            0        13021        13021
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time: ....................... 228 seconds

----------------------------------------------------------------
Running app with command : .......... setiV0.97.linux_x64_10x0 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 112.274147 112 182.123337
Gauss: start 112 stop -48 len -160
Sigma > GaussTOffsetStop: 112 > -48
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbBbBbB.bB...bB...........bB...bBbB..bB......................................................................................................................................bB...................bBP.................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:           13            0            0           13            0
          16:           25            0            0           25            0
          32:           51            0            0           51           51
          64:          101            0            0          101          101
         128:          203            0            0          203          203
         256:          407            0            0          407          407
         512:          813            0            0          813          813
        1024:         1627            0            0         1627         1627
        2048:         3255            0            0         3255         3255
        4096:         6511            0            0         6511         6511
        8192:        13021            0            0        13021        13021
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time : ...................... 225 seconds
Speed compared to default : ......... 101 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Running app with command : .......... setiV0.97.multi_sm.linux_X86_64_cuda92 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 112.274147 112 182.123337
Gauss: start 112 stop -48 len -160
Sigma > GaussTOffsetStop: 112 > -48
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbBbBbB.bB...bB...........bB...bBbB..bB......................................................................................................................................bB...................bBP.................FFtLen : spike gauss autocorr triplet pulse
           1:            0            0            0            0            0
           2:            0            0            0            0            0
           4:            0            0            0            0            0
           8:           13            0            0           13            0
          16:           25            0            0           25            0
          32:           51            0            0           51           51
          64:          101            0            0          101          101
         128:          203            0            0          203          203
         256:          407            0            0          407          407
         512:          813            0            0          813          813
        1024:         1627            0            0         1627         1627
        2048:         3255            0            0         3255         3255
        4096:         6511            0            0         6511         6511
        8192:        13021            0            0        13021        13021
       16384:         2463            0            0         2463            0
       32768:         9849            0            0         9849            0
       65536:        11817            0            0        11817            0
      131072:        47271            0        47272            0            0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Elapsed Time : ...................... 225 seconds
Speed compared to default : ......... 101 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Done with blc16_2bit_guppi_58185_76028_Dw1_off_0033.2471.1636.22.45.95.vlar.wu
====================================================================
Current WU: blc3_2bit_guppi_57432_28897_HIP57494_OFF_0014.14006.416.18.27.18.vlar.wu
----------------------------------------------------------------
Running default app with command :... setiathome_x41p_V0.97_x86_64-pc-linux-gnu_cuda91 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 255.441376 255 413.903625
Gauss: start 255 stop -191 len -446
Sigma > GaussTOffsetStop: 255 > -191
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbB....bB...bB.......................bB...bB...................................................................................bBP................................bBP.pP..pP.....pP....pP.pP...........pP..pP............pP.........pP.........pP
Elapsed Time: ....................... 272 seconds

----------------------------------------------------------------
Running app with command : .......... setiV0.97.linux_x64_10x0 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 255.441376 255 413.903625
Gauss: start 255 stop -191 len -446
Sigma > GaussTOffsetStop: 255 > -191
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbB....bB...bB.......................bB...bB...................................................................................bBP................................bBP.pP..pP.....pP....pP.pP...........pP..pP............pP.........pP.........pP
Elapsed Time : ...................... 268 seconds
Speed compared to default : ......... 101 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Running app with command : .......... setiV0.97.multi_sm.linux_X86_64_cuda92 -nobs -device 0
gCudaDevProps.multiProcessorCount = 5
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 5
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 255.441376 255 413.903625
Gauss: start 255 stop -191 len -446
Sigma > GaussTOffsetStop: 255 > -191
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbB....bB...bB.......................bB...bB...................................................................................bBP................................bBP.pP..pP.....pP....pP.pP...........pP..pP............pP.........pP.........pP
Elapsed Time : ...................... 267 seconds
Speed compared to default : ......... 101 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Done with blc3_2bit_guppi_57432_28897_HIP57494_OFF_0014.14006.416.18.27.18.vlar.wu
ID: 1950874 · 举报违规帖子     回复 引用
Profile Raistmer
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:16 Jun 01
贴子:6259
积分:106,370,077
近期平均积分:121
Russia
消息 1950872 - 发表于:20 Aug 2018, 9:46:03 UTC

Even didn't finish logon process yet but already got " Failed to update" message.
Looks like even didn't changed OS at all, same nasty things as in Windows here.

Anyway, I don't need to like Ubuntu, just use it on purpose... And purpose was to create Portable Linux bootable flash to test Linux-based software.
Will check if it can boot on PC with 1050Ti being created on netbook...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1950872 · 举报违规帖子     回复 引用
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者测试人员

发送消息
已加入:1 Dec 99
贴子:2786
积分:685,657,289
近期平均积分:835
Canada
消息 1950869 - 发表于:20 Aug 2018, 8:44:53 UTC - 回复消息 1950849.  

Yea, I'm not sure where Petri came along with gcc 5.2.1, I can only get mine up to 4.8.5 ... If that is even the problem.
I'm having no luck with Ubuntu 14 or Mint 17. Uggg that is most of my computers.
Mint 18 with the muti app, seem to be fine so far on my i7-960 with 980/780Ti/750Ti
Arecibo shorties on the 980 @ 58s, and 780Ti @ 50s ... impressive.

Looks like I have some computers to update before the fun starts ...
ID: 1950869 · 举报违规帖子     回复 引用
TBar
志愿者测试人员

发送消息
已加入:22 May 99
贴子:5204
积分:840,779,836
近期平均积分:2,768
United States
消息 1950849 - 发表于:20 Aug 2018, 3:43:27 UTC - 回复消息 1950847.  

I'm not even sure what the problem is. However it does appear your GCC is from 15.10;
gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
Again, I have no idea if that's a problem.
Now that I think about it, I wonder what happened to that problem with the AKv8 folder and GCC 5.2+....it just faded away.
ID: 1950849 · 举报违规帖子     回复 引用
前 · 1 . . . 100 · 101 · 102 · 103 · 104 · 105 · 106 . . . 161 · 后

留言板 : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.