Posts by David Anderson (not *that* DA)

1) Message boards : Number crunching : SETI/BOINC Milestones [ v2.0 ] - XXIX (Message 1831922)
Posted 11 days ago by Profile David Anderson (not *that* DA)Project Donor
Passed 35 million for Seti and 120 million for boinc recently.

Happily, the electricity for this comes from our thirty
roof solar panels. Because it's feasible for us.
Happy computing, everyone!
2) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1831817)
Posted 12 days ago by Profile David Anderson (not *that* DA)Project Donor
Two out of three xubuntu 16.04
Seti machines seemingly had messed up nvidia as shown
by
dpkg -l |grep nvidia

Did TBar's suggestion (on all three):

sudo apt-get purge 'nvidia*'
sudo apt-get autoremove
Now dpkg -l |grep nvidia (shows no output now).
reboot
selected most recent available nvidia in additional drivers
reboot
dpkg -l |grep nvidia
now looks sensible and short.

Next time the kernel updates I'll check for
this sort of problem. I suspect it was an update with
both newer kernel and newer nvidia that did not clean up properly.

Thanks for the help.
Now I'm hopeful things are ok again.
3) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1830119)
Posted 21 days ago by Profile David Anderson (not *that* DA)Project Donor
Cleaned up formatting a bit.
ii nvidia-367 367.57-0ubuntu0.16.04.1 NVIDIA binary driver - version 367.57
ii nvidia-libopencl1-367 367.57-0ubuntu0.16.04.1 NVIDIA OpenCL Driver and ICD Loader library
ii nvidia-modprobe 361.28-1 utility to load NVIDIA kernel modules and create device nodes
ii nvidia-opencl-icd-361 367.57-0ubuntu0.16.04.1 Transitional package for nvidia-opencl-icd-367
ii nvidia-opencl-icd-361-updates 361.42-0ubuntu2 Transitional package for nvidia-opencl-icd-361
ii nvidia-opencl-icd-367 367.57-0ubuntu0.16.04.1 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 Tools to enable NVIDIA's Prime
ii nvidia-settings 361.42-0ubuntu1 Tool for configuring the NVIDIA graphics driver
~
4) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1830118)
Posted 21 days ago by Profile David Anderson (not *that* DA)Project Donor
apt-get purge on the obsolete leaves:
dpkg -l |grep nvidia
ii nvidia-367 367.57-0ubuntu0.16.04.1
amd64 NVIDIA binary driver -
version 367.57
ii nvidia-libopencl1-367 367.57-0ubuntu0.16.04.1
amd64 NVIDIA OpenCL Driver an
d ICD Loader library
ii nvidia-modprobe 361.28-1
amd64 utility to load NVIDIA
kernel modules and create device nodes
ii nvidia-opencl-icd-361 367.57-0ubuntu0.16.04.1
amd64 Transitional package fo
r nvidia-opencl-icd-367
ii nvidia-opencl-icd-361-updates 361.42-0ubuntu2
amd64 Transitional package fo
r nvidia-opencl-icd-361
ii nvidia-opencl-icd-367 367.57-0ubuntu0.16.04.1
amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2
amd64 Tools to enable NVIDIA'
s Prime
ii nvidia-settings 361.42-0ubuntu1
amd64 Tool for configuring th
e NVIDIA graphics driver
5) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1830116)
Posted 21 days ago by Profile David Anderson (not *that* DA)Project Donor
q3 500: nvidia-smi
Sat Nov 12 10:56:17 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 750 Off | 0000:01:00.0 On | N/A |
| N/A 82C P0 22W / 38W | 369MiB / 1998MiB | 100% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1004 G /usr/lib/xorg/Xorg 134MiB |
| 0 2402 C ...10_x86_64-pc-linux-gnu__opencl_nvidia_SoG 233MiB |
+-----------------------------------------------------------------------------+
q3 501: dpkg -l |grep nvidia
rc nvidia-340 340.98-0ubuntu0.16.04.1 amd64 NVIDIA binary driver - version 340.98
rc nvidia-352-updates 361.42-0ubuntu2 amd64 Transitional package for nvidia-361
rc nvidia-361 367.57-0ubuntu0.16.04.1 amd64 Transitional package for nvidia-367
ii nvidia-367 367.57-0ubuntu0.16.04.1 amd64 NVIDIA binary driver - version 367.57
ii nvidia-libopencl1-367 367.57-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL Driver and ICD Loader library
ii nvidia-modprobe 361.28-1 amd64 utility to load NVIDIA kernel modules and create device nodes
rc nvidia-opencl-icd-340 340.98-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL ICD
rc nvidia-opencl-icd-352-updates 361.42-0ubuntu2 amd64 Transitional package for nvidia-opencl-icd-361
ii nvidia-opencl-icd-361 367.57-0ubuntu0.16.04.1 amd64 Transitional package for nvidia-opencl-icd-367
ii nvidia-opencl-icd-361-updates 361.42-0ubuntu2 amd64 Transitional package for nvidia-opencl-icd-361
ii nvidia-opencl-icd-367 367.57-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 361.42-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
q3 502:
6) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1830114)
Posted 21 days ago by Profile David Anderson (not *that* DA)Project Donor
I discovered that Ubuntu on 7748035 was using
generic nvidia opencl instead of the 367 version.

Switching to 340.98 lead to chaos in gpu tasks. All fail
for some seconds, very quickly.

Now using nvidia 367.57 with the 367 opencl.
Preliminary indication: gpu work proceeding ok.

Now I know that additional-drivers page does not necessarily
result in the best opencl choice automatically I'll have to keep an eye
on that with ubuntu updates.
7) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1829120)
Posted 25 days ago by Profile David Anderson (not *that* DA)Project Donor
Now another host: 7748035, seems to have gone crazy with
GPU tasks issues (due to latest updates from Ubuntu, I
presume). I've applied the same
sequence of operations and I hope that will help.
8) Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia (Message 1828691)
Posted 28 days ago by Profile David Anderson (not *that* DA)Project Donor
Had 271 tasks error off today. Latest Ubuntu 16.04 updates
(there have been many in the last few weeks) resulted in errors
and finally in being unable to find the two GPUs.
I only just noticed (when boinc would not restart properly).
It all went wrong with this morning's minor update, it seems.

https://setiathome.berkeley.edu/show_host_detail.php?hostid=5766757

I switched to Nouveau driver, rebooted, switch back to Nvidia 367.57,
rebooted, and now it seems the GPUs are ok.

I sure hope this does not recur.
9) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1802796)
Posted 15 Jul 2016 by Profile David Anderson (not *that* DA)Project Donor
New Linux kernel today. Install went fine, GPUs fine,
Seti working fine.
10) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1799786)
Posted 1 Jul 2016 by Profile David Anderson (not *that* DA)Project Donor
I prefer to keep up with bug fixes. Only one machine
has the nVidia driver problem, and that is only with some kernel releases
in 14.04 LTS. So I occasionally struggle :-)

Two nvidia GPU machines here on 14.04 are updated too
and just work without surprises.
11) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1799726)
Posted 30 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
The failed task I mentioned yesterday is failed for 4 others (3 have GPU related messages suggesting reboot same as I got, 1 has complaint about too-old GPU.) So I guess that task failure is nothing I should worry about.
12) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1799493)
Posted 29 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
Thanks for the links, The_Matrix.
13) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1799360)
Posted 29 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
Updated kernel installed June 28 and installed and ran nvidia-modprobe (same one as before) and X/lightdm stay operational.
So GPU operations have been resumed.

a fair number of GPU tasks validated overnight.

On one task 5008182283 I got:
ERROR: Possible wrong computation state on GPU, host needs reboot or maintenance

which is alarming. I'll be watching all this closely.
14) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1796480)
Posted 16 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
My concern is that it is (apparently) the old nvidia-modprobe
that (in 14.04) kills X/lightdm. Why that has not been
updated in sync with the drivers (in ubuntu) is a puzzle.
Someone thinks it's not needed?
Do you have nvidia-modprobe petri33?

Without nvidia-modprobe
the desktop (xfce) comes up fine with either nvidia
driver that 'additional drivers' shows. But boinc cannot see
the GPUs.
15) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1795440)
Posted 11 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
I meant Xubuntu, not plain Ubuntu.
16.04 will be officially
marked LTS in the repositories in a month
or so and when it is I'll upgrade.
In hopes it will help with
GPU on boinc. Sensible to do anyway.
16) Message boards : Number crunching : Nvidia-Ubuntu 14.04 fail with latest kernel (Message 1795412)
Posted 11 Jun 2016 by Profile David Anderson (not *that* DA)Project Donor
Installed the latest (weekly lately) 64bit kernel on CPUid 5766757
Ubuntu 14.04. Did the sequence that has worked for several months
after getting a kernel update:

sudo apt-get purge 'nvidia*'
sudo shutdown -h now
reboot....
Use settings additional drivers to get nvidia
sudo shutdown -h now
reboot...
sudo apt-get install nvidia-modprobe


It's a very old nvidia-modprobe, but oh well.
X won't start. No graphics.

If I don't install nvidia-modprobe graphics work
but boinc won't find the cards... And yes, I do restart
boinc-client, but that does not help boinc find the cards.

So I aborted the Seti GPU tasks in hand. A lovely bunch of
GPU tasks that I have no way to process right now...
Doing cpu for now on 11 cores.
17) Message boards : Number crunching : SETI/BOINC Milestones [ v2.0 ] - XXIX (Message 1781312)
Posted 22 Apr 2016 by Profile David Anderson (not *that* DA)Project Donor
Finally hit 30 million Seti cobblestones! Essentially all CPU as Seti is mostly not able to use my nVidia GPUs (old or new).

Congratulations to all of you reaching your milestones.
18) Message boards : Number crunching : No Usable GPU in Linux (Message 1741473)
Posted 12 Nov 2015 by Profile David Anderson (not *that* DA)Project Donor
Been focusing on the host with one 750.
Event log shows Seti does not see the GPU
or reqeust GPU tasks
though Einstein and boinc do see the GPU.
19) Message boards : Number crunching : No Usable GPU in Linux (Message 1741370)
Posted 11 Nov 2015 by Profile David Anderson (not *that* DA)Project Donor
Was running with nvidia driver installed by sgfxi.
A 14.04 kernel update resulted in absurd
window/text sizes (very very large)
Apparently noveau driver?
Uh Oh.

Did:
cntrl-alt F2
(to stop X)
(login to text window)
sudo su -
service lightdm stop
sudo apt-get remove `dpkg-query --show '*nvidia*'`
apt-get remove dkms
sgfxi
(and now back to 355 nvidia driver,
and on a reboot to test that this survives reboot,
all is well with boinc on restarting boinc-client).
Still not really getting Seti GPU tasks much if at all,
but oh well.
20) Message boards : Number crunching : No Usable GPU in Linux (Message 1739992)
Posted 5 Nov 2015 by Profile David Anderson (not *that* DA)Project Donor
I've modified app_info.xml on the dual-760's machine
to read along the lines you suggested, but naming
what apps and shared libraries I have right now.

I have no idea if the result is correct or useful.

<app_info>
<app>
<name>setiathome_v7</name>
</app>
<file_info>
<name>MBv7_7.05r2549_sse42_linux64</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_v7</app_name>
<version_num>705</version_num>
<cmdline></cmdline>
<file_ref>
<file_name>MBv7_7.05r2549_sse42_linux64</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>astropulse_v7</name>
</app>
<file_info>
<name>ap_7.05r2728_sse3_linux64</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v7</app_name>
<version_num>705</version_num>
<platform>x86_64-pc-linux-gnu</platform>
<plan_class></plan_class>
<cmdline></cmdline>
<file_ref>
<file_name>ap_7.05r2728_sse3_linux64</file_name>
<main_program/>
</file_ref>
</app_version>


<file_info>
<name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</name>
<executable/>
</file_info>

<file_info>
<name>libcudart.so.3</name>
<executable/>
</file_info>

<file_info>
<name>libcufft.so.3</name>
<executable/>
</file_info>

<app_version>
<app_name>setiathome_v7</app_name>
<version_num>705</version_num>
<plan_class>cuda32</plan_class>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>0.1</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.3</file_name>
</file_ref>
<file_ref>
<file_name>libcufft.so.3</file_name>
</file_ref>
</app_version>



</app_info>


Next 20


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.