Ubuntu and Nvidia tasks

Message boards : Number crunching : Ubuntu and Nvidia tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Bernd Noessler

Send message
Joined: 15 Nov 09
Posts: 99
Credit: 52,635,434
RAC: 0
Germany
Message 1391938 - Posted: 18 Jul 2013, 10:45:49 UTC - in response to Message 1391929.  

Please don't mix AP and MB on the GPU.

The linux AP problem is located in the nvidia opencl .so.
(which is not used when crunching MB)

The style nvidia has programmed this lib always gives a 100% cpu usage.

ID: 1391938 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1391941 - Posted: 18 Jul 2013, 10:52:00 UTC - in response to Message 1391938.  
Last modified: 18 Jul 2013, 10:53:17 UTC

The style nvidia has programmed this lib always gives a 100% cpu usage.
Correct, opencl 1.1 is non-blocking by design.

Pleased to see x41zc is building/working for you, as I'm still working on that ;) please forward any suggestions for improvement to me at: contact at jgopt dot org
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1391941 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1391946 - Posted: 18 Jul 2013, 11:09:08 UTC - in response to Message 1391938.  

Please don't mix AP and MB on the GPU.

The linux AP problem is located in the nvidia opencl .so.
(which is not used when crunching MB)

The style nvidia has programmed this lib always gives a 100% cpu usage.


Whut?
There were not any MBs in there. Look again, all APs.

Strange Windows driver 266.58 doesn't have that problem. Have you ever tried Linux Driver 270.xx? That's ALL I'M trying to do here, install 270 in Ubuntu. I seem to be getting a lot of people trying to tell me to do something else.
270...in Ubuntu.
ID: 1391946 · Report as offensive
Bernd Noessler

Send message
Joined: 15 Nov 09
Posts: 99
Credit: 52,635,434
RAC: 0
Germany
Message 1391973 - Posted: 18 Jul 2013, 12:53:08 UTC - in response to Message 1391946.  


Strange Windows driver 266.58 doesn't have that problem. Have you ever tried Linux Driver 270.xx? That's ALL I'M trying to do here, install 270 in Ubuntu. I seem to be getting a lot of people trying to tell me to do something else.
270...in Ubuntu.


There is no need to try the 270.xx driverfor the AP problem. The libOpenCl.so of
the 304.88 driver (the one you are using) and the 270.xx are identical.
The 100% cpu load with AP would be the same.
ID: 1391973 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1392143 - Posted: 18 Jul 2013, 19:10:09 UTC - in response to Message 1391946.  

Which Nvidia 266.58 driver are you looking for?
A search turns up at least 3~4 of them.

http://news.softpedia.com/news/Download-NVIDIA-GeForce-ION-Driver-Release-266-58-WHQL-179061.shtml
ID: 1392143 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1392243 - Posted: 19 Jul 2013, 2:14:39 UTC - in response to Message 1391973.  
Last modified: 19 Jul 2013, 2:21:40 UTC


Strange Windows driver 266.58 doesn't have that problem. Have you ever tried Linux Driver 270.xx? That's ALL I'M trying to do here, install 270 in Ubuntu. I seem to be getting a lot of people trying to tell me to do something else.
270...in Ubuntu.


There is no need to try the 270.xx driverfor the AP problem. The libOpenCl.so of
the 304.88 driver (the one you are using) and the 270.xx are identical.
The 100% cpu load with AP would be the same.

I see. So, you've never installed Linux driver 270.xx. Well, seems the requirements are a little lower than first thought. The AP App calls for Driver 258.19. The x41g CUDA App calls for 260.19.26. This leaves 4 versions of Driver 260.xx as candidates. Unfortunately, those drivers give the same Errors as version 270.xx. I don't see any mention of these Errors on nVidias site either. Has anyone ever installed the Linux 260.xx series driver in Ubuntu 12.04 LTS?

Linux 64bit OpenCL AstroPulse for NV GPUs (r1844), May 2013
Readme for Linux 64bit Cuda Multibeam (x41g), Dec 2011
ftp://download.nvidia.com/XFree86/Linux-x86_64/

nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Thu Jul 18 21:19:01 2013
installer version: 260.19.44

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

option status:
  license pre-accepted               : false
  update                             : false
  force update                       : false
  expert                             : false
  uninstall                          : false
  driver info                        : false
  precompiled interfaces             : true
  no ncurses color                   : false
  query latest version               : false
  no questions                       : false
  silent                             : false
  no recursion                       : false
  no backup                          : false
  kernel module only                 : false
  sanity                             : false
  add this kernel                    : false
  no runlevel check                  : false
  no network                         : false
  no ABI note                        : false
  no RPMs                            : false
  no kernel module                   : false
  force SELinux                      : default
  no X server check                  : false
  no cc version check                : false
  run distro scripts                 : true
  no nouveau check                   : false
  run nvidia-xconfig                 : false
  sigwinch work around               : true
  force tls                          : (not specified)
  force compat32 tls                 : (not specified)
  X install prefix                   : (not specified)
  X library install path             : (not specified)
  X module install path              : (not specified)
  OpenGL install prefix              : (not specified)
  OpenGL install libdir              : (not specified)
  compat32 install chroot            : (not specified)
  compat32 install prefix            : (not specified)
  compat32 install libdir            : (not specified)
  utility install prefix             : (not specified)
  utility install libdir             : (not specified)
  installer prefix                   : (not specified)
  doc install prefix                 : (not specified)
  kernel name                        : (not specified)
  kernel include path                : (not specified)
  kernel source path                 : (not specified)
  kernel output path                 : (not specified)
  kernel install path                : (not specified)
  precompiled kernel interfaces path : (not specified)
  precompiled kernel interfaces url  : (not specified)
  proc mount point                   : /proc
  ui                                 : (not specified)
  tmpdir                             : /tmp
  ftp mirror                         : ftp://download.nvidia.com
  RPM file list                      : (not specified)
  selinux chcon type                 : (not specified)

Using: nvidia-installer ncurses user interface
-> License accepted.
-> Installing NVIDIA driver version 260.19.44.
-> Running distribution scripts
   executing: '/usr/lib/nvidia/pre-install'...
-> done.
-> The distribution-provided pre-install script failed!  Continue installation 
   anyway? (Answer: Yes)
-> Performing CC sanity check with CC="cc".
-> Performing CC version check with CC="cc".
-> Kernel source path: '/lib/modules/3.2.0-49-generic/build'
-> Kernel output path: '/lib/modules/3.2.0-49-generic/build'
ERROR: If you are using a Linux 2.4 kernel, please make sure
       you either have configured kernel sources matching your
       kernel or the correct set of kernel headers installed
       on your system.
       
       If you are using a Linux 2.6 kernel, please make sure
       you have configured kernel sources matching your kernel
       installed on your system. If you specified a separate
       output directory using either the "KBUILD_OUTPUT" or
       the "O" KBUILD parameter, make sure to specify this
       directory with the SYSOUT environment variable or with
       the equivalent nvidia-installer command line option.
       
       Depending on where and how the kernel sources (or the
       kernel headers) were installed, you may need to specify
       their location with the SYSSRC environment variable or
       the equivalent nvidia-installer command line option.
ERROR: Installation has failed.  Please see the file
       '/var/log/nvidia-installer.log' for details.  You may find suggestions
       on fixing installation problems in the README available on the Linux
       driver download page at www.nvidia.com.
ID: 1392243 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1393362 - Posted: 21 Jul 2013, 19:16:29 UTC - in response to Message 1392243.  
Last modified: 21 Jul 2013, 19:31:14 UTC

I finally succeeded in installing the older driver. From what I've seen, there is a problem with the older nVidia drivers with Ubuntu 11.10 and above. Seems you have to expand the installer, run a patch, then compact the installer again to get it to work. Just Google 'ERROR: If you are using a Linux 2.4 kernel' and you will see what I mean. So, the easy way to install the older driver is to use an older Ubuntu. Then you have the next issue, BOINC v7 will NOT WORK with any Ubuntu older than 12.04. So, you have to use an older version of BOINC, I found 6.12.33 here, “Debian BOINC Maintainers” team. After testing numerous versions of Ubuntu, I settled for 10.04.4 with BOINC 6.12.33, and driver 260.19.44. The best page for installing the driver is here, How do I install latest NVIDIA drivers by hand? The remaining problem was getting BOINC to work in my Home folder. After playing around with that for a few hours I was able to get it to work.

In the process of getting all this to work, BOINC decided to Abandon all my current tasks on that Host, no biggie. The biggie was AstroPulse 1844 Errors out immediately with this setup. It's almost as if OpenCL isn't even installed. BOINC doesn't list OpenCL at startup, it does list CUDA. I haven't had a chance to investigate that...yet.
Sat 20 Jul 2013 09:27:01 PM EDT |  | Starting BOINC client version 6.12.33 for x86_64-pc-linux-gnu
Sat 20 Jul 2013 09:27:01 PM EDT |  | Libraries: libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
Sat 20 Jul 2013 09:27:01 PM EDT |  | Data directory: /home/tbar/BOINC
Sat 20 Jul 2013 09:27:01 PM EDT |  | Processor: 2 GenuineIntel Intel(R) Xeon(R) CPU            3060  @ 2.40GHz [Family 6 Model 15 Stepping 6]
Sat 20 Jul 2013 09:27:01 PM EDT |  | Processor: 4.00 MB cache
Sat 20 Jul 2013 09:27:01 PM EDT |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx
Sat 20 Jul 2013 09:27:01 PM EDT |  | OS: Linux: 2.6.32-49-generic
Sat 20 Jul 2013 09:27:01 PM EDT |  | NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 3020, compute capability 1.1, 1023MB, 470 GFLOPS peak)
Sat 20 Jul 2013 09:27:01 PM EDT | SETI@home | Found app_info.xml; using anonymous platform
Sat 20 Jul 2013 09:27:01 PM EDT | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6864181; resource share 100...

I switched to CUDA x41g, that works. Just as back in March, without a free core performance is bad. However, this time, using the renice command did produce much better results. I still haven't found anyway to view the GPU load, so, I go by temperature. Using stock settings, the temperature was around 61c. Using a renice of zero, the temperature goes to 73c. 73c is what this card runs when at full load. Using renice -5 doesn't seem to help much over renice 0. I'm not sure why there is a difference this time, it could be the difference is Version 6 tasks opposed to Version 7 tasks. That is about the only difference, besides an older Ubuntu, Driver, and BOINC... It seems running with one free core is about the same as running with the higher nice setting. So, HOW do you set a permanent nice setting with x41g, as far as I know, you can't.
ID: 1393362 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396144 - Posted: 29 Jul 2013, 11:37:24 UTC - in response to Message 1393362.  

Well, I'm back running SETI in my 'new' Ubuntu. I've come to realize why I like 10.04 so much better than 12.04. It's the scroll bars. I really dislike the scroll bars in 12.04...it's the little things that matter.

I looked into the AstroPulse error. Even tracked down another version of the driver from here, Debian GNU/Linux. I'm still getting the same Error(s). OpenCL is still not listed in BOINC;
Mon 29 Jul 2013 07:18:44 AM EDT |  | Starting BOINC client version 6.12.33 for x86_64-pc-linux-gnu
Mon 29 Jul 2013 07:18:44 AM EDT |  | Data directory: /home/tbar/BOINC
Mon 29 Jul 2013 07:18:44 AM EDT |  | Processor: 2 GenuineIntel Intel(R) Xeon(R) CPU            3060  @ 2.40GHz [Family 6 Model 15 Stepping 6]
Mon 29 Jul 2013 07:18:44 AM EDT |  | OS: Linux: 2.6.32-49-generic
Mon 29 Jul 2013 07:18:44 AM EDT |  | NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 3020, compute capability 1.1, 1023MB, 470 GFLOPS peak)
Mon 29 Jul 2013 07:18:44 AM EDT | SETI@home | Found app_info.xml; using anonymous platform
Mon 29 Jul 2013 07:18:44 AM EDT | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6864181; resource share 100

The AstroPulse task Errors in 2 seconds flat;

state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
GPU device # not found in init_data.xml
WARNING: BOINC was unable to find GPU device, using own enumeration
OpenCL platform detected: NVIDIA Corporation
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 1 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
Used GPU device parameters are:
	Number of compute units: 16
	Single buffer allocation size: 255MB
	max WG size: 512
	FERMI path used: no
INFO: can't open binary kernel file: /home/tbar/BOINC/projects/setiathome.berkeley.edu/AstroPulse_Kernels_r1844.cl_GeForceGTS250.bin_V6, continue with recompile...
SIGSEGV: segmentation violation
Stack trace (33 frames):
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64(boinc_catch_signal+0x4d)[0x4aa1fd]
/lib/libpthread.so.0(+0xf8f0)[0x7fe3382068f0]
/lib/libc.so.6(cfree+0x1d)[0x7fe337a6842d]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xcfdd5e)[0x7fe334891d5e]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x489ebd)[0x7fe33401debd]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xb2e40f)[0x7fe3346c240f]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x98fd76)[0x7fe334523d76]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7fe334527922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7fe334525422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7fe334527922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7fe334525422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7fe334527922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7fe334525422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993b20)[0x7fe334527b20]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9b8c10)[0x7fe33454cc10]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9c3cfb)[0x7fe334557cfb]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9c4335)[0x7fe334558335]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85aafc)[0x7fe3343eeafc]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85b029)[0x7fe3343ef029]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a1a9)[0x7fe3343ee1a9]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a746)[0x7fe3343ee746]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a77d)[0x7fe3343ee77d]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xe4a6f)[0x7fe333c78a6f]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xe6ab1)[0x7fe333c7aab1]
/usr/lib/libnvidia-compiler.so.260.19.44(NvCliCompileProgram+0xdf4)[0x7fe333c75ac4]
/usr/lib/libcuda.so(+0x1486de)[0x7fe3370b56de]
/usr/lib/libcuda.so(+0x13f297)[0x7fe3370ac297]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x47cfd3]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x467612]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x460360]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x46561d]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7fe337a07c4d]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x40a809]

Exiting...

SIGSEGV: segmentation violation???
Any ideas?
ID: 1396144 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1396185 - Posted: 29 Jul 2013, 14:42:42 UTC - in response to Message 1396144.  

Well, I'm back running SETI in my 'new' Ubuntu. I've come to realize why I like 10.04 so much better than 12.04. It's the scroll bars. I really dislike the scroll bars in 12.04...it's the little things that matter.

I looked into the AstroPulse error. Even tracked down another version of the driver from here, Debian GNU/Linux. I'm still getting the same Error(s). OpenCL is still not listed in BOINC;
[code]Mon 29 Jul 2013 07:18:44 AM EDT | | Starting BOINC client version 6.12.33 for x86_64-pc-linux-gnu


That version of BOINC does not report OpenCL at all, you will need a 7.x.x version for that.

ID: 1396185 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396305 - Posted: 29 Jul 2013, 19:22:00 UTC - in response to Message 1396185.  
Last modified: 29 Jul 2013, 19:38:31 UTC

Well, I'm back running SETI in my 'new' Ubuntu. I've come to realize why I like 10.04 so much better than 12.04. It's the scroll bars. I really dislike the scroll bars in 12.04...it's the little things that matter.

I looked into the AstroPulse error. Even tracked down another version of the driver from here, Debian GNU/Linux. I'm still getting the same Error(s). OpenCL is still not listed in BOINC;
[code]Mon 29 Jul 2013 07:18:44 AM EDT | | Starting BOINC client version 6.12.33 for x86_64-pc-linux-gnu


That version of BOINC does not report OpenCL at all, you will need a 7.x.x version for that.

Thanks. I guess I'll stop looking for the Open CL line in the event log. The AP task should still work though, AFAIK. This driver 260 test is turning out to be another bust. The x41g task still runs slowly without a free core, and I can't even get the AP task to start to see if it still uses a full CPU core. Not very encouraging...

I haven't forgotten about you. It's just I still haven't gotten a MO. Seems MOs are scarce out where I live. I liked your card so much I got another one just like it from eBay, only a version that works in XP. I haven't had a chance to test the new card in OSX yet, I do get a screen in OSX though...
ID: 1396305 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396353 - Posted: 29 Jul 2013, 21:56:54 UTC - in response to Message 1396144.  

If you search for SIGSEGV, you find some old interesting posts. The Memory issue is noteworthy considering the ongoing AP Errors. I also made note of an apparent Memory leak in the not yet released Mac NV AP App over in Beta. So, I tried the 'leave applications in Memory while suspended' option. Now I receive a different Error;

<message>
process exited with code 4 (0x4, -252)
</message>
<stderr_txt>
Not using ap_cmdline.txt-file, using commandline options.
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:4096
FFA thread fetchblock override value:2048
AstroPulse v6
Non-graphics	FFTW	USE_CONVERSION_OPT	
AstroPulse v6.02
Linux 64 bit, Rev 1844, OpenCL version by Raistmer, GPU mode
 V6, by Raistmer ported to Linux by Lunatics.kwsn.net team.	
 by Urs Echternacht
ffa threshold mods by Joe Segur
SSE3 dechirping by JDWhale using SSE3 emulation

Build features: Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION SSE2 64bit 
 System: Linux  x86_64  Kernel: 2.6.32-49-generic
 CPU   : Intel(R) Xeon(R) CPU            3060  @ 2.40GHz
 2 core(s), Speed :  2394.000 MHz
 L1 : 64 KB, Cache : 4096 KB

ERROR: cl::Platform::get() (1)

</stderr_txt>

?
ID: 1396353 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396828 - Posted: 31 Jul 2013, 3:06:05 UTC - in response to Message 1396353.  
Last modified: 31 Jul 2013, 3:10:30 UTC

Alright...
I finally convinced the compiler to use the correct version of sqlite3, and now have a working BOINC 7.0.65 in Lucid Lynx. That is no small feat BTW. I still receive the same ERROR;

Tue 30 Jul 2013 10:53:29 PM EDT |  | Starting BOINC client version 7.0.65 for x86_64-pc-linux-gnu
Tue 30 Jul 2013 10:53:29 PM EDT |  | Libraries: libcurl/7.19.7 GnuTLS/2.8.5 zlib/1.2.3.3 libidn/1.15
Tue 30 Jul 2013 10:53:29 PM EDT |  | Data directory: /home/tbar/BOINC
Tue 30 Jul 2013 10:53:29 PM EDT |  | Processor: 2 GenuineIntel Intel(R) Xeon(R) CPU            3060  @ 2.40GHz [Family 6 Model 15 Stepping 6]
Tue 30 Jul 2013 10:53:29 PM EDT |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow
Tue 30 Jul 2013 10:53:29 PM EDT |  | OS: Linux: 2.6.32-50-generic
Tue 30 Jul 2013 10:53:29 PM EDT |  | CUDA: NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 3.20, compute capability 1.1, 1023MB, 930MB available, 705 GFLOPS peak)
Tue 30 Jul 2013 10:53:29 PM EDT |  | OpenCL: NVIDIA GPU 0: GeForce GTS 250 (driver version 260.19.44, device version OpenCL 1.0 CUDA, 1023MB, 930MB available, 705 GFLOPS peak)
Tue 30 Jul 2013 10:53:29 PM EDT | SETI@home | Found app_info.xml; using anonymous platform...


<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
OpenCL platform detected: NVIDIA Corporation
Number of OpenCL devices found : 1 
BOINC assigns slot on device #0.
Info: BOINC provided device ID used
Used GPU device parameters are:
	Number of compute units: 16
	Single buffer allocation size: 255MB
	max WG size: 512
	FERMI path used: no
INFO: can't open binary kernel file: /home/tbar/BOINC/projects/setiathome.berkeley.edu/AstroPulse_Kernels_r1844.cl_GeForceGTS250.bin_V6, continue with recompile...
SIGSEGV: segmentation violation
Stack trace (33 frames):
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64(boinc_catch_signal+0x4d)[0x4aa1fd]
/lib/libpthread.so.0(+0xf8f0)[0x7f99c2e6f8f0]
/lib/libc.so.6(cfree+0x1d)[0x7f99c26d142d]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xcfdd5e)[0x7f99bf4f8d5e]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x489ebd)[0x7f99bec84ebd]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xb2e40f)[0x7f99bf32940f]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x98fd76)[0x7f99bf18ad76]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7f99bf18e922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7f99bf18c422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7f99bf18e922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7f99bf18c422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993922)[0x7f99bf18e922]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x991422)[0x7f99bf18c422]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x993b20)[0x7f99bf18eb20]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9b8c10)[0x7f99bf1b3c10]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9c3cfb)[0x7f99bf1becfb]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x9c4335)[0x7f99bf1bf335]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85aafc)[0x7f99bf055afc]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85b029)[0x7f99bf056029]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a1a9)[0x7f99bf0551a9]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a746)[0x7f99bf055746]
/usr/lib/libnvidia-compiler.so.260.19.44(+0x85a77d)[0x7f99bf05577d]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xe4a6f)[0x7f99be8dfa6f]
/usr/lib/libnvidia-compiler.so.260.19.44(+0xe6ab1)[0x7f99be8e1ab1]
/usr/lib/libnvidia-compiler.so.260.19.44(NvCliCompileProgram+0xdf4)[0x7f99be8dcac4]
/usr/lib/libcuda.so(+0x1486de)[0x7f99c1d1e6de]
/usr/lib/libcuda.so(+0x13f297)[0x7f99c1d15297]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x47cfd3]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x467612]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x460360]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x46561d]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f99c2670c4d]
../../projects/setiathome.berkeley.edu/ap_6.07r1844_sse2_clNV_linux64[0x40a809]

Exiting...
</stderr_txt>

??SIGSEGV: segmentation violation??
ID: 1396828 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1396829 - Posted: 31 Jul 2013, 3:09:13 UTC - in response to Message 1396828.  

What are the permissions on each of the files?

ID: 1396829 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396835 - Posted: 31 Jul 2013, 3:20:07 UTC - in response to Message 1396829.  
Last modified: 31 Jul 2013, 3:48:33 UTC

What are the permissions on each of the files?

They're in my Home folder. They are Mine...Mine...Mine. I used this setup successfully with ap_6.07r1844_sse2_clNV in 12.04 mere days ago. Everything is basically the same as it was then. The only difference is 10.04 and Driver 260.19.44. The other Apps work normally, x41g_x86_64 and ap_6.01r546 haven't any problems...
ID: 1396835 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1396934 - Posted: 31 Jul 2013, 10:42:14 UTC
Last modified: 31 Jul 2013, 10:47:55 UTC

I'm not too big on coincidences...
So, I don't think it's a coincidence that the Only nVidia Linux driver that should run NV AstroPulse without a using a full CPU Core seems to have a problem. At least on my machine. I've tried 2 different versions of the 260 driver, with 2 different version of BOINC, on two different kernels. I receive the same Error. If I just install the 270 driver, it works. Something doesn't smell right. Of course, the 270 driver uses a Full CPU Core...of course.

Wed 31 Jul 2013 04:43:05 AM EDT |  | Starting BOINC client version 7.0.65 for x86_64-pc-linux-gnu
Wed 31 Jul 2013 04:43:05 AM EDT |  | Libraries: libcurl/7.19.7 GnuTLS/2.8.5 zlib/1.2.3.3 libidn/1.15
Wed 31 Jul 2013 04:43:05 AM EDT |  | Data directory: /home/tbar/BOINC
Wed 31 Jul 2013 04:43:05 AM EDT |  | Processor: 2 GenuineIntel Intel(R) Xeon(R) CPU            3060  @ 2.40GHz [Family 6 Model 15 Stepping 6]
Wed 31 Jul 2013 04:43:05 AM EDT |  | OS: Linux: 2.6.32-50-generic
Wed 31 Jul 2013 04:43:05 AM EDT |  | CUDA: NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 4.0, compute capability 1.1, 1023MB, 922MB available, 705 GFLOPS peak)
Wed 31 Jul 2013 04:43:05 AM EDT |  | OpenCL: NVIDIA GPU 0: GeForce GTS 250 (driver version 270.41.19, device version OpenCL 1.0 CUDA, 1023MB, 922MB available, 705 GFLOPS peak)
Wed 31 Jul 2013 04:43:05 AM EDT | SETI@home | Found app_info.xml; using anonymous platform


INFO: can't open binary kernel file: /home/tbar/BOINC/projects/setiathome.berkeley.edu/AstroPulse_Kernels_r1844.cl_GeForceGTS250.bin_V6, continue with recompile...
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: /home/tbar/BOINC/projects/setiathome.berkeley.edu/AP_clFFTplan_GeForceGTS250_32768_r1844.bin, continue with recompile...

It builds the Binaries, and works as it should...
ID: 1396934 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 1397013 - Posted: 31 Jul 2013, 16:01:13 UTC - in response to Message 1396934.  

Isn't there any linux driver between 260.19.44 and 270.41.19 that could be tested as well ? The "complete" list looks like there is none.

The errors with 260.19.44 look like a bug in NVs OpenCL-compiler that was probably fixed in a newer version.

While searching for an answer, check this post (in some other forum).
Maybe that explains the errors you have seen with the 260.19.44 driver.
And NV OpenCL on windows should show similar problems before 263.06 WHQL. (-> Claggy might have tested this.)

_\|/_
U r s
ID: 1397013 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1397295 - Posted: 31 Jul 2013, 22:47:02 UTC - in response to Message 1397013.  

Isn't there any linux driver between 260.19.44 and 270.41.19 that could be tested as well ? The "complete" list looks like there is none.

The errors with 260.19.44 look like a bug in NVs OpenCL-compiler that was probably fixed in a newer version.

While searching for an answer, check this post (in some other forum).
Maybe that explains the errors you have seen with the 260.19.44 driver.
And NV OpenCL on windows should show similar problems before 263.06 WHQL. (-> Claggy might have tested this.)

Yes, that is exactly what I've experienced, "This driver fixes clBuildProgram as opposed to 263.00 and 263.09..." It seems the OpenCL build program is broken in Linux driver 260.xx. There aren't any in between drivers, just earlier versions of 260. I'd expect the earlier versions to be even buggier. I'll try 260.19.36 from here later, ftp://download.nvidia.com/XFree86/Linux-x86_64/ The other site doesn't have anything, Releases in Debian. I'm getting very low on APs on that Host, and the Server decided to 'Expire' one instead of sending it back as a NV task.
ID: 1397295 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1397333 - Posted: 1 Aug 2013, 1:39:05 UTC - in response to Message 1397295.  

Another Driver, another Error. This appears hopeless. Who knows, the problems nVidia had with the 260.xx drivers may be why they changed things after they fixed it with 266.xx. I'm going to end this chase and install driver 290 for a while. I do wish we would see some AstroPulses again...so I could play with the Mac again.

Wed 31 Jul 2013 08:50:08 PM EDT |  | Starting BOINC client version 7.0.65 for x86_64-pc-linux-gnu
Wed 31 Jul 2013 08:50:08 PM EDT |  | Libraries: libcurl/7.19.7 GnuTLS/2.8.5 zlib/1.2.3.3 libidn/1.15
Wed 31 Jul 2013 08:50:08 PM EDT |  | Data directory: /home/tbar/BOINC
Wed 31 Jul 2013 08:50:08 PM EDT |  | Processor: 2 GenuineIntel Intel(R) Xeon(R) CPU            3060  @ 2.40GHz [Family 6 Model 15 Stepping 6]
Wed 31 Jul 2013 08:50:08 PM EDT |  | OS: Linux: 2.6.32-50-generic
Wed 31 Jul 2013 08:50:08 PM EDT |  | CUDA: NVIDIA GPU 0: GeForce GTS 250 (driver version unknown, CUDA version 3.20, compute capability 1.1, 1023MB, 929MB available, 705 GFLOPS peak)
Wed 31 Jul 2013 08:50:08 PM EDT |  | OpenCL: NVIDIA GPU 0: GeForce GTS 250 (driver version 260.19.36, device version OpenCL 1.0 CUDA, 1023MB, 929MB available, 705 GFLOPS peak)
Wed 31 Jul 2013 08:50:08 PM EDT | SETI@home | Found app_info.xml; using anonymous platform


<stderr_out>
<![CDATA[
<message>
couldn't start app: Can't get shared memory segment name: can't get shared mem segment name
</message>
]]>
</stderr_out>

ID: 1397333 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Number crunching : Ubuntu and Nvidia tasks


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.