Too many errors Ubuntu 16.04 nvidia

Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1832014 - Posted: 23 Nov 2016, 14:21:38 UTC - in response to Message 1832013.  

I suspect something isn't happy about 2 tasks/card;


-instances_per_device 2 is missing in your linux version.

Some params are not working on Linux as i found on my testing last year.

I've never been sure that this parameter mattered; it felt like it was merely for logging. I was passing it for stock and it wasn't logging and I was passing it for the build TBar made, too, and I just checked and it wasn't showing up in the output there either.

I suppose I could go trawling through the code to see where it's omitted on Linux but since I haven't heard anything back from my last submitted patch my enthusiasm is low.
ID: 1832014 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1832061 - Posted: 23 Nov 2016, 19:29:00 UTC
Last modified: 23 Nov 2016, 19:54:00 UTC

I checked 16.10, the release date was 13-Oct-2016, http://releases.ubuntu.com/
I've never had much confidence in brand new Ubuntu releases.
nVidia has released One non-Beta driver since then, http://www.nvidia.com/object/linux-amd64-display-archive.html
I'd give that OS version a few months before I tried running SETI on it.

I built a couple Apps from r3568 and they are both slower on my machine than the 2 r3566 Apps. The r3567 build a CA is the Best App on my machine at present. I haven't had any stalls with any of the Apps and NVIDIA X Server Settings has the GPU load bouncing between 96-100% running one instance.
r3568 SoG build, http://setiathome.berkeley.edu/result.php?resultid=5308398478
r3567 aka r3566 using Intel path, http://setiathome.berkeley.edu/result.php?resultid=5308398484

There seems to be a problem with the r3568 Apps and the reference_work_unit_r3215.wu. For some reason that WU will produce an Error in Standalone mode while none of the other WUs have that problem,
...
period_iterations_num=10
ERROR: OpenCL kernel/call 'clGetEventProfilingInfo' call failed (-7) in file ../../src/GPU_lock.cpp near line 546.
Waiting 30 sec before restart...
ID: 1832061 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1832083 - Posted: 23 Nov 2016, 22:56:53 UTC

FWIW I've run that build for nearly 24hrs with single task per card and I'm not seeing any of them stall out nor am I seeing driver crashes.
ID: 1832083 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Too many errors Ubuntu 16.04 nvidia


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.