FAO Lunatics Beta testers - installer testing

Message boards : Number crunching : FAO Lunatics Beta testers - installer testing
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Profile Miep
Volunteer moderator
Avatar

Send message
Joined: 23 Jul 99
Posts: 2412
Credit: 351,996
RAC: 0
Message 1117355 - Posted: 15 Jun 2011, 15:49:15 UTC - in response to Message 1117350.  

try a reinstall of the lunatics. Perhaps you didnt install the Fermi apps


There is only ONE app and stderr from his WUs shows it's running in CPU fallback leading to -177 errors because CPU runtimes are more than 10x longer than estimated GPU runtimes (small wonder on a Fermi).

What is puzzeling me is why it has trouble acessing the card.
Carola
-------
I'm multilingual - I can misunderstand people in several languages!
ID: 1117355 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34559
Credit: 79,922,639
RAC: 80
Germany
Message 1117359 - Posted: 15 Jun 2011, 15:57:01 UTC
Last modified: 15 Jun 2011, 15:58:45 UTC

CUDA driver version is insufficient for CUDA runtime version.


upgrade to at least 26658 since you are running 25896 on this host.
With each crime and every kindness we birth our future.
ID: 1117359 · Report as offensive
Profile Miep
Volunteer moderator
Avatar

Send message
Joined: 23 Jul 99
Posts: 2412
Credit: 351,996
RAC: 0
Message 1117366 - Posted: 15 Jun 2011, 16:10:00 UTC - in response to Message 1117359.  

CUDA driver version is insufficient for CUDA runtime version.


upgrade to at least 26658 since you are running 25896 on this host.



d'oh. I should read my own installer notes - Thanks Mike. (I'm so used to that being an error that has nothing to do with the real error that I disregarded it...)

Fred, as per readme and beta post:

Requires minimum CUDA 3.2 capable NVidia driver: 263.06 (260.99 on notebooks)

upgrade and rerun installer
[and crossposting to lunatics...]
Carola
-------
I'm multilingual - I can misunderstand people in several languages!
ID: 1117366 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1117387 - Posted: 15 Jun 2011, 17:15:29 UTC - in response to Message 1117261.  
Last modified: 15 Jun 2011, 17:21:08 UTC

(...)
We've updated the CUDA MB with a SSE compliant version (prior builds were all SSE2) ...
(...)


Is there no speed difference between SSE and SSE2 usage on a SSE2+ CPU?
This is used only for the CUDA WU preparation (start) time on CPU, or?

I guess there are more SSE2+ than SSE CPUs out there with added GPUs.

Why not two CUDA apps (SSE & SSE2), or maybe also CUDA apps which use higher extensions?


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
ID: 1117387 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1117394 - Posted: 15 Jun 2011, 17:27:03 UTC - in response to Message 1117387.  
Last modified: 15 Jun 2011, 17:27:52 UTC

Sutaru, just like the old installer, you can choose the instruction set you need just by checking the one you want. The SSE was added just for those running an older machine that doesn't have the newer features. Checking the instruction set you need is for the optimized CPU side of things. The GPU side is another set of choices.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1117394 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1117413 - Posted: 15 Jun 2011, 17:55:25 UTC - in response to Message 1117394.  
Last modified: 15 Jun 2011, 17:59:33 UTC

No. ;-)

The old (current free available, x32f) CUDA app (V0.37 installer) need/ed a CPU with at least SSE2.

It's look like the new CUDA app (V0.38 installer) need at least now a SSE CPU.

I'm curious about if there is no wasted potential on SSE2+ CPUs.

I guess if you use a SSE2+ CPU and the CUDA app use only the SSE extension for the CUDA WU start, this would be wasted potential, even if this would be only 1 or 2 seconds. For the enthusiastic cruncher is this a lot.


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
ID: 1117413 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1117414 - Posted: 15 Jun 2011, 17:56:05 UTC - in response to Message 1117387.  

...
Why not two CUDA apps (SSE & SSE2), or maybe also CUDA apps which use higher extensions?
...

Because even with the best CPU code trying to do the task in CPU fallback would still take more than 10x as long on most systems and end in a -177 error. In the future we'll probably be moving toward using the boinc_temporay_exit() instead of trying to do CPU fallback.

The tiny amount of improvement which might be achieved in initialization time, etc. by using SIMD is the only thing which might lead someone with nothing more important to do to make several different versions of the CUDA application. Then there's additional time required to get multiple versions properly tested, and so on. If we had a huge team of programmers maybe that might happen.
                                                                  Joe
ID: 1117414 · Report as offensive
Profile Miep
Volunteer moderator
Avatar

Send message
Joined: 23 Jul 99
Posts: 2412
Credit: 351,996
RAC: 0
Message 1117438 - Posted: 15 Jun 2011, 19:09:50 UTC - in response to Message 1117413.  

this would be wasted potential, even if this would be only 1 or 2 seconds. For the enthusiastic cruncher is this a lot.


Ok, let me see, a state of the art Fermi needs some 1000 sec for a mid AR task.
Congratulations, you found a 0.1 % speed improvement. :)
Carola
-------
I'm multilingual - I can misunderstand people in several languages!
ID: 1117438 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1117459 - Posted: 15 Jun 2011, 20:22:20 UTC - in response to Message 1117438.  

I UPGraded my drivers, (forgot which version) and temporarely switched back
to x32f.
And now run 4 MBs together, memory use is 1280MByte, card has ~1500MByte
I'd forgotten about them drivers, too, I'll reïnstall x38f(?) tomorrow.


ID: 1117459 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1117463 - Posted: 15 Jun 2011, 20:28:58 UTC
Last modified: 15 Jun 2011, 20:38:51 UTC

...
Why not two CUDA apps (SSE & SSE2), or maybe also CUDA apps which use higher extensions?
...

Because even with the best CPU code trying to do the task in CPU fallback would still take more than 10x as long on most systems and end in a -177 error. In the future we'll probably be moving toward using the boinc_temporay_exit() instead of trying to do CPU fallback.

The tiny amount of improvement which might be achieved in initialization time, etc. by using SIMD is the only thing which might lead someone with nothing more important to do to make several different versions of the CUDA application. Then there's additional time required to get multiple versions properly tested, and so on. If we had a huge team of programmers maybe that might happen.
                                                                  Joe

Maybe a CUDA app without CPU fallback function? For advanced user.
Then the SSE2 (or more) extension could be used again.
Or one app and over <cmdline> I could enable/disable SSE2/CPU fallback usage/function.

It's look like currently I'm the only one who write his fingers bloody.. ;-)
If a 'normal' CUDA app in installer - and an 'advanced' CUDA app is available also separately, if you count the DLs you will see how much people are interested/use it.


this would be wasted potential, even if this would be only 1 or 2 seconds. For the enthusiastic cruncher is this a lot.

Ok, let me see, a state of the art Fermi needs some 1000 sec for a mid AR task.
Congratulations, you found a 0.1 % speed improvement. :)

My GTX260 OC make a 0.44x AR WU in ~ 500 secs.

500 secs - 172,8 WUs/day

498 secs - 173,5 WUs/day

This means + ~ 0.7 WU/day/GPU.

[EDIT: + 0.4 % ;-D]

E.g. my 4x GPU machine = + ~ 280 RAC (if 100 Cr./WU).

The RAC of CPU increase also, because less CPU time for the CUDA calculation.


Enthusiastic cruncher.. ;-)


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
ID: 1117463 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Number crunching : FAO Lunatics Beta testers - installer testing


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.