Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 75 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7485
Credit: 91,088,852
RAC: 2,208
Australia
Message 1835937 - Posted: 14 Dec 2016, 21:00:01 UTC - in response to Message 1835910.  

. . It is a shame I am running Windows, I would love to see how well the Special Cuda 60 app performs on my GT730 (CC=3.5), even though I am planning on retiring it...

You could always build a Linux machine for it, the parts are cheap on eBay.


. . But then I would have to learn Linux.

Stephen

.


As soon as I (or anyone) has managed to isolate and fix the unroll bug, the special alpha will become available in Windows form. It's not now, only because the mass devastation of the project results that would occur if released. Fortunately have more time for digging from a couple of days ago, though not promising quick fixes. The pulsefinding is unfortunately the most complex part of these applications.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1835937 · Report as offensive     Reply Quote
rob smithSpecial Project $250 donor
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 15385
Credit: 258,725,798
RAC: 277,217
United Kingdom
Message 1835938 - Posted: 14 Dec 2016, 21:00:43 UTC

The world just got stranger....
Reverted to the state before doing "a", rebooted (twice - I made trypo the first time round and got some "really interesting" error messages), and I'm now seeing all three GPUs chugging away on Beta (I haven't had the courage yet to try your special app - I wanted to get the beast behaving itself on a "stock" application first.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1835938 · Report as offensive     Reply Quote
rob smithSpecial Project $250 donor
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 15385
Credit: 258,725,798
RAC: 277,217
United Kingdom
Message 1835940 - Posted: 14 Dec 2016, 21:11:36 UTC

Downloaded, and ready to install, but let's allow the beast settle down before I give it a serious "prod in the ribs" - probably tomorrow night (UK time)....

For those that get impatient for me reporting back on this machine it is this one: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8161267
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1835940 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3835
Credit: 191,750,513
RAC: 209,151
United States
Message 1835941 - Posted: 14 Dec 2016, 21:16:27 UTC - in response to Message 1835938.  

The world just got stranger....
Reverted to the state before doing "a", rebooted (twice - I made trypo the first time round and got some "really interesting" error messages), and I'm now seeing all three GPUs chugging away on Beta (I haven't had the courage yet to try your special app - I wanted to get the beast behaving itself on a "stock" application first.

Hmmm, OK.
I was just about to explain again why I Don't use the repository version of BOINC. I use the Berkeley version with the single BOINC folder in my Home folder. Everything just works using the Berkeley version in the Home folder and there are a few Hosts around using even the ancient 6.10 version of BOINC with the 10 series GPUs without any problems. As long as you're using a Ubuntu 14.04 based version of Linux the older Berkeley version of BOINC still works. Just don't try the Berkeley BOINC with 16.04. Yet another reason to say with 14.04 ;-)

Glad to hear it's working.
ID: 1835941 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7485
Credit: 91,088,852
RAC: 2,208
Australia
Message 1835949 - Posted: 14 Dec 2016, 21:38:41 UTC - in response to Message 1835937.  

As soon as I (or anyone) has managed to isolate and fix the unroll bug, the special alpha will become available in Windows form. It's not now, only because the mass devastation of the project results that would occur if released. Fortunately have more time for digging from a couple of days ago, though not promising quick fixes. The pulsefinding is unfortunately the most complex part of these applications.


[Update:] Have located a race condition across unroll for best pulse. @TBar: have you any examples of non-best reportables out of whack (with unroll active) ?
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1835949 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3835
Credit: 191,750,513
RAC: 209,151
United States
Message 1835954 - Posted: 14 Dec 2016, 22:00:45 UTC - in response to Message 1835949.  

Well, I'd say about half the time a Bad Pulse is reported the Best Pulse is also incorrect. You can see a couple examples above http://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1834778. It seems to be worse in OSX....if that's what you're asking ;-)
ID: 1835954 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7485
Credit: 91,088,852
RAC: 2,208
Australia
Message 1835956 - Posted: 14 Dec 2016, 22:04:28 UTC - in response to Message 1835954.  
Last modified: 14 Dec 2016, 22:15:10 UTC

Well, I'd say about half the time a Bad Pulse is reported the Best Pulse is also incorrect. You can see a couple examples above http://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1834778. It seems to be worse in OSX....if that's what you're asking ;-)


Rephrasing... It's always either only the 'best pulse' that's bad, or that same bad best and an equivalently bad reportable right ? (actually scratch that, frequency will depend on unroll, will have to do some figuring)

[Edit:] Looking through those again, yeah the overflow one has no bests (fair enough) . Gives me a bit more ammo, thanks. Looks like a missing reduction step for unroll bests
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1835956 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 5868
Credit: 77,156,382
RAC: 35,221
Russia
Message 1835977 - Posted: 14 Dec 2016, 23:45:05 UTC - in response to Message 1835956.  
Last modified: 14 Dec 2016, 23:46:06 UTC

If I recall correctly one time we saw just reverse:
BAD reported but correct best (for non-bugged case there were reported and best just the same).
(So, the main issue is reportable signal incorrect reporting, incorrect bests are of less importance).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1835977 · Report as offensive     Reply Quote
Stephen "Heretic"Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 2702
Credit: 51,273,828
RAC: 117,541
Australia
Message 1835985 - Posted: 15 Dec 2016, 0:50:16 UTC - in response to Message 1835921.  

Learning Linux isn't too bad, especially if you use one of the modern distros - I've just reconfigured one of my PCs into a Linux box using "Mint" and am now getting the beast to run as I want it. The actual install took less than half an hour, sits alongside the Windows installation which is still there as a fall-back if I need it (or any of the data therein).
The one thing I am having to think about is that only one of the GPUs is recognised by BOINC/S@H, and I do recall having a similar problem under Windows...


. . Hi Bob,

. . It is the prospect of needing to learn a while new system from the ground up that I find daunting.

. . I am still trying to understand the ins and outs of Windows without jumping in at the deep end :) with Linux.

Stephen

.
ID: 1835985 · Report as offensive     Reply Quote
Stephen "Heretic"Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 2702
Credit: 51,273,828
RAC: 117,541
Australia
Message 1835986 - Posted: 15 Dec 2016, 0:53:34 UTC - in response to Message 1835934.  

Great, but it didn't work :-(
Mint (Cinamon, 18) is happy in telling me I have three GPUs, two GTX970, and one 1080.
Prodding around I found, in one of the boinc-client directories a file called "coproc_info.xml", this file lists all three GPUs. I've set "use_all_gpus" to 1, that was the first thing I did. After trying (a) from the two options BOINC isn't detecting any of the GPUs - one step back :-(
So, I've tried re-installing the drivers (Nvidia v.367.57) I've now lost BOINC manager from the GUI - oh joy, and I had everything suspended while I tried the above - now to find the command lines I need to get back in control.
Ah, getting lots of "authorization failure -102" responses, hmm, I don't like that as it means the boinc user has lost his marbles somewhere along the road :-(


. . OK, now I definitely don't want to get involved with Linux :)

Stephen

.
ID: 1835986 · Report as offensive     Reply Quote
Stephen "Heretic"Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 2702
Credit: 51,273,828
RAC: 117,541
Australia
Message 1835988 - Posted: 15 Dec 2016, 0:57:44 UTC - in response to Message 1835937.  


As soon as I (or anyone) has managed to isolate and fix the unroll bug, the special alpha will become available in Windows form. It's not now, only because the mass devastation of the project results that would occur if released. Fortunately have more time for digging from a couple of days ago, though not promising quick fixes. The pulsefinding is unfortunately the most complex part of these applications.


. . Not that I have the foggiest about the unroll bug but I like the idea of a test version being available for windows. However long it takes it won't be any worse than waiting for this low profile 1050 ti that MSI touted. Nearly a month and not a peep since they posted it's existence.

Stephen

.
ID: 1835988 · Report as offensive     Reply Quote
Profile ML1
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 9378
Credit: 7,128,501
RAC: 3,946
United Kingdom
Message 1835992 - Posted: 15 Dec 2016, 1:45:47 UTC

Is it not an old boinc 'feature' that only the first GPU is reported even though all the GPUs can be used?

(Easiest is if all the GPUs are the same...)


Aside: I've had multiple GPUs working on a system. Care was needed with the drivers and ensuring boinc was added to the "video" group.


Happy fast crunchin,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1835992 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7485
Credit: 91,088,852
RAC: 2,208
Australia
Message 1835997 - Posted: 15 Dec 2016, 2:44:59 UTC - in response to Message 1835977.  
Last modified: 15 Dec 2016, 3:03:38 UTC

If I recall correctly one time we saw just reverse:
BAD reported but correct best (for non-bugged case there were reported and best just the same).
(So, the main issue is reportable signal incorrect reporting, incorrect bests are of less importance).


At present, the race condition showing itself in code would appear to have a number of possible manifestations like that. That doesn't rule out more symptoms from the same, or another bug/race further along. Am probably going to have to reverse engineer what was done first though. I'll need to replicate/confirm without unroll being normal/correct, Vs with it inducing the weirdness.

With unroll, I'm seeing periods distributed across cores via the y grid dimension, which should be fine, so my feeling is there'll be another race deeper in for the result table, as there is with the best table.

[There should be unroll x bests + unroll x result tables to reduce as an extra step, but not seeing it... though eyes do need a rest]
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1835997 · Report as offensive     Reply Quote
Rockhount
Avatar

Send message
Joined: 29 May 00
Posts: 34
Credit: 19,210,551
RAC: 19,376
Germany
Message 1836023 - Posted: 15 Dec 2016, 9:31:44 UTC

After one day sucessful cruching I can't take the eyes of the machinestats.
https://setiathome.berkeley.edu/results.php?hostid=1931980
Very impressive what this small 750Ti is doing with all the units no matter if guppi or something else.
The inconclusiv seems to be coming from the wingmen. If not is the error rate nevertheless very low.

The RAC could be 3-4 times higher with the 750Ti.
Due to the higher power consumtion I reduced the runtime down to 18h per day.
Regards from nothern Germany
Roman

SETI@home classic workunits 207,059
SETI@home classic CPU time 1,251,095 hours

ID: 1836023 · Report as offensive     Reply Quote
Rockhount
Avatar

Send message
Joined: 29 May 00
Posts: 34
Credit: 19,210,551
RAC: 19,376
Germany
Message 1836041 - Posted: 15 Dec 2016, 12:18:22 UTC

It seems that the special app (Cuda 6) performs very well.
This one against the 1060 6G
https://setiathome.berkeley.edu/workunit.php?wuid=2361128750

Against the GTX 780Ti
https://setiathome.berkeley.edu/workunit.php?wuid=2361013895

Do they crunch more parallel or performs the 750Ti that well?
Regards from nothern Germany
Roman

SETI@home classic workunits 207,059
SETI@home classic CPU time 1,251,095 hours

ID: 1836041 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3835
Credit: 191,750,513
RAC: 209,151
United States
Message 1836045 - Posted: 15 Dec 2016, 12:48:36 UTC - in response to Message 1836041.  
Last modified: 15 Dec 2016, 12:50:08 UTC

The Special App was designed to run the Non-VLAR Arecibo tasks very fast. The current tasks are Non-VLARS, so you're just seeing the App do what it was designed to do. The VLARs were almost ignored in the early builds that didn't use an unroll function. The unroll was added to run the VLARs better. When the VLARs return there won't be that much difference in the Apps. Except the older CUDA 50 App, which doesn't have the unroll function, will be Dog slow on the VLARs. The biggest speed difference is when running the Arecibo Shorties, a lot of time was spent designing the App to run the Shorties. Shame we don't see many Shorties anymore.
Speaking of Shorties, the Quick Overflow Arecibo tasks don't react very well in the Special App, so, if we get a large number of them again you will see a large list of Inconclusives. Fortunately most of them validate eventually, just a heads up.
ID: 1836045 · Report as offensive     Reply Quote
Rockhount
Avatar

Send message
Joined: 29 May 00
Posts: 34
Credit: 19,210,551
RAC: 19,376
Germany
Message 1836046 - Posted: 15 Dec 2016, 13:12:19 UTC

Are this shorties?
https://setiathome.berkeley.edu/workunit.php?wuid=2361124696
https://setiathome.berkeley.edu/workunit.php?wuid=2361023315
https://setiathome.berkeley.edu/workunit.php?wuid=2360919373

They seems to be valid.
They are quite fast finished too. The cpu version and the openCL app needs much more time to finish.
Regards from nothern Germany
Roman

SETI@home classic workunits 207,059
SETI@home classic CPU time 1,251,095 hours

ID: 1836046 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3835
Credit: 191,750,513
RAC: 209,151
United States
Message 1836050 - Posted: 15 Dec 2016, 13:29:26 UTC - in response to Message 1836046.  

No, those are merely Overflows. Look at the Angle ranges;
WU true angle range is : 0.415008
WU true angle range is : 0.448246
...message -9 result_overflow

Shorties have an Angle range above 2.xxxxx
They usually finish in under 3 minutes on the 750Ti, except when they Overflow.
The Files from Jan 2016 contained a large number of Shorty Overflows, and there are a couple of those files loaded into the Splitters.
They will be back.
ID: 1836050 · Report as offensive     Reply Quote
rob smithSpecial Project $250 donor
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 15385
Credit: 258,725,798
RAC: 277,217
United Kingdom
Message 1836059 - Posted: 15 Dec 2016, 15:20:23 UTC
Last modified: 15 Dec 2016, 15:26:53 UTC

I hope that my 1080 and 970s manage to avoid them, but I bet they won't
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1836059 · Report as offensive     Reply Quote
Stephen "Heretic"Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 2702
Credit: 51,273,828
RAC: 117,541
Australia
Message 1836078 - Posted: 15 Dec 2016, 17:25:56 UTC - in response to Message 1836023.  

After one day sucessful cruching I can't take the eyes of the machinestats.
https://setiathome.berkeley.edu/results.php?hostid=1931980
Very impressive what this small 750Ti is doing with all the units no matter if guppi or something else.
The inconclusiv seems to be coming from the wingmen. If not is the error rate nevertheless very low.

The RAC could be 3-4 times higher with the 750Ti.
Due to the higher power consumtion I reduced the runtime down to 18h per day.



. . They are very impressive times. They are level of performance ahead of my 950s. Almost at the level of a 1060 ...

Stephen

.
ID: 1836078 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 75 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.