AK V8 + CUDA MB team work mod

Message boards : Number crunching : AK V8 + CUDA MB team work mod
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862271 - Posted: 5 Feb 2009, 11:02:03 UTC - in response to Message 862265.  

And another update - now it's hack into BOINC API.
This build should ignore BOINC request to suspend execution when BOINC switched to another task.
This should reduce idle GPU time and increase total system performance.
Update appropriate for all SSE levels.

Special thanks go to Jason who pointed me where to dig :)

Warning: I don't know if it will work as intended so consider this update as experimental one. If you have no time to watch your BOINC installation or you feel yourself not to be able to deal with possible consequences, please, don't use it.

I would like to recive some feedback if it helps avoid GPU idle state or how it works on your host in general.
ID: 862271 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 862280 - Posted: 5 Feb 2009, 11:42:08 UTC - in response to Message 862271.  
Last modified: 5 Feb 2009, 12:27:15 UTC

And another update - now it's hack into BOINC API.
This build should ignore BOINC request to suspend execution when BOINC switched to another task.
This should reduce idle GPU time and increase total system performance.
Update appropriate for all SSE levels.

Special thanks go to Jason who pointed me where to dig :)

Warning: I don't know if it will work as intended so consider this update as experimental one. If you have no time to watch your BOINC installation or you feel yourself not to be able to deal with possible consequences, please, don't use it.

I would like to recive some feedback if it helps avoid GPU idle state or how it works on your host in general.


After installing this new app, the AK task that was running started being done by the CUDA app, and the CUDA app switched to the task AK was running..

When the CUDA task finished it then started on the task the AK was running and the AK started a new task.

AMD64 3800+
8500GT
181.20

Funny thing is the stderr of the task doesn't show AK being used in it at all 1148690298, but the cpu time is much higher than my typical CUDA task.

Edit:Now several errors
1148690301
1148690272
1148658473
1148640072
1148640072

I believe are all autokill VLAR so that's working.
After that, I downloaded some more tasks and went right into high priority for the CUDA task, though this time I did notice CUDA started it's own task and AK kept doing it's thing.

Also task 1148549829 does show "VLAR WU (AR: 0.012856 )detected, but task partially done already, continuing computations" as well as the transition between AK to CUDA
ID: 862280 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862291 - Posted: 5 Feb 2009, 13:27:52 UTC - in response to Message 862280.  


Also task 1148549829 does show "VLAR WU (AR: 0.012856 )detected, but task partially done already, continuing computations" as well as the transition between AK to CUDA


Restarted at 74.48 percent.
So it saves that task from being trashed, fine! It seems it works really :)
Now it would be interesting to detect what will be if BOINC will command CUDA task to suspend.
I anticipate next situation:
New CPU-based task will be run -> 5 CPU based tasks will be run at once + CUDA-based should continue computations giving 6 tasks active total (on quad with ncpus==5 ). I consider such config more optimal than 5+0 as it would be with prev builds.

Thanks for report!
ID: 862291 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862334 - Posted: 5 Feb 2009, 15:53:20 UTC - in response to Message 862291.  
Last modified: 5 Feb 2009, 16:01:08 UTC

Those who still experience CUDA app freezing with no CPU consumption and low GPU temp could try build attached to this post:
http://lunatics.kwsn.net/gpu-crunching/ak-v8-cuda-mb-team-work-mod.msg13810.html#msg13810

For info about conditions when it will help and some additional consideretions read few posts before this one (on Lunatics site I mean).

ADDON:
And some suggestion: If you experience some delays (PC behave sluggish) when running CUDA app and browsing Inet, playing game or watching video on multicore system, try exclude first CPU (By setting affinity for process in task manager) for non-BOINC app that experience delays (i.e. browser, game, media player). You could get better experience that way. Don't forget to upgrade to attached build of course in this case.
ID: 862334 · Report as offensive
Profile mr.kjellen
Volunteer tester
Avatar

Send message
Joined: 4 Jan 01
Posts: 195
Credit: 71,324,196
RAC: 0
Sweden
Message 862434 - Posted: 5 Feb 2009, 20:53:11 UTC

Raistmer,

I just installed CUDA on my server running win server 2003, got a strange error with V7, this host and result:
http://setiathome.berkeley.edu/result.php?resultid=1151902080

"There are no child processes to wait for. (0x80) - exit code 128 (0x80)"

What's up with that?

Running stock 6.08 now...


/Anton
ID: 862434 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862636 - Posted: 6 Feb 2009, 5:43:04 UTC - in response to Message 862434.  

Raistmer,

I just installed CUDA on my server running win server 2003, got a strange error with V7, this host and result:
http://setiathome.berkeley.edu/result.php?resultid=1151902080

"There are no child processes to wait for. (0x80) - exit code 128 (0x80)"

What's up with that?

Running stock 6.08 now...


/Anton

Hm... If it would be V8 "team" mod I would think it can't start CUDA app so no child process, but for V7...
And stock 6.08 running OK? I see only compute errors and tasks in progress for that host.
ID: 862636 · Report as offensive
Profile nutcase
Volunteer tester
Avatar

Send message
Joined: 13 Jun 05
Posts: 19
Credit: 6,589,801
RAC: 0
United States
Message 862647 - Posted: 6 Feb 2009, 6:54:32 UTC

so, will we ever see a 64 Bit windows version for cuda app?


ID: 862647 · Report as offensive
Profile Paul D Harris
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 1122
Credit: 33,600,005
RAC: 0
United States
Message 862711 - Posted: 6 Feb 2009, 14:06:05 UTC - in response to Message 862250.  
Last modified: 6 Feb 2009, 14:07:11 UTC

After following this thread, I've now got 5 WU crunching at the same time. They all look the same on the task list. Any way to tell which one is running on the GPU? Tough to tell when one is running high priority and the other 4 are not. The one running on 6.08, rather than 5.28, perhaps?

Q6700 processor, 8800 GTS video card.

Even if the GPU goes the same speed at the Q6700, this is going to be a nice RAC bump for the machine.

Look at Progress bar and CPU time in "Tasks" section of BOINC. You'll see that for 4 tasks CPU time increases each second, but for fifth one it grows very slowly despite good progress, and final CPU time is several times smaller than for other WUs. It means that this fifth WU is being crunched by GPU.

For example my PC needs about 50 minutes of CPU time to crunch standard 60-credit MB WU at CPU and only 4 minutes of CPU time to crunch it using GPU (but actually GPU-crunching consumes about 15 minutes).

Also when GPU-crunching starts you'll see that CPU time increases but progress bar is idle. It happens because GPU initiation requires about 30 seconds when CPU is active but WU crunching wasn't started yet (so no moves in progress bar).


Thanks
I found out that to enable hyper threading on W2kAS I have to install the OS while the i7 chip is installed. I installed the OS when I had a celeron installed. So I have to re-install Window 2000 Advance Server in order for it to enable hyper threading with the i7 chip installed.


I finally got my OS enabled for hyper threading and it is crunching along!
Except the cpu is running a little hot. The BIOS say 45 and core temp says 80 idle and 90 with SETI. I tried a paste I got at Best Buy suppose to be silver but is fluid and not a grease. I will get some artic silver from my local computer shop ComputerXchange. The i7 should not be this hot?
ID: 862711 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862714 - Posted: 6 Feb 2009, 14:15:13 UTC
Last modified: 6 Feb 2009, 14:16:47 UTC

The times that are posted for CUDA processed units are just the CPU-times, not the real GPU time, so that doesn't give a real meaning of the performance gain. So, the "proof" of speed increase is in the average RAC increase for that machine.
Maybe one of the next things to do it will be measuring the real GPu time, but that I think is not easy to be done because of the nature of actual GPU's (not having preemptive multitasking).
ID: 862714 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862758 - Posted: 6 Feb 2009, 16:52:39 UTC - in response to Message 862714.  

Apps from this thread post their wall clock elapsed time into stderr BTW.
ID: 862758 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862769 - Posted: 6 Feb 2009, 17:35:06 UTC - in response to Message 862758.  

Ok, thank you! So it is just a matter of BOINC site not reading those numbers to be displayed along with the CPU time?

ID: 862769 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862801 - Posted: 6 Feb 2009, 19:02:10 UTC - in response to Message 862769.  

Ok, thank you! So it is just a matter of BOINC site not reading those numbers to be displayed along with the CPU time?


Actually it displays it along with other stderr output.

For example:


Flopcounter: 21080962173356.973000

Spike count: 3
Pulse count: 1
Triplet count: 2
Gaussian count: 0

Wall-clock time elapsed since last restart: 1198.7 seconds
called boinc_finish

</stderr_txt>
ID: 862801 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 862830 - Posted: 6 Feb 2009, 19:54:02 UTC - in response to Message 857579.  

Hello Raistmer,

Quoting from the 1st message in this thread:

5) This AK V8 build was not PGOed so it will show worse performance than current CPU-only AK V8 SSSE3x app (will be fixed if this approach will be useful)


I'm now using one of the later versions of this package on a q6600, and I have the impression that crunching on the cpu is still slower than I would expect from regular AK V8. So I guess that the 'will be fixed ...' part hasn't been implemented yet?

Could you please explain what PGO stands for?

Regards,
John.
ID: 862830 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862831 - Posted: 6 Feb 2009, 20:00:30 UTC - in response to Message 862830.  

Hello Raistmer,

Quoting from the 1st message in this thread:

5) This AK V8 build was not PGOed so it will show worse performance than current CPU-only AK V8 SSSE3x app (will be fixed if this approach will be useful)


I'm now using one of the later versions of this package on a q6600, and I have the impression that crunching on the cpu is still slower than I would expect from regular AK V8. So I guess that the 'will be fixed ...' part hasn't been implemented yet?

Could you please explain what PGO stands for?

Regards,
John.


This is outdated info. I edited post on Lunatics but can't edit first post here (very inconvenient indeed)

Current performance of all provided "team" CPU apps is equal corresponding AK_v8 "standalone" versions and SSE3_AMD build even 1-2% faster.
ID: 862831 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 862835 - Posted: 6 Feb 2009, 20:16:58 UTC - in response to Message 862831.  


This is outdated info. I edited post on Lunatics but can't edit first post here (very inconvenient indeed)

Current performance of all provided "team" CPU apps is equal corresponding AK_v8 "standalone" versions and SSE3_AMD build even 1-2% faster.

Okay, good to know. Still: what does PGO mean?

Regards,
John.
ID: 862835 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 862843 - Posted: 6 Feb 2009, 20:28:48 UTC - in response to Message 862835.  


This is outdated info. I edited post on Lunatics but can't edit first post here (very inconvenient indeed)

Current performance of all provided "team" CPU apps is equal corresponding AK_v8 "standalone" versions and SSE3_AMD build even 1-2% faster.

Okay, good to know. Still: what does PGO mean?

Regards,
John.

Profile-guided optimisation?

(Google is your friend - Google "PGO compiler")

F.
ID: 862843 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 862844 - Posted: 6 Feb 2009, 20:29:39 UTC - in response to Message 862835.  

Okay, good to know. Still: what does PGO mean?

Regards,
John.


Profile Guided Optimization - compiler uses additional info about app workload collected in special app runs to further optimize app performance.
ID: 862844 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862850 - Posted: 6 Feb 2009, 20:39:20 UTC - in response to Message 862801.  

Ok, thank you! So it is just a matter of BOINC site not reading those numbers to be displayed along with the CPU time?

Actually it displays it along with other stderr output.


I wanted to say that the GPU time it is not shown on the seti website status, they show only the CPU time. it should be another field that shows the difference between wall time and cpu time - only for a unit processed with CUDA.

ID: 862850 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 862853 - Posted: 6 Feb 2009, 20:49:19 UTC - in response to Message 862850.  


I wanted to say that the GPU time it is not shown on the seti website status, they show only the CPU time. it should be another field that shows the difference between wall time and cpu time - only for a unit processed with CUDA.

IMHO that wouldn't help for so many reasons, e.g.

CPU time is always less than wall-time (even for a CPU-crunched task) because the CPU is also servicing all the background tasks you can see in you Windows Task Manager.

It has been pointed out elsewhere that, if you suspend Boinc for any period of time (with tasks left in memory) that is added to the wall-time (doesn't matter if you are looking at the CPU or the GPU)

Etc...

F.
ID: 862853 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862857 - Posted: 6 Feb 2009, 21:11:33 UTC

It is true for CPU-only units. But I was talking about the CPU-GPU units, where the difference gives an indication about GPU time.
ID: 862857 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

Message boards : Number crunching : AK V8 + CUDA MB team work mod


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.