Message boards :
Number crunching :
V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use
Message board moderation
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 14 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
... As BOINC itself run on higher priority than worker threads - yes, IMHO it's possible reason. I checked boinc.exe current CPU time on my host - it already took 11 min of CPU time and continue with some peaks of 9% CPU usage (BOINC 6.6.20). 11 min of CPU time looks as some GPU tasks could be affected by these delays. ADDON: it can be tested by using locked affinity CUDA MB mod and restricting BOINC affinity to all cores but first one. That way CUDA MB should be free from boinc.exe interference. But in this case another problem arises for multi-GPU hosts. At moment of starting new task freshly launched instance of CUDA MB tasks will interfere with other executing app instances (newly statted tries to take whole core for itself but all CUDA MB instances locked on the same single core with affinity lock mod). This situation was discovered by Bob Mahoney and was the reason I removed affinity lock mod from recent CUDA MB builds. Considering possible boinc.exe interferention with CUDA MB running times maybe it's worth to restore that mod for single GPU multicore CPU hosts and restrict boinc.exe from using first core. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Complete pack with Number_of_GPUs and other needed modifications like cc_config works just OK for BOINC 6.6.20, I still use it in this way cause doing only MB right now on that host. But partially modified (with replace app_info only for example) can lead to errors of course. New app_info, "teamed" to ordinary AK_v8 replacement, removal of ncpus from cc_config are needed for "upgrading" to BOINC 6.6.20 CPU/GPU scheduling for SETI MB. |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
As BOINC itself run on higher priority than worker threads - yes, IMHO it's possible reason. I checked boinc.exe current CPU time on my host - it already took 11 min of CPU time and continue with some peaks of 9% CPU usage (BOINC 6.6.20). 11 min of CPU time looks as some GPU tasks could be affected by these dalays. I'm only the 'amateur'.. ;-) For my understanding I thought the operation system [Windows XP Home] (and BOINC) would be intelligent enough to adjust the lower app tasks in the order of the priorities. That the CPU tasks would only 'disturbed' from the boinc.exe, because the CUDA tasks have higher priority. It's possible to make a mod (or how I could tell it my OS?) that the CUDA tasks wouldn't/couldn't disturbed and have all the time 100 % CPU support in GPU crunching time? :-) ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
As BOINC itself run on higher priority than worker threads - yes, IMHO it's possible reason. I checked boinc.exe current CPU time on my host - it already took 11 min of CPU time and continue with some peaks of 9% CPU usage (BOINC 6.6.20). 11 min of CPU time looks as some GPU tasks could be affected by these dalays. With RTOS yes, correct priority setting should ensure that some task will not be interrupted by less important activity... but Windows is no way RTOS. Maybe port to QNX is required to fully utilize GPU potential... :) |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
As BOINC itself run on higher priority than worker threads - yes, IMHO it's possible reason. I checked boinc.exe current CPU time on my host - it already took 11 min of CPU time and continue with some peaks of 9% CPU usage (BOINC 6.6.20). 11 min of CPU time looks as some GPU tasks could be affected by these dalays. ..ops.. I forgot.. If I think about my upper posted results.. If boinc.exe disturb the CUDA tasks also, for my understanding all the CUDA tasks would be involved. So maybe all results have ~ 30 sec.* longer crunching time [vs. only GPU crunching] . And not only some very much longer crunching time.. Or not? [* EDIT: from min. to sec. .. ;-)] ![]() |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
With RTOS yes, correct priority setting should ensure that some task will not be interrupted by less important activity... but Windows is no way RTOS. What you talk about..? ;-) I'm the 'amateur'.. :-D QNX? I don't know.. ;-) If you think it's possible to do.. please do it.. :-) If you think about and/or you had done this function.. please give a hint! :-D ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Maybe not. Actually, app transfer from one core to another can be very costly operation if these 2 cores dont' share the same L2 cache. I hope Windows is clever enough to realize this and not perform process transition too often. And that means boinc.exe will executed mostly on the one core and would interfere only with one GPU app for multi GPU host or time to time with single GPU app (only when next condition become true: boinc.exe use excessive CPU _and_ boinc.exe and CUDA MB app are executed on the same core at this moment) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
It's not just a function it's complete RealTime OS. Free for non-commercial use BTW (look link at wikipedia about it in my earlier post). So I'm afraid it's not an option for windows-bound users anyway. Maybe some play with priorities _and_ affinity could increase overall performance for hosts with big queues and big boinc.exe CPU time usage. IMHO BOINC scheduling and bookkeeping became too complex to be considered as negligible overhead... |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
My 5th: http://setiathome.berkeley.edu/result.php?resultid=1202338717 icfft=98384, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error --------------------------------------------------------------- If you would like to help Raistmer.. please post the '-12 Unknown error' also.. *thumb up* It could look like this: Exception detected inside cudaAcc_find_triplets, dumping client state icfft=98384, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel File: ..\analyzePoT.cpp Line: 348 And only the [bolded] line is needed. ![]() |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
Maybe not. Actually, app transfer from one core to another can be very costly operation if these 2 cores dont' share the same L2 cache. I hope Windows is clever enough to realize this and not perform process transition too often. And that means boinc.exe will executed mostly on the one core and would interfere only with one GPU app for multi GPU host or time to time with single GPU app (only when next condition become true: Ohh.. I see - it's not so easy to be a 'optimizer/code cutter'.. ;-) It's well, that I'm not you.. ;-D ..now I'm an AMD Phenom II user.. AFAIK, this CPU have not shared L2-Cache, every CPU-Core have his own L2-Cache. [4 x L2-Cache] So this performance lost would be higher.. as for Core2 Quads which have 2 x L2-Cache. [2 CPU-Core share one L2-Cache] Maybe it would be possible to fix boinc.exe on one CPU-Core? [Or make it that boinc.exe share his 25 % CPU to all CPU-Core with 25 % Core-usage? [4 x 6,25 CPU usage]*] But this would mean to make a BOINC-mod, or? This would eliminate also the L2-Cache lost? (Which version would be better? Only one CPU-Core for boinc.exe or 4 x same shared?) [For example a Quad-CPU with 4 GPUs.. ;-)] For rigs with much GPUs it would help to rise the performance.. [* If I remember correct my only GPU crunching time.. In TaskManager - if boinc.exe took 25 % CPU, all the CPU-Core had (different) higher usage-peaks.. So boinc.exe split his 25 % CPU to all 4 CPU-Core. But not 6,25 % CPU for each..] ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Maybe it would be possible to fix boinc.exe on one CPU-Core? Yes, by using task manager's affinity settings or by using some programs like ProcessLasso. |
piper69 Send message Joined: 25 Sep 08 Posts: 49 Credit: 3,042,244 RAC: 0 ![]() |
question to you guys. does it impact overall performance of cuda app if i lower this two values <avg_ncpus>0.127970</avg_ncpus>; <max_ncpus>0.127970</max_ncpus> to something like 0.03-0.04 ? any answer/opinion is apreciated thx |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
question to you guys. It's just hint to BOINC scheduler, not any realy CPU usage or CPU usage limit. |
piper69 Send message Joined: 25 Sep 08 Posts: 49 Credit: 3,042,244 RAC: 0 ![]() |
problem is when i look in task manager the cuda app is constantly doing around 12% cpu usage. is that normal? |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
problem is when i look in task manager the cuda app is constantly doing around 12% cpu usage. It depends on CPU speed. I looked at your Athlon 64 X2 host results - it seems CUDA part working OK. You could see big CPU usage for CUDA app when it can't properly initialize (low GPU memory or another error) and falls back to CPU processing. This mode should be avoided cause CPU processing by CUDA MB much slower than by AK_v8. But your GPU works OK. |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
... You didn't saw my questions? ;-) If you don't have time with a new thread.. Could you please post the URLs of the recommend CUDA Vx apps and .dll's? Then someone of the community could open a new thread because of this.. :-) ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Could you please post the URLs of the recommend CUDA Vx apps and .dll's? http://lunatics.kwsn.net/12-gpu-crunching/v10-of-modified-seti-mb-cuda-opt-ap-package-for-full-multi-gpucpu-use.msg16715.html#msg16715 |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
Could you please post the URLs of the recommend CUDA Vx apps and .dll's? Thanks a lot! Hmm.. but.. what about the rigs with > 1 GPU in it..? Soon also a new app release for this rigs? ![]() |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
Could you please post the URLs of the recommend CUDA Vx apps and .dll's? ..hmm.. I misread your post at lunatics.kwsn.net ? I understand it, that this app is ONLY for rigs with 1 GPU. No? ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Could you please post the URLs of the recommend CUDA Vx apps and .dll's? You read it correct. I updated that post and included builds w/o affinity lock based on current codebase. No actual performance difference should be with last V10/11 update buried in corresponding Lunatic's thread. I did fresh build just for convenience. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.