Message boards :
Number crunching :
SETI@home v8.19 Windows GPU applications support thread
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
OpenCL builds have only 3 priority levels (third is real-time one). Hi Zalster, yes they are the same apps I run. ProcessLasso doing what I want for the moment. Raistmer's reply indicates the process priority limitation is built-in to the OpenCL platform. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
if I understand correctly, then you are saying that there is no built-in method of controlling priority level with BOINCs or SETI's configuration files? The limitation is built-in to the OpenCL platform? I am saying that SETI MultiBeam OpenCL GPU app has 3 levels of priority: default, -hp, -rtp If anything else needed it can be done via other tools. And CPU process priority has no connection to OpenCL platform at all. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
I run Einstein's intel_gpu OpenCL tasks with Process priority class (not thread priority) right up at 'Real time'. In that particular special case (nothing directly to do with Raistmer's SETI applications at all), I can overcome something like a 6x speed degradation when running those apps with all four CPU cores saturated - though it probably depends what they're saturated with. There is a recent and little-discussed cc_config.xml option: <process_priority>N</process_priority>, <process_priority_special>N</process_priority_special> Although that's pretty coarse control, it might be worth exploring. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Well, I am back with a NEW mystery: despite having 2 x GTX 750tis running 2 WUs each, I now have the situation where Task Manager shows 4 instances of 8.19 SOG running, but 2 have ~90megs of RAM and 2 have 4Megs, and apparently are not doing anything. (GPUShark shows only 2 instances of SOG running). In the past, I had 4 of them in TM and GPUShark, and all were ~90Meg. Any ideas about what is going on? I use an older version of EVGA Precision, which showed one of the cards at GPU Utilization limit of 1 and the other at 0 (this has to do with overvoltage being allowed or not). But I Googled, and the suggestion was to uninstall Precision, reboot and then reinstall. When I did, both cards were at 1, but after starting BOINC, both were set to 0 by Precision (I think, at least the setting showed went to 0). But still only 2 SOGs running. Again restarting BOINC, I had 2 good and 2 bad SOGs in TM, and 2 in GPUShark, as before. And both cards have GPU util'n = 0 (and were set when Prec'n was started; they were both 1 at start, and sent to 0 shortly after startup. Only machine I have seen this kind of behavior on (and only with 750tis). Just to reiterate, my c/l is -sbs 512 -period_iterations_num 50 -hp -cpu_lock And there was nothing unusual in BOINC startup log. 15-Oct-2016 21:57:52 [---] cc_config.xml not found - using defaults 15-Oct-2016 21:57:52 [---] Starting BOINC client version 7.6.22 for windows_x86_64 15-Oct-2016 21:57:52 [---] log flags: file_xfer, sched_ops, task 15-Oct-2016 21:57:52 [---] Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 15-Oct-2016 21:57:52 [---] Data directory: C:\ProgramData\BOINC 15-Oct-2016 21:57:52 [---] Running under account woof 15-Oct-2016 21:57:53 [---] CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 373.06, CUDA version 8.0, compute capability 5.0, 2048MB, 1968MB available, 1455 GFLOPS peak) 15-Oct-2016 21:57:53 [---] CUDA: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 373.06, CUDA version 8.0, compute capability 5.0, 2048MB, 1884MB available, 1472 GFLOPS peak) 15-Oct-2016 21:57:53 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 373.06, device version OpenCL 1.2 CUDA, 2048MB, 1968MB available, 1455 GFLOPS peak) 15-Oct-2016 21:57:53 [---] OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 373.06, device version OpenCL 1.2 CUDA, 2048MB, 1884MB available, 1472 GFLOPS peak) 15-Oct-2016 21:57:53 [---] OpenCL CPU: AMD FX(tm)-8350 Eight-Core Processor (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 2.0 (sse2,avx,fma4), device version OpenCL 1.2 AMD-APP (938.2)) 15-Oct-2016 21:57:53 [---] Host name: woof-PC 15-Oct-2016 21:57:53 [---] Processor: 8 AuthenticAMD AMD FX(tm)-8350 Eight-Core Processor [Family 21 Model 2 Stepping 0] 15-Oct-2016 21:57:53 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx svm sse4a osvw ibs xop skinit wdt lwp fma4 tce tbm topx page1gb rdtscp bmi1 15-Oct-2016 21:57:53 [---] OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 15-Oct-2016 21:57:53 [---] Memory: 15.97 GB physical, 31.93 GB virtual 15-Oct-2016 21:57:53 [---] Disk: 232.79 GB total, 187.26 GB free 15-Oct-2016 21:57:53 [---] Local time is UTC -4 hours 15-Oct-2016 21:57:53 [SETI@home] Found app_config.xml 15-Oct-2016 21:57:53 [SETI@home] Your app_config.xml file refers to an unknown application 'astropulse_v7'. Known applications: 'setiathome_v8' 15-Oct-2016 21:57:53 [SETI@home] URL http://setiathome.berkeley.edu/; Computer ID 8114988; resource share 120 15-Oct-2016 21:57:53 [SETI@home] General prefs: from SETI@home (last modified 21-May-2015 20:31:48) 15-Oct-2016 21:57:53 [SETI@home] Computer location: work 15-Oct-2016 21:57:53 [SETI@home] General prefs: no separate prefs for work; using your defaults 15-Oct-2016 21:57:53 [---] Reading preferences override file 15-Oct-2016 21:57:53 [---] Preferences: 15-Oct-2016 21:57:53 [---] max memory usage when active: 8174.31MB 15-Oct-2016 21:57:53 [---] max memory usage when idle: 14713.75MB 15-Oct-2016 21:57:53 [---] max disk usage: 100.00GB 15-Oct-2016 21:57:53 [---] max CPUs used: 7 15-Oct-2016 21:57:53 [---] suspend work if non-BOINC CPU load exceeds 25% 15-Oct-2016 21:57:53 [---] (to change preferences, visit a project web site or select Preferences in the Manager) 15-Oct-2016 21:57:53 Initialization completed 15-Oct-2016 21:57:53 [SETI@home] Sending scheduler request: To fetch work. 15-Oct-2016 21:57:53 [SETI@home] Requesting new tasks for NVIDIA GPU 15-Oct-2016 21:57:55 [SETI@home] Scheduler request completed: got 0 new tasks 15-Oct-2016 22:01:47 [SETI@home] Message from task: 0 15-Oct-2016 22:01:47 [SETI@home] Computation for task 19ja09ad.31343.1708.15.42.208_0 finished 15-Oct-2016 22:01:47 [SETI@home] Starting task 21au09aa.14085.476.13.40.190_0 15-Oct-2016 22:01:49 [SETI@home] Started upload of 19ja09ad.31343.1708.15.42.208_0_0 15-Oct-2016 22:01:52 [SETI@home] Finished upload of 19ja09ad.31343.1708.15.42.208_0_0 15-Oct-2016 22:03:00 [SETI@home] Sending scheduler request: To fetch work. 15-Oct-2016 22:03:00 [SETI@home] Reporting 1 completed tasks 15-Oct-2016 22:03:00 [SETI@home] Requesting new tasks for NVIDIA GPU 15-Oct-2016 22:03:02 [SETI@home] Scheduler request completed: got 2 new tasks 15-Oct-2016 22:03:04 [SETI@home] Started download of 12jl16aa.21961.12341.5.32.55 15-Oct-2016 22:03:04 [SETI@home] Started download of 12jl16aa.21961.12341.5.32.59 15-Oct-2016 22:03:06 [SETI@home] Finished download of 12jl16aa.21961.12341.5.32.55 15-Oct-2016 22:03:06 [SETI@home] Finished download of 12jl16aa.21961.12341.5.32.59 ???? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I run Einstein's intel_gpu OpenCL tasks with Process priority class (not thread priority) right up at 'Real time'. In that particular special case (nothing directly to do with Raistmer's SETI applications at all), I can overcome something like a 6x speed degradation when running those apps with all four CPU cores saturated - though it probably depends what they're saturated with. I tried that setting in cc_config.xml Richard as I stated in my original post. It had no effect on Raistmer's app. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
???? There is little benefit in running more than 1WU per GPU with the SoG application. For best performance 1 CPU core per GPU WU is required. If you want to run multiple WUs on multiple GPUs it would be worth reserving 1 Core for each WU. Using app_config.xml is generally the best method; if you make a mistake you won't trash your cache which is what tends to happen if you make a mess of app_info.xml Mike mentioned earlier in the thread that while you have 8 CPU cores, there are only 4 FPUs. 2 Cores share 1 FPU. If 2 GPU WUs are trying to share the 1 FPU, the work will stall. Grant Darwin NT |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Well, I am back with a NEW mystery: despite having 2 x GTX 750tis running 2 WUs each, I now have the situation where Task Manager shows 4 instances of 8.19 SOG running, but 2 have ~90megs of RAM and 2 have 4Megs, and apparently are not doing anything. (GPUShark shows only 2 instances of SOG running). It's been a while since I used that command but I think when you use -cpu_lock you need to include how many instance per gpu you are running and total amount of work units in the commandline I can't remember how the commands looked like it was something along the lines of -instance_per_gpu X or -instance_per_device X and -total_instance x I'd wait for someone to post the correct commandline before trying it. |
Kiska Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0 |
Well, I am back with a NEW mystery: despite having 2 x GTX 750tis running 2 WUs each, I now have the situation where Task Manager shows 4 instances of 8.19 SOG running, but 2 have ~90megs of RAM and 2 have 4Megs, and apparently are not doing anything. (GPUShark shows only 2 instances of SOG running). Please enable this option in cc_config.xml file please: <use_all_gpus>0|1</use_all_gpus> |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Please enable this option in cc_config.xml file please: Not needed. In this case, he doesn't have a cc_config.xml file at all, and there's no point in creating one just for this. 15-Oct-2016 21:57:52 [---] cc_config.xml not found - using defaults Note that the Wiki says "most capable ones" - plural - and because he has two cards of the same specification, both are being used already: 15-Oct-2016 21:57:53 [---] CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 373.06, CUDA version 8.0, compute capability 5.0, 2048MB, 1968MB available, 1455 GFLOPS peak) Whatever it is that's causing memory usage to spike up to 4GB (and we did have that problem with an Astropulse (IIRC) application once, with a certain type of task), I very much doubt that it's the BOINC configuration. It might even be a glitch in the reporting tool, or as others have said the interactions of all the command line switches leading to the SoG application itself to over-commit single instances of resources, rather than allowing the multi-tasking tools in the OS to distribute them normally. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
I am running both v8 8.00 and v8 8.19 stock apps on my Windows 10 PC with GTX 750 OC. They both run fine but 8.00 uses only 7% CPU while 8.19 uses 23%. I am running also two vLHC@home tasks or Atlas@home tasks, using VirtualBox on the AMD A10-6700 CPU, which should have 4 cores but Windows Task Manager says it has 2 cores and 4 logical processors. 24 GB RAM. Tullio |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Per ReadME file: -cpu_lock : Enables CPUlock feature. Results in CPUs number limitation for particular app instance. Also attempt to bind different instances to different CPU cores will be made. Can be used to increase performance under some specific conditions. Can decrease performance in other cases though. Experimentation required. Now this option allows GPU app to use only single logical CPU. Different instances will use different CPUs as long as there is enough of CPU in the system. Use -instances_per_device N option if multiple instances on GPU device are used. -instances_per_device N :Sets allowed number of simultaneously executed GPU app instances per GPU device (shared with MultiBeam app instances). N - integer number of allowed instances. Should not exceed 64. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Thanks Raistmer, I thought I remember needing that. I think there is also 1 other command needed telling the total amount of all instances. Keith, you remember what the command is?? |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
???? In my app_config.xml I already allocate 1 CPU per GPU WU; since I have 2 GPUs running 2 each, that's 4 CPUs, and I am not running CPU WUs. Would it be better to try -use_sleep to force the given CPU to give up the app to take care of the problem of 4 FPUs and 8 CPUs? Or: how can I bind each WU to a given FPU so as not to have conflicts? |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
It was 4MB, not GB. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
It was 4MB, not GB. Sorry, my mis-remembering. I was quoting Kiska in my reply, and it's hard to refer back to other posts at the same time with the current generation of forum software. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
If you choose to use the -use_sleep. I would remove the CPU lock. Too many changes can confuse as to what is causing the problems |
Mike Send message Joined: 17 Feb 01 Posts: 34348 Credit: 79,922,639 RAC: 80 |
If you choose to use the -use_sleep. I would remove the CPU lock. Too many changes can confuse as to what is causing the problems This will happen if he removes it. Like i said its necessity for FX CPU. With each crime and every kindness we birth our future. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
I found the problem. It WAS -cpu_lock. With it in the c/l, and watching in TM - 4 instances of SOG started, all with 2-4 Megs, then 2 of them went to 50M and then 90+M, the other 2 stayed in the 4M range. After removing -cpu_lock, all 4 instances of SOG started at 2M, then went to 48M and finally to 90+M. In retrospect, I should have suspected it earlier, as I don't know why I put it there to begin with. Now, as to -use_sleep: If I use that, CPU use will go way down, but elapsed times will go up maybe 20% IIRC(?). But can I then use the CPUs for computation by knocking down the amount of CPU in app_config.xml to say 0.2 and using the now available cores to do WUs? I'm really thinking of my 16c/32t machine "Big32" here, which is now running HT off, so 16 cores. It is set to use 15 cores, but the 2 980s on it are running 6 WUs, so reserving 6 cores, and running 9 threads of CPU work. In the -use_sleep scenario, I would have another 4 threads for CPU work, yes? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
You can tell it you only want it to use 0.2 CPU...but it is going to do whatever it wants despite what anyone here tells you. -use_sleep will knock it down to some level but it will never be the 0.2 you want. You must think of the values you list as GUIDELINES, not absolutes. So figure out how much the work units use, subtract that from the totals, subtract an extra 1 for GPU feeding and whatever is left is what you have to play with. Keep an eye on the system to make sure you don't lock up your system. Good luck Zalster |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
You can tell it you only want it to use 0.2 CPU...but it is going to do whatever it wants despite what anyone here tells you. 0.2 was just a number I pulled out of my *ss, just wanted the 6 GPU WUs to pull 2 CPUs out of the available number. In the past, when I tried -use_sleep, I found that (on one of my other machines (Intel E5-2670)) that CPU/GPU WU went from 90%+ to well under 20%, even as elapsed time increased roughly 20-30%. EDIT: IIRC
Yeah - I will need it. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.