Message boards :
Number crunching :
OpenCL NV MultiBeam v8 SoG edition for Windows
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 18 · Next
Author | Message |
---|---|
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Tut, Did you run an AP along side the OpenCl....SoG to see how they reacted? Just curious. Zalster |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Running 4 CPU work units (VLARS) combined with 4 GPU shows 99-100% CPU usage. When the kernal activity reaches these high levels I see a drop off in GPU usage, momentary drops from 98% down to 92%. I've noticed that the times to complete for the GPU have increased by about 3-5 minutes. As Kernal activity decrease, the GPU Usage goes back up to around 97-98%. I'll give this a few more hours but will eventually drop CPU MB to 2 and see if that provides a better through put with less momentary drops in the GPUs and how the times adjust. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Running 4 CPU work units (VLARS) combined with 4 GPU shows 99-100% CPU usage. And what about elapsed times for CPU apps? Did they drop after you increased their priority? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Any idea when there will be a Linux/Nvidia SoG version on beta to test? No idea for now. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I have not increased their priority yet. I wanted to get a baseline with the new OpenCl SoG before put in the commandline to increase the priority When the current 4 finish, I will put it in place. They have about 30 minutes left |
Mike Send message Joined: 17 Feb 01 Posts: 34365 Credit: 79,922,639 RAC: 80 |
Increase -oclfft_tune_bn and -oclfft_tune_cw to 64 on your Titan and change -no_cpu_lock to -cpu_lock. Should be faster on this host. With each crime and every kindness we birth our future. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
ok, will do that now |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Mike, CPU per work unit went down to 0.02, Time to complete as gone up on some. I'm seeing a weird effect where they slowly progress to anywhere from 6 minutes to 20 minutes then the progress goes back to 0 and they start again. Not all of them do this but at least a third do. I've also note that 2 of the 4 GPU dropped out of P2 states and went to P8 and utilization went to 0 then came back up a few minutes later. Ignore the 4 abort work units in my results Edit. I'll leave it for now, will see how they do over a longer period of time |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
So I removed the -cpu_lock as 2/5 of all work were restarting back at 0 It's hard to explain, they started with lower CPU usage and went down to low single digit percentage. They would continue to crunch to different points. 3/5 would continue onward to completion but the other 2/5 would restart the percentage at 0 and the CPU usage would then move up to 30-40% of a core. I would say it almost looks like the cpu lock "slipped" The 2/5 that did this would eventually reach conclusion but their times to complete were higher than they should have been. I've tried running 4 CPU work units along with the GPUs. There are periods where the demand from the CPU forces the GPU to pause, you can see Utilization drop to as much as 50% for a few moments, then it resumes. I had to stop CPU workage right now as the screen was flickering and it was starting to stall, spinning icon. I have 2 particular work units that are taking a lot longer time to complete than all the rest. Currently at 40 minutes for 58% completion. Once it's done I'll look at it's angle to see what it was. I think these 2 work units were stressing the system. Edit... http://setiathome.berkeley.edu/workunit.php?wuid=2064540563 WU true angle range is : 0.082070 |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I believe you. I remember seeing it on Beta. What was disconcerting was normal angles were taking just as abnormally long. 0.4 should take about 20 minute to crunch, they were taking 42 minutes. Since the removal of the cpu lock times have returned to normal. I still see the 0.01% process once in a while but it's only on the high angle work units. The only one now taking a abnormal amount of time to crunch is the low angle ones. Yes there is an abnormal high CPU now but I counter that by not doing any CPU work. The GPU more than make up for any I would have done to begin with. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
@Zalster 1. And most important. Cause you starting to use CPUlock don't deceive app (and it will not deceive you in return, hehe :)). You tell that there are only 2 instances per GPU but actually run 4. So part of them just await free slot. Be consistent, if you run N instances tell app there are N instances no more no less. And take into account that with CPUlock only 64 instances per host can work in usual regime. What will be with more than 64 instances - unknown (64 is the size of MutEx array that can be waited in Windows OS ) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I'm sure Raistmer can confirm this, but explain the reasons for it, much better, and why that behaviour is more visible in the SoG app, than in other MB apps. It was explained elsewhere (on beta?) already but lets reiterate a little: 1) MultiBeam has inhomogenious work distribution inside the task. It has consequence: progress scale non-linear. It's non-linear for ALL MB tasks, CPU, GPU, ATI, CUDA and so on. But the observed degree of that non-linearity will be different. It depends on how differ in performance algorithms for different types of work MB task contains. If different type of work (different searches) processed almost with the same performance one will see almost linear progress. And otherwise. 2) SoG build is more unique case cause it's first time implemented real parallel execution of different searches completely on GPU device. Progress marks embedded in CPU code and with high AR that CPU code does only non-blocking kernel enqueuing. This done very fast and then GPU process data w/o interaction with CPU code. So no more marks of progress going to BOINC - highly non-linear behavior on VHARs. 3) BOINC implements own estimation of progress in case app doesn't report progress in timely fashion. So, progress will "tick" w/o ANY connection with real progress. No wonder sometimes it will drop back or forth if app's reported progress will differ from what BOINC thought it should be. And regarding last Zalster reports - it was just example of misconfigured (as I wrote already) host - app supplied with wrong data. So, when BOINC launches 4 tasks only 2 execution slots reserved inside CPUlock inner mechanism. So, 2 instances going to wait for free slot in suspended state. And BOINC continues its "ticks" of progress. But then, when instance aquires lock on free slot and starts real (!) execution, it reports real progress to BOINC... oops, we have zero progress instead of all those false ticks BOINC already did. Progress drops to zero and then start counting again. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Ok, finally got some APs. So now can run them in parallel to see how they interact with the SoGs. Will be back later with results. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Ok, finally got some APs. And please don't forget to take into account instances num corrections. It's important for OpenCL AP app as well as for MB OpenCL one. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
maximum number of AP, MB instances should be used for both apps if possible mix anticipated. Or each app just could get own number of instances in own CMD line (3 for AP and 4 for MB in this particular case) it has better cosmetics but actually the same as providing max to both. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
The number of instances of MB and APs on each card are the same on my machines. (though for lower series GPUs that is not always the case, ie 750ti can do 2 mb or 1 Ap depending on the rest of the system) They worked together perfectly. I didn't see any slowing down or prolongation of the MB, which is the case when cuda MB are mixed with OpenCL APs. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Looking at the time to complete, I've noticed that the time for a OpenCL Ap to complete is comparable to running all nothing but APs on that GPU. In other words, If I ran an AP on a card with MB at the same time, the AP usually completes in 14-18 minutes, while the MBs have their time to complete elongated to as much as an hour or more. If I ran all APs on the GPU with no MB, the time to complete should be around 50-54 minutes Running an AP with SoG MB, the time to complete is now 50 mins. There is no elongation of the time to complete that I can see with the SoG MBs. So that is good news. I have not seen any other issues with the SoG MB when run alongside the APs. Will continue to check. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Ok, I have a question. I put the -cpu_lock back in to see how it does. One of the things I'm seeing is new work is started, they start with really low CPU usage. That is fine, Looks like they all decide to work on a few core rather than spread out over many. My question is this. While these work units are crunching, Several of the GPU will stop crunching. They go from a P2 state to a P8 state and the speed drops down. This may last several minutes. Eventually the GPU will start up again as newer work units start. I've seen this done over all 4 GPUs at different times. Can someone explain how this is happening? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
List complete config (cmd line and app_info/config) you use currently and provide links to results of task that were ran at time when GPU downclocked. |
Joe Januzzi Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 |
This is my SoG setup. I will show my complete list of current config files (cmd line and app_info/config). When I make any changes, I will post the changes to that file only. I will show my SoG app_info.xml file first, because I started with just SoG. Once I got things to work, I took my Cuda/AP/CPU app_info.xml and gutted all my cuda50 lines with SoG file to make my current app_info.xml work for SoG/APs and CPU. If you see something that could be improved, or a better way of tuning/trying, please comment:) Note: We don't make mistakes, we make changes. When I used -sbs 384 I got this: WARNING: can't open binary kernel file for oclFFT plan: C:\SETI\Data/projects/setiathome.berkeley.edu\MB_clFFTplan_GeForceGTX780_1024_gr256_lr16_wg256_tw0_ls512_bn16_cw16_r3366.bin_35330, continue with recompile... Workunit 2069914747, http://setiathome.berkeley.edu/result.php?resultid=4744742557 I have the Stderr output file saved. If you need it just ask. All my slots had this warning in the stderr.txt files. Changing -sbs to either 256 or 192 stopped the warnings. I'm using -sbs 192 for now. My goal is to use as little of CPU as possible without starving the GPU's. Sometimes I'm at 100% on all cores, but mostly in the range of 39-90%. 3 at the time seems the best so far. When I tried 4, the CPU usage was between 1-5% with very slow times (like I broke it), so I stopped the test. I also started working on my MultiBeam_NV_config.xml file to give me some different tuning help for my GTX 960 and 780. I will post file before using. Joe app_info.xml(SoG only)I recommend this to start, because I believe in KISS. <app_info> <app> <name>setiathome_v8</name> </app> <file_info> <name>MB8_win_x86_SSE3_OpenCL_NV_r3366_SoG.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x86.dll</name> <executable/> </file_info> <file_info> <name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_SoG</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>MB8_win_x86_SSE3_OpenCL_NV_r3366_SoG.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> </app_info> app_info.xml (SoG/AP/CPU) “current†<app_info> <app> <name>setiathome_v7</name> </app> <file_info> <name>AKv8c_r2549_winx86-64_AVXxjfs.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</name> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>700</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AKv8c_r2549_winx86-64_AVXxjfs.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>setiathome_v7</app_name> <version_num>700</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AKv8c_r2549_winx86-64_AVXxjfs.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>AP7_win_x64_AVX_CPU_r2692.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>ap_cmdline_win_x64_AVX_CPU.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_x86_64</platform> <plan_class>sse2</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_intelx86</platform> <plan_class>sse2</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_x86_64</platform> <plan_class>sse</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_intelx86</platform> <plan_class>sse</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>701</version_num> <platform>windows_x86_64</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>701</version_num> <platform>windows_intelx86</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>700</version_num> <platform>windows_x86_64</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>700</version_num> <platform>windows_intelx86</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x86.dll</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels_r2887.cl</name> </file_info> <file_info> <name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MB8_win_x64_AVX_VS2010_r3330.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>mb_cmdline_win_x64_AVX_VS2010.txt</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_x86_64</platform> <api_version>7.5.0</api_version> <file_ref> <file_name>MB8_win_x64_AVX_VS2010_r3330.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x64_AVX_VS2010.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_intelx86</platform> <api_version>7.5.0</api_version> <file_ref> <file_name>MB8_win_x64_AVX_VS2010_r3330.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x64_AVX_VS2010.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MB8_win_x86_SSE3_OpenCL_NV_r3366_SoG.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x86.dll</name> <executable/> </file_info> <file_info> <name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_SoG</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>MB8_win_x86_SSE3_OpenCL_NV_r3366_SoG.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> </app_info> mb_cmdline_win_x86_SSE3_OpenCL_NV.txt “current†-no_cpu_lock -sbs 192 -instances_per_device 3 -period_iterations_num 40 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 16 app_config.xml “current†(For now I will stay with 1 AP, hopefully I can make it the same number for SoG, so there both equal with no slow down.) <app_config> <app> <name>astropulse_v7</name> <max_concurrent>1</max_concurrent> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>0.33</cpu_usage> </gpu_versions> </app> <app> <name>setiathome_v8</name> <max_concurrent>12</max_concurrent> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>0.33</cpu_usage> </gpu_versions> </app> </app_config> Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.