Message boards :
Number crunching :
Lunatics Windows Installer v0.43 Release Notes
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next
Author | Message |
---|---|
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
@ Raistmer Try running oclfft_plan without ffa_fetch ffa_fetch_block and see how this works Juan. I guess the timings are to high on multi GPU hosts. With each crime and every kindness we birth our future. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
@ Raistmer No signs that that option was acknowledged by app... Running on device number: 1 |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I just put ro run the app without the switch and watch the memory usage by taskmgr and the memory usage stay bellow 100MB when run without the switch. But i was wrong, i just see now 1 WU who uses >1GB without the switch enabled, so i will try Mike sugestion. Just one point noticed, without the siwtch the core usage allmost dropto zero. @Mike Trying now back ASAP |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
@ Mike I do what you sugest and leave the hosts working, not see any > 1GB memory usage for now, but see some in the range of 200MB to 400MB (high normal memory usage is <100MB), will leave running for a while to get more data. I leave one of the hosts only with -use_sleep -unroll 16 taking out all the other switches. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I leave the hosts running all night, the problem aparently only apears when both switches are enabled. So i remove the -ffa_block 16384 -ffa_block_fetch 8192 and set the command line on the 780 hosts to: -use_sleep -unroll 16 -oclfft_plan 256 16 256 -tune 1 64 4 1 -tune 2 64 4 1 and the same with -unroll 12 on the 670/690 hosts. Not see any 1 GB memory usage Wu anymore will keep and eye on that. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Well, all this supports guess that increased memory consumption caused by found signals storage space inside FFA. There was same issue before when size of array of signals was unrestricted. Now issue should be smaller cause each thread can keep only 30 signals. But with big -ffa_block numbers that define number of threads (hence number of separate signal storages) it seems this issue shows itself again. If so one could use smaller number in -ffa_block param. Also, this issue should correlate with number of found repetitive signals. Please check that all tasks with increased memory consumption have 30 rep pulses in log. Unfortunately this issue arised from the way validator handles overflows for CPU and GPU results. Currently CPU result (even overflow) must match GPU result. But GPU processes data in different order. It doesn't matter when ALL data processed, but when we have early exit data processing order does matter (and this was issue in APv6 builds - increased number of inconclusives between CPU and GPU overlows). Hence I need to keep all found pulses while GPU finds it and later to reorder them to match same serial sequence as CPU processing does. With many pulses per thread and many threads that storage space grows. I'll check if there is some additional memory leak or not but afraid this isn't fixable for now w/o changes in handling of overflows by validator. |
Urs Echternacht Send message Joined: 15 May 99 Posts: 692 Credit: 135,197,781 RAC: 211 |
IMHO there is a small bug in the installer - APv7_r2559_SSE3_OSX64.zip Thanks for noting. Will get corrected ... _\|/_ U r s |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
Well, all this supports guess that increased memory consumption caused by found signals storage space inside FFA. I did run both units Juan posted. ap_26jn14aa_B6_P0_00184_20141017_17161.wu is a 30/30 and memory consumption didn`t go over 104K for one second. Of course memory consumption on GPU went up to 406K because of high ffa_fetch values. But thats no big deal with 3GB GPU. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I did run both units Juan posted. I have no ideia about how the program itself works, but what is increasing is the computer main memory (up to about 1.7 GB by the last WU i see) who´s normaly apears in the range of 50-100Mb on the taskmgr list. |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
I did run both units Juan posted. In task manager you only see system memory not GPU memory consumption. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
In task manager you only see system memory not GPU memory consumption. That´s exactly why this issue is wierd, why the main memory usage rises to that kind of level on a determinate WU while all others remains low? When the system memory usage rises to that large number the system runs out of memory very fast. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
BTW, I completed the oclFFT_plan tests on the nVidia 8800 GT. All the tests using the oclFFT_plan settings produced No Found Pulses even though the App Found Pulses without using the oclFFT_plan setting. I have all the Completed files. Anyone want the Files? Or the results posted? |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
BTW, I completed the oclFFT_plan tests on the nVidia 8800 GT. All the tests using the oclFFT_plan settings produced No Found Pulses even though the App Found Pulses without using the oclFFT_plan setting. I have all the Completed files. What i have seen told me already that oclfft_plan doesn`t work with the 8800GT. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
BTW, I completed the oclFFT_plan tests on the nVidia 8800 GT. All the tests using the oclFFT_plan settings produced No Found Pulses even though the App Found Pulses without using the oclFFT_plan setting. I have all the Completed files. Yep. You just have to wonder what other cards/drivers don't work... |
Mike Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80 |
BTW, I completed the oclFFT_plan tests on the nVidia 8800 GT. All the tests using the oclFFT_plan settings produced No Found Pulses even though the App Found Pulses without using the oclFFT_plan setting. I have all the Completed files. I have tested 11 different NV GPU`s. Everything Fermi and up should work. Not sure about 2xx cards. With each crime and every kindness we birth our future. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
In task manager you only see system memory not GPU memory consumption. yep, exactly system (OS, CPU.., not GPU) memory increase expected with issue I described earlier. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
BTW, I completed the oclFFT_plan tests on the nVidia 8800 GT. All the tests using the oclFFT_plan settings produced No Found Pulses even though the App Found Pulses without using the oclFFT_plan setting. I have all the Completed files. -oclFFT_plan settings placed in advanced options category not w/o reason. Only few possible combinations work correctly and we still finding why and wich. So either do thorough offline tests before using live or stay with Mike's recommended combo that known to be faster and safe enough. If you would collect your results in some table with clear designation what GPU what combo worked correctly what produced invalid results and what failed it would be appreciated. So far I collected such table for Loveland APU (C-60) and will post it on Lunatics. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
In task manager you only see system memory not GPU memory consumption. BTW, what tool can be used to log app's system memory consumption during its run (not final value but kinetics) ? I don't have time right now to sit before screen and watch where and how it rises so some logging tools would be appreciated. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Maybe this is what you look for: http://setiathome.berkeley.edu/forum_thread.php?id=75928&postid=1590689 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Maybe this is what you look for: process explorer great tool indeed but I did not find logging there (logging to file I meant to watch after precess finished already). EDIT: perhaps this: http://stackoverflow.com/questions/69332/tracking-cpu-and-memory-usage-per-process |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.