Message boards :
Number crunching :
Lunatics Windows Installer v0.43 Release Notes
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next
Author | Message |
---|---|
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Just curiosity Mike your son´s computer uses AMD or Intel CPU? In some way it makes sense. AMD is using a different memory controller. With each crime and every kindness we birth our future. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
or getting very very drunk :-) I´m in. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Well, now that I know what I'm looking for I saw what Juan is talking about. I just had one 1 G of memory usage that gave 30/30 with 0 blanking and exited at 14 minutes. Task 3798736005 Name ap_23jn11aa_B2_P1_00279_20141023_00693.wu_0 I use the following -use_sleep -unroll 18 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 -hp but it hasn't really affected my cruncher. I think there are several reasons for that. First thing pointed out by a coworker is that I'm running GTX780 with 3 GB Memory. The second was I run 16 GB of physical Memory in my Cruncher. I can't be for sure about my second cruncher. It crashed 3 hours ago and I just noticed it. Checking the history I don't see any memory hogging work units. This one is different as it has 2 GTX 780 with 3 GB and 2 GTX 750 with 2 GB. That machine as 32 GB of physical memory. Guess I'll have to keep an eye on it and see if I run into a memory hog and see which GPU it goes to. This one has a different command line than the first, not as aggressive. Zalster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Well, now that I know what I'm looking for I saw what Juan is talking about. I just had one 1 G of memory usage that gave 30/30 with 0 blanking and exited at 14 minutes. Since you use an AMD CPU on your 780 host that indicates the problem is not CPU realated as i suspect on my last posts so you actualy answer that question. Try to lower -ffa_block 8192 -ffa_block_fetch 4096 and check if the max memory usage changes to about 1/2 GB. If that happening you now see what i realy talk about. Few WU ussing 1 or more GB ends on out of memory error very fast on a 8 GB like the ones i use. Lower values give us less memory hugged but there is a performance penalty. Back to beer drinking task. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I just saw 4 more Error Work units with too many exits. Check the Memory none of them were close to 1 GB so not sure why. I went ahead and modified the commandline like you suggest. I'll keep an eye on it Zalster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I just saw a WU with 1GB of memory usage even with -ffa_block 8192 -ffa_block_fetch 4096. SO now i´m totaly lost, lower (defoult) block size slow the host but fix the problem. Will leave with this configuration while waiting for a clue tomorrow. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Raistmer post few interesting tests on the Lunatics site about the hugging memory WU problem. http://lunatics.kwsn.net/12-gpu-crunching/opencl-ap-v7-memory-consumption.msg57231.html;topicseen#msg57231 We could be sure he his working hard to find why it happening and a possible fix for that. I thanks him for that and let´s give him time to do his usual code magic. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Thanks for trust :D Some improvement is already reached indeed: http://lunatics.kwsn.net/12-gpu-crunching/opencl-ap-v7-memory-consumption.msg57241.html#msg57241 Ideas for more radical solution require possible non-trivial code changes so perhaps till next weekend, will see... |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Thanks for trust :D Good news, i know we are in good hands. Take your time, if we could help in anything, just ask. |
Michel Makhlouta Send message Joined: 21 Dec 03 Posts: 169 Credit: 41,799,743 RAC: 0 |
-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 I just moved from stock to lunatics, which I've been avoiding for a while but I guess my obsession with optimizing everything got the better of me. I've got an i7 4770K and 2x780. For the GPU, I went for cuda50 which was what stock was running. As for the CPU, I went for avx on both AP and MB, although stock was running sseX from time to time. Was it the right choice? About the quoted text. I've seen this on the forums a couple of times now. From what I understood, it has some advantages when running AP? Can someone clarify the need for the command line and what's best for my setup? Also where to add the above line? EDIT: adding a question. I'm running 3 WU's per GPU, allocating 1 core to AP and 0.06 for MB. Any thoughts on my current values? |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 Check for ap_cmdline_win_x86_SSE2_OpenCL_NV.txt I ´n case of the memory consumption on overflow tasks use this. -use_sleep -unroll 16 -oclFFT_plan 256 16 256 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -tune 2 64 4 1 With each crime and every kindness we birth our future. |
Michel Makhlouta Send message Joined: 21 Dec 03 Posts: 169 Credit: 41,799,743 RAC: 0 |
Thanks Mike. I've used the one in the readme file for now: For NV x80/x70 -use_sleep -unroll 18 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 8 1 -tune 2 64 8 1 I've also freed 2 cores, utilization is still 100%, so I guess running all cores is creating a bottleneck somewhere? I've had a crash 2 minutes ago: Display driver nvlddmkm stopped responding and has successfully recovered. I will wait and see if this occurs again. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
-oclFFT_plan is case sensitive. Well, that option is from advanced area not cause it hard to type of course :) but cause not all combos will go and there is no fool-proof at its excersising. But indeed, there is inconsistency in options naming. FFT is shortcut, but FFA is just similar shortcut. Hence there should be -FFA_block and -FFA_block_fetch. In next builds app will understand both "correct" option naming (case-sencitive where upper case can be) and low-register "unix-style" one. (-oclfft_plan and -oclFFT_plan both will go along with -FFA_block ). |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Thanks Mike. I've used the one in the readme file for now: Yes, it will slow processing down without freeing cores. Please reduce ff_block values like i posted to prevent memory leak on overflown tasks. Not sure how much system RAM you have installed. Just to be on the safe side. With each crime and every kindness we birth our future. |
Michel Makhlouta Send message Joined: 21 Dec 03 Posts: 169 Credit: 41,799,743 RAC: 0 |
Arlight, I've changed all values to what you've provided earlier. I've got 16GB RAM. I've had another display driver crash, 15 minutes after the first one. I used to have this once or twice a week after I added another card in SLI, but it has become more frequent after installing lunatics. Any ideas? |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
The oclFFT_plan will more than compensate it. Correcting the -oclfft_plan to -oclFFT_plan seems to have somewhat mixed results. I don't feel I have enough data points on my xw9400 yet using the above cmdline to draw firm conclusions, though the fact that it has completely locked up 5 times in 4 days is rather discouraging. ;^( However, on my T7400 (using similar cmdline but with -unroll 12) I've found that while the corrected -oclFFT_plan has helped the GTX 660 and GTX 670, it appears to have had no material effect on the GTX 780. Here are my observations, using only 0% blanked tasks which did not reach 30 pulses of either type. Scenario #1 is baseline, using no ap_cmdline. Scenario #2 is using ap_cmdline with incorrect -oclfft_plan (which seems to be ignored) Scenario #3 is using ap_cmdline with corrected -oclFFT_plan Average Run Times (5 tasks, 0% blanked, less than 30 pulses of each type) GTX 660: #1) 54 min 29.4 sec; #2) 1 hour 16 min 5.5 sec (+39.6%); #3) 1 hour 0 min 58 sec (+11.9%) GTX 670: #1) 40 min 24.8 sec; #2) 48 min 2.6 sec (+18.9%); #3) 41 min 48.8 sec (+3.5%) GTX 780: #1) 25 min 52.2 sec; #2) 35 min 10.6 sec (+36.0%); #3) 37 min 37.8 sec (+45.5%) For all 3 GPUs, of course, the CPU Time has fallen dramatically thanks to the -use_sleep parameter. However, the Run Time results are really a net loss for the machine. The GTX 780 should be the most productive GPU, and the significant Run Time slowdown that it's exhibiting considerably overshadows the overall CPU time gains. For the time being, I've removed everything but -use_sleep from the ap_cmdline, to see if I can get something close to the baseline Run Times along with the reduced CPU Times. (I've done the same with the xw9400 to see if the lockups go away. Perhaps there's something else going on there, but I've never had that happen even once before, and suddenly got 5 in 4 days.) I would guess that it might be very difficult to come up with a set of ap_cmdline parameters that would be "best" for a machine with mixed GPUs like those two of mine (and, I suspect, like many others out there). Since the AP app is now attempting to internally assign default values for some of the parameters based on a GPUs CU capability, I was wondering if it might make more sense for the ap_cmdline to input a "multiplier" for those values, rather than absolutes. That way, the multiplier would apply to the default values for each GPU, rather than having a single absolute "middle ground" value apply to all. Just a thought. |
Stick Send message Joined: 26 Feb 00 Posts: 100 Credit: 5,283,449 RAC: 5 |
Task 3804666490 is repeatedly hanging up on my Toshiba Laptop's GPU at the 11.865% mark. When this happens my screen flickers momentarily and I get a message that says the display driver stopped responding and has recovered. But the Task remains hung-up until I either suspend/resume the task or restart BOINC. After a restart, the task reverts to the previous checkpoint, and it starts counting up again until the 11.685% mark is reached. I have had this issue before (with differing hang-up points), with previous Lunatics releases and with stock applications as well as with various releases of BOINC and the Catalyst. I believe the problem is related to certain WU's. That is, it happens very rarely - maybe with 1% of the WU's I get. The other 99% complete and validate without any problems. I am guessing that with certain WU's, the program bumps up against the hardware limits of my GPU and crashes. But, that is only a guess. I am reporting it here in case anyone is interested in investigating the issue further. And I will hold the task "suspended" for a few days in case there are questions. As I said earlier, the problem occurs very rarely so I understand if it is deemed not worth pursuing. |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
Stick, I have similar APUs to your machine and have experienced the same on Multi-Beam tasks with the GPU application. It doesn't happen all the time but often enough. In those cases I switched the task to the CPU (requires fiddling with the client_state.xml), but I've since stopped running MB on the GPU for those low-powered hosts - only AstroPulse runs on the GPU now. Basically I came to the same conclusion - the hardware probably can't handle the complexity of the MB GPU application as well as their more powerful siblings. Soli Deo Gloria |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Stick & Wedge, You might just need to tweak some of the Windows OS video watchdog settings. See more details here: http://setiathome.berkeley.edu/forum_thread.php?id=75324&postid=1553652#1553652 If bumping up the TdrDelay doesn't work you might want to just disable it with the TdrLevel setting. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
Hmm. Might be interesting to try next time I need to try the MB GPU application. Thanks. Soli Deo Gloria |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.