Open Beta test: SoG for NVidia, Lunatics v0.45

Author	Message
Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1794731 - Posted: 9 Jun 2016, 15:15:30 UTC - in response to Message 1794655. . . . Hello Richard, . . . You haven't mentioned it but is there any chance that SSE4.1 support has been added in 0.45 Beta? No, it hasn't - no developer has supplied me with any updated CPU applications since the v0.44 launch to support SaH v8. I need to deploy it on my system with the GT730 which is Core2 Duo based. It might be helpful if SSE4.1 is available to help things along. Stephen You don't strictly 'need' it. SIMD hardware support is cumulative - extra capabilities are added to newer CPU designs, but the old ones are never removed. There are one or two gaps where Intel and AMD followed different pathways for a while, but during that phase of development, the incremental steps were relatively small. Sure, SSSE3 and SSE4.1 would be 'nice to have', but your Core2 Duo will get along pretty well with SSE3 until the developers can catch their breath and regroup. . . It is doing so as we speak. But I was just wondering. :) ID: 1794731 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1794733 - Posted: 9 Jun 2016, 15:19:26 UTC - in response to Message 1794666. Hey Rasputin42, Are those bc5 tasks? I have noticed that those run more than 3 times slower, so it appears they are stalled, but they are still running. And GPU temps are really low when running them. . . SoG is running nicely on this machine but blc5 certainly take longer than blc6, they fooled me into thinking something was wrong at one point. I am wondering how blc7 will behave as I have a lot of them coming up. ID: 1794733 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1794735 - Posted: 9 Jun 2016, 15:21:34 UTC - in response to Message 1794690. With any "new" GPU installation it is worth running with only one task for a few days just to see what the thing will do in the base situation, then step it up to two for a few more days, finally if that is OK, push up to three. As Mike says I very much doubt that a GTX720M is up to running more than one task at a time. . . My GT730 is only barely capable of running doubles, and comes to grief when a Guppie swims by. ID: 1794735 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1794739 - Posted: 9 Jun 2016, 15:28:00 UTC - in response to Message 1794723. How many times must this be said - the critical thing with GPUs is not MEMORY, but the number of GPU "cores" and their management. There is probably enough memory to support half a dozen tasks, but trying to run more than a couple of tasks (particularly SoG) the GPU's internal task manager will be seriously struggling long before you reach that number. Another thing to consider is that the current data from the servers is dominated by guppi (from the GBT) for which CUDA is not best suited - my GTX960 rig would quite happily run three "normal" Arecibo MB tasks, but try running three guppi at once and it started to sweat, it is much happier running two of them - that is quite a hit! . . FWIW, I received the suggestion to run -sbs 512 on my GTX950 to persuade Guppies to play nice. But it wanted more than the 2048MB on the card and it was doing that, dropping out of WUs and leaving them in "waiting to run" state. . . The max it will support is -sbs 384 so it may be a lack of sufficient memory, worth checking it out. ID: 1794739 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1794741 - Posted: 9 Jun 2016, 15:33:05 UTC - in response to Message 1794665. Wait for something that works better, I get the same thing, only time to completion becomes days, instead of minutes, I tried 3, then 6, then 33, same result, I can't figure this out, I just went back to cuda42, I give up. Zoom, I would suggest posting your entire commandline rather than just snips of it. It's hard to tell what the computer see when you only post a portion of it. ID: 1794741 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794747 - Posted: 9 Jun 2016, 16:09:03 UTC - in response to Message 1794730. My main problem, even though I have the sleep command running, is that I have a gpu wu 'running HP' and a gpu wu in 'waiting to run', it's probably nothing, but I thought I'd mention it. I have 3 cpu and 3 gpu wu's running, plus some SoG has been downloaded. I'm also getting this, note the days figure: SETI@home 8.00 setiathome_v8 (cuda42) 25se10ad.23501.12750.7.34.2_1 00:49:30 (00:00:10) 0.35 0.001 3372d,22:39:51 71.0 Â°C 0.04C + 0.33NV Running Pegasus This one is going to go Waiting to run, how can I stop this from happening? Help... The days figure is growing... . . Can you monitor the memory usage on your GPU card? If there is insufficient memory it can exit a task leaving it in the "Waiting to run" state. I have 1.5GB on hand in the gtx580, of that 935MB is being used(while crunching), I also have 512 cuda cores(in case anyone needs to know). I'll try 2, instead of 3. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794747 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794752 - Posted: 9 Jun 2016, 16:27:51 UTC 564MB of memory used while under 0.45, vs 935MB. I'm running 2 wu's on the PNY LC 580 card now, though I'd rather run 3... The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794752 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794753 - Posted: 9 Jun 2016, 16:28:59 UTC - in response to Message 1794741. Wait for something that works better, I get the same thing, only time to completion becomes days, instead of minutes, I tried 3, then 6, then 33, same result, I can't figure this out, I just went back to cuda42, I give up. Zoom, I would suggest posting your entire commandline rather than just snips of it. It's hard to tell what the computer see when you only post a portion of it. This is what I have(By your command): -use_sleep -instances_per_device N: 2 The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794753 ·

Rasputin42 Volunteer tester Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0	Message 1794755 - Posted: 9 Jun 2016, 16:36:36 UTC -use_sleep -instances_per_device N: 2 I think, that should be: -use_sleep -instances_per_device 2 ID: 1794755 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1794756 - Posted: 9 Jun 2016, 16:42:05 UTC - in response to Message 1794755. Last modified: 9 Jun 2016, 16:43:23 UTC -use_sleep -instances_per_device N: 2 I think, that should be: -use_sleep -instances_per_device 2 Rasputin is correct. Zoom try this -use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2 Edit.. Only 1 GPU in your machine correct? ID: 1794756 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1794757 - Posted: 9 Jun 2016, 16:44:44 UTC - in response to Message 1794756. You can always decrease the -sbs to 256 if you want and see if that works better ID: 1794757 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1794797 - Posted: 9 Jun 2016, 20:32:55 UTC - in response to Message 1794660. I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes. I have used -instances_per_per_device 2. Any suggestions? Link to such task result? ID: 1794797 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794798 - Posted: 9 Jun 2016, 20:34:39 UTC - in response to Message 1794756. -use_sleep -instances_per_device N: 2 I think, that should be: -use_sleep -instances_per_device 2 Rasputin is correct. Zoom try this -use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2 Edit.. Only 1 GPU in your machine correct? So far, but yes. I'll try the one you provided Zalster. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794798 ·

Rasputin42 Volunteer tester Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0	Message 1794799 - Posted: 9 Jun 2016, 20:37:18 UTC - in response to Message 1794797. I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes. I have used -instances_per_per_device 2. Any suggestions? Link to such task result? I will have to set it up again.Currently doing cuda50. ID: 1794799 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1794800 - Posted: 9 Jun 2016, 20:37:23 UTC - in response to Message 1794797. I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes. I have used -instances_per_per_device 2. Any suggestions? Link to such task result? Stephen's message 1794739 earlier, about VRAM overcommit, may be relevant. Link to result would be helpful - we can check if it's a BOINC client which supports temporary exit. ID: 1794800 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1794801 - Posted: 9 Jun 2016, 20:51:07 UTC - in response to Message 1794800. I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes. I have used -instances_per_per_device 2. Any suggestions? Link to such task result? Stephen's message 1794739 earlier, about VRAM overcommit, may be relevant. Link to result would be helpful - we can check if it's a BOINC client which supports temporary exit. No, it's not. A habit to post at least link to result under question should be developed. App prints how much memory it uses for particular task in its stderr. Regarding SoG task progress: if task is VHAR and processing enters in SoG-only phase when it will enqueue all work to be done for particular task to GPU and then await task completion. What BOINC shows - unrelevant (as we know it shows its own guesses). Cause there is no lack of work for GPU to do even from single task there is nothing bad if runtime decides to finish kernel sequence from one task before switching to another (even if it's real effect that I strongly doubt to be so). There is no such thing as pre-emptive context switching in GPU so far AFAIK after all. ID: 1794801 ·

Rasputin42 Volunteer tester Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0	Message 1794803 - Posted: 9 Jun 2016, 20:59:05 UTC Last modified: 9 Jun 2016, 21:01:15 UTC http://setiathome.berkeley.edu/result.php?resultid=4973141178 This might be one of them, but no guarantee.I had to use the version you issued to not get stuck.https://setiathome.berkeley.edu/forum_thread.php?id=79629&postid=1793393 I will do a fresh one. ID: 1794803 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794804 - Posted: 9 Jun 2016, 21:08:04 UTC - in response to Message 1794798. -use_sleep -instances_per_device N: 2 I think, that should be: -use_sleep -instances_per_device 2 Rasputin is correct. Zoom try this -use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2 Edit.. Only 1 GPU in your machine correct? So far, but yes. I'll try the one you provided Zalster. I modified the sbs setting first to 256(from 512), then to 192, 2 wu's at 192 are using 864MB of the gtx580 ram, 256 used about 996MB, 512 used 1249MB. After I had installed the line with sbs in it and shut down and restarted Boinc, two cuda42 gpu wu's went into 'waiting to run' mode, and a pair of Guppi's took over. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794804 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1794816 - Posted: 9 Jun 2016, 21:44:32 UTC - in response to Message 1794803. Last modified: 9 Jun 2016, 21:48:19 UTC http://setiathome.berkeley.edu/result.php?resultid=4973141178 This might be one of them, but no guarantee.I had to use the version you issued to not get stuck.https://setiathome.berkeley.edu/forum_thread.php?id=79629&postid=1793393 I will do a fresh one. No errors or restarts for posted result. It just OK with ~52min of elapsed time and ~3min CPU time (not bad too). EDIT and with -sbs 256 it uses: Currently allocated 357 MB for GPU buffers ID: 1794816 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1794817 - Posted: 9 Jun 2016, 21:46:53 UTC No matter what I do, it's waiting to run, then one runs really slow, no cpu% and becomes waiting, Zalster tried, but nothing works here, I'm going back to cuda 42, that works... Let Me know when this works and doesn't require a degree in computer programming to crunch with... Bye. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1794817 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.

Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)