Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)

Message boards : Number crunching : Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 32 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794731 - Posted: 9 Jun 2016, 15:15:30 UTC - in response to Message 1794655.  

. . . Hello Richard,

. . . You haven't mentioned it but is there any chance that SSE4.1 support has been added in 0.45 Beta?

No, it hasn't - no developer has supplied me with any updated CPU applications since the v0.44 launch to support SaH v8.

I need to deploy it on my system with the GT730 which is Core2 Duo based. It might be helpful if SSE4.1 is available to help things along.

Stephen

You don't strictly 'need' it. SIMD hardware support is cumulative - extra capabilities are added to newer CPU designs, but the old ones are never removed. There are one or two gaps where Intel and AMD followed different pathways for a while, but during that phase of development, the incremental steps were relatively small.

Sure, SSSE3 and SSE4.1 would be 'nice to have', but your Core2 Duo will get along pretty well with SSE3 until the developers can catch their breath and regroup.


. . It is doing so as we speak. But I was just wondering.

:)
ID: 1794731 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794733 - Posted: 9 Jun 2016, 15:19:26 UTC - in response to Message 1794666.  

Hey Rasputin42,

Are those bc5 tasks? I have noticed that those run more than 3 times slower, so it appears they are stalled, but they are still running.

And GPU temps are really low when running them.



. . SoG is running nicely on this machine but blc5 certainly take longer than blc6, they fooled me into thinking something was wrong at one point. I am wondering how blc7 will behave as I have a lot of them coming up.
ID: 1794733 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794735 - Posted: 9 Jun 2016, 15:21:34 UTC - in response to Message 1794690.  

With any "new" GPU installation it is worth running with only one task for a few days just to see what the thing will do in the base situation, then step it up to two for a few more days, finally if that is OK, push up to three. As Mike says I very much doubt that a GTX720M is up to running more than one task at a time.



. . My GT730 is only barely capable of running doubles, and comes to grief when a Guppie swims by.
ID: 1794735 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794739 - Posted: 9 Jun 2016, 15:28:00 UTC - in response to Message 1794723.  

How many times must this be said - the critical thing with GPUs is not MEMORY, but the number of GPU "cores" and their management.
There is probably enough memory to support half a dozen tasks, but trying to run more than a couple of tasks (particularly SoG) the GPU's internal task manager will be seriously struggling long before you reach that number.

Another thing to consider is that the current data from the servers is dominated by guppi (from the GBT) for which CUDA is not best suited - my GTX960 rig would quite happily run three "normal" Arecibo MB tasks, but try running three guppi at once and it started to sweat, it is much happier running two of them - that is quite a hit!




. . FWIW, I received the suggestion to run -sbs 512 on my GTX950 to persuade Guppies to play nice. But it wanted more than the 2048MB on the card and it was doing that, dropping out of WUs and leaving them in "waiting to run" state.

. . The max it will support is -sbs 384 so it may be a lack of sufficient memory, worth checking it out.
ID: 1794739 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1794741 - Posted: 9 Jun 2016, 15:33:05 UTC - in response to Message 1794665.  

Wait for something that works better, I get the same thing, only time to completion becomes days, instead of minutes, I tried 3, then 6, then 33, same result, I can't figure this out, I just went back to cuda42, I give up.



Zoom, I would suggest posting your entire commandline rather than just snips of it.

It's hard to tell what the computer see when you only post a portion of it.
ID: 1794741 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794747 - Posted: 9 Jun 2016, 16:09:03 UTC - in response to Message 1794730.  

My main problem, even though I have the sleep command running, is that I have a gpu wu 'running HP' and a gpu wu in 'waiting to run', it's probably nothing, but I thought I'd mention it.

I have 3 cpu and 3 gpu wu's running, plus some SoG has been downloaded.

I'm also getting this, note the days figure:

SETI@home 8.00 setiathome_v8 (cuda42) 25se10ad.23501.12750.7.34.2_1 00:49:30 (00:00:10) 0.35 0.001 3372d,22:39:51 71.0 °C 0.04C + 0.33NV Running Pegasus


This one is going to go Waiting to run, how can I stop this from happening?

Help...

The days figure is growing...


. . Can you monitor the memory usage on your GPU card? If there is insufficient memory it can exit a task leaving it in the "Waiting to run" state.

I have 1.5GB on hand in the gtx580, of that 935MB is being used(while crunching), I also have 512 cuda cores(in case anyone needs to know).

I'll try 2, instead of 3.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794747 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794752 - Posted: 9 Jun 2016, 16:27:51 UTC

564MB of memory used while under 0.45, vs 935MB.

I'm running 2 wu's on the PNY LC 580 card now, though I'd rather run 3...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794752 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794753 - Posted: 9 Jun 2016, 16:28:59 UTC - in response to Message 1794741.  

Wait for something that works better, I get the same thing, only time to completion becomes days, instead of minutes, I tried 3, then 6, then 33, same result, I can't figure this out, I just went back to cuda42, I give up.



Zoom, I would suggest posting your entire commandline rather than just snips of it.

It's hard to tell what the computer see when you only post a portion of it.


This is what I have(By your command):

-use_sleep -instances_per_device N: 2
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794753 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1794755 - Posted: 9 Jun 2016, 16:36:36 UTC

-use_sleep -instances_per_device N: 2



I think, that should be:

-use_sleep -instances_per_device 2
ID: 1794755 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1794756 - Posted: 9 Jun 2016, 16:42:05 UTC - in response to Message 1794755.  
Last modified: 9 Jun 2016, 16:43:23 UTC

-use_sleep -instances_per_device N: 2



I think, that should be:

-use_sleep -instances_per_device 2


Rasputin is correct.

Zoom try this

-use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2

Edit..

Only 1 GPU in your machine correct?
ID: 1794756 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1794757 - Posted: 9 Jun 2016, 16:44:44 UTC - in response to Message 1794756.  

You can always decrease the -sbs to 256 if you want and see if that works better
ID: 1794757 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6324
Credit: 106,370,077
RAC: 121
Russia
Message 1794797 - Posted: 9 Jun 2016, 20:32:55 UTC - in response to Message 1794660.  

I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes.
I have used -instances_per_per_device 2.
Any suggestions?

Link to such task result?
ID: 1794797 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794798 - Posted: 9 Jun 2016, 20:34:39 UTC - in response to Message 1794756.  

-use_sleep -instances_per_device N: 2



I think, that should be:

-use_sleep -instances_per_device 2


Rasputin is correct.

Zoom try this

-use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2

Edit..

Only 1 GPU in your machine correct?

So far, but yes. I'll try the one you provided Zalster.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794798 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1794799 - Posted: 9 Jun 2016, 20:37:18 UTC - in response to Message 1794797.  

I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes.
I have used -instances_per_per_device 2.
Any suggestions?


Link to such task result?


I will have to set it up again.Currently doing cuda50.
ID: 1794799 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14505
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1794800 - Posted: 9 Jun 2016, 20:37:23 UTC - in response to Message 1794797.  

I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes.
I have used -instances_per_per_device 2.
Any suggestions?

Link to such task result?

Stephen's message 1794739 earlier, about VRAM overcommit, may be relevant.

Link to result would be helpful - we can check if it's a BOINC client which supports temporary exit.
ID: 1794800 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6324
Credit: 106,370,077
RAC: 121
Russia
Message 1794801 - Posted: 9 Jun 2016, 20:51:07 UTC - in response to Message 1794800.  

I have noticed, that when running 2 instances of sog, it runs both tasks for a while and then one makes no progress any more.It finishes the other and starts a new one which it continues to process. The first one is still making no progress, but the elapsed time keeps going.If i suspend all other tasks, it finally makes progress again and eventually finishes.
I have used -instances_per_per_device 2.
Any suggestions?

Link to such task result?

Stephen's message 1794739 earlier, about VRAM overcommit, may be relevant.

Link to result would be helpful - we can check if it's a BOINC client which supports temporary exit.

No, it's not.
A habit to post at least link to result under question should be developed.
App prints how much memory it uses for particular task in its stderr.

Regarding SoG task progress: if task is VHAR and processing enters in SoG-only phase when it will enqueue all work to be done for particular task to GPU and then await task completion. What BOINC shows - unrelevant (as we know it shows its own guesses). Cause there is no lack of work for GPU to do even from single task there is nothing bad if runtime decides to finish kernel sequence from one task before switching to another (even if it's real effect that I strongly doubt to be so). There is no such thing as pre-emptive context switching in GPU so far AFAIK after all.
ID: 1794801 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1794803 - Posted: 9 Jun 2016, 20:59:05 UTC
Last modified: 9 Jun 2016, 21:01:15 UTC

http://setiathome.berkeley.edu/result.php?resultid=4973141178

This might be one of them, but no guarantee.I had to use the version you issued to not get stuck.https://setiathome.berkeley.edu/forum_thread.php?id=79629&postid=1793393

I will do a fresh one.
ID: 1794803 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794804 - Posted: 9 Jun 2016, 21:08:04 UTC - in response to Message 1794798.  

-use_sleep -instances_per_device N: 2



I think, that should be:

-use_sleep -instances_per_device 2


Rasputin is correct.

Zoom try this

-use_sleep -sbs 512 -total_GPU_instances_num 2 -instance_per_device 2

Edit..

Only 1 GPU in your machine correct?

So far, but yes. I'll try the one you provided Zalster.

I modified the sbs setting first to 256(from 512), then to 192, 2 wu's at 192 are using 864MB of the gtx580 ram, 256 used about 996MB, 512 used 1249MB.

After I had installed the line with sbs in it and shut down and restarted Boinc, two cuda42 gpu wu's went into 'waiting to run' mode, and a pair of Guppi's took over.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794804 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6324
Credit: 106,370,077
RAC: 121
Russia
Message 1794816 - Posted: 9 Jun 2016, 21:44:32 UTC - in response to Message 1794803.  
Last modified: 9 Jun 2016, 21:48:19 UTC

http://setiathome.berkeley.edu/result.php?resultid=4973141178

This might be one of them, but no guarantee.I had to use the version you issued to not get stuck.https://setiathome.berkeley.edu/forum_thread.php?id=79629&postid=1793393

I will do a fresh one.

No errors or restarts for posted result. It just OK with ~52min of elapsed time and ~3min CPU time (not bad too).

EDIT and with -sbs 256 it uses:
Currently allocated 357 MB for GPU buffers
ID: 1794816 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64941
Credit: 55,293,173
RAC: 49
United States
Message 1794817 - Posted: 9 Jun 2016, 21:46:53 UTC

No matter what I do, it's waiting to run, then one runs really slow, no cpu% and becomes waiting, Zalster tried, but nothing works here, I'm going back to cuda 42, that works... Let Me know when this works and doesn't require a degree in computer programming to crunch with...

Bye.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794817 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 32 · Next

Message boards : Number crunching : Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.