Panic Mode On (84) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (84) Server Problems?

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next
Author Message
juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5154
Credit: 280,605,385
RAC: 441,079
Brazil
Message 1373278 - Posted: 30 May 2013, 4:57:01 UTC
Last modified: 30 May 2013, 5:01:57 UTC

Jason

I could understand an increase in the time to process a WU but a change from 10-15 min to about 1 1/2 - 2 hours? on a 670 before running 3 WU at a time and now 2 WU at a time makes little sense.

So the question remains, why not keep the VLARS away from the Nvidias? Was work that way for years and everyone was happy with that, so why make changes in a winning team? Vlars on the CPU´s runs fine with little degradation.

And don´t mention, with the VLARS on the GPU´s the video response starts to show problems (lag and etc.), that makes in some situations impossible to mantain the GPU crunching while running another tasks as we could do in the past.

So the question remains, there is anything on the client configuration that allow us to avoid the Vlars to be crunched by the Nvidias?
____________

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4955
Credit: 72,940,433
RAC: 12,534
Australia
Message 1373281 - Posted: 30 May 2013, 5:05:21 UTC - in response to Message 1373259.

I agree the Nvidias does not like the VLARS...

Is there any configuration we could do in order to avoid the GPU receive the VLARS?


Looks like it'll be system specific, so I'll probably present the 3 options I mentioned in response to Fred E, to the project. Try the reduced instances / settings described. See if anything changes if you free a CPU core.
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5154
Credit: 280,605,385
RAC: 441,079
Brazil
Message 1373282 - Posted: 30 May 2013, 5:10:28 UTC

Here is 2 AM so i will make the test tomorrow with little less beer on my head.

Have a good night/day.
____________

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4955
Credit: 72,940,433
RAC: 12,534
Australia
Message 1373283 - Posted: 30 May 2013, 5:11:02 UTC - in response to Message 1373278.
Last modified: 30 May 2013, 5:38:01 UTC

So the question remains, there is anything on the client configuration that allow us to avoid the Vlars to be crunched by the Nvidias?


Not yet client side. If there needs to be a setting for stock it'll likely be server side, or removed from sending. It's out for a few hours & these limitations need to be found. Options here have been discussed, but the best not determined yet.
- Removing VLARs from being sent to these GPUs, OR
- a change in default settings, OR
- an Opt-in/Opt-out feature. [e.g. My own aging Core2Duo with GTX 680 happily crunches them while watching the Starship Troopers Trilogy, I'd like to crunch them because they are longer & should hopefully get more credit]



Try the suggested settings & report please. These issues were not detected here & need to be characterised. This is what Beta project participation was intended for, but never seems to cover the full range of systems & configurations come release day.

Since you asked why try the change, At this point blocking VLAR automatically to every single nVidia stalls my development somewhat, and not trying new settings for a new app will not be helpful to determine the best possible course (What options to put forward, things to change etc)

Here is 2 AM so i will make the test tomorrow with little less beer on my head.

Have a good night/day.


Sleep well,

Jason
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1198
Credit: 44,366,817
RAC: 114,388
United States
Message 1373290 - Posted: 30 May 2013, 5:37:17 UTC

So, I'm just about out of tasks for x41g on my 8800. I have quite a few CPU & ATI APs left. Is there some way to enter a new section on my app_info to receive tasks for my 8800? I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore. The current section is;

<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>Lunatics_x41g_win32_cuda32.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_32_16.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_32_16.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<platform>windows_intelx86</platform>
<version_num>609</version_num>
<plan_class>cuda23</plan_class>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.08</max_ncpus>
<flops>60000000000</flops>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_32_16.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_32_16.dll</file_name>
</file_ref>
</app_version>

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4955
Credit: 72,940,433
RAC: 12,534
Australia
Message 1373295 - Posted: 30 May 2013, 5:49:18 UTC - in response to Message 1373290.

That's right, V7 includes autocorrelation processing, so is a replacement.

I'll have to leave the new custom appinfo mods to someone who's looked at those details.

... I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore.


Assuming you performed the comparisons on V6, Can I have your data please? (perhaps in a dedicated thread). With equal angle ranges that shouldn't be the case on V6, so I'm happy to examine specific data. Depending on a number of factors, zc Cuda 2.3 should be the fastest for that 8800 by some margin. That's by V6, while under v7 zc would wipe the floor with g

____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1198
Credit: 44,366,817
RAC: 114,388
United States
Message 1373308 - Posted: 30 May 2013, 6:00:22 UTC - in response to Message 1373295.

That's right, V7 includes autocorrelation processing, so is a replacement.

I'll have to leave the new custom appinfo mods to someone who's looked at those details.

... I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore.


Assuming you performed the comparisons on V6, Can I have your data please? (perhaps in a dedicated thread). With equal angle ranges that shouldn't be the case on V6, so I'm happy to examine specific data. Depending on a number of factors, zc Cuda 2.3 should be the fastest for that 8800 by some margin. That's by V6, while under v7 zc would wipe the floor with g

It was weeks ago, maybe months. I think it was about 15 seconds difference on a 4 minute shorty. The longer ones were pretty close to equal. I think it was with a different driver also, I'm using 266.58 now. If I can't receive anymore cuda tasks, I'll end up conscripting a few CPU APs to run on the 8800 to hold me over. I'd rather be running cuda tasks on the card.

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 652
Credit: 5,541,860
RAC: 7,391
New Zealand
Message 1373335 - Posted: 30 May 2013, 6:51:23 UTC

Is the reason for no work been split or very low amounts sub 10 per second to do with the rollout of the new application (version 7) or is there something more server related going on?
____________

Live in NZ y not join Smile City?

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23685
Credit: 32,396,525
RAC: 24,153
Germany
Message 1373343 - Posted: 30 May 2013, 7:08:36 UTC

Dont forget all new applications needs to get downloaded dozen times.
Thats additional stress to the servers.

____________

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 652
Credit: 5,541,860
RAC: 7,391
New Zealand
Message 1373347 - Posted: 30 May 2013, 7:19:24 UTC - in response to Message 1373343.

Dont forget all new applications needs to get downloaded dozen times.
Thats additional stress to the servers.

True as I write this server bits in are about 82.83 MB so the server isn't under a great load but I can understand why the splitters have been turned off or working at a low rate
____________

Live in NZ y not join Smile City?

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5774
Credit: 57,565,633
RAC: 48,284
Australia
Message 1373349 - Posted: 30 May 2013, 7:22:54 UTC - in response to Message 1373335.

Is the reason for no work been split or very low amounts sub 10 per second to do with the rollout of the new application (version 7) or is there something more server related going on?

I suspect something's borked.
____________
Grant
Darwin NT.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5774
Credit: 57,565,633
RAC: 48,284
Australia
Message 1373373 - Posted: 30 May 2013, 8:57:19 UTC - in response to Message 1373349.


If anyone has a copy of the following files i'd appreciate it.
setiathome_6.10_windows_intelx86__cuda_fermi.exe
libfftw3f-3-1-1a_upx.dll
cudart32_30_14.dll
cufft32_30_14.dll

They are all queued up to download, and immediately timeout every time i hit retry. No more crunching until i can get them.
:-(
____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 222,196,376
RAC: 212,436
Australia
Message 1373374 - Posted: 30 May 2013, 8:58:52 UTC - in response to Message 1373373.

I have just moved all my boxes to v7. Didn't take that long.

Curious thing is:
the first box is dual GTX580s and it received v7 cuda42 WUs, later a few 32s and then some 50s;
the second box is dual GTX295s and it received v7 cuda50 WUs, and later on 5 cuda22s;
the third box is dual GTX580s and it received v7 cuda 32 WUs.

All are running 1 WU per GPU at present.

However, is there a way to stop cuda50 WUs landing on the GTX295s. They appear to be taking an age to process.

ps. I also dropped this in another thread
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5774
Credit: 57,565,633
RAC: 48,284
Australia
Message 1373377 - Posted: 30 May 2013, 9:04:44 UTC - in response to Message 1373374.

However, is there a way to stop cuda50 WUs landing on the GTX295s. They appear to be taking an age to process.

From what i can gather, the multiple applications for multiple work units is part of the automatic application optimisation.
Let it do it's thing & it'll end up using the one that performs the best.

____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 222,196,376
RAC: 212,436
Australia
Message 1373378 - Posted: 30 May 2013, 9:07:48 UTC - in response to Message 1373377.

Hope so as I have a GPU WU that has been going for 37 minutes and has another 49 minutes to run. The irony is that as each minute goes by, time to complete increases.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8442
Credit: 48,102,381
RAC: 66,193
United Kingdom
Message 1373391 - Posted: 30 May 2013, 9:27:49 UTC - in response to Message 1373259.

I agree the Nvidias does not like the VLARS...

Is there any configuration we could do in order to avoid the GPU receive the VLARS?

It seems that VLARs are only being sent to Kepler-class GPUs, which is what we tested at Beta.

v6 VLAR resends are also being sent to Keplers, which will both extend the time before I have to change the configuration on mine, and help with the v6 cleanout. Win-win.

Profile popandbob
Volunteer tester
Send message
Joined: 19 Mar 05
Posts: 535
Credit: 1,896,421
RAC: 0
Canada
Message 1373405 - Posted: 30 May 2013, 9:56:59 UTC

Got 52 Tasks first try... All downloaded just fine.
Saw a funny message in BOINC a couple min ago though...

30/05/2013 3:52:34 AM | SETI@home | Sending scheduler request: To fetch work.
30/05/2013 3:52:34 AM | SETI@home | Not requesting tasks
30/05/2013 3:52:36 AM | SETI@home | Scheduler request completed


____________


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8442
Credit: 48,102,381
RAC: 66,193
United Kingdom
Message 1373418 - Posted: 30 May 2013, 10:13:46 UTC - in response to Message 1373405.

Got 52 Tasks first try... All downloaded just fine.
Saw a funny message in BOINC a couple min ago though...

30/05/2013 3:52:34 AM | SETI@home | Sending scheduler request: To fetch work.
30/05/2013 3:52:34 AM | SETI@home | Not requesting tasks
30/05/2013 3:52:36 AM | SETI@home | Scheduler request completed

Thanks. It's been reported as a bug, but the devs don't seem to be very interested. Usually, it clears itself at the next work request: if not, one user reported that it cleared itself when he re-read the config file.

Or was that the other bug, where it shouldn't be requesting work, but does - again and again? Sorry, I'm losing the plot here.

Profile The_bestest
Send message
Joined: 7 Oct 06
Posts: 27
Credit: 39,549,314
RAC: 24,326
United States
Message 1373430 - Posted: 30 May 2013, 10:30:25 UTC
Last modified: 30 May 2013, 10:31:33 UTC

I happened to look on the Home page yesterday morning. No information about v7 being pushed out. Yes, I saw the notification that 7 was available, but NOWHERE did it state that the upgrade was required. Now I find that v7 was deployed to my machine without my consent. NOT COOL. No one should EVER allow software to be installed/upgraded without an explicit approval. This may be a deal breaker for me, that the admins running this project would once again make a huge change and then beg for forgiveness after the fact.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8442
Credit: 48,102,381
RAC: 66,193
United Kingdom
Message 1373431 - Posted: 30 May 2013, 10:40:46 UTC - in response to Message 1373430.

I happened to look on the Home page yesterday morning. No information about v7 being pushed out. Yes, I saw the notification that 7 was available, but NOWHERE did it state that the upgrade was required. Now I find that v7 was deployed to my machine without my consent. NOT COOL. No one should EVER allow software to be installed/upgraded without an explicit approval. This may be a deal breaker for me, that the admins running this project would once again make a huge change and then beg for forgiveness after the fact.

The whole design of BOINC is that science applications (and by implication, updated science applications) are distributed silently but securely. They come from secured servers, are digitally signed, and are run in a sandboxed work area.

Some people have asked, in the past, that BOINC itself should auto-update, as so much commercial (and even open source) software does these days: but the developers have always refused - both for security, and for the reasons you give. You have to download and install BOINC manually, and by doing so, you buy into the automatic science app distribution.

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?

Copyright © 2014 University of California