Panic Mode On (84) Server Problems?

Message boards : Number crunching : Panic Mode On (84) Server Problems?

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

AuthorMessage
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 5847
Credit: 330,568,582
RAC: 7,823
Panama
Message 1373278 - Posted: 30 May 2013, 4:57:01 UTC
Last modified: 30 May 2013, 5:01:57 UTC

Jason

I could understand an increase in the time to process a WU but a change from 10-15 min to about 1 1/2 - 2 hours? on a 670 before running 3 WU at a time and now 2 WU at a time makes little sense.

So the question remains, why not keep the VLARS away from the Nvidias? Was work that way for years and everyone was happy with that, so why make changes in a winning team? Vlars on the CPU´s runs fine with little degradation.

And don´t mention, with the VLARS on the GPU´s the video response starts to show problems (lag and etc.), that makes in some situations impossible to mantain the GPU crunching while running another tasks as we could do in the past.

So the question remains, there is anything on the client configuration that allow us to avoid the Vlars to be crunched by the Nvidias?
ID: 1373278 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7243
Credit: 87,254,287
RAC: 5,272
Australia
Message 1373281 - Posted: 30 May 2013, 5:05:21 UTC - in response to Message 1373259.  

I agree the Nvidias does not like the VLARS...

Is there any configuration we could do in order to avoid the GPU receive the VLARS?


Looks like it'll be system specific, so I'll probably present the 3 options I mentioned in response to Fred E, to the project. Try the reduced instances / settings described. See if anything changes if you free a CPU core.
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin
ID: 1373281 · Report as offensive
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 5847
Credit: 330,568,582
RAC: 7,823
Panama
Message 1373282 - Posted: 30 May 2013, 5:10:28 UTC

Here is 2 AM so i will make the test tomorrow with little less beer on my head.

Have a good night/day.
ID: 1373282 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7243
Credit: 87,254,287
RAC: 5,272
Australia
Message 1373283 - Posted: 30 May 2013, 5:11:02 UTC - in response to Message 1373278.  
Last modified: 30 May 2013, 5:38:01 UTC

So the question remains, there is anything on the client configuration that allow us to avoid the Vlars to be crunched by the Nvidias?


Not yet client side. If there needs to be a setting for stock it'll likely be server side, or removed from sending. It's out for a few hours & these limitations need to be found. Options here have been discussed, but the best not determined yet.
- Removing VLARs from being sent to these GPUs, OR
- a change in default settings, OR
- an Opt-in/Opt-out feature. [e.g. My own aging Core2Duo with GTX 680 happily crunches them while watching the Starship Troopers Trilogy, I'd like to crunch them because they are longer & should hopefully get more credit]



Try the suggested settings & report please. These issues were not detected here & need to be characterised. This is what Beta project participation was intended for, but never seems to cover the full range of systems & configurations come release day.

Since you asked why try the change, At this point blocking VLAR automatically to every single nVidia stalls my development somewhat, and not trying new settings for a new app will not be helpful to determine the best possible course (What options to put forward, things to change etc)

Here is 2 AM so i will make the test tomorrow with little less beer on my head.

Have a good night/day.


Sleep well,

Jason
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin
ID: 1373283 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3070
Credit: 122,766,024
RAC: 91,774
United States
Message 1373290 - Posted: 30 May 2013, 5:37:17 UTC

So, I'm just about out of tasks for x41g on my 8800. I have quite a few CPU & ATI APs left. Is there some way to enter a new section on my app_info to receive tasks for my 8800? I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore. The current section is;
<app>
        <name>setiathome_enhanced</name>
    </app>
    <file_info>
        <name>Lunatics_x41g_win32_cuda32.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cudart32_32_16.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>cufft32_32_16.dll</name>
        <executable/>
    </file_info>
   <app_version>
        <app_name>setiathome_enhanced</app_name>
        <platform>windows_intelx86</platform>
        <version_num>609</version_num>
        <plan_class>cuda23</plan_class>
        <avg_ncpus>0.04</avg_ncpus>
        <max_ncpus>0.08</max_ncpus>
        <flops>60000000000</flops>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
	    <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>
ID: 1373290 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7243
Credit: 87,254,287
RAC: 5,272
Australia
Message 1373295 - Posted: 30 May 2013, 5:49:18 UTC - in response to Message 1373290.  

That's right, V7 includes autocorrelation processing, so is a replacement.

I'll have to leave the new custom appinfo mods to someone who's looked at those details.

... I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore.


Assuming you performed the comparisons on V6, Can I have your data please? (perhaps in a dedicated thread). With equal angle ranges that shouldn't be the case on V6, so I'm happy to examine specific data. Depending on a number of factors, zc Cuda 2.3 should be the fastest for that 8800 by some margin. That's by V6, while under v7 zc would wipe the floor with g

"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin
ID: 1373295 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3070
Credit: 122,766,024
RAC: 91,774
United States
Message 1373308 - Posted: 30 May 2013, 6:00:22 UTC - in response to Message 1373295.  

That's right, V7 includes autocorrelation processing, so is a replacement.

I'll have to leave the new custom appinfo mods to someone who's looked at those details.

... I tried the new cuda32 app a while ago, x41g was slightly faster. Seems I don't have that option anymore.


Assuming you performed the comparisons on V6, Can I have your data please? (perhaps in a dedicated thread). With equal angle ranges that shouldn't be the case on V6, so I'm happy to examine specific data. Depending on a number of factors, zc Cuda 2.3 should be the fastest for that 8800 by some margin. That's by V6, while under v7 zc would wipe the floor with g

It was weeks ago, maybe months. I think it was about 15 seconds difference on a 4 minute shorty. The longer ones were pretty close to equal. I think it was with a different driver also, I'm using 266.58 now. If I can't receive anymore cuda tasks, I'll end up conscripting a few CPU APs to run on the 8800 to hold me over. I'd rather be running cuda tasks on the card.
ID: 1373308 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 902
Credit: 7,867,806
RAC: 1,976
New Zealand
Message 1373335 - Posted: 30 May 2013, 6:51:23 UTC

Is the reason for no work been split or very low amounts sub 10 per second to do with the rollout of the new application (version 7) or is there something more server related going on?
ID: 1373335 · Report as offensive
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 29590
Credit: 49,138,468
RAC: 17,229
Germany
Message 1373343 - Posted: 30 May 2013, 7:08:36 UTC

Dont forget all new applications needs to get downloaded dozen times.
Thats additional stress to the servers.

With each crime and every kindness we birth our future.
ID: 1373343 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 902
Credit: 7,867,806
RAC: 1,976
New Zealand
Message 1373347 - Posted: 30 May 2013, 7:19:24 UTC - in response to Message 1373343.  

Dont forget all new applications needs to get downloaded dozen times.
Thats additional stress to the servers.

True as I write this server bits in are about 82.83 MB so the server isn't under a great load but I can understand why the splitters have been turned off or working at a low rate
ID: 1373347 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7495
Credit: 91,197,662
RAC: 46,121
Australia
Message 1373349 - Posted: 30 May 2013, 7:22:54 UTC - in response to Message 1373335.  

Is the reason for no work been split or very low amounts sub 10 per second to do with the rollout of the new application (version 7) or is there something more server related going on?

I suspect something's borked.
Grant
Darwin NT
ID: 1373349 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7495
Credit: 91,197,662
RAC: 46,121
Australia
Message 1373373 - Posted: 30 May 2013, 8:57:19 UTC - in response to Message 1373349.  


If anyone has a copy of the following files i'd appreciate it.
setiathome_6.10_windows_intelx86__cuda_fermi.exe
libfftw3f-3-1-1a_upx.dll
cudart32_30_14.dll
cufft32_30_14.dll

They are all queued up to download, and immediately timeout every time i hit retry. No more crunching until i can get them.
:-(
Grant
Darwin NT
ID: 1373373 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 665
Credit: 351,361,239
RAC: 140,265
Australia
Message 1373374 - Posted: 30 May 2013, 8:58:52 UTC - in response to Message 1373373.  

I have just moved all my boxes to v7. Didn't take that long.

Curious thing is:
the first box is dual GTX580s and it received v7 cuda42 WUs, later a few 32s and then some 50s;
the second box is dual GTX295s and it received v7 cuda50 WUs, and later on 5 cuda22s;
the third box is dual GTX580s and it received v7 cuda 32 WUs.

All are running 1 WU per GPU at present.

However, is there a way to stop cuda50 WUs landing on the GTX295s. They appear to be taking an age to process.

ps. I also dropped this in another thread
ID: 1373374 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7495
Credit: 91,197,662
RAC: 46,121
Australia
Message 1373377 - Posted: 30 May 2013, 9:04:44 UTC - in response to Message 1373374.  

However, is there a way to stop cuda50 WUs landing on the GTX295s. They appear to be taking an age to process.

From what i can gather, the multiple applications for multiple work units is part of the automatic application optimisation.
Let it do it's thing & it'll end up using the one that performs the best.

Grant
Darwin NT
ID: 1373377 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 665
Credit: 351,361,239
RAC: 140,265
Australia
Message 1373378 - Posted: 30 May 2013, 9:07:48 UTC - in response to Message 1373377.  

Hope so as I have a GPU WU that has been going for 37 minutes and has another 49 minutes to run. The irony is that as each minute goes by, time to complete increases.
ID: 1373378 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11143
Credit: 83,873,548
RAC: 47,226
United Kingdom
Message 1373391 - Posted: 30 May 2013, 9:27:49 UTC - in response to Message 1373259.  

I agree the Nvidias does not like the VLARS...

Is there any configuration we could do in order to avoid the GPU receive the VLARS?

It seems that VLARs are only being sent to Kepler-class GPUs, which is what we tested at Beta.

v6 VLAR resends are also being sent to Keplers, which will both extend the time before I have to change the configuration on mine, and help with the v6 cleanout. Win-win.
ID: 1373391 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 536
Credit: 2,380,011
RAC: 780
Canada
Message 1373405 - Posted: 30 May 2013, 9:56:59 UTC

Got 52 Tasks first try... All downloaded just fine.
Saw a funny message in BOINC a couple min ago though...

30/05/2013 3:52:34 AM | SETI@home | Sending scheduler request: To fetch work.
30/05/2013 3:52:34 AM | SETI@home | Not requesting tasks
30/05/2013 3:52:36 AM | SETI@home | Scheduler request completed




Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 1373405 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11143
Credit: 83,873,548
RAC: 47,226
United Kingdom
Message 1373418 - Posted: 30 May 2013, 10:13:46 UTC - in response to Message 1373405.  

Got 52 Tasks first try... All downloaded just fine.
Saw a funny message in BOINC a couple min ago though...

30/05/2013 3:52:34 AM | SETI@home | Sending scheduler request: To fetch work.
30/05/2013 3:52:34 AM | SETI@home | Not requesting tasks
30/05/2013 3:52:36 AM | SETI@home | Scheduler request completed

Thanks. It's been reported as a bug, but the devs don't seem to be very interested. Usually, it clears itself at the next work request: if not, one user reported that it cleared itself when he re-read the config file.

Or was that the other bug, where it shouldn't be requesting work, but does - again and again? Sorry, I'm losing the plot here.
ID: 1373418 · Report as offensive
The_bestestProject Donor

Send message
Joined: 7 Oct 06
Posts: 36
Credit: 65,097,042
RAC: 8,689
United States
Message 1373430 - Posted: 30 May 2013, 10:30:25 UTC
Last modified: 30 May 2013, 10:31:33 UTC

I happened to look on the Home page yesterday morning. No information about v7 being pushed out. Yes, I saw the notification that 7 was available, but NOWHERE did it state that the upgrade was required. Now I find that v7 was deployed to my machine without my consent. NOT COOL. No one should EVER allow software to be installed/upgraded without an explicit approval. This may be a deal breaker for me, that the admins running this project would once again make a huge change and then beg for forgiveness after the fact.
ID: 1373430 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11143
Credit: 83,873,548
RAC: 47,226
United Kingdom
Message 1373431 - Posted: 30 May 2013, 10:40:46 UTC - in response to Message 1373430.  

I happened to look on the Home page yesterday morning. No information about v7 being pushed out. Yes, I saw the notification that 7 was available, but NOWHERE did it state that the upgrade was required. Now I find that v7 was deployed to my machine without my consent. NOT COOL. No one should EVER allow software to be installed/upgraded without an explicit approval. This may be a deal breaker for me, that the admins running this project would once again make a huge change and then beg for forgiveness after the fact.

The whole design of BOINC is that science applications (and by implication, updated science applications) are distributed silently but securely. They come from secured servers, are digitally signed, and are run in a sandboxed work area.

Some people have asked, in the past, that BOINC itself should auto-update, as so much commercial (and even open source) software does these days: but the developers have always refused - both for security, and for the reasons you give. You have to download and install BOINC manually, and by doing so, you buy into the automatic science app distribution.
ID: 1373431 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.