Panic Mode On (98) Server Problems?

Message boards : Number crunching : Panic Mode On (98) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 30 · Next

AuthorMessage
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 1703280 - Posted: 20 Jul 2015, 11:29:17 UTC - in response to Message 1703269.  
Last modified: 20 Jul 2015, 11:30:27 UTC

I Think that VLAR's are gone now, My Nvidia GPU's are full of MB now...

Nope.
Almost all bar a couple of my CPU WUs are VLARs, all of those in the last 6 hours are.


I briefly got some non vlars for my gpu earlier today, now I'm down to the last 2, then the gpu will go idle again <sigh> Meanwhile, after seeing my RAC reach an all time high I'm watching it plunge into the pit of depair ;-(

c'est la via

P.
ID: 1703280 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1703338 - Posted: 20 Jul 2015, 14:48:50 UTC - in response to Message 1703246.  

I Think that VLAR's are gone now, My Nvidia GPU's are full of MB now...


Still getting GPU tasks for every GPU... Every GPU is full to it's limit.
ID: 1703338 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1703341 - Posted: 20 Jul 2015, 14:56:03 UTC - in response to Message 1703338.  

My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone....

Not sure what is going on.
ID: 1703341 · Report as offensive
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 1703358 - Posted: 20 Jul 2015, 15:32:56 UTC - in response to Message 1703341.  
Last modified: 20 Jul 2015, 15:33:28 UTC

My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone....

Not sure what is going on.


same here, have 100 cpu tasks, but Seti reports I've reached a limit of tasks in progress, meanwhile ATI GPU is fast asleep doing nothing.

P.
ID: 1703358 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703383 - Posted: 20 Jul 2015, 16:21:30 UTC
Last modified: 20 Jul 2015, 16:22:33 UTC

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703383 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1703384 - Posted: 20 Jul 2015, 16:21:45 UTC - in response to Message 1703358.  
Last modified: 20 Jul 2015, 16:23:12 UTC

My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone....

Not sure what is going on.


same here, have 100 cpu tasks, but Seti reports I've reached a limit of tasks in progress, meanwhile ATI GPU is fast asleep doing nothing.

P.


Strange.

My 750ti*2 host has full 2*100 quota of MB's (and ATI and CPU are still working AP's)

My 660 host is almost full, 97 tasks. (and ATI and CPU are still working AP's)

My 590ti host has almost full quota off 100, 97 GPU tasks. (and and CPU is still working AP's)



What computing preferences are You using? My 750ti and 660 hosts have Store at least 3 days of work (no additional) and 590ti has Store at least 1 days of work (no additional)

And all three has "Run only the selected applications" - SETI@home v7: yes / AstroPulse v7: no

So maybe diffrence in settings????
ID: 1703384 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1703387 - Posted: 20 Jul 2015, 16:28:13 UTC - in response to Message 1703383.  

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)

No idea. But maybe you can use the Rescheduler to move them from CPU to GPU?
ID: 1703387 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1703389 - Posted: 20 Jul 2015, 16:29:14 UTC - in response to Message 1703383.  
Last modified: 20 Jul 2015, 16:29:58 UTC

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)


I wish, I would think we need the app from beta to do it, the opencl_nvidia_sah and all of it's components. Since we don't have permission to try it, I haven't done so. I think it only work with higher end GPUs, so I would think they would need to add something to our preferences to allow users to choose to use it. Otherwise it locks up the GPUs and forces hard reboots.

Just my 2 cents, thou I think it would help to clear the VLAR storms faster

Zalster
ID: 1703389 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703392 - Posted: 20 Jul 2015, 16:38:57 UTC - in response to Message 1703389.  
Last modified: 20 Jul 2015, 16:40:44 UTC

I have a CUDA MB app built from the source (modified to use CUDA streams to achieve 95% GPU occupancy when running just one MB at a time). This is not the beta OpenCL app.

I'd need the right information for the app_info.xml to make the server think this is an OpenCL app even though it is not.

A user selectable option "compute VLAR" y/n in the preferences would be good too.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703392 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703398 - Posted: 20 Jul 2015, 16:45:00 UTC - in response to Message 1703387.  

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)

No idea. But maybe you can use the Rescheduler to move them from CPU to GPU?


I could reschedule one task by hand. How? Editing some file?
I will not try to find a rescheduler for my linux machine.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703398 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1703405 - Posted: 20 Jul 2015, 16:57:11 UTC

Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find.
ID: 1703405 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1703407 - Posted: 20 Jul 2015, 17:01:05 UTC - in response to Message 1703405.  

Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find.


Does those reschdeulers even work now whe "Resend lost tasks" -feature in server side is disabled?
ID: 1703407 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703408 - Posted: 20 Jul 2015, 17:02:41 UTC - in response to Message 1703407.  

Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find.


Does those reschdeulers even work now whe "Resend lost tasks" -feature in server side is disabled?


Should do. It's a pretty much internal client state thing. fglops may or may not end up a problem since creditnew.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703408 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703428 - Posted: 20 Jul 2015, 17:44:11 UTC
Last modified: 20 Jul 2015, 18:11:38 UTC

I managed to do the edit manually...

It started to crunch ...

root@Linux1:~/Downloads/BOINC# cat slots/6/stderr.txt 
setiathome_CUDA: Found 4 CUDA device(s):
  Device 1: GeForce GTX 780, 3071 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 12 
     pciBusID = 2, pciSlotID = 0
  Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 12 
     pciBusID = 1, pciSlotID = 0
  Device 3: GeForce GTX 780, 3071 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 12 
     pciBusID = 3, pciSlotID = 0
  Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536
     computeCap 3.5, multiProcs 12 
     pciBusID = 4, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 3
setiathome_CUDA: CUDA Device 3 specified, checking...
   Device 3: GeForce GTX 780 is okay
SETI@home using CUDA accelerated device GeForce GTX 780
Using pfb = 4 from command line args
Using pfp = 192 from command line args

setiathome enhanced x41zc, Cuda 6.50 special

Detected setiathome_enhanced_v7 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.012579
Sigma 127
Thread call stack limit is: 1k



now I'm waiting it to finish. Estimate (bad) was 7 minutes.


EDIT:
minutes done
4:00 10%
5:00 12.57%
7:00 17.67%
8:00 20.32%
9:00 23.16%
10:00 26.16%
14:00 40.39%
15:00 43.61%
16:00 46.59%
17:00 49.81%
18:00 52.98%
19:00 55.00%
20:00 59.07%
21:00 62.30%
22:00 65.00%
23:00 68.10%
24:00 71.18%
25:00 73.90%
26:00 76.75%
27:00 79.60%
28:00 82.25%
28:46 100.00% (sudden jump, maybe 30/30)

End of test.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703428 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1703433 - Posted: 20 Jul 2015, 17:57:21 UTC - in response to Message 1703398.  

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)

No idea. But maybe you can use the Rescheduler to move them from CPU to GPU?


I could reschedule one task by hand. How? Editing some file?
I will not try to find a rescheduler for my linux machine.

Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs.
Just watch the estimated time, it may be too short after the change.
Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change;
<version_num>700</version_num>
to
<version_num>708</version_num>
<plan_class>opencl_ati5_sah</plan_class>

Don't make a mistake...or you Lose All your cache...
ID: 1703433 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703434 - Posted: 20 Jul 2015, 17:59:29 UTC - in response to Message 1703428.  

now I'm waiting it to finish. Estimate (bad) was 7 minutes.


I have a PID controlled adaptive estimate patch somewhere...
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703434 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1703438 - Posted: 20 Jul 2015, 18:12:09 UTC - in response to Message 1703405.  

Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find.

Rescheduling work was found to screw up the credit issued once we switched to CreditNew & as far as I know no version that supports MB v7 was released.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1703438 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703442 - Posted: 20 Jul 2015, 18:14:20 UTC - in response to Message 1703433.  
Last modified: 20 Jul 2015, 18:19:39 UTC

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)

No idea. But maybe you can use the Rescheduler to move them from CPU to GPU?


I could reschedule one task by hand. How? Editing some file?
I will not try to find a rescheduler for my linux machine.

Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs.
Just watch the estimated time, it may be too short after the change.
Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change;
<version_num>700</version_num>
to
<version_num>708</version_num>
<plan_class>opencl_ati5_sah</plan_class>

Don't make a mistake...or you Lose All your cache...



Thanks TBar, that is what I did. I had to do it to both clien_state and client_state_prev .xml files.

Now I'm going to check how it did (WU). It may require a third run by someone else. My version has still some accuracy problems.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703442 · Report as offensive
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 1703450 - Posted: 20 Jul 2015, 18:27:59 UTC - in response to Message 1703384.  
Last modified: 20 Jul 2015, 18:28:47 UTC



Strange.

My 750ti*2 host has full 2*100 quota of MB's (and ATI and CPU are still working AP's)

My 660 host is almost full, 97 tasks. (and ATI and CPU are still working AP's)

My 590ti host has almost full quota off 100, 97 GPU tasks. (and and CPU is still working AP's)



What computing preferences are You using? My 750ti and 660 hosts have Store at least 3 days of work (no additional) and 590ti has Store at least 1 days of work (no additional)

And all three has "Run only the selected applications" - SETI@home v7: yes / AstroPulse v7: no

So maybe diffrence in settings????


Nope, same settings all day, I did get 15 wu's for the gpu this morning, but nothing since, cpu has 100 wu's (25 AP's & 75 vlar's)..

P.
ID: 1703450 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1703466 - Posted: 20 Jul 2015, 19:19:22 UTC - in response to Message 1703442.  

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)

No idea. But maybe you can use the Rescheduler to move them from CPU to GPU?


I could reschedule one task by hand. How? Editing some file?
I will not try to find a rescheduler for my linux machine.

Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs.
Just watch the estimated time, it may be too short after the change.
Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change;
<version_num>700</version_num>
to
<version_num>708</version_num>
<plan_class>opencl_ati5_sah</plan_class>

Don't make a mistake...or you Lose All your cache...



Thanks TBar, that is what I did. I had to do it to both clien_state and client_state_prev .xml files.

Now I'm going to check how it did (WU). It may require a third run by someone else. My version has still some accuracy problems.

Hmmm, not much difference. Except mine didn't overflow;
Yours, Run time: 28 min 46 sec
Mine, Run time: 28 min 37 sec
I think the Grump found the nVidia OpenCL App was a little faster on MBs, but it ate a whole CPU core.
Or, maybe it was someone else...
ID: 1703466 · Report as offensive
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 30 · Next

Message boards : Number crunching : Panic Mode On (98) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.