Panic Mode On (98) Server Problems?

Author	Message
Phil Burden Send message Joined: 26 Oct 00 Posts: 264 Credit: 22,303,899 RAC: 0	Message 1703280 - Posted: 20 Jul 2015, 11:29:17 UTC - in response to Message 1703269. Last modified: 20 Jul 2015, 11:30:27 UTC I Think that VLAR's are gone now, My Nvidia GPU's are full of MB now... Nope. Almost all bar a couple of my CPU WUs are VLARs, all of those in the last 6 hours are. I briefly got some non vlars for my gpu earlier today, now I'm down to the last 2, then the gpu will go idle again <sigh> Meanwhile, after seeing my RAC reach an all time high I'm watching it plunge into the pit of depair ;-( c'est la via P. ID: 1703280 ·

WezH Volunteer tester Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95	Message 1703338 - Posted: 20 Jul 2015, 14:48:50 UTC - in response to Message 1703246. I Think that VLAR's are gone now, My Nvidia GPU's are full of MB now... Still getting GPU tasks for every GPU... Every GPU is full to it's limit. ID: 1703338 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1703341 - Posted: 20 Jul 2015, 14:56:03 UTC - in response to Message 1703338. My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone.... Not sure what is going on. ID: 1703341 ·

Phil Burden Send message Joined: 26 Oct 00 Posts: 264 Credit: 22,303,899 RAC: 0	Message 1703358 - Posted: 20 Jul 2015, 15:32:56 UTC - in response to Message 1703341. Last modified: 20 Jul 2015, 15:33:28 UTC My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone.... Not sure what is going on. same here, have 100 cpu tasks, but Seti reports I've reached a limit of tasks in progress, meanwhile ATI GPU is fast asleep doing nothing. P. ID: 1703358 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703383 - Posted: 20 Jul 2015, 16:21:30 UTC Last modified: 20 Jul 2015, 16:22:33 UTC Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703383 ·

WezH Volunteer tester Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95	Message 1703384 - Posted: 20 Jul 2015, 16:21:45 UTC - in response to Message 1703358. Last modified: 20 Jul 2015, 16:23:12 UTC My top 2 computers haven't had any significant GPU work in 12 hours. There has been maybe 20-30 at 1 time but they are long gone.... Not sure what is going on. same here, have 100 cpu tasks, but Seti reports I've reached a limit of tasks in progress, meanwhile ATI GPU is fast asleep doing nothing. P. Strange. My 750ti2 host has full 2100 quota of MB's (and ATI and CPU are still working AP's) My 660 host is almost full, 97 tasks. (and ATI and CPU are still working AP's) My 590ti host has almost full quota off 100, 97 GPU tasks. (and and CPU is still working AP's) What computing preferences are You using? My 750ti and 660 hosts have Store at least 3 days of work (no additional) and 590ti has Store at least 1 days of work (no additional) And all three has "Run only the selected applications" - SETI@home v7: yes / AstroPulse v7: no So maybe diffrence in settings???? ID: 1703384 ·

qbit Volunteer tester Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0	Message 1703387 - Posted: 20 Jul 2015, 16:28:13 UTC - in response to Message 1703383. Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) No idea. But maybe you can use the Rescheduler to move them from CPU to GPU? ID: 1703387 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1703389 - Posted: 20 Jul 2015, 16:29:14 UTC - in response to Message 1703383. Last modified: 20 Jul 2015, 16:29:58 UTC Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) I wish, I would think we need the app from beta to do it, the opencl_nvidia_sah and all of it's components. Since we don't have permission to try it, I haven't done so. I think it only work with higher end GPUs, so I would think they would need to add something to our preferences to allow users to choose to use it. Otherwise it locks up the GPUs and forces hard reboots. Just my 2 cents, thou I think it would help to clear the VLAR storms faster Zalster ID: 1703389 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703392 - Posted: 20 Jul 2015, 16:38:57 UTC - in response to Message 1703389. Last modified: 20 Jul 2015, 16:40:44 UTC I have a CUDA MB app built from the source (modified to use CUDA streams to achieve 95% GPU occupancy when running just one MB at a time). This is not the beta OpenCL app. I'd need the right information for the app_info.xml to make the server think this is an OpenCL app even though it is not. A user selectable option "compute VLAR" y/n in the preferences would be good too. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703392 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703398 - Posted: 20 Jul 2015, 16:45:00 UTC - in response to Message 1703387. Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) No idea. But maybe you can use the Rescheduler to move them from CPU to GPU? I could reschedule one task by hand. How? Editing some file? I will not try to find a rescheduler for my linux machine. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703398 ·

qbit Volunteer tester Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0	Message 1703405 - Posted: 20 Jul 2015, 16:57:11 UTC Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find. ID: 1703405 ·

WezH Volunteer tester Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95	Message 1703407 - Posted: 20 Jul 2015, 17:01:05 UTC - in response to Message 1703405. Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find. Does those reschdeulers even work now whe "Resend lost tasks" -feature in server side is disabled? ID: 1703407 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703408 - Posted: 20 Jul 2015, 17:02:41 UTC - in response to Message 1703407. Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find. Does those reschdeulers even work now whe "Resend lost tasks" -feature in server side is disabled? Should do. It's a pretty much internal client state thing. fglops may or may not end up a problem since creditnew. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703408 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703428 - Posted: 20 Jul 2015, 17:44:11 UTC Last modified: 20 Jul 2015, 18:11:38 UTC I managed to do the edit manually... It started to crunch ... root@Linux1:~/Downloads/BOINC# cat slots/6/stderr.txt setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 2, pciSlotID = 0 Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 1, pciSlotID = 0 Device 3: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 3 setiathome_CUDA: CUDA Device 3 specified, checking... Device 3: GeForce GTX 780 is okay SETI@home using CUDA accelerated device GeForce GTX 780 Using pfb = 4 from command line args Using pfp = 192 from command line args setiathome enhanced x41zc, Cuda 6.50 special Detected setiathome_enhanced_v7 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.012579 Sigma 127 Thread call stack limit is: 1k now I'm waiting it to finish. Estimate (bad) was 7 minutes. EDIT: minutes done 4:00 10% 5:00 12.57% 7:00 17.67% 8:00 20.32% 9:00 23.16% 10:00 26.16% 14:00 40.39% 15:00 43.61% 16:00 46.59% 17:00 49.81% 18:00 52.98% 19:00 55.00% 20:00 59.07% 21:00 62.30% 22:00 65.00% 23:00 68.10% 24:00 71.18% 25:00 73.90% 26:00 76.75% 27:00 79.60% 28:00 82.25% 28:46 100.00% (sudden jump, maybe 30/30) End of test. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703428 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1703433 - Posted: 20 Jul 2015, 17:57:21 UTC - in response to Message 1703398. Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) No idea. But maybe you can use the Rescheduler to move them from CPU to GPU? I could reschedule one task by hand. How? Editing some file? I will not try to find a rescheduler for my linux machine. Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs. Just watch the estimated time, it may be too short after the change. Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change; <version_num>700</version_num> to <version_num>708</version_num> <plan_class>opencl_ati5_sah</plan_class> Don't make a mistake...or you Lose All your cache... ID: 1703433 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703434 - Posted: 20 Jul 2015, 17:59:29 UTC - in response to Message 1703428. now I'm waiting it to finish. Estimate (bad) was 7 minutes. I have a PID controlled adaptive estimate patch somewhere... "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703434 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1703438 - Posted: 20 Jul 2015, 18:12:09 UTC - in response to Message 1703405. Sorry, didn't see you're on linux. I think there's just a rescheduler for windows and even that seems to be hard to find. Rescheduling work was found to screw up the credit issued once we switched to CreditNew & as far as I know no version that supports MB v7 was released. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1703438 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703442 - Posted: 20 Jul 2015, 18:14:20 UTC - in response to Message 1703433. Last modified: 20 Jul 2015, 18:19:39 UTC Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) No idea. But maybe you can use the Rescheduler to move them from CPU to GPU? I could reschedule one task by hand. How? Editing some file? I will not try to find a rescheduler for my linux machine. Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs. Just watch the estimated time, it may be too short after the change. Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change; <version_num>700</version_num> to <version_num>708</version_num> <plan_class>opencl_ati5_sah</plan_class> Don't make a mistake...or you Lose All your cache... Thanks TBar, that is what I did. I had to do it to both clien_state and client_state_prev .xml files. Now I'm going to check how it did (WU). It may require a third run by someone else. My version has still some accuracy problems. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703442 ·

Phil Burden Send message Joined: 26 Oct 00 Posts: 264 Credit: 22,303,899 RAC: 0	Message 1703450 - Posted: 20 Jul 2015, 18:27:59 UTC - in response to Message 1703384. Last modified: 20 Jul 2015, 18:28:47 UTC Strange. My 750ti2 host has full 2100 quota of MB's (and ATI and CPU are still working AP's) My 660 host is almost full, 97 tasks. (and ATI and CPU are still working AP's) My 590ti host has almost full quota off 100, 97 GPU tasks. (and and CPU is still working AP's) What computing preferences are You using? My 750ti and 660 hosts have Store at least 3 days of work (no additional) and 590ti has Store at least 1 days of work (no additional) And all three has "Run only the selected applications" - SETI@home v7: yes / AstroPulse v7: no So maybe diffrence in settings???? Nope, same settings all day, I did get 15 wu's for the gpu this morning, but nothing since, cpu has 100 wu's (25 AP's & 75 vlar's).. P. ID: 1703450 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1703466 - Posted: 20 Jul 2015, 19:19:22 UTC - in response to Message 1703442. Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) No idea. But maybe you can use the Rescheduler to move them from CPU to GPU? I could reschedule one task by hand. How? Editing some file? I will not try to find a rescheduler for my linux machine. Just edit the client state file by changing the <results> entry to the version number and plan class of your CUDA App in your app_info. Works for ATIs. Just watch the estimated time, it may be too short after the change. Best way is to suspend the VLAR, stop BOINC, then edit the entry containing the suspended line. In my case I would change; <version_num>700</version_num> to <version_num>708</version_num> <plan_class>opencl_ati5_sah</plan_class> Don't make a mistake...or you Lose All your cache... Thanks TBar, that is what I did. I had to do it to both clien_state and client_state_prev .xml files. Now I'm going to check how it did (WU). It may require a third run by someone else. My version has still some accuracy problems. Hmmm, not much difference. Except mine didn't overflow; Yours, Run time: 28 min 46 sec Mine, Run time: 28 min 37 sec I think the Grump found the nVidia OpenCL App was a little faster on MBs, but it ate a whole CPU core. Or, maybe it was someone else... ID: 1703466 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.