Panic Mode On (76) Server Problems?

Author	Message
Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1274286 - Posted: 23 Aug 2012, 8:36:18 UTC Last modified: 23 Aug 2012, 8:55:32 UTC As I type parts of the SSP are 206 hours behind. [As of 23 Aug 2012 \| 8:30:04 UTC] SSP was behind before this weeks outage. How are the As of* times set? ID: 1274286 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1274332 - Posted: 23 Aug 2012, 10:36:35 UTC - in response to Message 1274259. A .vlar WU would mean x ~ 10.8 longer than a normal AR WU. Wasted performance. bill - Can't agree. The vlar will take even longer to work on a cpu than a gpu. To me crunching a work unit faster is better. If your gpu is not busy doing any other work units, why not let it do vlars if they cause no problems. Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view), but if for a CPU it pretty much doesn't matter, if it's cruching a VLAR or a normal-AR task while the GPU is x times slower on VLAR, you are waisting performance if you send VLARs to a GPU. It's better for the project that a GPU do few 0.44 WUs instead of 1 VLAR. And it's up to the user to set his cache high enough that his card never idle (OK, not easy with the current load on servers, but that's another thing). I teory thats ok, but when crunching the vlar with Nvidia GPU your entire system turn dificult to use, the cursor flicks, a lot of wierd problems start to apears with the video interface, so you almost loose the host for any other use until this WU is processed, so processing Vlars with NVidia GPU in a non dedicated crunchig host realy is a waste of resources. Lets the vlars running in the CPUS and all others on the GPUs. ID: 1274332 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1274337 - Posted: 23 Aug 2012, 10:48:19 UTC - in response to Message 1274332. I teory thats ok, but when crunching the vlar with Nvidia GPU your entire system turn dificult to use, the cursor flicks, a lot of wierd problems start to apears with the video interface, so you almost loose the host for any other use until this WU is processed, so processing Vlars with NVidia GPU in a non dedicated crunchig host realy is a waste of resources. Lets the vlars running in the CPUS and all others on the GPUs. Exactly. From the project's point of view, they couldn't care less if the tasks run fast or slow. We're probably (collectively) supplying more processing power than they want or need at the moment - provided the work comes back, it's been processed accurately, and it doesn't hang around for too long (i.e. weeks), that'll be fine for them. The screen lag, and not being able to use the machine for anything else, is the big no-no for a volunteer project. If that happens, 99% of volunteers just uninstall BOINC and walk away, cursing. Not only is their resource lost to SETI, it's lost to all the other BOINC projects too - and the person behind the computer is lost to science and scientific research. That's what SETI can't be seen to do. ID: 1274337 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1274368 - Posted: 23 Aug 2012, 12:32:50 UTC - in response to Message 1274337. From the project's point of view, they couldn't care less if the tasks run fast or slow. Well, project like that would never get me to crunch for them. Such thinking from the project staff was a reason why I didn't join Milkyway earlier (those days when they had highly inefficent applications themselves, didn't except anonymous platform and all the other things they did back than). I want my resources to be used as efficently as possible by the project (hence I use opt apps), if I have more than my main project can use, there are other projects, who are happy to get what's over. ID: 1274368 ·

tullio Volunteer tester Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1	Message 1274388 - Posted: 23 Aug 2012, 13:51:08 UTC I am crunching both a SETI vlar and a BOINC_VM Virtual Machine from CERN on this CPU, an AMD APU E-450 at 1.67 GHz, which is not a speed champion but uses only 18 W, so it can take the heat wave we have in Italy (33 C now, no AC) while I had to shut down the SUN WS with its fans going full speed. Tullio ID: 1274388 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1274390 - Posted: 23 Aug 2012, 13:56:14 UTC - in response to Message 1274368. Last modified: 23 Aug 2012, 13:58:09 UTC From the project's point of view, they couldn't care less if the tasks run fast or slow. Well, project like that would never get me to crunch for them. Such thinking from the project staff was a reason why I didn't join Milkyway earlier (those days when they had highly inefficent applications themselves, didn't except anonymous platform and all the other things they did back than). I want my resources to be used as efficently as possible by the project (hence I use opt apps), if I have more than my main project can use, there are other projects, who are happy to get what's over. Please donÂ´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project. Any help is wanted, just donÂ´t need to be worried on that particular point, we are just warning because the video lag could be a serius problem if you need to use a not crunching only host. ID: 1274390 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1274396 - Posted: 23 Aug 2012, 14:12:10 UTC I think the thing to keep in mind here is that the batch of VLARs sent out to Nvidia GPU hosts was an unintended error, not a change in policy. Several of my best rigs have been crippled in output by the VLARs that have risen to the top of their caches and are now being tackled by the GPUs, although very slowly. Watching my best rig struggle through 2 per GPU on very capable cards is painful to see...LOL. But, the kitties have resigned themselves to letting the rigs work through it, and this too shall pass. I am not seeing any problems in the way of errors, just painfully slow processing. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1274396 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1274403 - Posted: 23 Aug 2012, 14:23:12 UTC - in response to Message 1274396. I agree with you Mark, this was an unintended error and is fixed now. For now... "Vlars to NVidia GPUs - Never Again"... unless someone find a way to bypass the problem with maybe some black magic... A task for our Master Guru Jason and his team... ID: 1274403 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1274431 - Posted: 23 Aug 2012, 15:38:48 UTC - in response to Message 1274390. Please donÂ´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project. I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there. ID: 1274431 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1274434 - Posted: 23 Aug 2012, 15:53:15 UTC - in response to Message 1274259. A .vlar WU would mean x ~ 10.8 longer than a normal AR WU. Wasted performance. bill - Can't agree. The vlar will take even longer to work on a cpu than a gpu. To me crunching a work unit faster is better. If your gpu is not busy doing any other work units, why not let it do vlars if they cause no problems. Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view), bill - You would have to prove that an idle gpu is better than a working gpu. but if for a CPU it pretty much doesn't matter, if it's cruching a VLAR or a normal-AR task while the GPU is x times slower on VLAR, you are waisting performance if you send VLARs to a GPU. bill - Not if the gpu is sitting idle. It's better for the project that a GPU do few 0.44 WUs instead of 1 VLAR. bill - So you missed the part about idle gpu. And it's up to the user to set his cache high enough that his card never idle (OK, not easy with the current load on servers, but that's another thing). bill - There's that pesky idle gpu again. ID: 1274434 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1274452 - Posted: 23 Aug 2012, 16:38:20 UTC - in response to Message 1274431. Please donÂ´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project. I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there. I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria. I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone. ID: 1274452 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1274453 - Posted: 23 Aug 2012, 16:41:16 UTC - in response to Message 1274452. Please donÂ´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project. I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there. I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria. I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone. Well, if you recall back then, there were also 'VLAR killer' opti apps that would just toss them back to the servers. Thus increasing the server load with no additional work being done. The kitties never agreed with this and looked on those apps as 'cherry picking' at the time. I think the present solution of not sending VLARs to hosts that cannot as effectively process them ended up being win-win for both the users and the project. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1274453 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1274464 - Posted: 23 Aug 2012, 17:11:40 UTC - in response to Message 1274434. Last modified: 23 Aug 2012, 17:18:43 UTC Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view), bill - You would have to prove that an idle gpu is better than a working gpu. The GPU does not need to be idle, it's just a matter of BOINC configuration: large enough cache and if that does not help backup project (probably necessary anyway with the current load on S@H's internet connection). Your idle GPU issue is something that the user can fix. Also such GPU might do more for the project if it's idle for a while and than gets again suitable WUs to work on. Blocking it for hours or even days with a bunch of VLARs might indeed lead to less job done at the end of the day/month/year/whatever. So yes, an idle GPU for a while might be better unless the servers would send just 1 VLAR in case they have nothing else and the GPU is idle. ID: 1274464 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1274497 - Posted: 23 Aug 2012, 18:09:50 UTC - in response to Message 1274464. I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half. Grant Darwin NT ID: 1274497 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1274498 - Posted: 23 Aug 2012, 18:13:03 UTC - in response to Message 1274497. I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half. I believe a shorty storm may be keeping the feeder rather dry. Lots of other comments about not getting work lately. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1274498 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1274501 - Posted: 23 Aug 2012, 18:17:11 UTC - in response to Message 1274497. Last modified: 23 Aug 2012, 18:18:00 UTC I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half. Lots of VLARs out there (as in there are tasks that can't be sent to my Nvidia GPU, but can be sent to my CPU if i choose to accept them): 23/08/2012 18:36:45 \| SETI@home \| [sched_op] Starting scheduler request 23/08/2012 18:36:45 \| SETI@home \| Sending scheduler request: To fetch work. 23/08/2012 18:36:45 \| SETI@home \| Reporting 1 completed tasks 23/08/2012 18:36:45 \| SETI@home \| Requesting new tasks for NVIDIA 23/08/2012 18:36:45 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices 23/08/2012 18:36:45 \| SETI@home \| [sched_op] NVIDIA work request: 254142.76 seconds; 0.00 devices 23/08/2012 18:36:55 \| SETI@home \| Scheduler request completed: got 0 new tasks 23/08/2012 18:36:55 \| SETI@home \| [sched_op] Server version 701 23/08/2012 18:36:55 \| SETI@home \| No tasks sent 23/08/2012 18:36:55 \| SETI@home \| No tasks are available for AstroPulse v6 23/08/2012 18:36:55 \| SETI@home \| No tasks are available for the applications you have selected. 23/08/2012 18:36:55 \| SETI@home \| Tasks for CPU are available, but your preferences are set to not accept them 23/08/2012 18:36:55 \| SETI@home \| Project requested delay of 303 seconds 23/08/2012 18:36:55 \| SETI@home \| [sched_op] handle_scheduler_reply(): got ack for task 08au12ab.8651.6772.6.10.83_2 23/08/2012 18:36:55 \| SETI@home \| [sched_op] Deferring communication for 5 min 3 sec 23/08/2012 18:36:55 \| SETI@home \| [sched_op] Reason: requested by project After about another 5 requests, i did get 58 Cuda tasks. Claggy ID: 1274501 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1274506 - Posted: 23 Aug 2012, 18:36:57 UTC - in response to Message 1274452. I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria. I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone. Sure, fixing issues like unusable systems or even driver crashes are more important than efficiency, however I think efficiency is one of the volunteer satisfaction criteria. Just think about all the complaints about falling RAC caused by pendings which led alredy to several discussions about shorter deadlines. And there is nothing lost, the credit is just awarded little later. So complaints about GPUs doing just a small fraction of what they are capable to do would happen probably more often and I'm pretty sure many would leave the project because of that. ID: 1274506 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1274509 - Posted: 23 Aug 2012, 18:49:59 UTC - in response to Message 1274464. Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view), bill - You would have to prove that an idle gpu is better than a working gpu. The GPU does not need to be idle, it's just a matter of BOINC configuration: large enough cache and if that does not help backup project (probably necessary anyway with the current load on S@H's internet connection). Your idle GPU issue is something that the user can fix. Also such GPU might do more for the project if it's idle for a while and than gets again suitable WUs to work on. bill - I run eight other projects, but doing work for them accomplishes nothing for SETI, does it? Blocking it for hours or even days with a bunch of VLARs might indeed lead to less job done at the end of the day/month/year/whatever. bill - I'm not "blocking anything". Idle gpu, remember? My pc does vlars in 2 to 3 hours three at a time. I don't consider that a problem. I've seen SETI be down for over a month. So yes, an idle GPU for a while might be better unless the servers would send just 1 VLAR in case they have nothing else and the GPU is idle. You do realize you just contradicted yourself? Reread your last line. I don't care how others run their pc, so long as they are not aborting work that does not does not cause errors. Just because it takes longer is not an excuse to off load work units onto other people. That's cherry picking. ID: 1274509 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1274515 - Posted: 23 Aug 2012, 18:59:14 UTC Every time VLARs are discussed I remember The Attack of the Killer 58.7s. In that late 2006 time frame with only CPU crunching and coarser chirp resolution, that was the granted credit rate for VLARs. Conceptually, assigning work in a manner such that each host is most productive is obviously desirable. The .vlar exclusion for CUDA does align with that and is definitely needed as long as the 6.08 and 6.09 stock applications are in use on older CUDA cards. My impression is that when SETI@home v7 is released here, both CUDA and NV OpenCL implementations should be capable of doing VLARS without excessive screen lags, etc., and with less time penalty. An adjustment of the basic splitter estimate could be used to better balance credit grants. The fact remains that VLAR tasks are harder to divide into small enough parts to take full advantage of the parallel nature of GPU crunching. Newer GPUs do have the capability to be subdivided so that only part of the GPU would be working on the least divisible parts of the task, but that's another layer of software complexity and there would be issues with validation on result_overflow tasks. That brings this discussion back toward the "Server problems" subject area. Joe ID: 1274515 ·

shizaru Volunteer tester Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0	Message 1274518 - Posted: 23 Aug 2012, 19:12:41 UTC - in response to Message 1274390. Please donÂ´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project. Any help is wanted, just donÂ´t need to be worried on that particular point, we are just warning because the video lag could be a serius problem if you need to use a not crunching only host. And when we (or our children, or our chilren's children) DO find the elusive little fraks, I'm sure the project will have all the funding and support it needs to last another 500 years. You know, so we can find the next race, and then the next one, and then the next one... :) ID: 1274518 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.