Panic Mode On (76) Server Problems?

Message boards : Number crunching : Panic Mode On (76) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 20 · Next

AuthorMessage
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 1274286 - Posted: 23 Aug 2012, 8:36:18 UTC
Last modified: 23 Aug 2012, 8:55:32 UTC

As I type parts of the SSP are 206 hours behind. [As of 23 Aug 2012 | 8:30:04 UTC] SSP was behind before this weeks outage. How are the As of* times set?
ID: 1274286 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1274332 - Posted: 23 Aug 2012, 10:36:35 UTC - in response to Message 1274259.  

A .vlar WU would mean x ~ 10.8 longer than a normal AR WU. Wasted performance.

bill - Can't agree. The vlar will take even longer to work on a cpu than a gpu.
To me crunching a work unit faster is better. If your gpu is not busy doing any other work units, why not let it do vlars if they cause no problems.

Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view), but if for a CPU it pretty much doesn't matter, if it's cruching a VLAR or a normal-AR task while the GPU is x times slower on VLAR, you are waisting performance if you send VLARs to a GPU. It's better for the project that a GPU do few 0.44 WUs instead of 1 VLAR. And it's up to the user to set his cache high enough that his card never idle (OK, not easy with the current load on servers, but that's another thing).


I teory thats ok, but when crunching the vlar with Nvidia GPU your entire system turn dificult to use, the cursor flicks, a lot of wierd problems start to apears with the video interface, so you almost loose the host for any other use until this WU is processed, so processing Vlars with NVidia GPU in a non dedicated crunchig host realy is a waste of resources. Lets the vlars running in the CPUS and all others on the GPUs.

ID: 1274332 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1274337 - Posted: 23 Aug 2012, 10:48:19 UTC - in response to Message 1274332.  

I teory thats ok, but when crunching the vlar with Nvidia GPU your entire system turn dificult to use, the cursor flicks, a lot of wierd problems start to apears with the video interface, so you almost loose the host for any other use until this WU is processed, so processing Vlars with NVidia GPU in a non dedicated crunchig host realy is a waste of resources. Lets the vlars running in the CPUS and all others on the GPUs.

Exactly.

From the project's point of view, they couldn't care less if the tasks run fast or slow. We're probably (collectively) supplying more processing power than they want or need at the moment - provided the work comes back, it's been processed accurately, and it doesn't hang around for too long (i.e. weeks), that'll be fine for them.

The screen lag, and not being able to use the machine for anything else, is the big no-no for a volunteer project. If that happens, 99% of volunteers just uninstall BOINC and walk away, cursing. Not only is their resource lost to SETI, it's lost to all the other BOINC projects too - and the person behind the computer is lost to science and scientific research. That's what SETI can't be seen to do.
ID: 1274337 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1274368 - Posted: 23 Aug 2012, 12:32:50 UTC - in response to Message 1274337.  

From the project's point of view, they couldn't care less if the tasks run fast or slow.

Well, project like that would never get me to crunch for them. Such thinking from the project staff was a reason why I didn't join Milkyway earlier (those days when they had highly inefficent applications themselves, didn't except anonymous platform and all the other things they did back than). I want my resources to be used as efficently as possible by the project (hence I use opt apps), if I have more than my main project can use, there are other projects, who are happy to get what's over.
ID: 1274368 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1274388 - Posted: 23 Aug 2012, 13:51:08 UTC

I am crunching both a SETI vlar and a BOINC_VM Virtual Machine from CERN on this CPU, an AMD APU E-450 at 1.67 GHz, which is not a speed champion but uses only 18 W, so it can take the heat wave we have in Italy (33 C now, no AC) while I had to shut down the SUN WS with its fans going full speed.
Tullio
ID: 1274388 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1274390 - Posted: 23 Aug 2012, 13:56:14 UTC - in response to Message 1274368.  
Last modified: 23 Aug 2012, 13:58:09 UTC

From the project's point of view, they couldn't care less if the tasks run fast or slow.

Well, project like that would never get me to crunch for them. Such thinking from the project staff was a reason why I didn't join Milkyway earlier (those days when they had highly inefficent applications themselves, didn't except anonymous platform and all the other things they did back than). I want my resources to be used as efficently as possible by the project (hence I use opt apps), if I have more than my main project can use, there are other projects, who are happy to get what's over.


Please don´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project.

Any help is wanted, just don´t need to be worried on that particular point, we are just warning because the video lag could be a serius problem if you need to use a not crunching only host.
ID: 1274390 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1274396 - Posted: 23 Aug 2012, 14:12:10 UTC

I think the thing to keep in mind here is that the batch of VLARs sent out to Nvidia GPU hosts was an unintended error, not a change in policy.

Several of my best rigs have been crippled in output by the VLARs that have risen to the top of their caches and are now being tackled by the GPUs, although very slowly. Watching my best rig struggle through 2 per GPU on very capable cards is painful to see...LOL.

But, the kitties have resigned themselves to letting the rigs work through it, and this too shall pass. I am not seeing any problems in the way of errors, just painfully slow processing.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1274396 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1274403 - Posted: 23 Aug 2012, 14:23:12 UTC - in response to Message 1274396.  

I agree with you Mark, this was an unintended error and is fixed now.

For now... "Vlars to NVidia GPUs - Never Again"... unless someone find a way to bypass the problem with maybe some black magic... A task for our Master Guru Jason and his team...
ID: 1274403 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1274431 - Posted: 23 Aug 2012, 15:38:48 UTC - in response to Message 1274390.  

Please don´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project.

I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there.
ID: 1274431 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1274434 - Posted: 23 Aug 2012, 15:53:15 UTC - in response to Message 1274259.  

A .vlar WU would mean x ~ 10.8 longer than a normal AR WU. Wasted performance.

bill - Can't agree. The vlar will take even longer to work on a cpu than a gpu.
To me crunching a work unit faster is better. If your gpu is not busy doing any other work units, why not let it do vlars if they cause no problems.


Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view),

bill - You would have to prove that an idle gpu is better than a working gpu.

but if for a CPU it pretty much doesn't matter, if it's cruching a VLAR or a normal-AR task while the GPU is x times slower on VLAR, you are waisting performance if you send VLARs to a GPU.

bill - Not if the gpu is sitting idle.

It's better for the project that a GPU do few 0.44 WUs instead of 1 VLAR.

bill - So you missed the part about idle gpu.

And it's up to the user to set his cache high enough that his card never idle (OK, not easy with the current load on servers, but that's another thing).

bill - There's that pesky idle gpu again.


ID: 1274434 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1274452 - Posted: 23 Aug 2012, 16:38:20 UTC - in response to Message 1274431.  

Please don´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project.

I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there.

I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria.

I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone.
ID: 1274452 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1274453 - Posted: 23 Aug 2012, 16:41:16 UTC - in response to Message 1274452.  

Please don´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project.

I wasn't talking about the progress of SETI@Home, as Richard pointed out we donate more resources to them than they can use ATM, I was talking about efficient usage of our resources by all projects and SETI is one, that can easily use nVidia GPUs more efficiently by not assigning VLAR tasks to them. The more the project care about efficient usage of our resources, the more science we get done, not necessarily for SETI (since they just can't send out more WUs than they are already doing now) but for other projects out there.

I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria.

I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone.

Well, if you recall back then, there were also 'VLAR killer' opti apps that would just toss them back to the servers. Thus increasing the server load with no additional work being done. The kitties never agreed with this and looked on those apps as 'cherry picking' at the time.

I think the present solution of not sending VLARs to hosts that cannot as effectively process them ended up being win-win for both the users and the project.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1274453 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1274464 - Posted: 23 Aug 2012, 17:11:40 UTC - in response to Message 1274434.  
Last modified: 23 Aug 2012, 17:18:43 UTC

Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view),

bill - You would have to prove that an idle gpu is better than a working gpu.

The GPU does not need to be idle, it's just a matter of BOINC configuration: large enough cache and if that does not help backup project (probably necessary anyway with the current load on S@H's internet connection). Your idle GPU issue is something that the user can fix. Also such GPU might do more for the project if it's idle for a while and than gets again suitable WUs to work on. Blocking it for hours or even days with a bunch of VLARs might indeed lead to less job done at the end of the day/month/year/whatever. So yes, an idle GPU for a while might be better unless the servers would send just 1 VLAR in case they have nothing else and the GPU is idle.
ID: 1274464 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1274497 - Posted: 23 Aug 2012, 18:09:50 UTC - in response to Message 1274464.  


I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half.
Grant
Darwin NT
ID: 1274497 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1274498 - Posted: 23 Aug 2012, 18:13:03 UTC - in response to Message 1274497.  


I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half.

I believe a shorty storm may be keeping the feeder rather dry.
Lots of other comments about not getting work lately.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1274498 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1274501 - Posted: 23 Aug 2012, 18:17:11 UTC - in response to Message 1274497.  
Last modified: 23 Aug 2012, 18:18:00 UTC


I've been getting a lot of "No tasks sent" messages lately, and haven't been able to get any work for about an hour and a half.

Lots of VLARs out there (as in there are tasks that can't be sent to my Nvidia GPU, but can be sent to my CPU if i choose to accept them):

23/08/2012 18:36:45 | SETI@home | [sched_op] Starting scheduler request
23/08/2012 18:36:45 | SETI@home | Sending scheduler request: To fetch work.
23/08/2012 18:36:45 | SETI@home | Reporting 1 completed tasks
23/08/2012 18:36:45 | SETI@home | Requesting new tasks for NVIDIA
23/08/2012 18:36:45 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
23/08/2012 18:36:45 | SETI@home | [sched_op] NVIDIA work request: 254142.76 seconds; 0.00 devices
23/08/2012 18:36:55 | SETI@home | Scheduler request completed: got 0 new tasks
23/08/2012 18:36:55 | SETI@home | [sched_op] Server version 701
23/08/2012 18:36:55 | SETI@home | No tasks sent
23/08/2012 18:36:55 | SETI@home | No tasks are available for AstroPulse v6
23/08/2012 18:36:55 | SETI@home | No tasks are available for the applications you have selected.
23/08/2012 18:36:55 | SETI@home | Tasks for CPU are available, but your preferences are set to not accept them
23/08/2012 18:36:55 | SETI@home | Project requested delay of 303 seconds
23/08/2012 18:36:55 | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 08au12ab.8651.6772.6.10.83_2
23/08/2012 18:36:55 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
23/08/2012 18:36:55 | SETI@home | [sched_op] Reason: requested by project


After about another 5 requests, i did get 58 Cuda tasks.

Claggy
ID: 1274501 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1274506 - Posted: 23 Aug 2012, 18:36:57 UTC - in response to Message 1274452.  

I don't think we're disagreeing here. Fortunately, not sending VLARs to NVidia is a win-win problem. The solution satisfies both the efficiency and the volunteer satisfaction criteria.

I was merely saying that, from the project's point of view, not alienating volunteers is the stronger argument. I don't think we'd have got the "don't send" solution coded in the first place, if we'd had to argue on the efficiency criterion alone.

Sure, fixing issues like unusable systems or even driver crashes are more important than efficiency, however I think efficiency is one of the volunteer satisfaction criteria. Just think about all the complaints about falling RAC caused by pendings which led alredy to several discussions about shorter deadlines. And there is nothing lost, the credit is just awarded little later. So complaints about GPUs doing just a small fraction of what they are capable to do would happen probably more often and I'm pretty sure many would leave the project because of that.
ID: 1274506 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1274509 - Posted: 23 Aug 2012, 18:49:59 UTC - in response to Message 1274464.  

Sure, a GPU doing a VLAR might be better than an idle GPU (depending from your point of view),

bill - You would have to prove that an idle gpu is better than a working gpu.


The GPU does not need to be idle, it's just a matter of BOINC configuration: large enough cache and if that does not help backup project (probably necessary anyway with the current load on S@H's internet connection).



Your idle GPU issue is something that the user can fix. Also such GPU might do more for the project if it's idle for a while and than gets again suitable WUs to work on.

bill - I run eight other projects, but doing work for them accomplishes nothing for SETI, does it?


Blocking it for hours or even days with a bunch of VLARs might indeed lead to less job done at the end of the day/month/year/whatever.

bill - I'm not "blocking anything". Idle gpu, remember? My pc does vlars in 2
to 3 hours three at a time. I don't consider that a problem.
I've seen SETI be down for over a month.

So yes, an idle GPU for a while might be better unless the servers would send just 1 VLAR in case they have nothing else and the GPU is idle.


You do realize you just contradicted yourself? Reread your last line.

I don't care how others run their pc, so long as they are not aborting work
that does not does not cause errors. Just because it takes longer is not an excuse to off load work units onto other people. That's cherry picking.
ID: 1274509 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1274515 - Posted: 23 Aug 2012, 18:59:14 UTC

Every time VLARs are discussed I remember The Attack of the Killer 58.7s. In that late 2006 time frame with only CPU crunching and coarser chirp resolution, that was the granted credit rate for VLARs.

Conceptually, assigning work in a manner such that each host is most productive is obviously desirable. The .vlar exclusion for CUDA does align with that and is definitely needed as long as the 6.08 and 6.09 stock applications are in use on older CUDA cards. My impression is that when SETI@home v7 is released here, both CUDA and NV OpenCL implementations should be capable of doing VLARS without excessive screen lags, etc., and with less time penalty. An adjustment of the basic splitter estimate could be used to better balance credit grants.

The fact remains that VLAR tasks are harder to divide into small enough parts to take full advantage of the parallel nature of GPU crunching. Newer GPUs do have the capability to be subdivided so that only part of the GPU would be working on the least divisible parts of the task, but that's another layer of software complexity and there would be issues with validation on result_overflow tasks. That brings this discussion back toward the "Server problems" subject area.
                                                                  Joe
ID: 1274515 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1274518 - Posted: 23 Aug 2012, 19:12:41 UTC - in response to Message 1274390.  

Please don´t missunderstud what I and Richard says, SETI is a project that is spected to runs DECADES before any spected success could be achived (unless of course our little green mens give us a hand), so "fast or slow" is relative to that, the difference from a 12 min WU (GPU) to a 1 1/2 hour (CPU) makes little difference to the 50 Years or maybe more project.

Any help is wanted, just don´t need to be worried on that particular point, we are just warning because the video lag could be a serius problem if you need to use a not crunching only host
.


And when we (or our children, or our chilren's children) DO find the elusive little fraks, I'm sure the project will have all the funding and support it needs to last another 500 years. You know, so we can find the next race, and then the next one, and then the next one... :)
ID: 1274518 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 20 · Next

Message boards : Number crunching : Panic Mode On (76) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.