Panic Mode On (54) Server problems?

Author	Message
W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19043 Credit: 40,757,560 RAC: 67	Message 1153597 - Posted: 18 Sep 2011, 14:28:55 UTC Last modified: 18 Sep 2011, 14:29:18 UTC If you are only getting 1 or 2 tasks on your requests when you have a large shortfall then take note of Richards post 1153104 in request issues. I'm pretty sure that request for 1 second, when there's a near-30,000 second shortfall, is a DCF safety. Edit - confirmed: I edited DCF by a factor of ten - took out a zero, so 0.013... became 0.13... ID: 1153597 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1153602 - Posted: 18 Sep 2011, 14:52:23 UTC - in response to Message 1153597. And conveniently, just as I was reading that, a host I've been monitoring confirmed it: 18/09/2011 15:35:36 \| \| [work_fetch] NVIDIA GPU: shortfall 132825.49 nidle 0.00 saturated 44294.51 busy 0.00 RS fetchable 100.00 runnable 100.00 18/09/2011 15:35:36 \| SETI@home \| [work_fetch] NVIDIA GPU: fetch share 1.00 LTD 0.00 backoff dt 0.00 int 0.00 18/09/2011 15:35:36 \| \| [work_fetch] No project chosen for work fetch 18/09/2011 15:35:50 \| SETI@home \| Computation for task 12jl11ad.18873.476.9.10.189_0 finished 18/09/2011 15:35:50 \| SETI@home \| [dcf] DCF: 0.016567->0.021427, raw_ratio 0.021427, adj_ratio 1.293360 18/09/2011 15:36:02 \| \| [work_fetch] NVIDIA GPU: shortfall 119854.60 nidle 0.00 saturated 57265.40 busy 0.00 RS fetchable 100.00 runnable 100.00 18/09/2011 15:36:02 \| SETI@home \| [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (119854.60 sec, 0.00 inst) 18/09/2011 15:36:02 \| SETI@home \| Reporting 10 completed tasks, requesting new tasks for NVIDIA GPU 18/09/2011 15:36:02 \| SETI@home \| [sched_op] CPU work request: 0.00 seconds; 0.00 CPUs 18/09/2011 15:36:02 \| SETI@home \| [sched_op] NVIDIA GPU work request: 119854.60 seconds; 0.00 GPUs I reckon this weekend goes down as 'revenge of the little guys'. That host is a 9800GT - as you see, it's teetering above and below the 0.02 DCF 'work fetch' cutoff value - I'm still clearing out some work assigned with stock estimates, VHAR drives DCF below 0.02, mid-AR takes it back above. In about 20 minutes, the optimised app APR kicks in, with tasks given twice the stock speed estimate. That'll do nicely, and I've got a good big run of shorties lined up (the best part of 200) - they'll be reporting, one or two every six minutes, all evening I reckon. That's why the "results returned per hour" is so high. Stock crunchers, and the people with lesser CUDA cards, are having a field-day with a download pipe mercifully clear of AP, and plenty of shorties between the VLARs. ID: 1153602 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1153661 - Posted: 18 Sep 2011, 18:46:27 UTC - in response to Message 1153589. ItÂ´s not just the client problem. Even if you request work and itÂ´s available, itÂ´s not beeing assigned. It keeps just sending 0 or 1 even if you are drain. My problem isn't getting 1 or 2 for the GPU, it's getting any at all. Sometimes i get a couple, sometimes a dozen, sometimes a couple of dozen. But invariably they're all crunched before i can download anymore. "No tasks sent" is the usual message, but there are plenty of "Project has no tasks available" there as well. Grant Darwin NT ID: 1153661 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1153666 - Posted: 18 Sep 2011, 19:13:46 UTC - in response to Message 1153450. When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday. Either the estimated completion times for new AP work are so huge you can only get a couple at a time, or something got borked with the Scheduler when they put in the patch. The AP Raedy to Send buffer just continues to grow, as the Work in Progress gets less & less. I'm thinking this is probably the case. Does anyone with a nearly-empty 10+10 cache of AP-only get any new APs? I know the ETA would be astronomical, but if you can normally run through one in ~15 hours, surely you should be able to pick up at least one with a 480-hour work request, unless the ETA is up by 30x. If it's not that, then feeder is borked. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1153666 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1153680 - Posted: 18 Sep 2011, 20:11:46 UTC - in response to Message 1153553. It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. I wasn't suggesting you had fiddled with it, I was just asking what it is. It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it. I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min. I see no flops entry in the app_info. Janice ID: 1153680 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1153721 - Posted: 18 Sep 2011, 23:37:54 UTC Uploads have dropped to Zero. Claggy ID: 1153721 ·

Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572	Message 1153722 - Posted: 18 Sep 2011, 23:39:57 UTC Uploads have stalled and server status page is not updating. Kevin ID: 1153722 ·

Akio Send message Joined: 18 May 11 Posts: 375 Credit: 32,129,242 RAC: 0	Message 1153723 - Posted: 18 Sep 2011, 23:40:30 UTC - in response to Message 1153721. Aye. Uploads are nil. Cricket has plummeted. ID: 1153723 ·

Robert Pick Send message Joined: 21 May 05 Posts: 11 Credit: 6,592,540 RAC: 18	Message 1153724 - Posted: 18 Sep 2011, 23:40:55 UTC Same here!!!!! ID: 1153724 ·

Wembley Volunteer tester Send message Joined: 16 Sep 09 Posts: 429 Credit: 1,844,293 RAC: 0	Message 1153726 - Posted: 18 Sep 2011, 23:48:29 UTC Yay! The upload server has died again! Which means my BOINC will soon stop requesting work because of the 2*numprocessors limit! ID: 1153726 ·

arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1153727 - Posted: 18 Sep 2011, 23:55:38 UTC - in response to Message 1153680. It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. I wasn't suggesting you had fiddled with it, I was just asking what it is. It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it. I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min. I see no flops entry in the app_info. You will have to manually add the info. http://setiathome.berkeley.edu/forum_thread.php?id=62293#1055179 Seems to be fairly close after I changed my DCF back to 1.000000 again. ID: 1153727 ·

Iona Send message Joined: 12 Jul 07 Posts: 790 Credit: 22,438,118 RAC: 0	Message 1153728 - Posted: 18 Sep 2011, 23:59:45 UTC Is it time for that dreaded water-fowl to present itself? Don't take life too seriously, as you'll never come out of it alive! ID: 1153728 ·

W5DMG - Dave Send message Joined: 19 May 99 Posts: 155 Credit: 33,162,251 RAC: 0	Message 1153740 - Posted: 19 Sep 2011, 0:59:42 UTC - in response to Message 1153728. Uploads not working.. :( ID: 1153740 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1153750 - Posted: 19 Sep 2011, 1:39:42 UTC Maybe uploads died because APs aren't being handed out and the storage got full? That's happened numerous times. Or did they go and put uploads and WU storage on separate volumes? I thought I remembered reading something about that. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1153750 ·

.clair. Send message Joined: 4 Nov 04 Posts: 1300 Credit: 55,390,408 RAC: 69	Message 1153751 - Posted: 19 Sep 2011, 1:48:09 UTC - in response to Message 1153728. Is it time for that dreaded water-fowl to present itself? If its the one i think you mean, our fowl watery fiend only comes out to play when the grass is green :Â¬) But i think our crickets have turned into locusts and there will be nothing green left before long, the uploads line seems to have crashed :Â¬( ID: 1153751 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1153765 - Posted: 19 Sep 2011, 3:32:36 UTC I am actually surprised it held as long as it did. Luckily tomorrow is Monday, and somebody should be in the lab to set things upright again.... Then we wait for the Boinc server code to be straightened out. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1153765 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1153776 - Posted: 19 Sep 2011, 4:06:47 UTC Astropuleses are turned off and the servers dont survive the weekend. IÂ´m wondering....... With each crime and every kindness we birth our future. ID: 1153776 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19043 Credit: 40,757,560 RAC: 67	Message 1153802 - Posted: 19 Sep 2011, 6:13:25 UTC - in response to Message 1153776. Astropuleses are turned off and the servers dont survive the weekend. IÂ´m wondering....... Too many tasks stored in the database, maybe. Lots more MB taks have been downloaded and returned, and a high number of AP;s not even sent out. ID: 1153802 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1153824 - Posted: 19 Sep 2011, 10:12:01 UTC - in response to Message 1153802. The MB Assimilators didn't appear to be working- the backlog was growing minute by minute. Grant Darwin NT ID: 1153824 ·

geoff Send message Joined: 25 Apr 00 Posts: 123 Credit: 34,100,351 RAC: 18	Message 1153829 - Posted: 19 Sep 2011, 10:36:27 UTC I am in the process of downloading 6 AP WUs with completion times of 10x. I thought AP downloads were turned off. ID: 1153829 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.