Panic Mode On (54) Server problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (54) Server problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next
Author Message
Profile Firehawk
Volunteer tester
Avatar
Send message
Joined: 21 May 99
Posts: 1731
Credit: 258,887,428
RAC: 5,314
Brazil
Message 1153589 - Posted: 18 Sep 2011, 14:06:44 UTC

It´s not just the client problem. Even if you request work and it´s available, it´s not beeing assigned. It keeps just sending 0 or 1 even if you are drain.
____________

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1153592 - Posted: 18 Sep 2011, 14:21:48 UTC
Last modified: 18 Sep 2011, 14:22:58 UTC

I wouldn't be surprised if a lot of feeder slots are permanently occupied by APs no one wants to have (because no doubt AP task duration is badly F-ed up, too).
____________

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8681
Credit: 24,882,186
RAC: 28,450
United Kingdom
Message 1153597 - Posted: 18 Sep 2011, 14:28:55 UTC
Last modified: 18 Sep 2011, 14:29:18 UTC

If you are only getting 1 or 2 tasks on your requests when you have a large shortfall then take note of Richards post 1153104 in request issues.

I'm pretty sure that request for 1 second, when there's a near-30,000 second shortfall, is a DCF safety.

Edit - confirmed: I edited DCF by a factor of ten - took out a zero, so 0.013... became 0.13...

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3508
Credit: 20,635,907
RAC: 21,900
Sweden
Message 1153600 - Posted: 18 Sep 2011, 14:47:16 UTC - in response to Message 1152837.
Last modified: 18 Sep 2011, 14:50:17 UTC

Asks for work for days, get nothing, still happy, refuse to whine :-)

END

Edit, added: Come to think of it, if I don't get any work by Sunday evening (local time), I might start some mini whining.



So there, it's Sunday evening and still not one single AP received.

As promised: Whine, whine, whine.


LOL
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8629
Credit: 51,392,089
RAC: 50,490
United Kingdom
Message 1153602 - Posted: 18 Sep 2011, 14:52:23 UTC - in response to Message 1153597.

And conveniently, just as I was reading that, a host I've been monitoring confirmed it:

18/09/2011 15:35:36 | | [work_fetch] NVIDIA GPU: shortfall 132825.49 nidle 0.00 saturated 44294.51 busy 0.00 RS fetchable 100.00 runnable 100.00
18/09/2011 15:35:36 | SETI@home | [work_fetch] NVIDIA GPU: fetch share 1.00 LTD 0.00 backoff dt 0.00 int 0.00
18/09/2011 15:35:36 | | [work_fetch] No project chosen for work fetch
18/09/2011 15:35:50 | SETI@home | Computation for task 12jl11ad.18873.476.9.10.189_0 finished
18/09/2011 15:35:50 | SETI@home | [dcf] DCF: 0.016567->0.021427, raw_ratio 0.021427, adj_ratio 1.293360
18/09/2011 15:36:02 | | [work_fetch] NVIDIA GPU: shortfall 119854.60 nidle 0.00 saturated 57265.40 busy 0.00 RS fetchable 100.00 runnable 100.00
18/09/2011 15:36:02 | SETI@home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (119854.60 sec, 0.00 inst)
18/09/2011 15:36:02 | SETI@home | Reporting 10 completed tasks, requesting new tasks for NVIDIA GPU
18/09/2011 15:36:02 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 CPUs
18/09/2011 15:36:02 | SETI@home | [sched_op] NVIDIA GPU work request: 119854.60 seconds; 0.00 GPUs

I reckon this weekend goes down as 'revenge of the little guys'.

That host is a 9800GT - as you see, it's teetering above and below the 0.02 DCF 'work fetch' cutoff value - I'm still clearing out some work assigned with stock estimates, VHAR drives DCF below 0.02, mid-AR takes it back above.

In about 20 minutes, the optimised app APR kicks in, with tasks given twice the stock speed estimate. That'll do nicely, and I've got a good big run of shorties lined up (the best part of 200) - they'll be reporting, one or two every six minutes, all evening I reckon.

That's why the "results returned per hour" is so high. Stock crunchers, and the people with lesser CUDA cards, are having a field-day with a download pipe mercifully clear of AP, and plenty of shorties between the VLARs.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5861
Credit: 60,394,107
RAC: 49,046
Australia
Message 1153661 - Posted: 18 Sep 2011, 18:46:27 UTC - in response to Message 1153589.

It´s not just the client problem. Even if you request work and it´s available, it´s not beeing assigned. It keeps just sending 0 or 1 even if you are drain.

My problem isn't getting 1 or 2 for the GPU, it's getting any at all. Sometimes i get a couple, sometimes a dozen, sometimes a couple of dozen. But invariably they're all crunched before i can download anymore.
"No tasks sent" is the usual message, but there are plenty of "Project has no tasks available" there as well.
____________
Grant
Darwin NT.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2287
Credit: 8,794,229
RAC: 3,887
United States
Message 1153666 - Posted: 18 Sep 2011, 19:13:46 UTC - in response to Message 1153450.

When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday.

Either the estimated completion times for new AP work are so huge you can only get a couple at a time, or something got borked with the Scheduler when they put in the patch. The AP Raedy to Send buffer just continues to grow, as the Work in Progress gets less & less.

I'm thinking this is probably the case. Does anyone with a nearly-empty 10+10 cache of AP-only get any new APs? I know the ETA would be astronomical, but if you can normally run through one in ~15 hours, surely you should be able to pick up at least one with a 480-hour work request, unless the ETA is up by 30x.

If it's not that, then feeder is borked.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 3
United States
Message 1153680 - Posted: 18 Sep 2011, 20:11:46 UTC - in response to Message 1153553.

It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again.

I wasn't suggesting you had fiddled with it, I was just asking what it is.

It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it.

I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min.


I see no flops entry in the app_info.
____________

Janice

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4139
Credit: 33,426,181
RAC: 19,573
United Kingdom
Message 1153721 - Posted: 18 Sep 2011, 23:37:54 UTC

Uploads have dropped to Zero.

Claggy

Kevin Olley
Send message
Joined: 3 Aug 99
Posts: 368
Credit: 35,323,000
RAC: 1,919
United Kingdom
Message 1153722 - Posted: 18 Sep 2011, 23:39:57 UTC

Uploads have stalled and server status page is not updating.


____________
Kevin


Profile SliverProject donor
Avatar
Send message
Joined: 18 May 11
Posts: 281
Credit: 7,191,152
RAC: 738
United States
Message 1153723 - Posted: 18 Sep 2011, 23:40:30 UTC - in response to Message 1153721.

Aye. Uploads are nil. Cricket has plummeted.
____________

Profile Robert Pick
Send message
Joined: 21 May 05
Posts: 11
Credit: 2,340,386
RAC: 2,927
United States
Message 1153724 - Posted: 18 Sep 2011, 23:40:55 UTC

Same here!!!!!
____________

Wembley
Volunteer tester
Avatar
Send message
Joined: 16 Sep 09
Posts: 415
Credit: 888,257
RAC: 0
United States
Message 1153726 - Posted: 18 Sep 2011, 23:48:29 UTC

Yay! The upload server has died again! Which means my BOINC will soon stop requesting work because of the 2*numprocessors limit!

____________


Donate with your searches and online buys:
http://www.goodsearch.com/toolbar/university-of-california-setihome

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3686
Credit: 48,711,560
RAC: 7,208
United States
Message 1153727 - Posted: 18 Sep 2011, 23:55:38 UTC - in response to Message 1153680.

It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again.

I wasn't suggesting you had fiddled with it, I was just asking what it is.

It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it.

I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min.


I see no flops entry in the app_info.


You will have to manually add the info.
http://setiathome.berkeley.edu/forum_thread.php?id=62293#1055179

Seems to be fairly close after I changed my DCF back to 1.000000 again.
____________

Iona
Avatar
Send message
Joined: 12 Jul 07
Posts: 567
Credit: 2,908,207
RAC: 2,549
United Kingdom
Message 1153728 - Posted: 18 Sep 2011, 23:59:45 UTC

Is it time for that dreaded water-fowl to present itself?



____________
Don't take life too seriously, as you'll never come out of it alive!

W5DMG - Dave
Send message
Joined: 19 May 99
Posts: 155
Credit: 33,047,922
RAC: 11,109
United States
Message 1153740 - Posted: 19 Sep 2011, 0:59:42 UTC - in response to Message 1153728.

Uploads not working.. :(

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2287
Credit: 8,794,229
RAC: 3,887
United States
Message 1153750 - Posted: 19 Sep 2011, 1:39:42 UTC

Maybe uploads died because APs aren't being handed out and the storage got full? That's happened numerous times. Or did they go and put uploads and WU storage on separate volumes? I thought I remembered reading something about that.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,063,000
RAC: 679
United Kingdom
Message 1153751 - Posted: 19 Sep 2011, 1:48:09 UTC - in response to Message 1153728.

Is it time for that dreaded water-fowl to present itself?


If its the one i think you mean, our fowl watery fiend only comes out to play when the grass is green :¬)

But i think our crickets have turned into locusts and there will be nothing green left before long, the uploads line seems to have crashed :¬(

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24471
Credit: 33,771,339
RAC: 23,988
Germany
Message 1153776 - Posted: 19 Sep 2011, 4:06:47 UTC

Astropuleses are turned off and the servers dont survive the weekend.
I´m wondering.......

____________

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (54) Server problems?

Copyright © 2014 University of California