Panic Mode On (77) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (77) Server Problems?

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · Next
Author Message
rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8535
Credit: 59,447,140
RAC: 86,028
United Kingdom
Message 1301074 - Posted: 1 Nov 2012, 20:19:41 UTC

Certainly something amiss in the server closet - the query rate has climbed to about 3000/s instead of the usual few hundred....
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4443
Credit: 119,139,792
RAC: 140,477
United States
Message 1301077 - Posted: 1 Nov 2012, 20:23:47 UTC - in response to Message 1301074.

Certainly something amiss in the server closet - the query rate has climbed to about 3000/s instead of the usual few hundred....

Probably just the daily stat dump being generated.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

MikeN
Send message
Joined: 24 Jan 11
Posts: 301
Credit: 32,389,183
RAC: 18,814
United Kingdom
Message 1301092 - Posted: 1 Nov 2012, 21:46:23 UTC

Very difficult to report here, and the more tasks to report the more difficult it is. When I did manage to report, the only new WUs I got were shorties. Combination of APs and shorties makes for a perfect storm and everything grinds to a halt.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2290
Credit: 8,813,816
RAC: 4,080
United States
Message 1301102 - Posted: 1 Nov 2012, 23:14:45 UTC
Last modified: 1 Nov 2012, 23:24:20 UTC

Starting to get a bunch of APs allocated to me. Download speeds increased by a factor of 10 for me sometime today to mid-20KB/sec range. Getting a lot of "resent lost task" to go with all of the "timeout was reached" messages.



Unrelated, but involves my crunchers.. Both of my UPSes (APC 1300, Tripp-Lite 1400) on my desk simultaneously died this morning at 0730utc. I woke up because all of the fan hum stopped (two rigs and a rack-mount 24-port gigabit switch), and there was a 60dB 2.5kHz tone coming from one of them. That was fun to deal with. Now I need 172Ah worth of 12v batteries to get all five of my units (7800VA in total) back up and running. Looks like it's going to cost about US$550. Or.. I can get an APC 3000VA 3U rackmount unit with new batteries on eBay for $550 + shipping. Decisions, decisions. I think I might go for getting the APC 2200 up and running to replace the two that were on the desk.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7101
Credit: 60,933,483
RAC: 17,242
Germany
Message 1301112 - Posted: 2 Nov 2012, 0:53:12 UTC

Maybe the last 20 hours or something no well scheduler contact. Always:

Scheduler request failed: Timeout was reached


*new tasks* was enabled.

I set *no new tasks* - and then 178 uploaded tasks were accepted from the scheduler server in a bunch (successful report).


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1715
Credit: 205,828,023
RAC: 27,762
Australia
Message 1301118 - Posted: 2 Nov 2012, 1:34:28 UTC

By installing a 7.xx client, setting a max report of 10 tasks in my cc_config file and setting No New Tasks. I find I can reliably report.

A report of 10 tasks was the most I could go, 15 or 20 timed out.

With nearly a thousand task to report, it's gonna be a Looooong day :-)

T.A.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2290
Credit: 8,813,816
RAC: 4,080
United States
Message 1301119 - Posted: 2 Nov 2012, 1:41:17 UTC - in response to Message 1301118.

By installing a 7.xx client, setting a max report of 10 tasks in my cc_config file and setting No New Tasks. I find I can reliably report.

A report of 10 tasks was the most I could go, 15 or 20 timed out.

With nearly a thousand task to report, it's gonna be a Looooong day :-)

T.A.

Do you generate 10 new ones to report within that 303-second scheduler interval? If so, it's a negative-feedback loop and you won't catch up.

My scheduler contacts are hit-and-miss, and I'm not even reporting any completed tasks. Some of them get a reply in 10 seconds, others reach the time-out period.

I am, however, trying to hoard APs. Om nom nom nom..
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1715
Credit: 205,828,023
RAC: 27,762
Australia
Message 1301122 - Posted: 2 Nov 2012, 1:49:58 UTC - in response to Message 1301119.
Last modified: 2 Nov 2012, 1:51:39 UTC

Luckily the task s I'm crunching atm are taking around 10 minutes to crunch so the reporting is slowly pulling ahead.

I'm also helping with a bit of "button abuse". If you have set NNT you don't have to wait the 5 minutes for another "hit".

T.A.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2290
Credit: 8,813,816
RAC: 4,080
United States
Message 1301125 - Posted: 2 Nov 2012, 2:08:05 UTC - in response to Message 1301122.

If you have set NNT you don't have to wait the 5 minutes for another "hit".

Interesting to know. I thought it was a universal 303 seconds between accepted contacts. Not that I ever have a huge pile of tasks to report since I'm CPU-only and AP.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile cov_route
Avatar
Send message
Joined: 13 Sep 12
Posts: 295
Credit: 7,423,890
RAC: 12,746
Canada
Message 1301128 - Posted: 2 Nov 2012, 2:15:28 UTC

I don't know what happened during the day today, but I come home and the task list on my account page insists I have 2270 tasks in progress.

There are actually 431 in my cache. Hmm.

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1715
Credit: 205,828,023
RAC: 27,762
Australia
Message 1301132 - Posted: 2 Nov 2012, 2:40:41 UTC - in response to Message 1301125.
Last modified: 2 Nov 2012, 2:48:23 UTC

If you have set NNT you don't have to wait the 5 minutes for another "hit".

Interesting to know. I thought it was a universal 303 seconds between accepted contacts. Not that I ever have a huge pile of tasks to report since I'm CPU-only and AP.

My guess is that the limit is applied in the Scheduler and that a straight report without a request for new work doesn't go through the Scheduler, so therefore the limit is not applied.

T.A.

Profile Brother Frank
Send message
Joined: 10 Dec 11
Posts: 26
Credit: 15,142,410
RAC: 0
United States
Message 1301141 - Posted: 2 Nov 2012, 3:28:27 UTC - in response to Message 1301112.

Dear Sutaru Tsureku,

I did what you suggested and set my download to no new tasks and then waited five minutes or so. Then I "abused my Update button" by pressing it, but only once. I used to really hit the darned thing so often, but I'm getting more subtle and not "using that hammer" all the time. A few minutes later, all of my 50 or so built up completed tasks had been swallowed up. I moved from this desktop with one Nvidia Ti 550 graphics card to my new notebook and did the same thing. Again in just a few minutes my 40 or so completed tasks were swallowed. I went downstairs to my i7 2600k desktop with two Nvidia Ti 550 graphics cards and it had swallowed all its completed tasks -- probably 70-80 of them. I am feeling a great sense of relief like you have helped me find a wonderful brand of computer laxative. Wow, Thanks for reminding me of this wonderful inexpensive procedure. I am breaking out in song,"Oh What A Relief IT IS -- la la la la la.... la la. Hurray." I hope this is not too racy. It's in the same fine tradition found in the writings of the esteemed 16th Century author Francois Rabelais, a scholar and humorist who penned the tales of Gargantua and Pantagruel. French literature including Rabelais and Michel de Montaigne was one of my very favorite college courses.

Brother Frank

Sirius B
Volunteer tester
Avatar
Send message
Joined: 26 Dec 00
Posts: 11547
Credit: 1,728,110
RAC: 1,647
Israel
Message 1301144 - Posted: 2 Nov 2012, 3:58:19 UTC

This is a first for me. After working on my Win 8 host, allowed it to download tasks.

After several timeouts, found that I had downloaded 16 wu's. However, on looking at My computers page, found that I had 119 wu's, but 103 marked "abandoned"?
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,599,319
RAC: 47,507
Australia
Message 1301148 - Posted: 2 Nov 2012, 4:17:39 UTC - in response to Message 1301144.


While at work today not once were either of my systems able to report any work.
Scheduler request failed: Timeout was reached is the only response they get.

And naturally their caches continue to shrink.
____________
Grant
Darwin NT.

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1301151 - Posted: 2 Nov 2012, 4:41:23 UTC - in response to Message 1301148.
Last modified: 2 Nov 2012, 4:41:57 UTC

After several timeouts, found that I had downloaded 16 wu's. However, on looking at My computers page, found that I had 119 wu's, but 103 marked "abandoned"?

You're lucky. My computer got entire 5-day cache of 692 tasks (including some of new APs) abandoned today for absolutely no reason. The worst of it is, none of the workunits were deleted locally, so if I hadn't aborted them, I'd be doing 5 days of pointless work.
____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,896,743
RAC: 2,450
United States
Message 1301188 - Posted: 2 Nov 2012, 6:23:16 UTC

As someone pointed out, and for me, it appears that the workunits we've uploaded are being processed and credited, just the notification of that to clear all the workunits listed as "ready to report" off of the task tab as well as fetching any new workunits.

Of course you know what this means. When the blockage is finally fixed we will be getting a metric ton of new workunits all with the same timestamp and likely paired with the same wingman or two. And if history is any indication there is a high likelihood that the wingman will be someone whose a) cuda application always overflow thus leading to a inconclusive validation; b) has a 29.9 day turn around time; or C) forgot they signed up for Seti@Home and all the tasks timeout after six or so weeks.

No, I'm not cynical. Not at all. :p
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Profile UliProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Feb 00
Posts: 9842
Credit: 5,466,250
RAC: 190
Germany
Message 1301202 - Posted: 2 Nov 2012, 7:09:27 UTC

I have been monitoring my rig and have no issues.
Except Mark is partnered with me on of my inconclusives. I sure it is stuck in the upload problem.
____________
Pluto will always be a planet to me.
Order your 15th Seti Anniversary Shirt today. Just PM me for details.
Cash Donation Specialist

Seti Ambassador

Profile UliProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Feb 00
Posts: 9842
Credit: 5,466,250
RAC: 190
Germany
Message 1301213 - Posted: 2 Nov 2012, 8:17:12 UTC
Last modified: 2 Nov 2012, 8:19:15 UTC

MP all our science counts. Your facinasion with Guassions astonds me. When in time we will find the signal, everyone will win.
____________
Pluto will always be a planet to me.
Order your 15th Seti Anniversary Shirt today. Just PM me for details.
Cash Donation Specialist

Seti Ambassador

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · Next

Message boards : Number crunching : Panic Mode On (77) Server Problems?

Copyright © 2014 University of California