Message boards :
Number crunching :
Panic Mode On (79) Server Problems?
Message board moderation
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · Next
Author | Message |
---|---|
.clair. Send message Joined: 4 Nov 04 Posts: 1300 Credit: 55,390,408 RAC: 69 |
While the network was not at max coms where good, Now that more AP splitters are runing we seem to have hit the same problem of a month ago, That is what i can see of it, Increasing the AP spliters slowly seem to me to point to something. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
On a test sample of one host, I got the same outcome here as I got at Albert. 1) Try to do a normal update (reporting completed work, and requesting new work): I saw a server timeout, but the server registered the completed work and created some ghosts. 2) Set NNT before update: I got acknowledgements of the (same) completed work). 3) Unset NNT and update again: I got the (same) ghosts, as "resent lost results". I don't think it's just network congestion, no matter how severe. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
On a test sample of one host, I got the same outcome here as I got at Albert. Yeah, something is screwed up, but what? The Joker is in the details... The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Yeah, getting scheduler requests through again is so ugly most of my better rigs are out of GPU work due to timed out or otherwise unable to be completed requests and the %#@$#@&##! 100 WU limit. This limit situation is starting to piss even the good natured kitties off. Can't ride out the Tuesday outage or some network/server congestion without running out of work for the GPUs. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Wiggo Send message Joined: 24 Jan 00 Posts: 34841 Credit: 261,360,520 RAC: 489 |
If the guys have changed something in the server closet in the last several hours then they better change it back again. Cheers. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I'm having problems with the uploads hanging. I don't have that many with the APs going on one card and long MBs on the other, but, they are all hanging. The Long MBs are about gone, and now the recently downloaded shorties will be running....more hangs. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
And when, after a number of retries, the scheduler responds with some 'resends', most of the downloads are dead in the water. EDIT... I would estimate this all went to heck in a handbasket about 4-5 hours ago. When I left for work about 9 hours ago, all rigs had their pitiful 100 WU allotment filled. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Seeing the same behavior as when things were really bad a while back....before the limits 'fixed' things. Host makes scheduler request. My account shows that contact with the scheduler was made. Scheduler does not answer.... Host tries again, still no answer. Eventually after enough retries, the scheduler responds by resending 'lost' tasks. MY rigs did not lose them. Uploads are rather dicey too. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
Seeing the same behavior as when things were really bad a while back....before the limits 'fixed' things. Yeah and that's crazy, something's borked... The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Wiggo Send message Joined: 24 Jan 00 Posts: 34841 Credit: 261,360,520 RAC: 489 |
I've just set NNT until this hiccup is over as I'm not going to baby sit down/up loads (my backup projects may get to fight for my resources again). Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Uploads backing up. After hitting retry several dozen times i was able to get a couple to upload, eventually. Upload speed was that of an old & crippled snail (< 2kB/s). Upload error message- connect() failed. Have got a few Scheduler errors, mostly Server returned no data etc. I'd probably have more, but the backedup uploads have been blocking the work requests. When the request does go through it's taking 1-2min to get a response. BTW- weren't the WUs with the really long identifier meant to have been fixed? I'm still getting lots of those. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Your observations about comms mirror what I am seeing. I don't think the long IDs were considered a problem per se, but I thought they were a temporary thing as well. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Just to add to the fun, when i do get work it's almost all shorties. Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19072 Credit: 40,757,560 RAC: 67 |
Uploads backing up. That looks like the problem here also. By abusing a few buttons, got enough uploads to happen so that requests could be made. It took a few requests but eventually got a few GPU tasks and they all came in at >50kbs. Just to add to the fun, when i do get work it's almost all shorties. Same here, so that still leaves me less than an hours GPU crunching time on hand. So as my four legged friend (the bed) calls I must either enable Einstein crunching or switch off and try domani. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
It took a few requests but eventually got a few GPU tasks and they all came in at >50kbs. 10-20kB/s here at the moment. Now with uploads 1kB/s is doing well (when it does eventually go through). Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Ye-Haw a Shorty Storm with a Borked Upload server. How many do you think it will hang? 12/28/2012 6:13:56 PM | SETI@home | Started upload of 03se12aa.10570.17249.140733193388039.10.238_2_0 12/28/2012 6:14:19 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:14:19 PM | SETI@home | Temporarily failed upload of 03se12aa.10570.17249.140733193388039.10.238_2_0: connect() failed 12/28/2012 6:14:19 PM | SETI@home | Backing off 3 min 50 sec on upload of 03se12aa.10570.17249.140733193388039.10.238_2_0 12/28/2012 6:14:21 PM | | Internet access OK - project servers may be temporarily down. 12/28/2012 6:17:59 PM | SETI@home | Computation for task 07oc12af.16388.6202.140733193388038.10.1_1 finished 12/28/2012 6:17:59 PM | SETI@home | Starting task 07oc12ag.10905.12746.9.10.247_1 using setiathome_enhanced version 609 (cuda23) in slot 3 12/28/2012 6:18:01 PM | SETI@home | Started upload of 07oc12af.16388.6202.140733193388038.10.1_1_0 12/28/2012 6:18:10 PM | SETI@home | Started upload of 03se12aa.10570.17249.140733193388039.10.238_2_0 12/28/2012 6:18:23 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:18:23 PM | SETI@home | Temporarily failed upload of 07oc12af.16388.6202.140733193388038.10.1_1_0: connect() failed 12/28/2012 6:18:23 PM | SETI@home | Backing off 3 min 34 sec on upload of 07oc12af.16388.6202.140733193388038.10.1_1_0 12/28/2012 6:18:25 PM | | Internet access OK - project servers may be temporarily down. 12/28/2012 6:21:45 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:21:45 PM | SETI@home | Temporarily failed upload of 03se12aa.10570.17249.140733193388039.10.238_2_0: connect() failed 12/28/2012 6:21:45 PM | SETI@home | Backing off 4 min 49 sec on upload of 03se12aa.10570.17249.140733193388039.10.238_2_0 12/28/2012 6:21:47 PM | | Internet access OK - project servers may be temporarily down. 12/28/2012 6:22:09 PM | SETI@home | Computation for task 07oc12ag.10905.12746.9.10.247_1 finished 12/28/2012 6:22:09 PM | SETI@home | Starting task 07oc12ah.10878.22562.6.10.76_1 using setiathome_enhanced version 609 (cuda23) in slot 3 12/28/2012 6:22:11 PM | SETI@home | Started upload of 07oc12ag.10905.12746.9.10.247_1_0 12/28/2012 6:22:49 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:22:49 PM | SETI@home | Temporarily failed upload of 07oc12ag.10905.12746.9.10.247_1_0: connect() failed 12/28/2012 6:22:49 PM | SETI@home | Backing off 3 min 11 sec on upload of 07oc12ag.10905.12746.9.10.247_1_0 12/28/2012 6:22:50 PM | | Internet access OK - project servers may be temporarily down. 12/28/2012 6:26:11 PM | SETI@home | Computation for task 07oc12ah.10878.22562.6.10.76_1 finished 12/28/2012 6:26:11 PM | SETI@home | Starting task 01au12ab.23909.24895.10.10.9_2 using setiathome_enhanced version 609 (cuda23) in slot 3 12/28/2012 6:26:13 PM | SETI@home | Started upload of 07oc12ah.10878.22562.6.10.76_1_0 12/28/2012 6:26:26 PM | SETI@home | Computation for task 01au12ab.23909.24895.10.10.9_2 finished 12/28/2012 6:26:26 PM | SETI@home | Starting task 07oc12af.5577.9065.140733193388039.10.69_0 using setiathome_enhanced version 609 (cuda23) in slot 3 12/28/2012 6:26:28 PM | SETI@home | Started upload of 01au12ab.23909.24895.10.10.9_2_0 12/28/2012 6:26:35 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:26:35 PM | SETI@home | Temporarily failed upload of 07oc12ah.10878.22562.6.10.76_1_0: connect() failed 12/28/2012 6:26:35 PM | SETI@home | Backing off 3 min 46 sec on upload of 07oc12ah.10878.22562.6.10.76_1_0 12/28/2012 6:26:36 PM | | Internet access OK - project servers may be temporarily down. 12/28/2012 6:26:50 PM | | Project communication failed: attempting access to reference site 12/28/2012 6:26:50 PM | SETI@home | Temporarily failed upload of 01au12ab.23909.24895.10.10.9_2_0: connect() failed 12/28/2012 6:26:50 PM | SETI@home | Backing off 3 min 19 sec on upload of 01au12ab.23909.24895.10.10.9_2_0 12/28/2012 6:26:52 PM | | Internet access OK - project servers may be temporarily down. :-( |
W-K 666 Send message Joined: 18 May 99 Posts: 19072 Credit: 40,757,560 RAC: 67 |
It took a few requests but eventually got a few GPU tasks and they all came in at >50kbs. Thats cause you live in the middle of nowhere, or at least can see nowhere from there. My cousin keeps me informed of the actions of your ISP's and the telco's over there he is not very impressed having lived in the UK, Boston and southern California. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
My cousin keeps me informed of the actions of your ISP's and the telco's over there he is not very impressed having lived in the UK, Boston and southern California. We're not overly thrilled with them either, but they do have the population density & total numbers argument on their side. A landmass the area of mainland USA, with a total population that's not even 3 times that of London (22.6 million v 8.2 million). Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I have not seen the Seti servers so totally screwed in a long while. I mean, we have a trifecta going here. Uploads, downloads, AND scheduler requests all nearly impossible. Makes me wonder if we have a dang DOS attack on the servers going again. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
I have not seen the Seti servers so totally screwed in a long while. Perhaps just busy. I had an AP upload and two AP downloads finished within the last 10 minutes. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.