Panic Mode On (80) Server Problems? |
![]() |
| log in |
Message boards : Number crunching : Panic Mode On (80) Server Problems?
1 · 2 · 3 · 4 . . . 25 · Next
| Author | Message |
|---|---|
|
Almost nothing is working currently, most of us are on backup projects. | |
| ID: 1321335 · | |
|
Aye the plumbing is all plugged up, ach someone stuffed too many Klingons in there... | |
| ID: 1321343 · | |
|
I'm seeing the same thing you guys are - Scheduler time-outs, them come back 5 minutes later and get through, and pick up the ghosts from the earlier contact. | |
| ID: 1321391 · | |
|
A couple days ago, I set the host with the new GT630 to NNT for Einstein because I wanted it to actually get some Seti done so I could see how long it was taking 2 at a time. So I check on it today and find it doing nothing because all the Seti downloads are stuck. Grrrrr. | |
| ID: 1321430 · | |
|
My upload timeouts began about Wednesday and got worse day by day. Noticed that my RAC rose and rose from Dec 5th until Dec 27th time point. I am getting work unit limit reached even on my little i3 notebook without dedicated graphics. Nothing is working here anymore. Am switching to alternate project GPU Grid and Rosetta @ Home today. Have tried proxy servers and no new work setting used during that very bad set of scheduler/server problems from November to early December time period. This last two + days feels as bad as the last meltdown. | |
| ID: 1321491 · | |
I'm seeing the same thing you guys are - Scheduler time-outs, them come back 5 minutes later and get through, and pick up the ghosts from the earlier contact. Yes, I synchronised my home machine with NTP (I presume the Labs are too; my lab servers are) and observed that after my PC sent a request for more work both the time-stamp on my "Last contact" and the "Sent" time for (a) recent file(s) updated to 3-5 seconds after my request time. Sometimes the scheduler request would time out, other times I got notification in 5-20 seconds that there was more work, often in the form of ghosts if the file entry existed before my request. Seems the database is updating quickly, but somewhere along the chain the return of information about the new file(s) is disappearing. ____________ | |
| ID: 1321498 · | |
|
It's still sometimes taking me around 5-10 minutes to complete a simple file upload. Once that is accomplished, the rest seems to work fine. If you are working on 20-40 minute long tasks, as myself, it's not that much of a problem. First step is to Fix the Upload Server. | |
| ID: 1321552 · | |
|
All the problems started when 4th and 5th AP splitter went online. That was the only visible change after the maintenance outage. | |
| ID: 1321613 · | |
|
The splitters are working better than they have for months, the assimilators are finally keeping up. | |
| ID: 1321681 · | |
|
Panic mode? Again? | |
| ID: 1321717 · | |
All the problems started when 4th and 5th AP splitter went online. That was the only visible change after the maintenance outage. This is more or less the same I wanted to say in my post: http://setiathome.berkeley.edu/forum_thread.php?id=70070&postid=1321332 Easy to test, isn't too complicated! I think it's worth testing it! Edit: And it will not increase anybody's power bill. | |
| ID: 1321733 · | |
All the problems started when 4th and 5th AP splitter went online. That was the only visible change after the maintenance outage. So that's what torpedoed the system... sigh. ____________ BSG Anthem My Facebook page | |
| ID: 1321749 · | |
|
Ya know.........the kitties really don't care anymore. | |
| ID: 1321803 · | |
|
FWIW Uploads are uploading with an occasional Retry | |
| ID: 1321833 · | |
|
Best to ween the kitties off the weed in the new year..... Ya know.........the kitties really don't care anymore. ____________ | |
| ID: 1321855 · | |
|
As I recall, during the other 'Psychotic Server Episodes', the Uploads continued without any problems. It never took longer than a second or two to complete the upload. This is what has been going on since yesterday, around the time the Multibeam Shortie Storm began; 12/29/2012 6:58:31 PM | SETI@home | Computation for task ap_25no12ad_B5_P1_00295_20121227_30240.wu_1 finished 12/29/2012 6:58:31 PM | SETI@home | Starting task ap_25no12ad_B6_P0_00240_20121227_31377.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 2 12/29/2012 6:58:33 PM | SETI@home | Started upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 6:58:56 PM | | Project communication failed: attempting access to reference site 12/29/2012 6:58:56 PM | SETI@home | Temporarily failed upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0: connect() failed 12/29/2012 6:58:56 PM | SETI@home | Backing off 3 min 47 sec on upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 6:58:57 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 6:59:28 PM | SETI@home | Started upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:00:13 PM | | Project communication failed: attempting access to reference site 12/29/2012 7:00:13 PM | SETI@home | Temporarily failed upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0: connect() failed 12/29/2012 7:00:13 PM | SETI@home | Backing off 5 min 12 sec on upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:00:15 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 7:00:54 PM | SETI@home | Started upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:04:04 PM | | Project communication failed: attempting access to reference site 12/29/2012 7:04:04 PM | SETI@home | Temporarily failed upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0: transient HTTP error 12/29/2012 7:04:04 PM | SETI@home | Backing off 10 min 45 sec on upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:04:05 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 7:04:06 PM | SETI@home | Started upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:05:02 PM | | Project communication failed: attempting access to reference site 12/29/2012 7:05:02 PM | SETI@home | Temporarily failed upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0: connect() failed 12/29/2012 7:05:02 PM | SETI@home | Backing off 20 min 59 sec on upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:05:03 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 7:05:04 PM | SETI@home | Started upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:05:37 PM | SETI@home | Finished upload of ap_25no12ad_B5_P1_00295_20121227_30240.wu_1_0 12/29/2012 7:05:40 PM | SETI@home | Sending scheduler request: To fetch work. 12/29/2012 7:05:40 PM | SETI@home | Reporting 1 completed tasks, requesting new tasks for NVIDIA and ATI 12/29/2012 7:05:44 PM | SETI@home | Scheduler request completed: got 1 new tasks 12/29/2012 7:05:47 PM | SETI@home | Started download of 08oc12aa.31216.126844.12.10.140 12/29/2012 7:06:01 PM | SETI@home | Finished download of 08oc12aa.31216.126844.12.10.140 .... That's 7 minutes to upload one file. If that would have been last night during the height of the SS, I would have had another completed task before that one uploaded. In fact, I had around 8 stalled uploads at one point last night. If I got them all to remain active long enough, the report went through and I received downloaded files. If one was stalled, nothing worked. Things calmed down when the SS passed and the 4 minute MBs were replaced with 20+ minute MBs. Right now, I'm completing the Upload before the next task is finished. The way I see it, the latest problem began when someone decided to release a Shortie Storm on a malfunctioning Upload Server. The Upload Server is still Borked. | |
| ID: 1321856 · | |
The way I see it, the latest problem began when someone decided to release a Shortie Storm on a malfunctioning Upload Server. The Upload Server is still Borked. Even before the shorties, uploads & the Scheduler were both stuffed. At least ince the shorties haved stopped arriving things aren't as bad as they were- i wouldn't say they've improved as things are still seriously screwed. But they're not as bad as they were (uploads still take ages & i'm still getting them backing up & Scheduler requests still timeout although the number of errors has dropped). ____________ Grant Darwin NT. | |
| ID: 1321887 · | |
Ya know.........the kitties really don't care anymore. Yes, it can be frustrating at times. But... This is the only real attempt to find Intelligent Life, not just life, somewhere beyond this planet. Despite the problems: Let's just 'Keep On Going'. ____________ | |
| ID: 1321908 · | |
|
Best to ween the kitties off the weed in the new year..... | |
| ID: 1321910 · | |
The way I see it, the latest problem began when someone decided to release a Shortie Storm on a malfunctioning Upload Server. The Upload Server is still Borked. I really haven't had any trouble with downloads. Other than the scheduler continually trying to give me all ATI MBs and nothing for the nVidia card. I just made the transition from all MBs to half APs & half MBs. Things are working well except for the Upload Server problem. Here's the latest one, 11 minutes and note how the scheduler didn't make a peep during that time. As soon as the upload completed, the scheduler kicked in and topped me off at 200 again. 12/29/2012 9:32:33 PM | SETI@home | Computation for task 08oc12ab.6744.8247.6.10.239_0 finished 12/29/2012 9:32:33 PM | SETI@home | Starting task 08oc12ab.6744.8247.6.10.208_1 using setiathome_enhanced version 609 (cuda23) in slot 3 12/29/2012 9:32:35 PM | SETI@home | Started upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:33:38 PM | SETI@home | Started download of 08oc12aa.13907.128071.13.10.99 12/29/2012 9:37:22 PM | | Project communication failed: attempting access to reference site 12/29/2012 9:37:22 PM | SETI@home | Temporarily failed upload of 08oc12ab.6744.8247.6.10.239_0_0: transient HTTP error 12/29/2012 9:37:22 PM | SETI@home | Backing off 3 min 56 sec on upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:37:24 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 9:37:27 PM | SETI@home | Started upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:38:38 PM | | Project communication failed: attempting access to reference site 12/29/2012 9:38:38 PM | SETI@home | Temporarily failed upload of 08oc12ab.6744.8247.6.10.239_0_0: transient HTTP error 12/29/2012 9:38:38 PM | SETI@home | Backing off 4 min 15 sec on upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:38:39 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 9:38:43 PM | SETI@home | Started upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:39:06 PM | | Project communication failed: attempting access to reference site 12/29/2012 9:39:06 PM | SETI@home | Temporarily failed upload of 08oc12ab.6744.8247.6.10.239_0_0: connect() failed 12/29/2012 9:39:06 PM | SETI@home | Backing off 9 min 57 sec on upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:39:07 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 9:39:11 PM | SETI@home | Started upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:39:13 PM | | Project communication failed: attempting access to reference site 12/29/2012 9:39:13 PM | SETI@home | Temporarily failed download of 08oc12aa.13907.128071.13.10.99: transient HTTP error 12/29/2012 9:39:13 PM | SETI@home | Backing off 6 min 0 sec on download of 08oc12aa.13907.128071.13.10.99 12/29/2012 9:39:14 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 9:39:15 PM | SETI@home | Started download of 08oc12aa.13907.128071.13.10.99 12/29/2012 9:39:51 PM | SETI@home | Finished download of 08oc12aa.13907.128071.13.10.99 12/29/2012 9:41:37 PM | | Project communication failed: attempting access to reference site 12/29/2012 9:41:37 PM | SETI@home | Temporarily failed upload of 08oc12ab.6744.8247.6.10.239_0_0: transient HTTP error 12/29/2012 9:41:37 PM | SETI@home | Backing off 22 min 6 sec on upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:41:38 PM | | Internet access OK - project servers may be temporarily down. 12/29/2012 9:41:40 PM | SETI@home | Started upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:43:47 PM | SETI@home | Finished upload of 08oc12ab.6744.8247.6.10.239_0_0 12/29/2012 9:43:48 PM | SETI@home | Sending scheduler request: To fetch work. 12/29/2012 9:43:48 PM | SETI@home | Reporting 1 completed tasks, requesting new tasks for NVIDIA and ATI 12/29/2012 9:43:54 PM | SETI@home | Scheduler request completed: got 1 new tasks 12/29/2012 9:43:56 PM | SETI@home | Started download of 08oc12ab.13879.15200.11.10.53 12/29/2012 9:44:10 PM | SETI@home | Finished download of 08oc12ab.13879.15200.11.10.53 .... | |
| ID: 1321918 · | |
Message boards : Number crunching : Panic Mode On (80) Server Problems?
| Copyright © 2013 University of California |