Message boards :
Number crunching :
Panic Mode On (116) Server Problems?
Message board moderation
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 47 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
My issue occurred 15 hours after the outage recovery and before this mornings shorty outage. RTS buffer was fully stocked by then. Also I got every download I asked for without any issues except for the six or so stuck tasks on each host. The common factor was that every host had the same elapsed time on the stuck tasks or within a couple of minutes of each other since the hosts normally sync up on scheduler request timers. So they all hit the servers at approximately the same time and ended up with stuck tasks. So don't think it was because the servers were being hit with any different amount of traffic at that time. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
could the download problems be caused by too many people trying to get WUs all at the same time? Too many connections at once? Is it usually after an outage like we had today and yesterday?I had a few connection problems after the second outage today, but they downloaded OK after a few retries. I also had problems accessing this website, with - apparently - the Cloudflare service failing to get me a secure connection and giving me a 'forbidden' plain http connection instead. So I went out to the pub, and it was fine when I got back. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I had a few connection problems after the second outage today, but they downloaded OK after a few retries. . . That seems to fall in with Wiggo's cure. :) Stephen :) |
Wiggo Send message Joined: 24 Jan 00 Posts: 34984 Credit: 261,360,520 RAC: 489 |
It works most of the time for me, but it's a bummer during Coffee O'clock like today. :-D Cheers. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I had a brief occurrence of backoffs and http errors after this mornings brief outage. But they cleared rather fast once I started hitting the retry button in BoincTasks. The running elapsed timer on stalled active downloads with no backoffs and no progress is a completely different problem from all the various download issues we have experienced in the last few months. Only shown up in the past couple of weeks and always occurs in the wee hours of the morning. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I had a brief occurrence of backoffs and http errors after this mornings brief outage. But they cleared rather fast once I started hitting the retry button in BoincTasks. . . The problem I noticed was on 2 of the slower machines but as it self corrected I didn't think to check the faster Linux rig. I should have! It had been completely out of work for about 2 hours with 90 stalled downloads. One click on retry got the d/l's running immediately but I didn't stop to see what the errors were. It must have started about 4 hours prior to that. . . Oh well, they are back to working again now. Stephen :) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
This is a screenshot is when I am speaking of "stuck" downloads stuck_downloads.png Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
This is a screenshot is when I am speaking of "stuck" downloadsInteresting, and no, I've never seen that before either. There should be a timeout in your TCP/IP stack somewhere. BOINC has <http_transfer_timeout>seconds</http_transfer_timeout>in cc_config.xml: I've got that one turned down to 60 seconds, on the basis that if it ain't happened, it ain't gonna happen. ('Abort' doesn't mean throw the file away: just stop trying this time, backoff, and try again later) The thing I've tried in years past is to 'Suspend network activity' from the 'Activity' menu in BOINC Manager, count to ... ooh, some random number or other ... and turn it back on again. At least you can do that without interrupting the running tasks and wasting time while they start again from checkpoint. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I had turned that down to 90 seconds in the past. But when we started getting the download issues, I removed any value and defaulted back to the standard 300 seconds. [Edit]Coulda - Shoulda thought of that on my own. That does eventually wake up the download process on the stuck tasks. Just tried on another machine with the same half dozen stuck downloads. Saves time as you say and doesn't inflict a restart on tasks. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
|
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
Shortest Tuesday out(r)age yet ever today I think. I don't know whether this forebodes anything but I will try to stay positive. :^) Very nice short outage. Good Job, seti team. |
Boiler Paul Send message Joined: 4 May 00 Posts: 232 Credit: 4,965,771 RAC: 64 |
shockingly short outage |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30702 Credit: 53,134,872 RAC: 32 |
What did they forget to do? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
What did they forget to do? I wonder also. The one last week was short also. But needed another outage later in the day. Will the pattern repeat? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
It's only me or the site & the DLs are very slow? |
W-K 666 Send message Joined: 18 May 99 Posts: 19114 Credit: 40,757,560 RAC: 67 |
It's only me or the site & the DLs are very slow? Looking from here it's working fine 4 to 5 secs/download. |
Wiggo Send message Joined: 24 Jan 00 Posts: 34984 Credit: 261,360,520 RAC: 489 |
No problems here either. Cheers. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I've been noticing that the website server has been slow to serve pages since the outage. No issues with downloads. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Web pages are very slow show, 5-10 secs to mount a single page. DL are happening with no issues but very slow (around 15KBps) instead of the > 256KBPs normal. ISP Speed test normal (170 MBps) |
W-K 666 Send message Joined: 18 May 99 Posts: 19114 Credit: 40,757,560 RAC: 67 |
I've been noticing that the website server has been slow to serve pages since the outage. No issues with downloads. Neither of those issues seen here at any time since the outage. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.