Message boards :
Number crunching :
Panic Mode On (110) Server Problems?
Message board moderation
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 37 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
This makes for a long day now :( And then they cleared. Hopefully they'll stay good now. Edit- now it'd be nice if the splitters could finally get going, and keep going. But with all those deletions backing up... Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Watched a documentary and came back and gave the stalled downloads a try. Cleared them all up but now no work is available. If past experience shows, I will wake up tomorrow morning to full caches on all machines. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
Now it's lag time right on queue |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Now it's lag time right on queue Yes, but much shorter tonight. Only about ten minutes for the notch in the Haveland graphs. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
Not sure what that is? but around 5pm thru to 6pm everyday Adelaide time..Right after 6pm everything is running fine lol Could be the transition of time zones ( fast then slow then visa versa) All I know is that Seti is the only one affected, all other web sites works like normal.. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
My experience also. No other websites exhibit the phenomena, only SETI. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Maybe something runs at the servers at this time, scheduled since it's always at the same time, and slows down everything. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yes, that is what I suspect. It seems to run on all the exposed servers in the Haveland graphs. So all the splitters, validators, purgers etc. for both AP and MB. Same for all the schedulers, up/down servers and the replica database. Since its network related, I wonder if the routers or backup power supplies have a 15 minute update period or do some sort of internal housekeeping. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
The Haveland graphs are drawn from exactly the same data as we see on the server status page ("SETI@home server status information is also available in XML") - I debugged that when it wouldn't show SaH v8 for some time after we started using that. So, there are three possibilities for that gap in the line. 1) Every single server pauses at exactly the same time. 2) One server - the one which collects the data - pauses. 3) the XML data is inaccessible over the internet, either because of server connection failures, or because of router and line congestion. I think the second two are both more likely than the first. Richard Haselgrove <redacted> 26/11/16 at 10:21 AM |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks for the comment Richard about where the Havenland graphs get their data. I think that #2 is the likely cause as the break in the graphs seems to occur regularly every night at almost exactly the same time. I would think that #3 would be more variable as the the data going over the connection is likely a lot more variable in its traffic load. So do we know the name of the server that pulls the XML data from the SSP to publish to the Haveland graphs? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Thanks for the comment Richard about where the Havenland graphs get their data. I think that #2 is the likely cause as the break in the graphs seems to occur regularly every night at almost exactly the same time. I would think that #3 would be more variable as the the data going over the connection is likely a lot more variable in its traffic load.Wrong question. The same server renders the data, whether it's requested in html form or xml form - it's all done in the single sah_status.php file I linked. So I guess that would be muarae1 - the web (and god knows what else) server that you complain about being unresponsive each morning. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
So I guess that would be muarae1 - the web (and god knows what else) server that you complain about being unresponsive each morning. Web site & forums become slow/unresponsive/timeout. Scheduler is out of reach. No server status updates (Haveland graphs). It's generally a 30-45min period. Lately 45min has been more common. And it's now occurring about 1hour later than it used to. Edit- I can't remember when this started occurring, but i'm pretty sure it was very late last year (Nov, Dec?) Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
For me, granted I have not been sitting in front of the computer exactly the same time every night, the unresponsiveness occurs around 07:15 UTC and usually lasts for 30 - 45 minutes and the site becomes available around 07:45 UTC. Anyone follow up my post and look into the Haveland graphs and verify what I see with regard the UTC time under each graph? I see the graph legend off by 1 hour UTC at all times. But the graph dropout is exactly in sync with when I have the site go unresponsive. For example my computer indicates the time is 01:38 UTC 15 Feb 2018 and the Haveland graphs are all showing 02:30 UTC 15 Feb 2018. That accounts for the SSP ten minute update cycle. They are showing a DST offset from last November still. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
The results awaiting purge has now exceeded 7 million. That has made it impossible to view any of my tasks on my fastest crunchers because the database times out. They need to get those results purged and back down to reasonable levels. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
It might get done at "Lag o- Clock" period :/ |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
5:20pm and so far no lag looks promising |
Ghia Send message Joined: 7 Feb 17 Posts: 238 Credit: 28,911,438 RAC: 50 |
Started here at 7:13 UTC. Humans may rule the world...but bacteria run it... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
5:20pm and so far no lag looks promising Didn't notice any web site issues (not that I was doing much here at the time), but as per usual from 16:45 till 17:25 (CST (Australia)) no Scheduler contact was possible. Edit- And the Haveland graphs show the usual small gap, then drop & surge in Received-last-hour numbers. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Missed it. Was watching the telly. Came in to check on the computers and saw they were down on work. Looked back through the logs and see the first no connection event at 07:09 UTC. The big jump in returned tasks is a good telltale that many others were unable to contact the servers to report and get new work. Keith-Windows7 3196 SETI@home 2/14/2018 23:09:06 Sending scheduler request: To fetch work. 3197 SETI@home 2/14/2018 23:09:06 Reporting 4 completed tasks 3198 SETI@home 2/14/2018 23:09:06 Requesting new tasks for CPU and NVIDIA GPU 3199 SETI@home 2/14/2018 23:09:28 Scheduler request failed: Couldn't connect to server 3200 2/14/2018 23:09:29 Project communication failed: attempting access to reference site 3201 2/14/2018 23:09:31 Internet access OK - project servers may be temporarily down. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Anyone follow up my post and look into the Haveland graphs and verify what I see with regard the UTC time under each graph? I see the graph legend off by 1 hour UTC at all timesMy haveland times have always been out 1h for me - for as long as I can remember. I have always thought it was because of the strange time zone I'm in that switches between Central and Mountain. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.