Message boards :
Number crunching :
Panic Mode On (109) Server Problems?
Message board moderation
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 35 · Next
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Seems to be particularly Bad at the moment. All three Linux machines are being refused work, something about 'Project has no tasks available' even though RTS is hovering around 600k. I suspect that is a Red Herring and only appears due to asking for AP work. If you ask for just MB work, as with my Mac, the Server can't come up with a reason for Not sending work and merely says, No Tasks Sent. Fortunately, the Mac is still being sent work suggesting prejudice, but alas, even Windows machines are denied work on occasion. Strange how it always seems to start with the Linux machines though. One machine is right at 100 tasks in the red, will it recover, or will the Server run it out of work? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
The Server has decided to continue to Strangle 2 of my Linux machines. The third machine recovered to a full cache, for a while, and then continued to be sent less tasks than was being reported. The other two are getting very low on tasks and it appears the server would be content to run them out of GPU work. Both machines have 3 GPUs and should have around 330 tasks in the 1 day cache; State: All (2474) · In progress (97) · Validation pending (1092) State: All (1898) · In progress (98) · Validation pending (785) Strange how the Server sends them about the same number of tasks instead of the number the Host is Requesting. What are the chances that after hours one is at 97 and the other 98? I suppose I'm going to have to intervene or else watch the Server Starve my machines to death. Now down to 94 & 95. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
The Server has decided to continue to Strangle 2 of my Linux machines. It's not just Linux systems. I've had to triple update my Win10 i7 several times this morning to keep the work coming. Edit- even my Vista C2D has required a bump or 2. Edit- and it's not surprising. Very little Arecibo work about, and no AP. When there is Arecibo & AP work available, then you can get MB without extra effort being needed. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Edit- even my Vista C2D has required a bump or 2. Make that several bumps. Struggling to get work on either system. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Only the linux machine is massively down. It is down about 200 tasks. It is getting about 11 tasks per request on average for the past couple of hours. Nowhere near enough to replenish tasks after task retirement per request. Triple Update isn't doing much for it. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
My host get a lot of WU but all are Arecibo Vlars, maybe that is why your host did not get any new GPU work. ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
The two computers that got any tasks of any amount (36) were Arecibo shorties and BLC. Need a couple of hundred each machine. Could be a Arecibo VLAR storm be the cause again? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() ![]() Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 ![]() ![]() |
Nah, it's not a VLAR Storm. I have 3 computers that are down on CPU tasks too. Nothing is coming out the pipe ... or at least, not much. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Yes, I have to agree. The pattern is nothing coming out of the servers. Down on cpu tasks all machines too. So, no handicap there on VLAR's. The fastest cpu machines, the Ryzens can't get enough cpu work either. Lots of no tasks to send messages. [Edit] The Haveland graphs support that assessment. Number of tasks in progress is dropping over the last couple of hours. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
In Progress seems to be dropping. I suppose that means the Server is sending less tasks. I thought a VLAR Storm was alleged to be impossible with BLC tasks running on GPUs. ![]() A couple machines just got topped off, another is still down by over 200 tasks. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I thought a VLAR Storm was alleged to be impossible with BLC tasks running on GPUs. We had one for several hours last week (or was it the week before?) But this isn't that. It's just the problem with the Scheduler- little or no AP or Arecibo WUs means not even GBT work will be allocated when it's requested. When the Arecibo and/or AP work becomes available again, then we'll be able to get GBT WUs again. It's been this way for 12 months. Grant Darwin NT |
![]() ![]() ![]() ![]() Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 ![]() ![]() |
My read of the way the GBT splitters act is they run fine until the server cache reaches 600k, then they shut off and don't restart (or just dribble a little) and the Arecibo splitters take over. When the server does it's Expired/TimedOut task check, the GBT splitters restart. I always see a blast of resends on a least 1 computer just before the GBT tasks reappear. They run for approx 1/2 an hour and shut down again. So until the server does it's check, it is possible to get stuck in a VLAR storm with no GBT being split. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'm down to my last 2 gpu tasks on the Linux cruncher. Guess it's going to do Einstein for the rest of the evening. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I'm down to my last 2 gpu tasks on the Linux cruncher. Guess it's going to do Einstein for the rest of the evening. Even my C2D is struggling to get work today. Usually it doesn't have a problem, even when the i7 does. Grant Darwin NT |
![]() ![]() ![]() ![]() Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 ![]() |
This seems eerily like the "Database slowness" (as Eric called it) back on 10 Nov. We're having some as yet unexplained slowness with the our BOINC database. There don't seem to be any hardware issues. Temperatures are running normal and all the drives seem good. Yet for some reason the query that fills the "ready to send" queue is running about 10 times slower than it normally does.That was a holiday weekend, too. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Can't connect to server for the last 10 minutes. Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | Sending scheduler request: To fetch work. Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | Reporting 2 completed tasks Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | Requesting new tasks for CPU and NVIDIA GPU Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] HTTP_OP::init_post(): http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Info: Trying 208.68.240.126... Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Info: Connected to setiboinc.ssl.berkeley.edu (208.68.240.126) port 80 (#16) Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: POST /sah_cgi/cgi HTTP/1.1 Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Host: setiboinc.ssl.berkeley.edu Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: User-Agent: BOINC client (x86_64-pc-linux-gnu 7.8.3) Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Accept: */* Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Accept-Language: en_US Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Content-Length: 36896 Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Expect: 100-continue Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Sent header to server: Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Received header from server: HTTP/1.1 100 Continue Sat 30 Dec 2017 09:05:59 PM PST | SETI@home | [http] [ID#1] Info: We are completely uploaded and fine Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: HTTP/1.1 500 Internal Server Error Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Date: Sun, 31 Dec 2017 05:05:59 GMT Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Server: Apache/2.2.15 (Scientific Linux) Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Content-Length: 647 Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Connection: close Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Content-Type: text/html; charset=iso-8859-1 Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <html><head> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <title>500 Internal Server Error</title> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: </head><body> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <h1>Internal Server Error</h1> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <p>The server encountered an internal error or Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: misconfiguration and was unable to complete Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: your request.</p> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <p>Please contact the server administrator, Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: boincadm@ssl.berkeley.edu and inform them of the time the error occurred, Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: and anything you might have done that may have Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: caused the error.</p> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <p>More information about this error may be available Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: in the server error log.</p> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <hr> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: <address>Apache/2.2.15 (Scientific Linux) Server at setiboinc.ssl.berkeley.edu Port 80</address> Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Received header from server: </body></html> Sat 30 Dec 2017 09:06:40 PM PST | | [http_xfer] [ID#1] HTTP: wrote 647 bytes Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | [http] [ID#1] Info: Closing connection 16 Sat 30 Dec 2017 09:06:40 PM PST | SETI@home | Scheduler request failed: HTTP internal server error Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
This seems eerily like the "Database slowness" (as Eric called it) back on 10 Nov. . . I guess the servers have a calender with the holidays marked on it ? A shame they are union servers :) Stephen :) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Can't connect to server for the last 10 minutes. There are almost always issues with the web site & scheduler around this time of day for anything from just 10min to 45min. Grant Darwin NT |
Ghia ![]() Send message Joined: 7 Feb 17 Posts: 238 Credit: 28,911,438 RAC: 50 ![]() ![]() |
Hi, everyone...wish you a Happy New S@H Year ! Just a small question : I have the usual 5 minutes back-off time after a manual update request. Is there supposed to be an automatic scheduler request when that time runs out ? I seem to remember that used to be the case, but now the time just runs out and nothing happens. It may take over 30 minutes before a new scheduler request is sent to the server. This doesn't have any negative consequences, of course...just finding it weird. I haven't done any changes to Boinc or S@H settings for many months...the only thing out of the ordinary is that my system was down for a week before Christmas when my monitor died. ...Ghia... Humans may rule the world...but bacteria run it... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Hi, everyone...wish you a Happy New S@H Year ! It depends. If your cache is full, then it won't make another request till the next WU has been completed and uploaded. But if your cache isn't full, even if another WU hasn't been completed, then after the 5min 3 sec delay it will ask for work again (although that also depends on your Store up to an additional x days setttings). The larger that value, the longer it will wait (ie the more WUs that will have to be returned) before requesting more work. Grant Darwin NT |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.