Panic Mode On (107) Server Problems?

Author	Message
Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897580 - Posted: 26 Oct 2017, 16:24:38 UTC Last modified: 26 Oct 2017, 16:36:39 UTC Great, getting "internal server error failure" when requesting tasks after waking the server up with a kick. [Edit] Got the same message on another machine after kicking the servers. The last machine got ONE task. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897580 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1897612 - Posted: 26 Oct 2017, 18:37:05 UTC - in response to Message 1897580. Last modified: 26 Oct 2017, 18:39:55 UTC Dang! I guess I'm going to have to keep an eye on my boxes today. I just discovered that my #1 cruncher has been out of GPU work for about an hour and a half. Moved everything left on the CPUs, except for Arecibo VLARs, over to the GPUs (66 tasks). That'll tide the GPUs over for a little while, at least. EDIT: My #2 and #3 boxes are down just a bit, but nothing out of the ordinary if there's an Arecibo VLAR storm. Only the #1 box has taken the big hit. ID: 1897612 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1897614 - Posted: 26 Oct 2017, 18:46:26 UTC I am getting "Project has no tasks available" on every work request on both of my crunchers, and they are each now down to about half of their max number of tasks. But the server page shows about 700K available, and a reasonable creation rate. Is anybody at the datacenter to kick these machines? Whoops, rate is ~1/sec, WAY too low. ID: 1897614 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22204 Credit: 416,307,556 RAC: 380	Message 1897620 - Posted: 26 Oct 2017, 20:08:49 UTC ready to send should be sitting around 600k, so if that's hit 700k then its OK not to be producing anymore tasks. What is more concerning is that the delivery rate appears to be pathetically low, it's taking quite a few requests to get anything. Call in the tyre kicker to do his thing. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1897620 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1897644 - Posted: 26 Oct 2017, 21:18:37 UTC - in response to Message 1897620. Another way of looking at that: there is work that needs doing, but the people helping the project aren't asking for that kind of work. Until people come forward to do the work that needs doing (my guess is Arecibo VLAR), it blocks the pipelines and gets in the way. ID: 1897644 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1897647 - Posted: 26 Oct 2017, 21:27:55 UTC - in response to Message 1897644. Another way of looking at that: there is work that needs doing, but the people helping the project aren't asking for that kind of work. Until people come forward to do the work that needs doing (my guess is Arecibo VLAR), it blocks the pipelines and gets in the way. Don't think it's Arecibo VLARs this time. I ended up moving everything (except running tasks, of course) from the CPU queue to the GPU queue on my #1 box, Arecibo VLARs included. That left 96 places available for anything the scheduler wanted to send to the CPU, so any Arecibo VLARs would have been welcome. Sadly, nothing arrived, and once the GPUs ran out for the final time, I just shut it down (a couple hours earlier than would normally happen on a weekday afternoon). Something at the server end definitely seems to be constipated today. ID: 1897647 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1897651 - Posted: 26 Oct 2017, 21:52:06 UTC - in response to Message 1897647. [quote] Arecibo VLARs included. That left 96 places available for anything the scheduler wanted to send to the CPU, so any Arecibo VLARs would have been welcome. Sadly, nothing arrived, and once the GPUs ran out for the final time, I just shut it down (a couple hours earlier than would normally happen on a weekday afternoon). Something at the server end definitely seems to be constipated today. . . Yep, I awoke to find empty machines twiddling their thumbs ... Stephen :( ID: 1897651 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1897652 - Posted: 26 Oct 2017, 21:57:28 UTC - in response to Message 1897647. Don't think it's Arecibo VLARs this time. Really? I've just kicked a CPU-only cruncher, and got 53 new tasks at the first time of asking. Now I have to re-balance the cache with other projects... Sorry, I've had a particularly bad intrusion of real like this week (up to and including police interaction), and I'm feeling in a particularly forensic state of mind this evening. ID: 1897652 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1897660 - Posted: 26 Oct 2017, 22:20:20 UTC - in response to Message 1897652. Don't think it's Arecibo VLARs this time. Really? I've just kicked a CPU-only cruncher, and got 53 new tasks at the first time of asking. Now I have to re-balance the cache with other projects... Heh, heh. Mamma Berkeley likes you best! ;^) The queues on my other 2 Linux boxes are steadily draining, but should survive until their normal shutdown in about 45 minutes, but I just went ahead and moved over 50 CPU tasks to the GPUs on one of them, to see if it might goose the scheduler in some way. No joy, though. My Win Vista box got 8 new tasks about 25 minutes ago, but that's been the only recent burp directed this way. ID: 1897660 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1897661 - Posted: 26 Oct 2017, 22:24:16 UTC - in response to Message 1897652. Don't think it's Arecibo VLARs this time. Really? I've just kicked a CPU-only cruncher, and got 53 new tasks at the first time of asking. Now I have to re-balance the cache with other projects... Sorry, I've had a particularly bad intrusion of real like this week (up to and including police interaction), and I'm feeling in a particularly forensic state of mind this evening. . . Hi Richard . . Even if the problem is a glut of VLAR work, the fact that you had to kick the servers for a CPU only rig does not bode well. It should be overflowing with the servers trying to give it work. . .On the second issue I am sorry to hear that, RL has a way of intruding ... :( Stephen :( ID: 1897661 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1897664 - Posted: 26 Oct 2017, 22:44:22 UTC - in response to Message 1897661. By 'kick', I meant preventing other projects requesting work, and increasing the cache request fourfold - just to see what happened. I got what I asked for. About to try the same for my GPU crunchers, now I've worked through the glut of VHAR shorties. ID: 1897664 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897674 - Posted: 26 Oct 2017, 23:38:14 UTC It's possible the server's constipation might have abated. I've received new work across all machines and am closing back in on full caches. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897674 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1897676 - Posted: 26 Oct 2017, 23:54:38 UTC - in response to Message 1897664. By 'kick', I meant preventing other projects requesting work, and increasing the cache request fourfold - just to see what happened. I got what I asked for. About to try the same for my GPU crunchers, now I've worked through the glut of VHAR shorties. . . OK, fair enough. I have had no works for a couple of hours, but getting work again now. Stephen .. ID: 1897676 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897685 - Posted: 27 Oct 2017, 1:05:34 UTC I wish I had a snapshot of the SSP page from Monday, before the outrage. Does the SSP look like the project has started more validators and assimilators to you? And for the first time in memory, all my machines have more validated tasks than pending tasks. I know there is always a bump after the outrage, but I don't remember it hanging on for two days past the outrage. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897685 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897822 - Posted: 28 Oct 2017, 4:43:58 UTC - in response to Message 1897685. Down to 10 gpu tasks on the linux machine. No work available messages for the past couple of hours. Time to kick the servers upside the head. This is getting tiresome. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897822 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1897844 - Posted: 28 Oct 2017, 9:27:29 UTC - in response to Message 1897822. Down to 10 gpu tasks on the linux machine. No work available messages for the past couple of hours. Time to kick the servers upside the head. This is getting tiresome. 28/10/2017 09:58:27 \| SETI@home \| Scheduler request completed: got 139 new tasks ... about 30 minutes ago, my time zone. My current practice is to batch fetch SETI work every few hours, to cover the situation where BOINC doesn't provide the tools to set a long cache on Tuesdays (to cover maintenance) and a short cache at all other times (so that urgent GPUGrid tasks can fit their 15 hours of computing into a 24-hour window, as that project would like). Keeping Einstein's (steady and easily refilled) queue down below 50 or so also helps me to see what's going on. So I switch my cache between 0.25 days (steady running) and maybe 1.25 days (fully cache SETI to my 200 GPU limit). It normally works first time, as it did today. I don't normally need 139 tasks, but GPUGrid had a problem overnight and their card switched to help out SETI. If you fully understand BOINC's work fetch strategy (nobody does, but the basics are fairly easy), and work within it, you can usually follow your chosen project needs. ID: 1897844 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897888 - Posted: 28 Oct 2017, 17:13:21 UTC - in response to Message 1897844. Not sure what your reply has to do with the project refusing to send my machine work when it asks for it and has tons available. It only crunches SETI so no other project is interfering. SETI is unable to keep my SETI system limited allotment of tasks fully up repeatedly and the machine runs out of work. I have to use the ghost task recovery protocol to wake the servers up to my machines need for work. Needless work if the servers would just feed it work when it asks for it every 305 seconds. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897888 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1897909 - Posted: 28 Oct 2017, 18:03:30 UTC - in response to Message 1897888. Keith? Did you badmouth the servers?? You know how sensitive they are.... ;) ID: 1897909 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1897915 - Posted: 28 Oct 2017, 18:30:43 UTC - in response to Message 1897909. LOL, no I didn't ..... at least recently. I always held the servers in high respect up and until last December when they decided to see my machines as "second-class citizens" for some reason. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1897915 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1897970 - Posted: 28 Oct 2017, 22:29:19 UTC - in response to Message 1897915. LOL, no I didn't ..... at least recently. I always held the servers in high respect up and until last December when they decided to see my machines as "second-class citizens" for some reason. . . That was the "upgrade" at Berkeley. They became flaky ever after. Stephen :( ID: 1897970 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.