The Server Issues / Outages Thread - Panic Mode On! (118)

Author	Message
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 2027791 - Posted: 15 Jan 2020, 23:18:08 UTC - in response to Message 2027749. Scheduler request happened and it took about 30 seconds to respond, but it responded and acknowledged all of them. Didn't get any new tasks though. Update: 2020-01-15 16:58:59 SETI@home [sched_op_debug] Starting scheduler request 2020-01-15 16:58:59 SETI@home Sending scheduler request: To fetch work. 2020-01-15 16:58:59 SETI@home Reporting 2 completed tasks, requesting new tasks 2020-01-15 16:58:59 SETI@home [sched_op_debug] CPU work request: 1685721.26 seconds; 0.00 idle CPUs 2020-01-15 16:59:04 SETI@home Scheduler request completed: got 21 new tasks 2020-01-15 16:59:04 SETI@home [sched_op_debug] Server version 709 2020-01-15 16:59:04 SETI@home Project requested delay of 303 seconds 2020-01-15 16:59:04 SETI@home [sched_op_debug] estimated total CPU job duration: 69015 seconds 2020-01-15 16:59:04 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec 2020-01-15 16:59:04 SETI@home [sched_op_debug] Reason: requested by project I received some work. Once. So it's just going to be slow-going, but it'll all work out eventually. Be patient. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 2027791 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1859 Credit: 268,616,081 RAC: 1,349	Message 2027792 - Posted: 15 Jan 2020, 23:18:16 UTC - in response to Message 2027787. Last modified: 15 Jan 2020, 23:22:09 UTC The 200/call (if that's what is being talked about) is a hard limit. Unless I misunderstand, it's more about having increased the total CPU limit from 100 to 200, and the total per GPU limit from 100 to 400 (now 300). This caught a lot of people by surprise when it happened (again, lack of communication) and required rethinking cache size settings for folks. [edit] Would also be nice if those cache limits could be set per-project instead of (or in addition to) per-client across all projects, but it is what it is. ID: 2027792 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 2027793 - Posted: 15 Jan 2020, 23:22:30 UTC During my 20+ years with SETI we have had several outages that have lasted this long or longer. The good thing is that they are truly very few, and very far apart. If you look at the major corporations with large staff who have had many outages over the years, you have to wonder how Berkeley's very small staff has done so well! (Actually, I know why, because I have visited them. They are some of the most dedicated, enthusiastic, and extremely intelligent people I have ever met!!!) As to the questions about the backoff times getting longer and longer, it is a very astute bit of programming by the Berkeley staff. You see, each time the software has another backoff without a successful download, it lengthens the backoff time. This is necessary because, as time goes on, more and more computers begin begging for tasks. Without this code, when SETI's servers do come back online, the massive request for tasks creates a DOS attack that overloads their servers. This causes a new problem, which then greatly lengthens the time it takes to get back to normal. (Think of how effective deliberate DOS attacks have been at bringing down major websites for long periods of time.) Don't worry, though, they even thought to code an automatic reset back to a shorter wait time before the backoff gets too long. This ensures that none of us will be left waiting too long before our computers ask for tasks. Like I said, the staff at Berkeley is very good at what they do!!! :-) ID: 2027793 ·

Thomas Womack Volunteer tester Send message Joined: 7 Sep 03 Posts: 4 Credit: 4,912,298 RAC: 64	Message 2027794 - Posted: 15 Jan 2020, 23:26:38 UTC Hi all, my first posting here even though I have been processing work units for years and NEVER had a problem getting timely downloads or uploads accomplished until today...it seems that something is very much amiss with the seti servers that isnt reflected on the status page or maybe I just dont know how to read it. At any rate, cant seem to get new work units at all today and wondering if the powers that be know about the problem or if this is new? Would love an update on whats happening so I know what to do with the many devices I have trying to run this program. If I should just shut them off and wait for a large star to appear in the western sky? Could we have an update on whats happening???? thanks, Tom ID: 2027794 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2027796 - Posted: 15 Jan 2020, 23:32:08 UTC I believe I mentioned that some time ago. To lower the Total number of outstanding tasks on the Server simply lower the Cache limit to One Day. The machines that can complete 5000 tasks a day will get 5000 tasks a day no matter what kind of task limits you set. On the other hand, the machines that only complete 100 tasks a day will have 100 tasks on their machine instead of 1000. Why should you have all those tasks assigned to machines that don't need them? All you are doing is causing problems for the Server by having 1000 tasks on a machines that complete 100 a day. Again, if a machine completes 10000 tasks a day, then it will get 10000 tasks a day no matter if it's in the In Progress column or the Pending column. If you lower the In Progress then the Pending will rise to reach 10000. It doesn't matter which column the tasks are in, it still totals 10000 to the Server at the end of the day. ID: 2027796 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 2027798 - Posted: 15 Jan 2020, 23:34:41 UTC 8.24 is now available on main for AMD GPUs: https://setiathome.berkeley.edu/apps.php Just as Eric promised. ID: 2027798 ·

SalemWill Send message Joined: 30 Jul 05 Posts: 1 Credit: 228,091 RAC: 5	Message 2027799 - Posted: 15 Jan 2020, 23:35:44 UTC - in response to Message 2027794. I am having the same issue. States there is no new work to be downloaded. ID: 2027799 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1859 Credit: 268,616,081 RAC: 1,349	Message 2027800 - Posted: 15 Jan 2020, 23:37:44 UTC - in response to Message 2027796. ... To lower the Total number of outstanding tasks on the Server simply lower the Cache limit to One Day. .. Agreed. Would be more effective, but again the issue is that since it's a BOINC limit rather than a SETI limit, I'm not sure how you do that? Perhaps it's a reflection of how user and the internet have changed over the past 20 years since this was all originally designed. The model no longer fits, in that one regard. ID: 2027800 ·

Pierre A Renaud Send message Joined: 3 Apr 99 Posts: 998 Credit: 9,101,544 RAC: 65	Message 2027801 - Posted: 15 Jan 2020, 23:39:37 UTC - in response to Message 2027794. A normal tuesday maintenance outage turned into a (lengthy) 24+ hrs outrage (for reasons to be specified by the staff when they can), which is rather rare. Now the servers are catching up with demand. Come to this thread to get the latest news on server Issues / Outages. Hi all, my first posting here even though I have been processing work units for years and NEVER had a problem getting timely downloads or uploads accomplished until today...it seems that something is very much amiss with the seti servers that isnt reflected on the status page or maybe I just dont know how to read it. At any rate, cant seem to get new work units at all today and wondering if the powers that be know about the problem or if this is new? Would love an update on whats happening so I know what to do with the many devices I have trying to run this program. If I should just shut them off and wait for a large star to appear in the western sky? Could we have an update on whats happening???? thanks, Tom Apr 3, 1999 - May 3, 2020 ID: 2027801 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2027802 - Posted: 15 Jan 2020, 23:39:55 UTC - in response to Message 2027773. Good News - Beta is giving out work Bad News - Main is somewhat constipated with the RTS at >1,000,000 and nothing getting out (for me) As others have said - something has been done to the servers. . . You are not alone ... Stephen :( ID: 2027802 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462	Message 2027803 - Posted: 15 Jan 2020, 23:40:01 UTC - in response to Message 2027798. 8.24 is now available on main for AMD GPUs: https://setiathome.berkeley.edu/apps.php Just as Eric promised. Will that help with the late model Amd gpu headaches or is that strictly a driver fix? Tom A proud member of the OFA (Old Farts Association). ID: 2027803 ·

Darrell Wilcox Volunteer tester Send message Joined: 11 Nov 99 Posts: 303 Credit: 180,954,940 RAC: 118	Message 2027804 - Posted: 15 Jan 2020, 23:41:12 UTC - in response to Message 2027790. @ rob smith Hey! I remember those days! Took 96 hours (4 days) on my 32MB 133 Mhz PC for a single WU. ID: 2027804 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462	Message 2027806 - Posted: 15 Jan 2020, 23:43:12 UTC Last modified: 15 Jan 2020, 23:49:27 UTC Wed 15 Jan 2020 05:41:54 PM CST \| SETI@home \| Scheduler request completed: got 49 new tasks The next update was nada tasks.... Tom A proud member of the OFA (Old Farts Association). ID: 2027806 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2027807 - Posted: 15 Jan 2020, 23:45:07 UTC - in response to Message 2027784. The server status page seems to update only once every few hours but the last time it updated it said there was over a million results ready to send. However even when that that information was fresh, my both computers got just 'Project has no tasks available' over and over. Are the anonymous systems being discriminated against again like during the christmas? Edit: right after I typed that my bigger box received over a hundred tasks! . . The Phantom of the Fora strikes again ... Stephen :) ID: 2027807 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2027809 - Posted: 15 Jan 2020, 23:46:54 UTC - in response to Message 2027785. For the fast majority of users (maybe more than 50 K), who run on a "set & forget" and produces maybe 20 WU per day or less a large WU cache is not needed and unnecessary increases the size of the DB. That change is what we are talking about. And the proper way to deal with that, for users of any production volume, is to set realistic cache size limits so that the process can self-regulate, rather than flailing about trying to find a sweet spot in externally imposed limits. If everyone set their caches for a max of 1-2 days, and stuck to that, actual device limits would be unneeded. Assuming, of course, that the client calculated the requirement accurately. . . +1 . . Personally I have always argued that one days worth of work is plenty for a cache size. Stephen <shrug> ID: 2027809 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2027810 - Posted: 15 Jan 2020, 23:48:34 UTC - in response to Message 2027787. The 200/call (if that's what is being talked about) is a hard limit. The actual "ready for dispatch" queue is 200 work units long, so anyone needing more than tha is going to need more than 1 call to fill their cache. . . Nope, it's the 200/CPU and 300/GP limit that is causing the angst ... Stephen <shrug> ID: 2027810 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2027812 - Posted: 15 Jan 2020, 23:55:05 UTC - in response to Message 2027798. 8.24 is now available on main for AMD GPUs: https://setiathome.berkeley.edu/apps.php Just as Eric promised. . . Now to get that message to the card owners ... Stephen :) ID: 2027812 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2027814 - Posted: 16 Jan 2020, 0:08:15 UTC - in response to Message 2027800. ... To lower the Total number of outstanding tasks on the Server simply lower the Cache limit to One Day. .. Agreed. Would be more effective, but again the issue is that since it's a BOINC limit rather than a SETI limit, I'm not sure how you do that? The number of tasks you receive per period of time is controlled by the the Apps APR number. That's why you receive few tasks until the first 11 tasks are completed and the APR number is set. The higher the APR the More tasks are sent. This is why the Bug in the APR number on Server version 715 is so troubling, the Server version 715 will hang the APR number after about a day or so instead of updating the number after tasks are completed. Hopefully the Cern people are fixing that bug as well as the Anonymous platform Bug. ID: 2027814 ·

Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530	Message 2027815 - Posted: 16 Jan 2020, 0:15:05 UTC What is blocking the tasks in the 'Results ready to send' queue from going to the clients requesting work? The rrts is so high the splitters have stopped. And it stays high despite of that. So it looks like no one is getting much work. I got one big bunch of tasks and only 'project has no jobs' before and after that. ID: 2027815 ·

Freewill Send message Joined: 19 May 99 Posts: 766 Credit: 354,398,348 RAC: 11,693	Message 2027817 - Posted: 16 Jan 2020, 0:20:46 UTC - in response to Message 2027815. I think it's the same for all of us, Villa Saari. The servers are probably busy with other tasks for incoming completed work and are prioritizing that over sending out new tasks. Just my guess. ID: 2027817 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.