Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 84 · 85 · 86 · 87 · 88 · 89 · 90 . . . 94 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
just got some WUs to download... is it fixed?Yep (at least for now). The Scheduler is no longer MIA. Although downloads are an issue on my Linux system (no download server setting in Hosts file)- instant timeouts. Luckily it only took a couple of dozen retries to get them to download. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Ready-to-send buffer is empty, and the splitters aren't splitting. Looks like all the shorties & noise bombs have built the backlogs back up to the cutoff level again. Grant Darwin NT |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19494 Credit: 40,757,560 RAC: 67 ![]() ![]() |
Ready-to-send buffer is empty, and the splitters aren't splitting. Something must be happening as I just got 23 tasks at 10:54 |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
Lot of DL/UL error and retries. Do we have a problem? ![]() |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I don't think the problem is in shorties or noise bombs but in the assimilator not assimilating. Assimilation queue has skyrocketed after the Tuesday outage and the results it is holding hostage have pushed the database size out of safe zone. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14687 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I don't think the problem is in shorties or noise bombs but in the assimilator not assimilating. Assimilation queue has skyrocketed after the Tuesday outage and the results it is holding hostage have pushed the database size out of safe zone.The two are related. A lot of shorties and noise bombs reporting quickly after the end of the outage and in the hours that followed will have spiked both the validation queue, and the subsequent assimilation queue, to much higher levels than normal. To use an illustration all too relevant in this country at the moment: the shorties are like a big thunderstorm in the hills. That creates a surge of water, which moves down the streams and rivers over the following days. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
for i in `boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do boinccmd --file_transfer http://setiathome.berkeley.edu $i retry;done I was having a heck of a time getting this to work in conjunction with the watch command, but it always seemed to give a syntax error or just didn't work properly. it works as-is if you run it standalone, just not with watch. but it works if you just drop into a executable bash script filename update_transfers #!/bin/bash for i in `./boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do ./boinccmd --file_transfer http://setiathome.berkeley.edu $i retry;done then run it watch -n 60 ./update_transfers Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Looks like we are bumping up against the memory limit again. Nothing but no work available for all hosts over the past 20 minutes. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Luc Send message Joined: 7 Jan 17 Posts: 4 Credit: 1,840,181 RAC: 5 ![]() |
more of a whine / observation / OCD thing but why are there 5 'chunks' of data waiting to be processed since late 2019? would flushing everything out of the repositories have a potential cleansing affect? i'm no DB nor systems design guy, just wondering..... it has come close a few times only to have another coupla days of data pushed in front - like today. |
![]() ![]() ![]() Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 ![]() ![]() |
Thanks, this bash file helps a lot ![]() ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Just can't get any or much replacement work. Splitters are throttled because the database is too big. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Just can't get any or much replacement work. Splitters are throttled because the database is too big.It's backlog is also heading for a new record high. The last time it got this bad we were pretty much out of work for a couple of days. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Forums laggy and Scheduler is MIA/taking ages to respond again. And the response is "Project has no tasks available" if it does respond, even though there are 60k ready-to-send. Grant Darwin NT |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
Yep. Borked again. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Linux system out of GPU work. Grant Darwin NT |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
Looks like we are back up again. Getting decent downloads. Progressing forward. All in all, I'm not disappointed with what they're doing behind the scenes. I know they must be frustrated with the outages. Still, the changes they're making is making a huge difference in my machines' overall performance. RAC has went up about 5-10K/day with each machine tracking higher. Machines are not taking huge dives. Downtimes have been fairly minimal. Edit: Even though RTS is empty :( |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13886 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Looks like we are back up again. Getting decent downloads.Not here. Just the odd WU. "Project has no tasks available" is still the current response. Edit- until i posted this, then i got 18. Another 400 or so and i'll hit the limits again. Edit- and then got more work. The bad news- it's 98% shorties, and all new WUs. So the database backlog is going to get even worse than it is now. Grant Darwin NT |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
My RAC is way up from what it was before the problems started in December but not because I'm crunching more. Quite the opposite. My machines have idled a lot in the long outages but the credit I get per crunched cpu or gpu hour has gone up a lot. I guess Setiathome staff turned the "Credit screw" several turns looser to compensate for the problems. This change happened suddenly after the very long Tuesday outage that started on Jan 14. Before that outage my RAC was climbing slowly back towards is normal value from the depression the Christmas problems had caused. After the outage it shot through the roof so that just a few days after the outage my RAC had broken my all time high. Despite that same outage having dropped it below the lowest I had had with the same hardware before. |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
Not here. Yeah. I got a few decent downloads on the 10Sep file. Lots of shorties. Still, they're keeping the system kicking. I'd almost be at the point that they should think about shutting it down for a few days to clear that backlog problem. Yeah, it'd suck, but if they planned it out, at least we would know about it, and those of us who work on other projects would be forewarned. This limping along from one outage to the next, even if they are fairly short in duration, just can't keep happening and get to any real resolution. |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I'd almost be at the point that they should think about shutting it down for a few days to clear that backlog problem.The problem would then immediately reappear when all the starved hosts are reporting the millions of tasks they crunched during the long outage and asking millions of new tasks to fill their caches. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.