Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 58 · 59 · 60 · 61 · 62 · 63 · 64 . . . 94 · Next
Author | Message |
---|---|
JohnDK ![]() ![]() ![]() Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 ![]() ![]() |
Yup, stuck uploads on and off for days/weeks. Problem with the fast multi GPU hosts is that sometimes they doesn't clear fast enough. In my case with 20+ stuck uploads, BOINC won´t request new work. |
![]() ![]() Send message Joined: 26 Jan 15 Posts: 88 Credit: 280,183 RAC: 1 ![]() |
@Richard Haselgrove thanks for that, i re-installed everything with newer versions, that took care of that issue with the messages. and i can download new tasks now again. I appreciate getting new tasks now. but its like the tasks are showing a countdown for the deadline. each task i did get shows 52d,14:37:xx the xx changes like a countdown. Im guessing thats normal, i just never seen that before. ill just run it as usual and see where it goes. thanks
|
![]() ![]() Send message Joined: 26 Jan 15 Posts: 88 Credit: 280,183 RAC: 1 ![]() |
each task i did get shows 52d,14:37:xx the xx changes like a countdown. turns out thats a setting on boinc tasks that i usually dont use. so i guess all is solved for now.
|
Speedy ![]() Send message Joined: 26 Jun 04 Posts: 1646 Credit: 12,921,799 RAC: 89 ![]() ![]() |
I am amazed as I write the return rate is 7000 more results per hour than their is total in the RTS queue. I am aware there are a few short years out there plus some noise bombs. I must say there has been a vast purging improvement because I have the result number in the 700,000 range I cannot remember the last time I saw it that low RTS is now more than the return rate ![]() |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I am amazed as I write the return rate is 7000 more results per hour than their is total in the RTS queue.They are intentionally keeping the RTS queue short to keep the result table in the database from growing too big to fit in RAM. If/when the huge validation queue shrinks, then the RTS queue should return to its normal size. In my opinion they should this coming downtime shut only the scheduler and splitters down but let the rest of the system run until validators, assimilators and purgers have no more work to do and only then shut the rest of the system down and do their normal downtime stuff. |
Speedy ![]() Send message Joined: 26 Jun 04 Posts: 1646 Credit: 12,921,799 RAC: 89 ![]() ![]() |
I am amazed as I write the return rate is 7000 more results per hour than their is total in the RTS queue.They are intentionally keeping the RTS queue short to keep the result table in the database from growing too big to fit in RAM. Good point I had forgotten about that I thought it was only before last week's outage. I read in another thread that this week's outage may be 24 hours ![]() |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I have had stuck upload problems with both 7.14.2 & 7.16.3 for some weeks I guess. I've never had a stuck up load, just plenty uploads instantly timing out on their first attempt. Over the last couple of days there have been periods when they would time out (not always instantly, but within a few seconds) on their second & even a third attempt. Downloads on the other hand, i been having plenty of stickys there. Came home to many of them today- mostly on my Linux system (no hosts file setting). Sitting there, timinr counting away but no data moving. Disabling & re-enabling network access generally gets them going again. Grant Darwin NT |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22815 Credit: 416,307,556 RAC: 380 ![]() ![]() |
Shutting everything down apart from the splitters & scheduler wouldn't help as they are both in continuous dialogue with the main database, which wouldn't be running. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Both the AP and MB deleters appear to have given up, they're both developing significant backlogs. Grant Darwin NT |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
Shutting everything down apart from the splitters & scheduler wouldn't help as they are both in continuous dialogue with the main database, which wouldn't be running.I said shutting down just scheduler and splitters. The database would obviously be still running but without load from people asking/reporting work so that the backlogged stuff would have a good opportunity to get their stuff done. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
Still up? Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() ![]() Send message Joined: 23 May 99 Posts: 7381 Credit: 44,181,323 RAC: 238 ![]() ![]() |
Still up? Hi Ian, Could the maintenance be postponed in order to allow the DB and servers to settle in after last weeks DB redo? Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
Don't think so, they usually put out a notice if they plan to skip a maintenance day. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Don't think so, they usually put out a notice if they plan to skip a maintenance day.Having said that, we normally have an automatic 'advance warning' notice on the front page from midnight PST on the day in question - and it ain't there. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
and we are back!! ![]() ![]() |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
and we are back!!Back yes, but how long until we can actually get new work? My GPU still has plenty of stuff to do but the CPU is dry in 30 minutes. Also looks like I can't get any CPU work for a long time even if some was available. My client wants to report all the GPU results first, so my CPU queue stays 'full' from completed work and server won't give me more. Also suspending completed tasks doesn't apparently prevent the client from reporting them, so I can't force it to report CPU stuff first. |
![]() ![]() Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 212 ![]() ![]() |
I am getting "project servers may be temporarily down" message, apparently since around 17.15Z on 28/1/20 but the SSP does not show an associated lag - anyone else? Member of the 20 Year Club ![]() ![]() |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I am getting "project servers may be temporarily down" message, apparently since around 17.15Z on 28/1/20 but the SSP does not show an associated lag - anyone else?Servers are congested from all the empty hosts bombing it asking for work, so most requests fail. In this situation you can make the requests work a bit more reliably if you set 'no new work' so that your client is only reporting work but not asking for more. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I am getting "project servers may be temporarily down" messageNever seen that message before. Just the usual "Project has no tasks available". Scheduler response time is better than it has has been recently after outages, and have been able to report work with no issues. But the forums are in extreme slow motion at present, and i expect it'll be hours before we can get any work- it takes a long time after the system comes up after an outage for the splitters to start up again these days. Grant Darwin NT |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
Never seen that message before. Just the usual "Project has no tasks available". Scheduler response time is better than it has has been recently after outages, and have been able to report work with no issues.I have had some issues. Just after the downtime ended I was able to report tasks for a while without problems but when the big masses woke up from their long backoffs, I started getting just timeouts and internal server errors. 10 or 15 minutes ago the requests started working again. And right now I got my first new task after dt! Just a single one. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.