Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation
Previous · 1 . . . 98 · 99 · 100 · 101 · 102 · 103 · 104 . . . 107 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 19 May 99 Posts: 209 Credit: 10,924,287 RAC: 29 ![]() ![]() |
State: All (122) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (122) · Invalid (0) · Error (0) A bunch of WU's got sent Apr 22 amd were completed Apr 23 so Valids still at 123 This is a bit like watching ice melt there is always that teeny bit that persists. Sometimes I wonder, what happened to all the people I gave directions to? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SETI@home classic workunits 4,321 SETI@home classic CPU time 22,169 hours |
![]() ![]() Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 ![]() ![]() |
Well there it is State: All (700) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (700) · Invalid (0) · Error (0) All task returned and validated. End of an era and I had such big plans for the WOW event, luckily I had put off spending any money just long enough. |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
State: All (0) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (0) · Invalid (0) · Error (0) . . That rather says it all :( Stephen :( |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
9/5/20 11:00am AEST {9/5/20 1:00 am UTC} State: All (2172) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (2155) · Invalid (10) · Error (7) N.B. the 10 Invalids are 'cannot validate' or 'validation error' tasks with hordes of wingmen. The 7 Errors are all validated on other hosts. Stephen :( |
![]() Send message Joined: 17 Nov 00 Posts: 90 Credit: 76,455,865 RAC: 735 ![]() ![]() |
Just returned my last unit. All zeros in my task list now. |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
9/5/20 11:00am AEST {9/5/20 1:00 am UTC} It's all out of your hands, You've done your part. |
![]() ![]() ![]() Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 ![]() ![]() |
drop of about 45 tasks this night ^^ |
![]() ![]() Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238 ![]() ![]() |
Greetings, My stats are stagnant. They haven't moved for at least 2 weeks now: State: All (331) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (329) · Invalid (2) · Error (0) I've been watching the server stats gradually going down day by day, but mine remain the same. :| Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
The "Result turnaround time (last hour average) " has been greater than 1000 hours a few times that I have looked recently, That's > 6 weeks ! It's currently ~ 827h I don't think that there is a Munin graph for that parameter, or for "Results waiting for db purging" I'm guessing that there are probably <1% of the " Results out in the field" that are still needed. Eric posted a couple of weeks ago on the SETI Facebook group that there were about 17k results needed. At that time it represented about 1.5% of the results in the field. [Edit] This is the most recent comment that I have seen from Eric. There are currently 17234 workunits that don't have a validated result. Once they are finished nothing more should go out. During normal operations, that would have been about 10 minutes worth. But a lot of people quit in March without returning their results. We've sent everything out again in an attempt to speed things up. I'd guess a few days more before those most of those results are back and validated. |
WezH Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95 ![]() ![]() |
[Edit] This is the most recent comment that I have seen from Eric. Just minute ago Eric wrote in FB: As far as the "waiting for validation" I think there's a bug in that code. When I do the query manually, I get a much smaller number. |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
795 Workunits waiting for a valid Result out of 653,838 Results out in the field That's approximately 0.12% of the outstanding Tasks that are needed ! |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22622 Credit: 416,307,556 RAC: 380 ![]() ![]() |
It's about time one or two folks with big caches of "work in progress" took a look at their tasks and abandoned the thousands for which the canonical result has already declared. Sitting on tasks for which the canonical result has already been declared does nothing for the science. I know there are a fair number of folks who have just stopped processing, or, like me, have data stuck on failed machines - there is little that can be done about them, but at least get rid of the excess where possible. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Sirius B ![]() ![]() Send message Joined: 26 Dec 00 Posts: 24921 Credit: 3,081,182 RAC: 7 ![]() |
OR, how about a server side abort? |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
OR, how about a server side abort? Abort all from the server side and resend the remaining 795 (or a little less now) in small batchs (like 5 WU per host max) with a very small deadline (like 3 days) all will be done very fast. ![]() |
Sirius B ![]() ![]() Send message Joined: 26 Dec 00 Posts: 24921 Credit: 3,081,182 RAC: 7 ![]() |
Yep & all those that left would not receive any. Being a slower CPU only cruncher, I set NNT with the last few I had. Crunched them all by 10th April. Let all the fast hosts crunch, much quicker return. |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
Aborting tasks now could case "Too Many errors" state. If Eric wants the last few results quickly, he could probably make sure that all the Ghosts are re-sent if needed, and then Cancel Unstarted Tasks. That would probably be more effective. |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
Aborting tasks now could case "Too Many errors" state. But requires a lot more babysitting. Not believe Eric has time to cherry pick each WU. ![]() |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
I think it would be 2 or 3 scripts, one of which has been done before I'm not an expert on the BOINC server code, but I think it is basically switching an option on or off, then updating. The load on the servers now is probably low enough to turn on the options that were previously off due to excessive load. |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22622 Credit: 416,307,556 RAC: 380 ![]() ![]() |
Aborting tasks that have already been validated by others, and have a canonical result will not cause problems of work units dying due to too many errors. Once the canonical result has been declared any subsequent result returned is nothing but fluff. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
![]() Send message Joined: 17 Nov 00 Posts: 90 Credit: 76,455,865 RAC: 735 ![]() ![]() |
I have recovered many tasks from failed machines. 1). Create or select a healthy machine. 2). Make sure it is signed up to the project. 3) Edit client_state.xml and change it's hostid to that of the failed machine. 4) Change it's rpc count to that of the failed machine. 5) Start the client. 6) Carry out the ghost recovery procedure. You do have to make sure the base architectures match. You can't move ARM units to an Intel this way, but you can move Windows on Intel to Linux on AMD. FWIW. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.