Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 67 · 68 · 69 · 70 · 71 · 72 · 73 . . . 94 · Next
Author | Message |
---|---|
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
What WUs are coming through here are all crap short ones. Nothing to heat the house. I would prefer some resends instead of these crap new ones. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13750 Credit: 208,696,464 RAC: 304 |
Fri 31 Jan 2020 07:44:42 PM CST | SETI@home | Project has no tasks availableSplitters have been down for a few hours now. Getting the odd resend here & there (many of which are just noise bombs). Grant Darwin NT |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
There have been a lot of ups and downs lately, but it seems that the downs are winning. So, is this going to be the new normal for the weekends now? I don't buy computers, I build them!! |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
So, is this going to be the new normal for the weekends now?Doesn't look like that. Last week the weekend was the only time of the week my computers could consistently keep their caches full. |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
We are drifting further and further from a situation where the splitters could start running. There's now almost 21 million results in the db and it is still rising despite of splitters having been stopped for 10 hours. So much resends... About 9 milllion of those 21 million results are probably being held hostage by the ever growing assimilation queue. |
Dave Stegner Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27 |
Interesting "Results in the field" is DROPING "Results received in last hour" is SLOWING AND "Results returned and awaiting validation" is RISING Doesn't seem like "awaiting validation" should be going up if "results received" is slowing. Dave |
Dave Stegner Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27 |
It sure did... Dave |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13750 Credit: 208,696,464 RAC: 304 |
Scheduler was MIA for a while there, but now it's back to "Project has no tasks available" Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
OK, we're back - it all went very black there for a few minutes, didn't it? I'm trying to think through what could possibly have caused all this database bloat. My suggestion of re-running the transitioner to pick up tasks which should have validated, but didn't, seems to have flushed out a few - but not as many as I expected. So what else has gone wrong? First oddity - WU 3781186004. The middle guy - who seems a perfectly respectable cruncher, with a decent rig and a team member, got his task on 9 December (soon after the limits were raised) - and has done nothing with it. Why? He's returning good work, quickly, now, and has lots of credit at other projects. Only finger of suspicion I can see right now is 'Driver version 432.00' on Windows 10. And he's returned about 80 good tasks - all of a similar age - in the last day. Did he realise that everything was stuck and downgrade the driver? Could all of this be down to Microsoft (auto update), NVidia (bad driver), and our own long deadlines? Preserving 8317964641 8873167 9 Dec 2019, 10:04:20 UTC 10 Dec 2019, 4:31:07 UTC Completed and validated 253.39 128.74 126.77 SETI@home v8 Anonymous platform (NVIDIA GPU) 8317964642 8272778 9 Dec 2019, 10:04:17 UTC 31 Jan 2020, 15:04:19 UTC Not started by deadline - canceled 0.00 0.00 --- SETI@home v8 v8.22 (opencl_nvidia_SoG) windows_intelx86 8497103820 8313133 31 Jan 2020, 15:04:03 UTC 31 Jan 2020, 22:52:18 UTC Completed and validated 765.91 709.91 126.77 SETI@home v8 Anonymous platform (NVIDIA GPU) |
BetelgeuseFive Send message Joined: 6 Jul 99 Posts: 158 Credit: 17,117,787 RAC: 19 |
Also got a bunch, but not all of them resends. Even got an Astropulse (_0). Looks like things are moving again ... Edit: downloads are extremely slow Tom |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
And I got a BLC35_1 about an hour ago. The replica has fallen behind again, so I can't (yet) see when it was split - but I hope it wasn't recently. Edit - it's gone now. 36 seconds on an old, tired, slow CPU. We really should stop splitting these noisy tapes while we're still in deep doo-doo. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13750 Credit: 208,696,464 RAC: 304 |
Just got a bunch of WU's now, but all are resends _2 or higher.I managed to score 50 resends on one of my systems. When they finally downloaded, all done in under 4 minutes. Only 3 of them weren't noise bombs. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13750 Credit: 208,696,464 RAC: 304 |
And to add to the issues, it appears that "Result files waiting for deletion" has developed issues for both MB & AP. Both have gone from effectively 0 to over 510k & 13.7k respectively. Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
My Inconclusive results are going up too, even though I've only had a handful of Tasks since last night. Last night I had a large number of Inconclusive results that said 'minimum quorum 1' and only listed a single Inconclusive host. I didn't see how a single Inconclusive host task could ever validate. Now, it's very difficult to bring up my Inconclusive tasks lists, but, it seems those tasks are now listed as; https://setiathome.berkeley.edu/workunit.php?wuid=3862758806 minimum quorum 1 initial replication 3 Task Computer Sent Time reported Status Runtime CPUtime Credit Application 8495599283 1473578 31 Jan 2020, 5:02:48 UTC 31 Jan 2020, 21:47:15 UTC Completed and validated 15.36 12.61 3.59 SETI@home v8 v8.20 (opencl_ati5_mac) x86_64-apple-darwin 8498611906 6796479 1 Feb 2020, 3:00:50 UTC 1 Feb 2020, 4:00:03 UTC Completed and validated 4.10 1.93 3.59 SETI@home v8 v8.11 (cuda42_mac) x86_64-apple-darwin 8498669733 8673543 1 Feb 2020, 4:01:52 UTC 1 Feb 2020, 5:29:49 UTC Completed and validated 15.11 13.09 3.59 SETI@home v8 v8.22 (opencl_nvidia_SoG)So, the single host are now triple hosts, but they are still just sitting there with a number of them showing one or two Completed, waiting for validation hosts, and some with one or two Inconclusive hosts. |
Freewill Send message Joined: 19 May 99 Posts: 766 Credit: 354,398,348 RAC: 11,693 |
Just got a bunch of WU's now, but all are resends _2 or higher.I managed to score 50 resends on one of my systems. When they finally downloaded, all done in under 4 minutes. Only 3 of them weren't noise bombs. How does one tell if the jobs are resends? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
How does one tell if the jobs are resends?In this case, we're talking about new - extra - replications of existing tasks , not about resending lost tasks. You tell from the task name, as shown in BOINC Manager. You need to be able to see the very end of the name - so use advanced view, and make the column as wide as you need. The last two characters are as follows: _0 - always the first time a workunit has been sent to a cruncher. Every WU has a task _0 _1 - in normal times, usually created at the same as _0 and sent out straight away. At the moment, some are being created and distributed later. _2 onwards - probably a new replication, because the first two failed to validate (either because they returned different answers, or one of them never returned at all). But again, just at the moment, some results are untrustworthy, so _2 may be created for a safety-check. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
And I got a BLC35_1 about an hour ago. The replica has fallen behind again, so I can't (yet) see when it was split - but I hope it wasn't recently.OK, it's visible now - WU 3863162100. It turns out to be a replication for a task created as a singleton yesterday morning. And because it overflowed, it's been sent to a third host for checking. And it's gone to a host running Windows 7 - no driver problems. Phew. |
Freewill Send message Joined: 19 May 99 Posts: 766 Credit: 354,398,348 RAC: 11,693 |
|
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Sat 01 Feb 2020 07:08:16 AM CST | SETI@home | Scheduler request completed: got 0 new tasks Sat 01 Feb 2020 07:08:16 AM CST | SETI@home | [sched_op] Server version 709 Sat 01 Feb 2020 07:08:16 AM CST | SETI@home | Project has no tasks available The good news is my system(s) are happily crunching along (E@H and WCG). The bad news is "the dry spell" is really dry..... ;) Tom A proud member of the OFA (Old Farts Association). |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.