Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 94 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
. . But if all cards agree the sample is an overflow (noisy) then it does not matter as there is no result going into the database right or wrong. If you are worried about the 1 or 2 credits then what can I say ....It's not about Credit, it's about science, so the type of overflow is important- the number of pulses triples or whatever. It is important to get it right. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I see the Scheduler is taking another of it's random breaks, and the splitters also took a time out from work production for almost an hour. MB In progress is falling off, and my requests for work are mostly "Project has no tasks available" and occasionally 1 or 2 WUs when reporting 10-15. I am surprised the system held up as long as it did under the present load. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
First time this year I have seen the returns past 7 million.This year, yes. Last year they reached 10.8 million for a while. Presently there is a Validation/ assimilation/ deletion backlog. Which along with the huge return rate & Scheduler issues could explain the Splitters taking a break for a while earlier. Grant Darwin NT |
Speedy ![]() Send message Joined: 26 Jun 04 Posts: 1647 Credit: 12,921,799 RAC: 89 ![]() ![]() |
First time this year I have seen the returns past 7 million.This year, yes. Last year they reached 10.8 million for a while. The splitters are currently in overdrive I think we while ago I saw that would producing over 300/second it is currently over 150/second ![]() |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . But if all cards agree the sample is an overflow (noisy) then it does not matter as there is no result going into the database right or wrong. If you are worried about the 1 or 2 credits then what can I say ....It's not about Credit, it's about science, so the type of overflow is important- the number of pulses triples or whatever. . . I think someone needs to make it clear, are noise bombs added to the science database or NOT? . . My understanding is that they do not contribute to the database at all. Stephen ? ? |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
. . I think someone needs to make it clear, are noise bombs added to the science database or NOT?They are added to the database. 'Noise bomb' is an awkward term. The technical term is 'overflow' (reached 30 signals, no space for any more). That can happen at any point in a task's run - from <1%, to 99%. The 'Late onset' overflows (my own term for them) can certainly contain useful data - immediate overflows, less likely. But they should be treated with respect, and validated properly, whenever the axe comes down. |
![]() ![]() Send message Joined: 24 Jan 00 Posts: 38193 Credit: 261,360,520 RAC: 489 ![]() ![]() |
1 thing that I am wondering about is when MB v7 is finally going to be put to bed? Those 71 tasks still listed were finished years ago and that could free up a few cycles and processes on the servers. ;-) Cheers. |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . I think someone needs to make it clear, are noise bombs added to the science database or NOT?They are added to the database. . . Well I think the term noise bomb is fairly explicit, they are noisy signal samples that go off like a bomb, 'pop' and are gone, before any serious processing is done which seems to be the spurious results the faulty drivers are returning. Certainly late overflows have a fair proportion of the signal analysed and would be expected to be treated with more gravity. I am surprised that the 'unmentionables' are added to the database since almost none of the sample has even been looked at. Stephen ? ? |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
1 thing that I am wondering about is when MB v7 is finally going to be put to bed? . . And maybe close of a few dead ends ... Stephen ?? |
![]() ![]() ![]() Send message Joined: 1 Apr 13 Posts: 1859 Credit: 268,616,081 RAC: 1,349 ![]() ![]() |
1 thing that I am wondering about is when MB v7 is finally going to be put to bed? Probably when they need the column for MB v9 :) But I doubt there's any appreciable impact on MB v7s presence . . . ![]() ![]() |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Web site & forums very slow, are we about to crash? Edit- and Scheduler is back to mostly "Project has no tasks available" responses with the occasional release of 1 or 2 no matter how many are being reported. Grant Darwin NT |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
It's very variable. Two different machines: 11/01/2020 11:22:01 | SETI@home | [sched_op] NVIDIA GPU work request: 11753.94 seconds; 0.00 devices 11/01/2020 11:22:05 | SETI@home | Scheduler request completed: got 24 new tasks 11/01/2020 11:23:35 | SETI@home | [sched_op] NVIDIA GPU work request: 18778.74 seconds; 0.00 devices 11/01/2020 11:23:37 | SETI@home | Scheduler request completed: got 0 new tasks The more you ask for, the less you get! |
![]() Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 ![]() |
Sat 11 Jan 2020 07:28:31 AM CST | | Project communication failed: attempting access to reference site Sat 11 Jan 2020 07:28:31 AM CST | SETI@home | Temporarily failed upload of 11oc10aa.23426.13973.6.33.30_2_r320875423_0: transient HTTP error Got up this morning with bunch of download retry s hanging on one box. Hit the retry button and down they came. Then I got this. Tom A proud member of the OFA (Old Farts Association). |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
Could be just a coincidence but each time the sum of the WU reaches 22-23 MM weird things happening. Who knows? Sometimes is wise to do an step back and rethink about the change of the WU limits. Maybe, just maybe, 200CPU/300GPU is too much for the servers to handle. My 0.02 cents. ![]() |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Well, the Scheduler's back up after it's random break. Now i'm getting more uploads instantly timing out on their first attempt than usual, and even getting some timing out on their second. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Been having that issue for the past day or so. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I see new files have been loaded, and they're a lower in number than the BLC35s, so as the present BLC35 files finish, then the BLC35s clear out of caches, then the resends work their way through, that should reduce the load on the servers by a huge amount. So the upload issue should return to it's normal level and the next time the Scheduler takes it's random break, it won't last as long (hopefully). Grant Darwin NT |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
immediate overflows, less likely. But they should be treated with respect, and validated properly, whenever the axe comes down.The position where the axe blade makes contact can be inconsistent. CPU apps that process the data serially hit the 30th signal at the same point if they see the same signals but GPU apps that do parallel processing discover the signals in unpredictable order. So they may report a completely different subset of 30 signals out of potentially thousands that exist in the file. At least this is my theory of why my GPU scores invalids from overlows quite regularly but non overflow results are always valid. |
Sleepy ![]() Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360 ![]() ![]() |
At least this is my theory of why my GPU scores invalids from overlows quite regularly but non overflow results are always valid.I have very few invalids (just not to say none, never say never) , CPU or GPU. I fear there is something going on at your end, Ville. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13959 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Something's broken- Received-last-hour for both MB & AP have plummeted, as has the splitter output. All at the same time, around 21:20 UTC. Although it looks like things might just be coming back to life again; splitter output is no longer 0, Returned-last-hour is climbing again. Very odd; usually I have to make the post before the problem clears. This time I just had to start typing about it. Grant Darwin NT |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.