Message boards :
Number crunching :
Panic Mode On (108) Server Problems?
Message board moderation
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 29 · Next
Author | Message |
---|---|
betreger Send message Joined: 29 Jun 99 Posts: 11408 Credit: 29,581,041 RAC: 66 |
When reporting I got this: 11/15/2017 8:32:51 AM | SETI@home | Project is temporarily shut down for maintenance |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
One server tuneup, coming right up. Meow. "Time is simply the mechanism that keeps everything from happening all at once." |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
One server tuneup, coming right up.And already (provisionally) complete. |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
We had a drive in the array that holds the workunits that was generating errors. I've kicked it out of the array. We'll see if the rebuild fixes the problem. I don't like the looks of WU 2739307008. One host successfully downloaded and ran it. Everybody else is getting D/L errors on it, 5 hosts so far. My machine's Event Log shows:I just downloaded it as well. I got a manual MD5 of 450a32005c6700d7ab95284edc959572, the same as BOINC calculated for yours: that would suggest that the MD5 stored in the database when the file was created (so that the comparison can be done) might be corrupted. @SETIEric@qoto.org (Mastodon) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I see we've got some blc_25 WUs now. Looks like they require more work than the blc_24s did. Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
We had a drive in the array that holds the workunits that was generating errors. I've kicked it out of the array. We'll see if the rebuild fixes the problem.More disk drives was a line item in the fall fundraising drive, wasn't it? |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
We had a drive in the array that holds the workunits that was generating errors. I've kicked it out of the array. We'll see if the rebuild fixes the problem. . . Thanks f or the update Eric. Stephen :) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
The WU File deleters have been having issues for a few days now, however things appear to be getting worse with the backlog reaching new highs. Hopefully we won't get to the point of running out of disk space till well after everyone's back at work at Berkeley. Grant Darwin NT |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
We had a drive in the array that holds the workunits that was generating errors. I've kicked it out of the array. We'll see if the rebuild fixes the problem. . . For what it's worth, since the issue with the derelict Raid drive. and hopefully no jinxes involved here, I have not had to play kick the servers at all. Work requests are being met and my caches are full. It would seem, on the surface, that the malfunctioning drive may have been at the heart of the issue for some time. Stephen :) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
We had a drive in the array that holds the workunits that was generating errors. I've kicked it out of the array. We'll see if the rebuild fixes the problem. I too was wondering the same or more likely the re-apportioning of the memory Eric mentioned. I haven't had to kick the servers either. Smooth sailing for once and I really like it. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hi Keith, . . Nice to know the effect is not just with my rigs. I wonder if things have improved for grant too, he seemed to be suffering from the issue a lot. Stephen ?? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
. . Nice to know the effect is not just with my rigs. I wonder if things have improved for grant too, he seemed to be suffering from the issue a lot. About the same. Ok for days or weeks at time then the cache runs down for a bit & Tbars triple update gets it going again. When there is no AP work, or mostly just GBT or Arecibo is when the issue generally occurs. The fact that there has been a steady flow of AP work for a while now is most likely why we're not having issues getting work. Grant Darwin NT |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Back up over 90 minutes, still can't get any tasks for any of my three. May be time to head to Einstein? |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Once I noticed that the servers were back up, I went over to my crunchers and forced a request-for-work. I started getting work right away (first 1, then 100+ a couple of times), Check to see if your machines are on a delay because they requested work when none was available. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Once I noticed that the servers were back up, I went over to my crunchers and forced a request-for-work. I started getting work right away (first 1, then 100+ a couple of times), Check to see if your machines are on a delay because they requested work when none was available. Thanks, the first thing I do when it's back up is force an update on all boxes via BOINCTasks. Every 305 seconds getting "No work available" === Disregard, flowing now ... |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Back up over 90 minutes, still can't get any tasks for any of my three. . . I was surprised that I got some new work on the first try after the outage, but since then nada. Looking at the server page the RTS tasks are below 90K but the creation rate is only 8/sec. Oh well! . . Have fun at Einstein :) [edit] .. OK. so while I was reading and typing, a work request got "unable to communicate with project, server may be down" message followed on the next attempt by a large download. Maybe you should try again now :) Stephen :) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
AP Validators don't appear to be working too well- Number of AP WUs awaiting Validation and Assimilation are heading for orbit. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
AP Awaiting Validation and WU Awaiting Assimilation continue to climb. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I guess no one from the project has noticed. Hopeful that it gets fixed with maintenance tomorrow. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I guess no one from the project has noticed. Hopeful that it gets fixed with maintenance tomorrow. Yep. Hopefully the restart after the system shut down will kick things along. Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.