Message boards :
News :
Storage machine crash....
Message board moderation
Author | Message |
---|---|
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
A machine that was holding 15% of our outgoing workunits has crashed and refuses to start back up. Short term it means that attempts to access those workunits will cause an error until the workunit is marked as bad. Sorry for the incovenience. @SETIEric@qoto.org (Mastodon) |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 31045 Credit: 53,134,872 RAC: 32 |
eeeeekk |
FurryGuy Send message Joined: 1 Jun 04 Posts: 6 Credit: 9,294,513 RAC: 1 |
A machine that was holding 15% of our outgoing workunits has crashed and refuses to start back up. Short term it means that attempts to access those workunits will cause an error until the workunit is marked as bad.So...... should we wait for the server to catch up on its own, or should we abort any stalled download WUs? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13864 Credit: 208,696,464 RAC: 304 |
should we wait for the server to catch up on its own, or should we abort any stalled download WUs? I wouldn't abort as there are stalled downloads occurring even with the new work units being allocated for download, they will download eventually. But it's taking a while for them to start, and often with extended pauses & restarts to eventually download. Oh, and even though the splitters show as running, they're not actually producing much work at the moment. So work is going to remain extremely scarce for some time yet. Grant Darwin NT |
ronssito Send message Joined: 8 Feb 00 Posts: 19 Credit: 43,465,609 RAC: 63 |
Thanks and great job guys! |
J3P-0 Send message Joined: 1 Dec 11 Posts: 45 Credit: 25,258,781 RAC: 180 |
Thanks for the update, as of this AM 11:25 CST have a bunch of tasks waiting to report with nothing downloaded. Should I continue to wait, abort or will it pick back up when storage is back online? EDIT: seems a reboot fixed my issue. |
rob smith Send message Joined: 7 Mar 03 Posts: 22572 Credit: 416,307,556 RAC: 380 |
Don't abort them as only part of the storage system has been failed and there is no way for us to identify if a task was distributed from the failed part or the part that is working correctly (only about 15% of the tasks were being by the failed computer). Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
The system eventually came back up and we're getting the missing workunits back online as quickly as we can. There will still be some download errors as things will be out of synchronization for a while. Some workunits that exist in the database may not have been flushed to disk before the system went down (although in theory our disk controllers shouldn't allow that to happen). @SETIEric@qoto.org (Mastodon) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
|
Ray Cameron Send message Joined: 2 Sep 04 Posts: 5 Credit: 107,850 RAC: 0 |
Any idea when there will be data available to process? Ray |
rob smith Send message Joined: 7 Mar 03 Posts: 22572 Credit: 416,307,556 RAC: 380 |
As the servers are coming back to life after a ~24hour break one can expect it to be a S-L-O-W process. (Given the time they started to come back I would guess that it's an automated job, and was triggered by some task or other starting to live again) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Ray Cameron Send message Joined: 2 Sep 04 Posts: 5 Credit: 107,850 RAC: 0 |
2:55 Eastern time and my computer just downloaded and I'm now processing! |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
Sorry for the late notice. The problems we had bloated the result table to about double its normal size. Hopefully it will be back down to normal next week. @SETIEric@qoto.org (Mastodon) |
Sirius B Send message Joined: 26 Dec 00 Posts: 24920 Credit: 3,081,182 RAC: 7 |
Wow, still on campus at 17:20? Whatever will CRL say? Thanks for what you guys do. :-) |
ronssito Send message Joined: 8 Feb 00 Posts: 19 Credit: 43,465,609 RAC: 63 |
3rd consecutive day of no boinc stats update |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
3rd consecutive day of no boinc stats update Not sure why that would be. Our stats files are in place and have current timestamps. https://setiathome.berkeley.edu/stats/ @SETIEric@qoto.org (Mastodon) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
BOINCstats has processed one now. Probably all three days' worth in one go, judging by the figures on my account. |
ronssito Send message Joined: 8 Feb 00 Posts: 19 Credit: 43,465,609 RAC: 63 |
my boinc stats update each morning so tomorrow we shall see |
Roland Send message Joined: 1 Feb 19 Posts: 1 Credit: 155,648 RAC: 0 |
Hello Eric, would it be possible to renew the "Status of the UC-Berkeley SETI Efforts (Korpela, et al. 2011) " ? There has been some years between 2011 and today - or has nothing changed? Thank you |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
We're working on a couple of large papers right now. We'll certainly post them when we're done. @SETIEric@qoto.org (Mastodon) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.