Message boards :
Number crunching :
Error Uploadserver
Message board moderation
Author | Message |
---|---|
Skywalker66 @ Berlin Send message Joined: 31 Jan 01 Posts: 78 Credit: 27,692,349 RAC: 0 |
have a new Error and not see ever before 20.07.2010 15:49:53 SETI@home [error] Error reported by file upload server: can't open file /home/boincadm/projects/sah/upload/2b9/04jn10aa.3527.18886.15.10.105_2_0: Read-only file system 20.07.2010 15:49:53 SETI@home Temporarily failed upload of 04jn10aa.3527.18886.15.10.105_2_0: transient upload error |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Yes, new one for me too. I saw a brief total network outage (uploads, message boards, status page) starting around 13:20 UTC (I was doing shorties at the time): 20-Jul-2010 14:18:31 [SETI@home] Finished upload 20-Jul-2010 14:21:01 [SETI@home] Started upload 20-Jul-2010 14:21:24 [SETI@home] Temporarily failed upload The network came back quite quickly, but all subsequent attempts show that error: 20-Jul-2010 14:26:40 [SETI@home] [error] Error reported by file upload server: can't open file /home/boincadm/projects/sah/upload/31e/19my10aa.29386.9479.7.10.208_0_0: Read-only file system Unfortunately, it accepts the upload of the whole file before discovering the error, so the cricket graphs show high inbound activity. I'll PM Jeff, and ask if he can fix it before total shut-down for the outage. |
Skywalker66 @ Berlin Send message Joined: 31 Jan 01 Posts: 78 Credit: 27,692,349 RAC: 0 |
thanks Richard !!!! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Extra information (I pointed Jeff to this thread, so any more useful information should catch his eye here): Beta uploads are still running fine, it's only the main project which is affected. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
I'm getting the same error. I rebooted just before noticing the error. I aborted the first one in the transfers tab and then aborted it in the tasks tab. The next WU to finish has also hung with this message. Let us know what happens. Ok, just a little added something, it's happening on both CPU and GPU work. The GPU WU was dated 04jn10aa and the CPU failure was dated 06jn10aa if that helps you any. PROUD MEMBER OF Team Starfire World BOINC |
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0 |
Yep, suspending network activity... Martin |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
I guess somebody is working on it, I'm getting the projects may be temporarily down message now. Wish they had waited just a few minutes more though, I just finished a CPU task from 05no09ad and would have liked to see if it might just be the newer WUs going bad. Edit: Just noticed I'm getting validate errors now too. Probably related. PROUD MEMBER OF Team Starfire World BOINC |
Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0 |
My first error occurred: 7/20/2010 10:33:51 AM SETI@home Started upload of 05jn10aa.18791.65145.15.10.65_0_0 7/20/2010 10:33:54 AM SETI@home [error] Error reported by file upload server: can't open file /home/boincadm/projects/sah/upload/6b/05jn10aa.18791.65145.15.10.65_0_0: Read-only file system So, I'll just leave the network suspended now. It was going to "auto"-suspend at noon EST anyways... No big loss. I'll just have 1 extra wu to report on Friday/Saturday, whenever I can... Martin |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Yeah, Server Status page shows all red and orange except for the Databases and webpages. Probably figured the upload server would not be back online before 0900 PDT (1600 UTC) so they started the shutdown early. I had 2 short VHARs trying to upload, guess they'll have to wait until Friday. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
|
Nemesis Send message Joined: 14 Mar 07 Posts: 129 Credit: 31,295,655 RAC: 0 |
I sure hope it's being worked on and that the problem with validation errors can be corrected. I'm well over 100 "invalid" WU's now from all 3 of my crunchers...all stuff that was, supposedly, uploaded today - July 20th. If it was only one box then I would assume it was my problem, but with it happening on all 3 of my crunchers so it's something wrong on the SETI side of things. Now that the weekly outage has started at least I won't be uploading any more trouble. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
I've noticed some WU's which have this message: <core_client_version>6.10.56</core_client_version> <![CDATA[ <message> - exit code -529697949 (0xe06d7363) </message> <stderr_txt> setiathome_CUDA: Found 1 CUDA device(s): Device 1 : GeForce GTS 250 19no09ac.11796.885.9.10.220 applicatie SETI@home Enhanced aangemaakt 13 Jul 2010 10:36:54 UTC Error code: -529697949 (0xffffffffe06d7363) And a lot client detached, messages, some 40 tasks. I wonder what caused this, as I've no clue. |
Terror Australis Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44 |
My errors for the same period 20/07/2010 23:17:31 SETI@home Started upload of 03jn10aa.8800.8247.11.10.84_0_0 20/07/2010 23:17:31 SETI@home Started upload of 03jn10aa.8800.8247.11.10.83_0_0 20/07/2010 23:17:37 SETI@home [error] Error reported by file upload server: can't open file /home/boincadm/projects/sah/upload/c/03jn10aa.8800.8247.11.10.84_0_0: Read-only file system 20/07/2010 23:17:37 SETI@home [error] Error reported by file upload server: can't open file /home/boincadm/projects/sah/upload/30e/03jn10aa.8800.8247.11.10.83_0_0: Read-only file system 20/07/2010 23:17:37 SETI@home Temporarily failed upload of 03jn10aa.8800.8247.11.10.84_0_0: transient upload error 20/07/2010 23:17:37 SETI@home Backing off 1 min 0 sec on upload of 03jn10aa.8800.8247.11.10.84_0_0 20/07/2010 23:17:37 SETI@home Temporarily failed upload of 03jn10aa.8800.8247.11.10.83_0_0: transient upload error 20/07/2010 23:17:37 SETI@home Backing off 1 min 0 sec on upload of 03jn10aa.8800.8247.11.10.83_0_0 Similar to others but no mix up of the file names, times are in Australian Central Standard time - UTC + 9.5 hours about 1 minute after the problem started All machines were showing this error and the Server Status page showed all SAH Validators in Red - Everything else was green T.A. |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
In Papa's sticky thread Extended Outage July 20 2010 - Problems he mentions that there was a BOINC Database Crash this morning shortly before the shutdown. That could explain many of the error messages cited in this thread. I suspect getting the Database back online was one of the first tasks after the shutdown. Hope Jeff or Papa will give us an update on Wednesday, but if not, I can wait until Thursday or Friday. Until then, crunch 'em if you got 'em. |
Hellsheep Send message Joined: 12 Sep 08 Posts: 428 Credit: 784,780 RAC: 0 |
Just to clarify, as far as i am aware web-servers and file systems the way the work is after a crash or a serious error, the system reboots in read-only mode. Also usually a FSCK(File system check) is done on the server automatically. It would seem the server encountered an error, and was either rebooted or rebooted itself into read-only mode to prevent any further issues. :) (Good thing web servers and servers are my specialty.) ;) - Jarryd |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Just to clarify, as far as i am aware web-servers and file systems the way the work is after a crash or a serious error, the system reboots in read-only mode. Also usually a FSCK(File system check) is done on the server automatically. As I said in Pappa's thread yesterday, that makes a lot more sense than his off-the-cuff remark about a BOINC database crash. I don't know much about web or general *nix servers, but I do know a bit about databases - and if the early outage was invoked by staff because of database problems, then they would have been the result of the spontaneous reboot, not the original cause. Different symptoms entirely. |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Just to clarify, as far as i am aware web-servers and file systems the way the work is after a crash or a serious error, the system reboots in read-only mode. Also usually a FSCK(File system check) is done on the server automatically. Richard, if I understand this right, you saw indications of a short power / Internet access interuption, which may have caused the upload/download servers (and maybe others) to reboot in Read-Only mode, which then caused the Master BOINC database to crash. With ALL that chaos, they shut everything down and did a full restart. That DOES make a whole lot more sense than just a Database crash. |
Hellsheep Send message Joined: 12 Sep 08 Posts: 428 Credit: 784,780 RAC: 0 |
Just to clarify, as far as i am aware web-servers and file systems the way the work is after a crash or a serious error, the system reboots in read-only mode. Also usually a FSCK(File system check) is done on the server automatically. 100% correct, a power outage or surge would cause the servers to reboot in read only mode due to it thinking it was a possible hardware failure. :) Database probably did crash, but only as a result of it being in read only mode and unable to write anything. :) - Jarryd |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.