Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/boinc_db.inc on line 147
Error on file upload: Socket Read incomplete:

Error on file upload: Socket Read incomplete:

Message boards : SETI@home Enhanced : Error on file upload: Socket Read incomplete:
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
David M. Feifer
Volunteer tester

Send message
Joined: 7 Sep 05
Posts: 15
Credit: 11,323
RAC: 0
United States
Message 13802 - Posted: 27 Jan 2007, 3:39:21 UTC

1/26/2007 1:14:59 PM|SETI@home Beta Test|Error on file upload: socket read incomplete: asked for 15980, got 12912: Socket operation on non-socket
1/26/2007 1:14:59 PM|SETI@home Beta Test|[file_xfer] Temporarily failed upload of 06no06aa.19371.89263.10.4.28_0_0: transient upload error
1/26/2007 1:14:59 PM|SETI@home Beta Test|Backing off 2 hr 45 min 53 sec on upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 1:32:18 PM|SETI@home Beta Test|Sending scheduler request: To fetch work
1/26/2007 1:32:18 PM|SETI@home Beta Test|Requesting 78 seconds of new work, and reporting 1 completed tasks
1/26/2007 1:32:23 PM|SETI@home Beta Test|Scheduler RPC succeeded [server version 505]
1/26/2007 1:32:23 PM|SETI@home Beta Test|Deferring communication for 7 sec
1/26/2007 1:32:23 PM|SETI@home Beta Test|Reason: requested by project
1/26/2007 1:32:25 PM|SETI@home Beta Test|[file_xfer] Started download of file 06no06aa.19371.92535.10.4.204
1/26/2007 1:32:27 PM|SETI@home Beta Test|[file_xfer] Finished download of file 06no06aa.19371.92535.10.4.204
1/26/2007 1:32:27 PM|SETI@home Beta Test|[file_xfer] Throughput 286885 bytes/sec
1/26/2007 2:51:58 PM|SETI@home Beta Test|Computation for task 06no06aa.19371.90899.10.4.106_1 finished
1/26/2007 2:51:58 PM||Starting 06no06aa.19371.90899.10.4.202_1
1/26/2007 2:51:58 PM|SETI@home Beta Test|Starting task 06no06aa.19371.90899.10.4.202_1 using setiathome_enhanced version 517
1/26/2007 2:52:00 PM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.90899.10.4.106_1_0
1/26/2007 2:52:04 PM|SETI@home Beta Test|[file_xfer] Finished upload of file 06no06aa.19371.90899.10.4.106_1_0
1/26/2007 2:52:04 PM|SETI@home Beta Test|[file_xfer] Throughput 26365 bytes/sec
1/26/2007 4:00:53 PM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 4:00:58 PM|SETI@home Beta Test|Error on file upload: socket read incomplete: asked for 15980, got 12912: Socket operation on non-socket
1/26/2007 4:00:58 PM|SETI@home Beta Test|[file_xfer] Temporarily failed upload of 06no06aa.19371.89263.10.4.28_0_0: transient upload error
1/26/2007 4:00:58 PM|SETI@home Beta Test|Backing off 2 hr 58 min 18 sec on upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 4:50:05 PM|SETI@home Beta Test|Sending scheduler request: To fetch work
1/26/2007 4:50:05 PM|SETI@home Beta Test|Requesting 102 seconds of new work, and reporting 1 completed tasks
1/26/2007 4:50:10 PM|SETI@home Beta Test|Scheduler RPC succeeded [server version 505]
1/26/2007 4:50:10 PM|SETI@home Beta Test|Deferring communication for 7 sec
1/26/2007 4:50:10 PM|SETI@home Beta Test|Reason: requested by project
1/26/2007 4:50:12 PM|SETI@home Beta Test|[file_xfer] Started download of file 06no06aa.19371.92944.10.4.62
1/26/2007 4:50:14 PM|SETI@home Beta Test|[file_xfer] Finished download of file 06no06aa.19371.92944.10.4.62
1/26/2007 4:50:14 PM|SETI@home Beta Test|[file_xfer] Throughput 287001 bytes/sec
1/26/2007 6:09:40 PM|SETI@home Beta Test|Computation for task 06no06aa.19371.90899.10.4.202_1 finished
1/26/2007 6:09:40 PM||Starting 06no06aa.19371.91308.10.4.78_0
1/26/2007 6:09:40 PM|SETI@home Beta Test|Starting task 06no06aa.19371.91308.10.4.78_0 using setiathome_enhanced version 517
1/26/2007 6:09:43 PM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.90899.10.4.202_1_0
1/26/2007 6:09:47 PM|SETI@home Beta Test|[file_xfer] Finished upload of file 06no06aa.19371.90899.10.4.202_1_0
1/26/2007 6:09:47 PM|SETI@home Beta Test|[file_xfer] Throughput 30720 bytes/sec
1/26/2007 6:59:17 PM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 6:59:19 PM|SETI@home Beta Test|Error on file upload: socket read incomplete: asked for 15980, got 12912: Socket operation on non-socket
1/26/2007 6:59:19 PM|SETI@home Beta Test|[file_xfer] Temporarily failed upload of 06no06aa.19371.89263.10.4.28_0_0: transient upload error
1/26/2007 6:59:19 PM|SETI@home Beta Test|Backing off 1 hr 25 min 59 sec on upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 8:25:19 PM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 8:25:23 PM|SETI@home Beta Test|Error on file upload: socket read incomplete: asked for 15980, got 12912: Socket operation on non-socket
1/26/2007 8:25:23 PM|SETI@home Beta Test|[file_xfer] Temporarily failed upload of 06no06aa.19371.89263.10.4.28_0_0: transient upload error
1/26/2007 8:25:23 PM|SETI@home Beta Test|Backing off 3 hr 46 min 0 sec on upload of file 06no06aa.19371.89263.10.4.28_0_0
1/26/2007 10:26:49 PM|SETI@home Beta Test|Sending scheduler request: Requested by user
1/26/2007 10:26:49 PM|SETI@home Beta Test|Reporting 1 tasks
1/26/2007 10:26:54 PM|SETI@home Beta Test|Scheduler RPC succeeded [server version 505]
1/26/2007 10:26:54 PM|SETI@home Beta Test|Deferring communication for 7 sec
1/26/2007 10:26:54 PM|SETI@home Beta Test|Reason: requested by project

This one has actually been done for a few days now, but just can't make the upload. Only one i have seen like this, have done other units since then without a problem. any ideas?
ID: 13802 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 13815 - Posted: 27 Jan 2007, 11:35:13 UTC

This is a rare problem, but it seems to be becoming more common: maybe someone should feed it into the BOINC bug-reporting system.

I've seen it twice, and written it up here (this Beta board, but describing an Einstein event) and here (BOINC dev board, describing a SETI main event).

Byron has reported it in SETI Beta here.

There are a couple of recent reports in SETI Main, here and here.

In my own case, I'm convinced that the problem started when two WUs from the same project finished within 1 second of each other on a multi-core machine: BOINC seems to have got confused about which file to upload (the numbers in the error message relate to the file sizes of the upload files).

In some cases, it seems to correlate with an error in the state file found the next time BOINC restarts - David, if you haven't restarted BOINC since you saw the error, perhaps you could test this out?

Whatever it is, it seems to be an error on the local computer, and the only way I could find to solve it was to abort the failing transfer. But unfortunately the reports have been rather submerged by the spate of server-related upload problems at both SETI and Einstein (and mis-diagnosed as a server problem by at least one moderator on the SETI main board).
ID: 13815 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 13817 - Posted: 27 Jan 2007, 14:04:41 UTC

And another one on SETI main.
ID: 13817 · Report as offensive
David M. Feifer
Volunteer tester

Send message
Joined: 7 Sep 05
Posts: 15
Credit: 11,323
RAC: 0
United States
Message 13818 - Posted: 27 Jan 2007, 14:05:51 UTC - in response to Message 13815.  
Last modified: 27 Jan 2007, 14:07:52 UTC

Restarted, and tried to reload, and still have the same error. I have only experienced the error with that specific work unit so far. result id 1267041 work unit id 302792. system states (asked for 15980, got 12912) transfer area lists SETI@homeBetaTest 06no06aa.19371.89263.10.4.28_0_0 86.63% 13.52/15.61KB

This is a single core AMD64 3200+ 2 gigs of system ram, 2 gigs of readyboost running 64bit Vista

Really wish these boards were keyword searchable
ID: 13818 · Report as offensive
Josef W. Segur
Volunteer tester

Send message
Joined: 14 Oct 05
Posts: 1137
Credit: 1,848,733
RAC: 0
United States
Message 13820 - Posted: 27 Jan 2007, 17:09:25 UTC - in response to Message 13818.  

Restarted, and tried to reload, and still have the same error. I have only experienced the error with that specific work unit so far. result id 1267041 work unit id 302792. system states (asked for 15980, got 12912) transfer area lists SETI@homeBetaTest 06no06aa.19371.89263.10.4.28_0_0 86.63% 13.52/15.61KB

This is a single core AMD64 3200+ 2 gigs of system ram, 2 gigs of readyboost running 64bit Vista

Really wish these boards were keyword searchable

If you haven't already aborted the upload, open the 06no06aa.19371.89263.10.4.28_0_0 file in a text editor and look at the end. I suspect you'll see one <spike> report but no <best_spike>, <best_gaussian>, or <best_pulse>. Without those "best of" signals, the result wouldn't validate even if the upload were forced.

There's been an active discussion of this problem on the boinc_alpha mailing list for a few days, that seems to be the route to report problems with the Alpha versions of BOINC. The BOINCzilla bug report system had fewer than 600 total reports the last time I looked, it's very inactive.

The only link to the keyword search for these boards is on the Questions and answers page. Dunno why, but it's the same in the main project.
                                                             Joe


ID: 13820 · Report as offensive
David M. Feifer
Volunteer tester

Send message
Joined: 7 Sep 05
Posts: 15
Credit: 11,323
RAC: 0
United States
Message 13855 - Posted: 28 Jan 2007, 15:09:54 UTC - in response to Message 13820.  

something else i just noticed while watching the system..

1/28/2007 9:59:25 AM|SETI@home Beta Test|Computation for task 06no06aa.19371.92535.10.4.204_0 finished
1/28/2007 9:59:27 AM|SETI@home Beta Test|[file_xfer] Started upload of file 06no06aa.19371.92535.10.4.204_0_0
1/28/2007 9:59:29 AM|SETI@home Beta Test|[file_xfer] Finished upload of file 06no06aa.19371.92535.10.4.204_0_0
1/28/2007 9:59:29 AM|SETI@home Beta Test|[file_xfer] Throughput 30914 bytes/sec
1/28/2007 10:00:11 AM||Starting 06no06aa.19371.92535.10.4.204_0
1/28/2007 10:00:11 AM|SETI@home Beta Test|Starting task 06no06aa.19371.92535.10.4.204_0 using setiathome_enhanced version 517

currently my system appears to be re hashing the same wu that it just completed, there is no time to finish listed, and at the end it says ready to report.Currently it is at 8:06 cpu time with a progress of 3.9%
posibly what is making the other units larger then what the server is expecting?
ID: 13855 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 13861 - Posted: 28 Jan 2007, 15:55:32 UTC

Note that this is the same WU that you downloaded during the upload errors in your first post:

1/26/2007 1:32:25 PM|SETI@home Beta Test|[file_xfer] Started download of file 06no06aa.19371.92535.10.4.204

I saw something similar in the Einstein error I linked to earlier: not only did an upload go wrong, but the next WU to crunch [in the same slot directory ???] is also damaged.

If you can hang on until the next subscriber to the dev mailing list can chip in, they might ask for logs or other system info to help the debug effort: otherwise, I suspect you're going to have to abort it.
ID: 13861 · Report as offensive
Daniel Schaalma
Volunteer tester
Avatar

Send message
Joined: 16 Feb 06
Posts: 30
Credit: 3,136
RAC: 0
United States
Message 13900 - Posted: 29 Jan 2007, 13:44:26 UTC

I have been seeing this same error on the main Seti project. I have been getting at least one or two of these errors per week for a couple of months now. The error is most commonly seen on my multicore or HT machines. In each occurance on my machines, the workunit is listed in the Task List as being less than 100% complete, but it's status is listed as "uploading". On 32-bit systems, this error usually terminates all BOINC related processes, in some instances locking up the O/S. On 64-bit Windows, only the Seti app and BOINC client are terminated, but the BOINC Manager is left running, and the O/S does not lock up. On my Linux systems, when this occurs, the whole system locks up. I have also had this error occur on a single core, single CPU system running SuSE Linux 10.0 x86_64. In all cases when this error occurs, upon restarting BOINC, the Message Tab refers to an error in the state file. The only way to recover from this has been for me to abort the file transfer.

Regards, Daniel.
ID: 13900 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 14948 - Posted: 19 Feb 2007, 15:44:14 UTC - in response to Message 13820.  
Last modified: 19 Feb 2007, 15:44:35 UTC

There's been an active discussion of this problem on the boinc_alpha mailing list for a few days, that seems to be the route to report problems with the Alpha versions of BOINC.
                                                             Joe

Can anyone pass on any conclusions from the alpha discussion about this problem? I've had a couple more instances recently, under BOINC 5.8.9, and I've posted full logs in this thread.
ID: 14948 · Report as offensive
Profile Byron Leigh Hatch @ team Carl ...
Volunteer tester
Avatar

Send message
Joined: 15 Jun 05
Posts: 970
Credit: 1,495,169
RAC: 0
Canada
Message 14952 - Posted: 19 Feb 2007, 16:41:06 UTC - in response to Message 14948.  
Last modified: 19 Feb 2007, 16:58:45 UTC




Richard: more info:


There's been an active discussion of this problem on the boinc_alpha mailing list for a few days, that seems to be the route to report problems with the Alpha versions of BOINC.
                                                             Joe

Can anyone pass on any conclusions from the alpha discussion about this problem? I've had a couple more instances recently, under BOINC 5.8.9, and I've posted full logs in this thread.

Message: 5
Date: Sat, 27 Jan 2007 11:48:34 -0500
From: "Josef W. Segur"
Subject: Re: [boinc_alpha] Probably a SETI issue, but....
To: "BOINC Alpha list"
Message-ID:
Content-Type: text/plain; charset=US-ASCII

On 26 Jan 2007 at 21:52, Alex wrote:

> Yes, same failures after upgrading to 5.8.5 [stdoutdae.txt attached].
> After looking at this and failures on other hosts, maybe these uploads were
> fubared by 5.8.4.
> In any case, on to 5.86a.

In the Seti Beta case[1] I looked at for a host using BOINC 5.8.0, the
upload was certainly fubared. The setiathome_enhanced app copies the
WU header to the result file at startup, adds any "interesting" signals
as they occur, and finally writes the "best of" signals. That WU didn't
have any "interesting" signals, the size BOINC measured (16324 bytes)
when the app exited was about right for the WU header plus the "best of"
signals, but the actual file on disk (13063 bytes) contained only the
WU header. The "asked for 16324, got 13063" in the error report
reflected the actual situation, and since the Validator needs signals
to check, uploading the short file would be no use.

The question is how the file could be complete when BOINC checks it at
app exit, but be the older version without the final writes when upload
is attempted.
                                                            Joe


[1] - http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=731&nowrap=true#13067




ID: 14952 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 14954 - Posted: 19 Feb 2007, 16:56:17 UTC - in response to Message 14952.  
Last modified: 19 Feb 2007, 17:10:44 UTC

The question is how the file could be complete when BOINC checks it at
app exit, but be the older version without the final writes when upload
is attempted.

I think the logs I've just posted in main Q & A (link in my previous post) might explain 'how': both WUs finished normally, but then started processing again before the upload happened. That would explain why the concluding 'best of' lines were missing.

And the question becomes "why" BOINC chose the recently-finished WU as the next one to start processing.

Edit: I preserved the output files from the two recent examples before aborting the uploads. One ends after the </workunit_header> line: the other goes on to list two triplets and a spike. For that WU (114524447), successful crunchers report 4 triplets and 1 spike, so the 'partial re-crunch' theory is reinforced.
ID: 14954 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 15107 - Posted: 22 Feb 2007, 11:52:05 UTC

Yet another lifecycle log: again, the WU finished one second after another WU from the same project, and then started again before uploading.

2007-02-21 20:12:11 [SETI@home] Sending scheduler request: To fetch work
2007-02-21 20:12:11 [SETI@home] Requesting 13201 seconds of new work, and reporting 1 completed tasks
2007-02-21 20:12:17 [SETI@home] Scheduler RPC succeeded [server version 507]
2007-02-21 20:12:17 [SETI@home] Deferring communication for 11 sec
2007-02-21 20:12:17 [SETI@home] Reason: requested by project
2007-02-21 20:12:19 [SETI@home] [file_xfer] Started download of file 11no03aa.4732.12274.92334.3.38
2007-02-21 20:12:19 [SETI@home] [file_xfer] Started download of file 11no03aa.4732.12274.92334.3.43
2007-02-21 20:12:26 [SETI@home] [file_xfer] Finished download of file 11no03aa.4732.12274.92334.3.38
2007-02-21 20:12:26 [SETI@home] [file_xfer] Throughput 68354 bytes/sec
2007-02-21 20:12:26 [SETI@home] [file_xfer] Finished download of file 11no03aa.4732.12274.92334.3.43
2007-02-21 20:12:26 [SETI@home] [file_xfer] Throughput 68756 bytes/sec
2007-02-21 20:12:26 [SETI@home] [file_xfer] Started download of file 11no03aa.4732.12274.92334.3.44
2007-02-21 20:12:26 [SETI@home] [file_xfer] Started download of file 11no03aa.4732.12274.92334.3.45
2007-02-21 20:12:27 [SETI@home] Starting 11no03aa.4732.12274.92334.3.38_3
2007-02-21 20:12:27 [SETI@home] Starting task 11no03aa.4732.12274.92334.3.38_3 using setiathome_enhanced version 517
2007-02-21 20:12:27 [SETI@home] Starting 11no03aa.4732.12274.92334.3.43_3
2007-02-21 20:12:27 [SETI@home] Starting task 11no03aa.4732.12274.92334.3.43_3 using setiathome_enhanced version 517
2007-02-21 20:12:27 [SETI@home] Starting 01no03aa.5768.1328.765888.3.102_0
2007-02-21 20:12:27 [SETI@home] Starting task 01no03aa.5768.1328.765888.3.102_0 using setiathome_enhanced version 517
2007-02-21 20:12:27 [SETI@home] Starting 01no03aa.5768.1233.373588.3.77_0
2007-02-21 20:12:27 [SETI@home] Starting task 01no03aa.5768.1233.373588.3.77_0 using setiathome_enhanced version 517
2007-02-21 20:12:36 [SETI@home] [file_xfer] Finished download of file 11no03aa.4732.12274.92334.3.44
2007-02-21 20:12:36 [SETI@home] [file_xfer] Throughput 44306 bytes/sec
2007-02-21 20:12:36 [SETI@home] [file_xfer] Finished download of file 11no03aa.4732.12274.92334.3.45
2007-02-21 20:12:36 [SETI@home] [file_xfer] Throughput 44219 bytes/sec
2007-02-21 20:12:36 [SETI@home] [file_xfer] Started download of file 15au03aa.20522.7056.765914.3.4
2007-02-21 20:12:38 [SETI@home] Starting 11no03aa.4732.12274.92334.3.44_3
2007-02-21 20:12:38 [SETI@home] Starting task 11no03aa.4732.12274.92334.3.44_3 using setiathome_enhanced version 517
2007-02-21 20:12:38 [SETI@home] Starting 11no03aa.4732.12274.92334.3.45_0
2007-02-21 20:12:38 [SETI@home] Starting task 11no03aa.4732.12274.92334.3.45_0 using setiathome_enhanced version 517
2007-02-21 20:12:46 [SETI@home] [file_xfer] Finished download of file 15au03aa.20522.7056.765914.3.4
2007-02-21 20:12:46 [SETI@home] [file_xfer] Throughput 47095 bytes/sec
2007-02-21 20:12:53 [SETI@home] Computation for task 11no03aa.4732.12274.92334.3.38_3 finished
2007-02-21 20:12:54 [SETI@home] Computation for task 11no03aa.4732.12274.92334.3.43_3 finished
2007-02-21 20:12:55 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.12274.92334.3.38_3_0
2007-02-21 20:12:55 [SETI@home] Starting 11no03aa.4732.12274.92334.3.43_3
2007-02-21 20:12:55 [SETI@home] Starting task 11no03aa.4732.12274.92334.3.43_3 using setiathome_enhanced version 517
2007-02-21 20:12:56 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.12274.92334.3.43_3_0
2007-02-21 20:12:58 [SETI@home] Computation for task 11no03aa.4732.12274.92334.3.44_3 finished
2007-02-21 20:12:58 [SETI@home] Resuming task 19au03aa.16709.19889.648582.3.172_0 using setiathome_enhanced version 517
2007-02-21 20:12:58 [SETI@home] Resuming task 19au03aa.16709.20530.392322.3.171_3 using setiathome_enhanced version 517
2007-02-21 20:12:59 [SETI@home] Computation for task 11no03aa.4732.12274.92334.3.45_0 finished
2007-02-21 20:13:03 [SETI@home] [file_xfer] Finished upload of file 11no03aa.4732.12274.92334.3.38_3_0
2007-02-21 20:13:03 [SETI@home] [file_xfer] Throughput 4505 bytes/sec
2007-02-21 20:13:03 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.12274.92334.3.44_3_0
2007-02-21 20:13:09 [SETI@home] [error] Error on file upload: socket read incomplete: asked for 16382, got 7421: No such file or directory
2007-02-21 20:13:09 [SETI@home] [file_xfer] Temporarily failed upload of 11no03aa.4732.12274.92334.3.43_3_0: transient upload error
ID: 15107 · Report as offensive
Pepo
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 172
Credit: 251,583
RAC: 0
Slovakia
Message 15122 - Posted: 22 Feb 2007, 17:42:30 UTC

Richard, I've just read your post on the Boinc boards, this was my reply:

Is anyone making any progress on this (increasingly common) error?

2 WUs from the same project finish 1 second apart. The second WU then re-starts before uploading, when one would expect a different WU to start.


There is following entry in the checkin notes:
David 15 Feb 2007
- core client: fix bug where if a task is aborted
(e.g. because it exceeds CPU limit)
it's restarted on the next enforce_schedule().
The problem: we're deleting the ACTIVE_TASK,
but the result is still in the ordered_scheduled_results list.
The solution: call request_schedule_cpus() in
handle_finished_apps() when an ACTIVE_TASK is deleted.


It could handle the same symptom (restarting finished task), but possibly also not exact this one.

The read problem could also be that Boinc could not read the same that was previously written - either the slot changed or the output file was overwritten by the new run.


I think there might be the chance it was corrected by the fix.

Peter
ID: 15122 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 15123 - Posted: 22 Feb 2007, 18:39:30 UTC - in response to Message 15122.  

The read problem could also be that Boinc could not read the same that was previously written - either the slot changed or the output file was overwritten by the new run.

I think there might be the chance it was corrected by the fix.

Peter

Someone else on the BOINC boards has asked for a re-check with the latest build: I've upgraded to 5.8.15, but it's difficult to arrange these events to order. Now if Eric can just keep the Beta splitters off a little longer, and send through a nice batch of VHAR on main, we might just hit one....

I think that the current best bet is that the output file is overwritten by the re-run: see my comments to Joe in my earlier posts. This recent event, again, had a clean copy of the WU header but no data or 'Best of' following the header.
ID: 15123 · Report as offensive
Pepo
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 172
Credit: 251,583
RAC: 0
Slovakia
Message 15128 - Posted: 22 Feb 2007, 20:03:42 UTC - in response to Message 15123.  
Last modified: 22 Feb 2007, 20:03:55 UTC

Someone else on the BOINC boards

Nicolas a.k.a. PovAddict, RenderFarm@Home's admin.

has asked for a re-check with the latest build

Actually I've meant the same, but forgot to tell explicitly ;-)

... it's difficult to arrange these events to order. Now if Eric can just keep the Beta splitters off a little longer, and send through a nice batch of VHAR on main, we might just hit one....

That's the "beauty" of debugging :-(

Peter
ID: 15128 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 15135 - Posted: 23 Feb 2007, 0:15:33 UTC

Nicolas has come back with a claim attributed to DA that this was fixed in v5.8.11, released 09 Feb.

I refuse to run that version, because of the "harmless", "bogus", 'exit with zero status' message which has caused at least one cruncher on CPDN to lose over 600 hours of work.

While we're waiting for mine to hit the spot with 5.8.15, anybody got any other evidence, pro or anti?
ID: 15135 · Report as offensive
The Eternal
Volunteer tester

Send message
Joined: 21 Aug 05
Posts: 56
Credit: 183,653
RAC: 0
United States
Message 15170 - Posted: 23 Feb 2007, 23:44:48 UTC

5.8.11 is running here... i had one unit with this problem... so. no.. .not fixed.

WU upload aborted
ID: 15170 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 15193 - Posted: 24 Feb 2007, 10:05:53 UTC - in response to Message 15170.  

5.8.11 is running here... i had one unit with this problem... so. no.. .not fixed.

WU upload aborted

Do you still have the message log (or could you recover it from 'stdoutdae.txt', in the BOINC directory)?

I would be interested in the lines around the time the WU finished crunching: a few lines before, and the lines after 'finished' up to the first attempt at uploading.
ID: 15193 · Report as offensive
Pepo
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 172
Credit: 251,583
RAC: 0
Slovakia
Message 15219 - Posted: 25 Feb 2007, 1:30:48 UTC

OK let's repeat it here too: If someone will notice any "Error on file upload" problem, it would be helpful to take a look into the log, if that particular result was possibly restarted once more after being finished, prior to being uploaded. Especially if it happened with at least Boinc 5.8.12 (because WinterKnight already observed it on 5.8.11).

Peter
ID: 15219 · Report as offensive
The Eternal
Volunteer tester

Send message
Joined: 21 Aug 05
Posts: 56
Credit: 183,653
RAC: 0
United States
Message 15232 - Posted: 25 Feb 2007, 10:12:58 UTC - in response to Message 15193.  
Last modified: 25 Feb 2007, 10:13:55 UTC

5.8.11 is running here... i had one unit with this problem... so. no.. .not fixed.

WU upload aborted

Do you still have the message log (or could you recover it from 'stdoutdae.txt', in the BOINC directory)?

I would be interested in the lines around the time the WU finished crunching: a few lines before, and the lines after 'finished' up to the first attempt at uploading.


well here you go... (relevent info is tabbed

2007-02-21 06:03:59 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.18_0_0
2007-02-21 06:03:59 [SETI@home] Computation for task 11no03aa.4732.8626.304816.3.22_0 finished
. 2007-02-21 06:03:59 [SETI@home] Starting 11no03aa.4732.8626.304816.3.23_0
. 2007-02-21 06:03:59 [SETI@home] Starting task 11no03aa.4732.8626.304816.3.23_0 using setiathome_enhanced version 519
2007-02-21 06:04:01 [SETI@home] [file_xfer] Finished upload of file 11no03aa.4732.8626.304816.3.18_0_0
2007-02-21 06:04:01 [SETI@home] [file_xfer] Throughput 67350 bytes/sec
2007-02-21 06:04:01 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.22_0_0
2007-02-21 06:04:04 [SETI@home] [file_xfer] Finished upload of file 11no03aa.4732.8626.304816.3.22_0_0
2007-02-21 06:04:04 [SETI@home] [file_xfer] Throughput 214138 bytes/sec
2007-02-21 06:13:28 [SETI@home Beta Test] Sending scheduler request: To fetch work
2007-02-21 06:13:28 [SETI@home Beta Test] Requesting 107916 seconds of new work
2007-02-21 06:13:33 [SETI@home Beta Test] Scheduler RPC succeeded [server version 509]
2007-02-21 06:13:33 [SETI@home Beta Test] Deferring communication for 7 sec
2007-02-21 06:13:33 [SETI@home Beta Test] Reason: requested by project
2007-02-21 06:13:33 [SETI@home Beta Test] Deferring communication for 42 min 44 sec
2007-02-21 06:13:33 [SETI@home Beta Test] Reason: no work from project
2007-02-21 06:23:23 [SETI@home] Computation for task 11no03aa.4732.8626.304816.3.29_0 finished
2007-02-21 06:23:23 [SETI@home] Resuming task 19au03aa.16709.20786.704834.3.98_1 using setiathome_enhanced version 519
. 2007-02-21 06:23:24 [SETI@home] Computation for task 11no03aa.4732.8626.304816.3.23_0 finished
2007-02-21 06:23:25 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.29_0_0
. 2007-02-21 06:23:25 [SETI@home] Starting 11no03aa.4732.8626.304816.3.23_0
. 2007-02-21 06:23:25 [SETI@home] Starting task 11no03aa.4732.8626.304816.3.23_0 using setiathome_enhanced version 519
. 2007-02-21 06:23:26 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.23_0_0
2007-02-21 06:23:27 [SETI@home] [file_xfer] Finished upload of file 11no03aa.4732.8626.304816.3.29_0_0
2007-02-21 06:23:27 [SETI@home] [file_xfer] Throughput 51216 bytes/sec
2007-02-21 06:23:28 [SETI@home] [error] Error on file upload: socket read incomplete: asked for 12226, got 7426: No such file or directory
. 2007-02-21 06:23:28 [SETI@home] [file_xfer] Temporarily failed upload of 11no03aa.4732.8626.304816.3.23_0_0: transient upload error
. 2007-02-21 06:23:28 [SETI@home] Backing off 1 min 0 sec on upload of file 11no03aa.4732.8626.304816.3.23_0_0
. 2007-02-21 06:24:28 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.23_0_0
2007-02-21 06:24:30 [SETI@home] [error] Error on file upload: socket read incomplete: asked for 12226, got 8963: No such file or directory
. 2007-02-21 06:24:30 [SETI@home] [file_xfer] Temporarily failed upload of 11no03aa.4732.8626.304816.3.23_0_0: transient upload error
. 2007-02-21 06:24:30 [SETI@home] Backing off 1 min 0 sec on upload of file 11no03aa.4732.8626.304816.3.23_0_0
2007-02-21 06:25:24 [SETI@home] Sending scheduler request: To fetch work
2007-02-21 06:25:24 [SETI@home] Requesting 13653 seconds of new work, and reporting 7 completed tasks
2007-02-21 06:25:29 [SETI@home] Scheduler RPC succeeded [server version 507]
2007-02-21 06:25:29 [SETI@home] Deferring communication for 11 sec
2007-02-21 06:25:29 [SETI@home] Reason: requested by project
2007-02-21 06:25:30 [SETI@home] [error] garbage_collect(); still have active task for acked result 11no03aa.4732.8626.304816.3.19_3; state 9
. 2007-02-21 06:25:30 [SETI@home] [file_xfer] Started upload of file 11no03aa.4732.8626.304816.3.23_0_0
2007-02-21 06:25:30 [SETI@home] Computation for task 11no03aa.4732.8626.304816.3.19_3 finished
2007-02-21 06:25:36 [SETI@home] [file_xfer] Started download of file 15au03aa.20522.4642.779818.3.216
2007-02-21 06:25:38 [SETI@home] [error] Error on file upload: socket read incomplete: asked for 12226, got 8963: No such file or directory
. 2007-02-21 06:25:38 [SETI@home] [file_xfer] Temporarily failed upload of 11no03aa.4732.8626.304816.3.23_0_0: transient upload error
. 2007-02-21 06:25:38 [SETI@home] Backing off 1 min 0 sec on upload of file 11no03aa.4732.8626.304816.3.23_0_0
ID: 15232 · Report as offensive
1 · 2 · Next

Message boards : SETI@home Enhanced : Error on file upload: Socket Read incomplete:


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.