Questions and Answers :
Windows :
Errors Uploading data
Message board moderation
Author | Message |
---|---|
v100 Send message Joined: 4 Sep 04 Posts: 2 Credit: 16,182,615 RAC: 0 ![]() |
I have been getting these error message constantly for 2 days. Not sure what is going wrong. There are 11 tasks all showing 100% progress and status uploading. However on the transfer section they are all at 0% and retrying. I have checked the 3 programs all have full access in the firewall but still nothing. Also not getting any more work. 15/07/2009 11:29:49 SETI@home Started upload of 05dc08ad.15018.6207.14.8.232_1_0 15/07/2009 11:29:51 Project communication failed: attempting access to reference site 15/07/2009 11:29:51 SETI@home Temporarily failed upload of 05dc08ad.15018.6207.14.8.232_1_0: connect() failed 15/07/2009 11:29:51 SETI@home Backing off 35 min 54 sec on upload of 05dc08ad.15018.6207.14.8.232_1_0 15/07/2009 11:29:52 Internet access OK - project servers may be temporarily down. |
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
|
![]() Send message Joined: 18 Jul 01 Posts: 23 Credit: 368,078 RAC: 0 ![]() |
same thing here.. 3rd day... :-( |
v100 Send message Joined: 4 Sep 04 Posts: 2 Credit: 16,182,615 RAC: 0 ![]() |
Thank you both, I had checked the server for downtime but could not see a problem, hence why I assumed it was something on my end. |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3777 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
If you check the server status page you'll see that the upload server "bruno" (not named from the recent film :^D ) is offline. There are often problems uploading when it isn't but in this case it's completely down and has been since yesterday. ![]() |
![]() ![]() Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 ![]() |
Is there any prognosis on poor Bruno's condition? Any estimate on when he will recover? 181 tasks in "Uploading - Retry in xx:xx:xx" at this time. |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3777 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
Bruno the upload server is back up, but still getting all my uploads failing (it must be at least partly up as they no longer fail instantly but time out.) It will probably be quite a long time before the queue is cleared given that several times a week it gets swamped and drops connections even with not having been down for a day! Edit: Yup, on this machine (about sixty queued uploads) only one partially went through and the rest timed out at zero progress. ![]() |
![]() ![]() Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 ![]() |
I can't understand why SETI@home have in general more problems on upload than on download server if the size of uploaded result (MultiBeam) is ~ 30/300 KB = 10% of the downloaded WorkUnit. What is the bandwidth of upload & download servers? . Â ![]() ![]() Â |
OzzFan ![]() ![]() ![]() ![]() Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 ![]() ![]() |
100Megabit shared for both uploads and downloads. |
![]() ![]() Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 ![]() |
If other crunchers have a similar backlog of files to upload as I do (~250) then Bruno must be getting positively pounded into the ground. This would explain why all of the uploads that my two comps attempt to do are failing. I wonder if anything is actually getting uploaded. |
![]() ![]() Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 ![]() |
100Megabit shared for both uploads and downloads. This is only about [30 uploads (20-40 KB) + 30 downloads (350 KB)]/s With the increasing number of CUDA capable computers which compute one MB task for ~1000 seconds this bandwidth can serve for only 30*1000 = 30 000 CUDA hosts! The SETI@home has at least 300 000 active hosts (~3 000 000 registered) so nowadays they need 3-10 times bigger bandwidth - 1000 Mbit. What will this cost? ($$$) If this is not resolved many will leave the project. . Â ![]() ![]() Â |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3777 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
I can't understand why SETI@home have in general more problems on upload than on download server if the size of uploaded result (MultiBeam) is ~ 30/300 KB = 10% of the downloaded WorkUnit. I know it may be disagreed with, but I can guaran-frickin-tee that there is something wrong with the upload server, perhaps with its coding. I've watched it happen dozens of times that workunits will pile up in my upload queue but many of them have uploaded to 100% and the confirmation never gets back to client so the upload fails; this happens to many clients that then keep retrying their uploads to 100% completion, bandwidth gets saturated from the retried uploads and then the downloads don't work either. Again, this has been happening once or twice a week. Only this time has bruno's status changed so perhaps it was finally brought offline and fixed. I wonder if anything is actually getting uploaded. On the status page under Data Distribution State is Results received in last hour. It was around 3K last time I checked and now it's around 9K, or about 3 results/sec. ![]() |
Aurora Borealis ![]() Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 ![]() |
$50K-100K. UCB has to study and approve any change to the Internet link (fiber or others). Seti has to find the $$$ to pay for it.
The project has always said they will not have 100% up time. So far they are up more than 90% of the time. They also have always recommended to attach to other projects to cover down times. The number of active users has jumped from 130k to 180k in the last few months causing much of the current communication problems. It will take time for the current load situation to be resolved. Volunteers are by definition free to leave the project if they wish.[/url][/quote] Boinc V7.2.42 Win7 i5 3.33G 4GB, GTX470 |
Aurora Borealis ![]() Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 ![]() |
Boinc will show 100% uploaded even if only the connection request fails. It only indicates how much of the file was buffered for upload. It doesn't mean that any part of the file has been uploaded. If the connection succeeds the file will upload. |
![]() ![]() Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 ![]() |
I wonder if anything is actually getting uploaded. 3 per second ... I guess some are getting through, but certainly none of mine. Still, with 180K active users and perhaps an average of 3.5 computers per user, that is 630K computers. ( Boincstats reports 4,239,823 hosts for SAH, at least 300,000 of them having RAC > 100, and 991,569 users, at least 87,000 of them having RAC > 100. So that is about 3.5 hosts per user, of those with RAC > 100. ) If 630K hosts all have an upload backlog of say 100 files (my two comps are at 130+ and 30+ files), that is a total backlog of 63 MILLION files. At 11K files per hour . . . [edit]. . . it will be about 8 months before Bruno can catch up, assuming that no more results are completed in that time and Bruno doesn't croak. :-| |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3777 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
Boinc will show 100% uploaded even if only the connection request fails. It only indicates how much of the file was buffered for upload. It doesn't mean that any part of the file has been uploaded. If the connection succeeds the file will upload. Sorry, but I have to disagree, as I see some of the files go to fractional percentages, ie 62.77% or equivalent. And if it was true that the 100% is it showing pre-buffering before it even contacts the server, why don't they all do that instead of mostly sit at 0%? And why do the ones that go to 100% go to it at the same rate as the upload takes place, ie you can watch it increment? ![]() |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
...that there is something wrong with the upload server, perhaps with its coding. The upload server is just a storage machine with some hard drives in RAID5 configuration. When you upload to Seti, you contact the scheduler server, tell it you want to upload, it will point you to it and all that's done is a directory is made and a file is written to the disk on the upload server. No coding, no nothing. Just the same way as you would write a file to disk. Now, when uploading on a 100Mbit line that's completely clogged up, you may reach the scheduler, but by the time you actually start uploading the result file, the whole pipe is clogged. Although you seem to be uploading, the data isn't getting through, it'll even push backwards. Compare it to draining water from the sink, where once there's not enough air in the pipe, the water will splosh back up, suck in some air and go down again. 100Mbit divided by 8 bits equals 12.5Mbytes. That's the maximum that can be uploaded per second. 12.5Mbytes times 1024 equals 12,800 kilobytes. Now divide 12,800 by an average of 25 kilobytes per result that's uploading, that's 512 results. Not necessarily 512 hosts... But OK, so that's 512 results that can be uploaded per second. If all result files are 25 KB, which they aren't. Some are bigger, some are smaller. Now, with 180,000 hosts out there, with about 4,911,372 results to be uploaded (at a guess, that number is the total amount of Seti Enhanced tasks out in the field, not necessarily all crunched and waiting to upload), you can see that they won't all fit through the 100Mbit pipe at the same time. In a perfect world, where everyone had the same speed internet connection, where everyone could upload their results in one second, it would take 4,911,372 divided by 512 equals 9592 seconds for all that work to come back. But since we don't live in a perfect world and not everyone can upload their work in one second, add a whole lot more! |
![]() ![]() Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 ![]() |
In a perfect world, where everyone had the same speed internet connection, where everyone could upload their results in one second, it would take 4,911,372 divided by 512 equals 9592 seconds for all that work to come back. . . . it will be about 8 months before Bruno can catch up, assuming that no more results are completed in that time and Bruno doesn't croak. :-| So, the answer is probably somewhere between the perfect world (about 3 hours) and the imperfect world (about 8 months). |
![]() ![]() Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 ![]() |
. . . it will be about 8 months before Bruno can catch up, assuming that no more results are completed in that time and Bruno doesn't croak. :-| Update: In the last hour we have uploaded 67,388 results. So the admittedly worst-case estimate of 8 months should be updated to about 6 weeks. However, my CUDA cruncher had a backlog of about 130 files this morning, and now it has about 190, so things are not getting better yet on this end of the pipeline. |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3777 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
If you're O.C. about S@H like I am, click Advanced -> Do network communication. That will change all the queued WU's to upload from "Retry in ##:##:##" to "Upload pending" state. I'll do this whenever they've all gone back to "Retry..." I cleared off about two hundred queued on my primary machine this way. The problem is that the client doesn't seem to want to download any work if there are too many queued transfers, so if there are continuously the client will run out of work, especially as there's a shortage of work recently to add to the problem. ![]() |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.