Message boards :
Number crunching :
Current download problem prohibits also other projects downloads
Message board moderation
Author | Message |
---|---|
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4103 Credit: 85,281,665 RAC: 126 |
Hi, for some curious reason, my hosts did not download new WU's from other projects although CPU's were already running idle. BoincView shows that there is a work buffer for sah even when download is not succesful. I had to suspend sah to download new work from Einstein. This happened with Boinc 5.4.11 and 5.3.12 (truXoft). Update did not help. Harri |
Calculator Send message Joined: 30 Sep 06 Posts: 62 Credit: 69,529 RAC: 0 |
Same with me. I guess it is because the state is "downloading" and boinc thinks the work is going to come soon or sth. like that. |
mikey Send message Joined: 17 Dec 99 Posts: 4215 Credit: 3,474,603 RAC: 0 |
Hi, You are probably running the old deficiet issue. Where you owe Seti time and until that is satisfied, no otherproject can download work. It is set up to satisfy your multy project setting, the ones where you say I want my computer to crunch x amount of time for this project and y for that program. If the program that you have set up for X, Seti in this case, then your other projects go down because Seti is lacking in the time you owe it. |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
It's my guess that his client has requested X seconds of work from seti and the order was filled, but just hasn't reached him yet, so the puter thinks it has X seconds on hand when infact it doesn't. The scheduler has already taken that into account and isn't requesting other work. Atleast, I think this is correct. So when the download is finally completed, he would have X seconds on hand, and if the code was changed to see the non existent download and actually get work from elsewhere, then if the outage was short, the host would be overcommitted and risk missing deadlines. tony |
CoolBlue87GT Send message Joined: 27 Dec 03 Posts: 59 Credit: 53,580 RAC: 0 |
Same problem here. now have three work units, status downloading. Here's part of the messages. Any help would be nice. 11/14/2006 5:39:11 AM||Starting BOINC client version 5.4.11 for windows_intelx86 11/14/2006 5:39:11 AM||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3 11/14/2006 5:39:11 AM||Data directory: F:\\program files\\BOINC 11/14/2006 5:39:11 AM||Processor: 1 AuthenticAMD mobile AMD Athlon(tm) XP2200+ 11/14/2006 5:39:11 AM||Memory: 446.48 MB physical, 885.71 MB virtual 11/14/2006 5:39:11 AM||Disk: 55.89 GB total, 45.16 GB free 11/14/2006 5:39:11 AM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 2847434; location: home; project prefs: default 11/14/2006 5:39:11 AM||No general preferences found - using BOINC defaults 11/14/2006 5:39:11 AM||Local control only allowed 11/14/2006 5:39:11 AM||Listening on port 31416 11/14/2006 5:40:16 AM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 11/14/2006 5:49:04 AM|SETI@home|Scheduler request succeeded 11/14/2006 5:49:06 AM|SETI@home|Started download of file 10jn03aa.8062.16754.436050.3.20 11/14/2006 5:49:07 AM|SETI@home|Incomplete read of less than 5KB for 10jn03aa.8062.16754.436050.3.20 - truncating 11/14/2006 5:49:07 AM|SETI@home|Temporarily failed download of 10jn03aa.8062.16754.436050.3.20: Error 403 11/14/2006 5:49:07 AM|SETI@home|Backing off 1 minutes and 0 seconds on download of file 10jn03aa.8062.16754.436050.3.20 11/14/2006 5:50:08 AM|SETI@home|Started download of file 10jn03aa.8062.16754.436050.3.20 11/14/2006 5:50:09 AM|SETI@home|Incomplete read of less than 5KB for 10jn03aa.8062.16754.436050.3.20 - truncating 11/14/2006 5:50:09 AM|SETI@home|Temporarily failed download of 10jn03aa.8062.16754.436050.3.20: Error 403 |
ML1 Send message Joined: 25 Nov 01 Posts: 20291 Credit: 7,508,002 RAC: 20 |
Formerly Phew! Good. So the forums aren't completely broken! :-) Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4103 Credit: 85,281,665 RAC: 126 |
It's my guess that his client has requested X seconds of work from seti and the order was filled, but just hasn't reached him yet, so the puter thinks it has X seconds on hand when infact it doesn't. The scheduler has already taken that into account and isn't requesting other work. Atleast, I think this is correct. I think that's what is happening. This is one rare situation when micro managing may be needed. As I mentioned, suspending seti for a minute allowed other projects to download some work and keep my hosts busy, at least for a while. Harri |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
This is one rare situation when micro managing may be needed. Indeed, you did exactly the right thing. tony |
Peter Baker Send message Joined: 5 Nov 06 Posts: 2 Credit: 10,585 RAC: 0 |
I am having the same issue as coolblue. At first I thought it was my systems playing up. Which I have found not to be the case as all my machines can stream a radio station at 128kbit without buffering. This made me wonder if its a problem with the download server. All of my machines just keep failing to download and on occasions they cant even connect to the server. Even though the machines are still recieving the radio stream. Here is part of my log >> 14/11/2006 13:51:14|SETI@home|Started download of file 14jn03ab.20420.23936.778410.3.111 14/11/2006 13:51:17|SETI@home|Temporarily failed download of 14jn03ab.20420.23936.778410.3.111: http error 14/11/2006 13:51:17|SETI@home|Backing off 1 minutes and 21 seconds on download of file 14jn03ab.20420.23936.778410.3.111 14/11/2006 13:52:39|SETI@home|Started download of file 14jn03ab.20420.23936.778410.3.111 14/11/2006 13:52:41|SETI@home|Incomplete read of less than 5KB for 14jn03ab.20420.23936.778410.3.111 - truncating 14/11/2006 13:52:41|SETI@home|Temporarily failed download of 14jn03ab.20420.23936.778410.3.111: Error 403 14/11/2006 13:52:41|SETI@home|Backing off 1 minutes and 45 seconds on download of file 14jn03ab.20420.23936.778410.3.111 Thats not all I get. I also have this pop-up on occasions >> 14/11/2006 13:39:08|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 14/11/2006 13:39:08|SETI@home|Reason: Requested by user 14/11/2006 13:39:08|SETI@home|Reporting 1 tasks 14/11/2006 13:39:13|SETI@home|Scheduler request failed: couldn't resolve host name 14/11/2006 13:39:13|SETI@home|Deferring scheduler requests for 1 minutes and 0 seconds All 4 of my machines are having the exact same issue. And its not my PC or my net connection thats causing the problem. Plus I still have some tasks that still need reporting. And just to clarify, I am only running SETI on all of my machines. P.S. Still pretty new to all this stuff, but I am learning fast. .::EDIT::. I know this isn't the topic to use, but it seems it may be related .::END-EDIT::. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
It's my guess that his client has requested X seconds of work from seti and the order was filled, but just hasn't reached him yet, so the puter thinks it has X seconds on hand when infact it doesn't. The scheduler has already taken that into account and isn't requesting other work. Atleast, I think this is correct. These outages make fascinating testbeds to observe BOINC behaviour under pressure. I have exactly the opposite situation [NB I'm not regarding it as a problem - just an observation]. Rig does mostly SETI work, with Einstein as a low-share secondary to cover outages. I also run a 3 day cache, again to cover situations just like this one. At some point during the outage, the rig got scheduled a VHAR with a 4-day deadline, so immediately (and by design) went into EDF. However, it hasn't yet downloaded the data for the VHAR, so the EDF has resulted in - several hours crunching for Einstein, and no work for SETI at all! Ah well, such is life - LTD will sort it all out eventually. |
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
Now will someone at SETI please let us know what is going on? As usual there is no mention of anything in the technical newsletters or announcements. Just that "sah_assimilator_nonenh kryten Not Running" is on the server status page. Is that the problem? Can no one get it sorted? |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
Hi, That's not necessarily true. In this case if SAH is owed time due to LTD, then BOINC won't DL work from the other projects until all the current work onboard is done or is not runable, and then only DL one result at a time from the others. Of course, manual intervention with a project suspend will work around the problem, but may cause other debt problems down the road. My recommendation; if you still have work onboard, or run multiple projects it's probably better to just ride this one out. BOINC will automatically make up the lost resource share time sooner or later. Alinator |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Now will someone at SETI please let us know what is going on? As usual there is no mention of anything in the technical newsletters or announcements. Just that "sah_assimilator_nonenh kryten Not Running" is on the server status page. Is that the problem? Can no one get it sorted? If you check the other trouble threads currently running in this forum, you will find that most of us know what is going on. Seti has has a randomly occuring mount failure of some sort that prevents downloads. They will have to reboot the servers to correct it. I would expect it to happen fairly soon. The assimilator that is not running on the status page has nothing to do with it. I don't think it ever runs anymore, because it was for the old non-enhanced WUs, which we have not been crunching for some time. Sit tight, they should have it sorted out soon...... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Bob Neville Send message Joined: 3 Jan 00 Posts: 35 Credit: 7,451,208 RAC: 0 |
someone wake up the hamster!! :) words are the symbols of mental experience |
KB7RZF Send message Joined: 15 Aug 99 Posts: 9549 Credit: 3,308,926 RAC: 2 |
Now will someone at SETI please let us know what is going on? As usual there is no mention of anything in the technical newsletters or announcements. Just that "sah_assimilator_nonenh kryten Not Running" is on the server status page. Is that the problem? Can no one get it sorted? I'm sure they are well aware of the problem, and working on the problem now. Why should they take time away from fixing something just to post and update? When its all said and done, they will post something in the tech news, just as they always do. Relax, take a breath. Nothing you, nor I, or anyone else here have any control over. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
November 14, 2006 A configuration problem on our servers have caused workunit downloads to fail since yesterday afternoon. This has been fixed. However, we are bringing the whole project down for our regular Tuesday outage to back up our database. We should be back up in a few hours (22:00 UTC). "Freedom is just Chaos, with better lighting." Alan Dean Foster |
mikey Send message Joined: 17 Dec 99 Posts: 4215 Credit: 3,474,603 RAC: 0 |
Now will someone at SETI please let us know what is going on? As usual there is no mention of anything in the technical newsletters or announcements. Just that "sah_assimilator_nonenh kryten Not Running" is on the server status page. Is that the problem? Can no one get it sorted? I believe it happened in the middle of the night, when they were snoozing. Probably after being out all night with those coeds, you know how Master's Degree people are. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
It's my guess that his client has requested X seconds of work from seti and the order was filled, but just hasn't reached him yet, so the puter thinks it has X seconds on hand when infact it doesn't. The scheduler has already taken that into account and isn't requesting other work. Atleast, I think this is correct. This is a known issue in the current release, where it does CPU scheduling system-wide. The current beta does scheduling on a per-core basis. |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
Now will someone at SETI please let us know what is going on? As usual there is no mention of anything in the technical newsletters or announcements. Just that "sah_assimilator_nonenh kryten Not Running" is on the server status page. Is that the problem? Can no one get it sorted? Hmmm, I don't know. The INR-688 interface to Cogent "flatlined" yesterday around 3 PM Berkeley time, so it would seem they knew they were in trouble before they left yesterday (Also an afternoon time frame was mentioned in the news item). Still, the extra curricular activity factor may have played a part in how long it was out. ;-) Alinator |
keyboards Send message Joined: 14 Jul 00 Posts: 66 Credit: 492,766 RAC: 0 |
From the front page: November 14, 2006 Be patient, they are working on it. !!Stupidity should be PAINFUL!! |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.