Message boards :
Number crunching :
Ghost WU issue (and some talk about deadlines)
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next
| Author | Message |
|---|---|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
I keep downloading WU's from March 4, 2005 and all of the ones I am returning, have "already been reported as success." half right, you won't get new ghosts if you have 'No New Work' set. only when you need a work topup would you need to do the appinfo thing and allow new work, when you do this fill a cache, so you don't have to do it often. If you are running Boinc as a service on Windows, and If you are comfortable with batch files in Windows (.cmd text files), then you can take a look at my profile for two versions of SEMI-automated method. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Jason Safoutin Send message Joined: 8 Sep 05 Posts: 1386 Credit: 200,389 RAC: 0
|
I keep downloading WU's from March 4, 2005 and all of the ones I am returning, have "already been reported as success." Fill a cache?? What? And the WU above was received AFTER I renamed the app_info. "By faith we understand that the universe was formed at God's command, so that what is seen was not made out of what was visible". Hebrews 11.3 |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
I interpret the error message as that your boinc already reported it but didn't get an acknowledgment back, so tried to report it again ... maybe I'm missing something but it should be all good "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Jason Safoutin Send message Joined: 8 Sep 05 Posts: 1386 Credit: 200,389 RAC: 0
|
It actually says refused the result...because it was already reported...maybe I read it wrong? "By faith we understand that the universe was formed at God's command, so that what is seen was not made out of what was visible". Hebrews 11.3 |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
04mr05ab.17213.23921.909646.3.57_2 is reported as "Success" "Done, in your results. It is an ambiguous error message I reckon :D [reported by an AMD Sempron 'Manila', CompterID: 3188650] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
littlegreenmanfrommars Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0
|
OK... After Jason pointed out the ghost WU issue to me in another thread... thanks, mate... The Ghost WU issue has been around for a while now, but just not in such large numbers. I've noted before that some crunchers hadinordinately large lists of results, which continued growing. I just haven't seen so many affected users at one time. The issue has hit me before, too. What I did was detach and re-attach. That had the effect of cancelling the ghost WUs, and I got new ones. HOWEVER!!!!!!!!!!!! In the numbers I see at present, I am hesitant to recommend the above course of action for all, as it would cause incredible overhead in bandwidth/server time.
|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
OK... Certainly one possibility, however, keep in mind that clients with full caches tend to not hammer the server, whereas (not)getting ghosts means a fairly immediate retry, further draining the results ready for download. Clearing the ghosts also makes them immediately reavailable (to fill caches) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
littlegreenmanfrommars Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0
|
OK... Sure, However, my single cruncher already has 74 ghosts to it's name... I've seen this issue cause lists of ghost WUs rise to thousands. Multiply that by thousands or more affected machines, and we could have another outage. At least, that's my feeling. I've renamed the app_info.xml file, and have the following log: 19/05/2007 11:33:29 PM|SETI@home|Sending scheduler request: Requested by user 19/05/2007 11:33:29 PM|SETI@home|Requesting 170459 seconds of new work 19/05/2007 11:33:34 PM|SETI@home|Scheduler RPC succeeded [server version 509] 19/05/2007 11:33:34 PM|SETI@home|Deferring communication for 11 sec 19/05/2007 11:33:34 PM|SETI@home|Reason: requested by project 19/05/2007 11:33:34 PM|SETI@home|Deferring communication for 1 min 0 sec 19/05/2007 11:33:34 PM|SETI@home|Reason: no work from project 19/05/2007 11:34:35 PM|SETI@home|Fetching scheduler list 19/05/2007 11:34:40 PM|SETI@home|Master file download succeeded 19/05/2007 11:34:45 PM|SETI@home|Sending scheduler request: To fetch work 19/05/2007 11:34:45 PM|SETI@home|Requesting 170470 seconds of new work 19/05/2007 11:36:20 PM|SETI@home|Scheduler request failed: HTTP bad gateway 19/05/2007 11:36:20 PM|SETI@home|Deferring communication for 1 min 0 sec 19/05/2007 11:36:20 PM|SETI@home|Reason: scheduler request failed 19/05/2007 11:37:20 PM|SETI@home|Sending scheduler request: To fetch work 19/05/2007 11:37:20 PM|SETI@home|Requesting 170485 seconds of new work 19/05/2007 11:37:25 PM|SETI@home|Scheduler RPC succeeded [server version 509] 19/05/2007 11:37:25 PM|SETI@home|Deferring communication for 11 sec 19/05/2007 11:37:25 PM|SETI@home|Reason: requested by project 19/05/2007 11:37:25 PM|SETI@home|Deferring communication for 1 min 0 sec 19/05/2007 11:37:25 PM|SETI@home|Reason: no work from project 19/05/2007 11:38:26 PM|SETI@home|Sending scheduler request: To fetch work 19/05/2007 11:38:26 PM|SETI@home|Requesting 170461 seconds of new work 19/05/2007 11:39:26 PM|SETI@home|Scheduler RPC succeeded [server version 509] 19/05/2007 11:39:26 PM|SETI@home|Deferring communication for 11 sec 19/05/2007 11:39:26 PM|SETI@home|Reason: requested by project 19/05/2007 11:39:26 PM|SETI@home|Deferring communication for 1 min 0 sec 19/05/2007 11:39:26 PM|SETI@home|Reason: no work from project The workaround seems to have partially worked, as master file download has succeeded, according to the log. (I renamed app_info.xml to oldapp_info.xml) I'm intrigued, though, by the "No work from project" message. My results list is still showing 74 in progress, which is about 15 days' work for my machine, if it crunches only S@h. I think I'll have to wait a while, to see if it clears. If not, I'll detach and re-attach, even though I'm loathe to do so, unless there's a better plan surfaces in the next day or so.
|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
... Could be the Ready to send queue running dry, two splitters are shown as 'Not Running'. [303,978 Ready to Send @ 1 hour ago, was up around 600k before .... ] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874
|
Two points: 1) Nothing that we're doing here will get the existing ghost results sent to anyone's computers. What we're trying to do is (a) avoid creating any new ghost results (set 'no new work' while running with an app_info.xml file), and (b) download new results to keep the CPUs warm (run the workround). 2) I doubt we've run the queue dry yet, but remember that bit from Matt's final post before his vacation: ....And our old friend "slow feeder query" is back, probably just being aggravated by the heavy load. I suspect that the 'no new work' messages just indicate that the feeder cache has temporarily run dry, not the whole queue. |
Stan Pleban Send message Joined: 16 Jun 00 Posts: 13 Credit: 216,111 RAC: 0
|
have detached and re-attached twice already...it has cleared the Ghost WU's shown as Client Detached and they were resent out to other users.... Once you get your download of WU's into your task manager, then exit BOINC, put in the 2.2B files in SETI, then start Boinc again...but put "no new work" now on SETI, I am hoping that will stem the flow of the Ghost WU's...I admit I should have done that sooner.. user stats |
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 51583 Credit: 1,018,363,574 RAC: 1,004
|
Silly question..... We know the server code has been broken by whatever changes were recently made. Can't the old server code be put back in place until Matt can analyze and fix what went wrong? You do keep a copy of programs before you modify them....correct? "Time is simply the mechanism that keeps everything from happening all at once."
|
littlegreenmanfrommars Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0
|
... Situation is changing rapidly: Now have 86 results in the list, but downloads of program and a dozen WUs are in the "transfers" tab of BOINC. They appear to have stalled. Keeping an eye on it.
|
littlegreenmanfrommars Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0
|
Have a dozen S@h WUs now in cache, but the ghosts haven't been cleaned up. I suppose I'll just have to wait for them to time out.
|
BORG Send message Joined: 3 Aug 99 Posts: 305 Credit: 6,157,052 RAC: 0
|
I don't know if others are experiencing this. But after renaming the app_info.xml file I was able to download and upload without renaming it back. I'm running optimized and the optimized client continues to run regardless of the app_info.xml file being present or not. WU's are authenticating and all seem to be working fine for now.
|
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 51583 Credit: 1,018,363,574 RAC: 1,004
|
I don't know if others are experiencing this. But after renaming the app_info.xml file I was able to download and upload without renaming it back. I'm running optimized and the optimized client continues to run regardless of the app_info.xml file being present or not. I ran into this when I ran optimized over on Beta by mistake (don't do it!) and removed to app info file to go back to normal processing. What happens is that the WUs downloaded when the app info file was in place will run with the optimized app, but any new work will revert back to the stock app. "Time is simply the mechanism that keeps everything from happening all at once."
|
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874
|
I don't know if others are experiencing this. But after renaming the app_info.xml file I was able to download and upload without renaming it back. I'm running optimized and the optimized client continues to run regardless of the app_info.xml file being present or not. Strangely, that is what I would expect and what I understand from the documentation. But it ain't so. I was using my own workround this morning, and downloaded 100 new WUs. 44 of them were the shortest deadline possible, so coupled with my 3-day connect setting, they went straight into EDF and started crunching - on all 8 cores - while I was still downloading. App_info was still disabled - I hadn't restarted BOINC since the successful work fetch request - and yet Task Manager confirmed that the active applications were all Chicken 2.2B Go figure. Edit - case in point: result 535578535 |
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 51583 Credit: 1,018,363,574 RAC: 1,004
|
I don't know if others are experiencing this. But after renaming the app_info.xml file I was able to download and upload without renaming it back. I'm running optimized and the optimized client continues to run regardless of the app_info.xml file being present or not. Actually, right now I'm not sure how this works. I had a rig this morning that had no Seti work. Took out the app info file, got some WUs. Put the app info file back and started crunching. Took the app info file back out, and it looks like it's still using the optimized app. Please note that each time, Boinc was stopped and restarted. When you start Boinc with the xml file in place, does it then tag all work in the cache to run with the optimized app, or just the ones it happens to be processing at the moment? If all the WUs get tagged, you could remove the app info xml file, get work, replace the file, start crunching, and remove it again to allow comms with the host as usual. What I'm not clear on is when Boinc will decide to use the stock app instead of the optimized one. "Time is simply the mechanism that keeps everything from happening all at once."
|
BORG Send message Joined: 3 Aug 99 Posts: 305 Credit: 6,157,052 RAC: 0
|
I don't know if others are experiencing this. But after renaming the app_info.xml file I was able to download and upload without renaming it back. I'm running optimized and the optimized client continues to run regardless of the app_info.xml file being present or not. I honestly don't know what's going on, only that additional wu's have been download since yesterday. And with the app_info.xml file still renamed as app_info.xml_rename, crunching continues with the newly downloaded wu's with the optimized client. I've renamed the stock app so it's not running. Boinc has not downloaded a new one either. So I don't get it.....
|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0
|
... The Apps seems to be mentioned in the client_state.xml, somehow , in my case, KWSN_2.2B_SSE2-P4_Ben-Joe.exe has inserted itself where the standard app used to be... worth a closer look "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2026 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.