Ghost WU issue (and some talk about deadlines)

Message boards : Number crunching : Ghost WU issue (and some talk about deadlines)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 571813 - Posted: 19 May 2007, 22:41:22 UTC - in response to Message 571806.  
Last modified: 19 May 2007, 22:46:10 UTC



I have a couple of my rigs running that way right now. Renamed the app info file so it was inactivated, restarted Boinc, up/downloads now work, and they are still crunching with the optimized app. What seems to be the trick here is that this will continue to work unless the rig totally runs out of Seti WUs, at which point I think they would go back to the stock crunching app.
But as long as there are WUs in process using Chicken, it seems that the new ones downloaded will also continue to crunch with Chicken.

And the kitties say.....'Sure are a lot of spoons stirring that Chicken soup right now!'

Great, that's what I was originally going to try - did it upload/download on it's own? I think I agree with you too on what will happen with/without WU's. Going back to that plan again, lol.


Yes, it seems to be working OK on it's own. One of the rigs was in EDF mode, so it was only reporting results as they completed, but when it got down to 2 WUs left, Boinc decided it was time to get some more work, and after a couple of tries it downloaded 1 WU. Then it tried again and got about 17 more. They were all shortys, so it is crunching them now in EDF mode again, and not asking for more work to fill the cache. So all appears to be working, but until it happens to download WUs that do not toss it into EDF processing, it's not gonna build the cache up.


I'm not familar with EDF mode, can you help me out on that one.

EDF is 'earliest date first', when Boinc thinks that a WU is due sooner than it will process in normal 'round robin' processing, so it suspends work on the WU is is crunching at the moment (shows 'waiting to run'), and starts crunching the new WU first. At the moment, it also appears to suspend requesting new work as well, even if it should be filling the cache. I am not sure if it always did that and I just never noticed it when the cache was fairly full.

OK, thanks, that's what mine do too, when a more current expire date is downloaded it switch's to that one. I've only been cruching since Feb 07 but it seems to me it's always done it that way on my computer. I assumed it to be normal since it didn't want the more recent expirations not getting crunched.
[/quote]

The switching to a WU that is due soon is entirely normal, and is part of the way that Boinc manages it's workload. What I don't know is if it always suspended work requests when in EDF processing. It may have.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 571813 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 571816 - Posted: 19 May 2007, 22:43:50 UTC - in response to Message 571810.  

WooooHoooo, I just went back to see how much time was left on my 2 current crunch's and low and behold, another one was downloaded so the process is working on my computer.

Lordy lordy, I'm going to go see the boys and have some Buds and watch some NASCAR. (practicing my drawwwww and spitttting)

WooHoo, and just got a second one!!!


Yeah, and I gotta pull myself away from this monitor and keyboard for a few hours to go over to the GF's place for some dinner and DVDs.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 571816 · Report as offensive
Profile Philadelphia
Volunteer tester
Avatar

Send message
Joined: 12 Feb 07
Posts: 1590
Credit: 399,688
RAC: 0
United States
Message 571817 - Posted: 19 May 2007, 22:45:02 UTC - in response to Message 571816.  

WooooHoooo, I just went back to see how much time was left on my 2 current crunch's and low and behold, another one was downloaded so the process is working on my computer.

Lordy lordy, I'm going to go see the boys and have some Buds and watch some NASCAR. (practicing my drawwwww and spitttting)

WooHoo, and just got a second one!!!


Yeah, and I gotta pull myself away from this monitor and keyboard for a few hours to go over to the GF's place for some dinner and DVDs.


I'm outttta here, hopefully when I get back the process is still going. Enjoy your dinner and DVD's.

ID: 571817 · Report as offensive
Russell McGaha
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 11
Credit: 70,871,448
RAC: 106
United States
Message 571820 - Posted: 19 May 2007, 22:48:27 UTC

Folks;
I can confirm that on Mac's with Boinc 5.9.10; just renaming app_info.xml will result in an EMPTY client D/l WU but NOT D/l the 'current' SAH ppc app. [1 G4 laptop, 1G4 Desktop, 2 Intell Desktops, & 1 G5 Desktop] all completely empty of SAH WU's renamed app_info, WU's were D/l and the new WU's used the optomized aspps to start crunching. 1 G4 Desktop still having trouble getting WU's even though it's running stock 5.9.10.

Just more data points.
Russell
ID: 571820 · Report as offensive
Profile cbocksta

Send message
Joined: 30 Nov 01
Posts: 5
Credit: 9,482,031
RAC: 0
Belgium
Message 571824 - Posted: 19 May 2007, 22:54:25 UTC - in response to Message 570201.  

Someone have a solution (like rename the app_info.xml) for running SETI on a SPARC SOLARIS, i have make the solution but i receive the following message :

[SETI@home] Message from server: platform 'sparc64-sun-solaris' not found

after restoring the app_info.xml, i not receive any new work with the folowing messages :

2007-05-20 00:12:37 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2007-05-20 00:12:37 [SETI@home] Reason: Requested by user
2007-05-20 00:12:37 [SETI@home] Requesting 17280 seconds of new work
2007-05-20 00:13:13 [SETI@home] Scheduler request failed: HTTP internal server error
2007-05-20 00:13:13 [SETI@home] Scheduler request failed: HTTP internal server error
2007-05-20 00:13:13 [SETI@home] Deferring scheduler requests for 4 minutes and 0 seconds
2007-05-20 00:17:15 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2007-05-20 00:17:15 [SETI@home] Reason: Requested by user
2007-05-20 00:17:15 [SETI@home] Requesting 17280 seconds of new work
2007-05-20 00:17:45 [SETI@home] Scheduler request failed: HTTP internal server error
2007-05-20 00:17:45 [SETI@home] Scheduler request failed: HTTP internal server error
2007-05-20 00:17:45 [SETI@home] Deferring scheduler requests for 18 minutes and 13 seconds

ID: 571824 · Report as offensive
Profile Speedy67 & Friends
Volunteer tester
Avatar

Send message
Joined: 14 Jul 99
Posts: 335
Credit: 1,178,138
RAC: 0
Netherlands
Message 571830 - Posted: 19 May 2007, 23:05:43 UTC - in response to Message 571738.  

Most of the app_info.xml files report that they're compatible with everything from 5.12 to 5.17 -- from the past into the future.

What happens if one "edits down" the app_info.xml to just 5.15?

-- Ned


Tried that on my Ubuntu machine, all downloaded workunits (40 of them) instantly errored out because they were downloaded as 5.12 workunits without the app_info.xml... don't know why, because _with_ the app_info.xml it only downloads 5.17 workunits (using Simon's 2.2B client).

Greetings,
Sander



ID: 571830 · Report as offensive
crazyrabbit1

Send message
Joined: 17 Sep 06
Posts: 35
Credit: 2,282,319
RAC: 0
Germany
Message 571850 - Posted: 19 May 2007, 23:53:17 UTC

At the moment i'am running the chicken apps and still get work if i want to get, i think msattler realized it first but even after a restart it sems to work also, i just make a test.

1. how to get new work is clear, i think.
Stop Boinc, rename app_info.xml, start boinc, download the original app and get new work.
2. stop boinc again, rename app_info.xml to its original name and start boinc again. now it crunch with the opp app.
3. stop boinc again and rename app_info.xml again.
4. Restart boinc and it is running with the opp app and get work like using the original app.
It works on my site also after a restart of the pc.

I hope this helps until the problem on the server can fixed.
ID: 571850 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 571854 - Posted: 20 May 2007, 0:06:40 UTC
Last modified: 20 May 2007, 0:15:13 UTC

I tried the app_info.xml workaround and at the moment I'm happily crunching with Chicken's app but without app_info.xml. After reading this thread (and a bunch of others) I took a look at client_state.xml and sched_reply_setiathome.berkeley.edu.xml and I think I can make a guess as to why this works the way it does.

In client_state.xml I have the following:
<app_version>
    <app_name>setiathome_enhanced</app_name>
    <version_num>515</version_num>
    <file_ref>
        <file_name>KWSN_2.2B_SSE_Ben-Joe.exe</file_name>
        <main_program/>
    </file_ref>
</app_version>


and in scheduler's reply:
<app_version>
    <app_name>setiathome_enhanced</app_name>
    <version_num>515</version_num>
    <file_ref>
       <file_name>setiathome_5.15_windows_intelx86.exe</file_name>
       <main_program/>
    </file_ref>
    <file_ref>
        [some other files]
    </file_ref>
    <platform>windows_intelx86</platform>
</app_version>


I don't know if I'm jumping to conclusions here but it seems like BOINC just compares version numbers and since they are equal it doesn't bother to look at the filenames and just keeps whatever it originally had.

Some of you said that it would download the stock app once it starts to crunch fresh WUs but I don't think it's going to do that either. One more clip:
<workunit>
    <name>04mr05ab.17213.21808.934666.3.176</name>
    <app_name>setiathome_enhanced</app_name>
    <version_num>515</version_num>
        [snip]
</workunit>


All there is is the app name and version number, no executable names.

And just to make my post even longer I quote this:


Note: if you decide to switch back to using the project-supplied executables, you must delete the app_info.xml file, then reset the project.


-Juha

[edit] Well this is one generous board. I used pre-tags to get nice formatting and it gave me all those linebreaks too. [/edit]
ID: 571854 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 571862 - Posted: 20 May 2007, 0:22:46 UTC - in response to Message 571569.  

...
What I'm not clear on is when Boinc will decide to use the stock app instead of the optimized one.


The Apps seems to be mentioned in the client_state.xml, somehow , in my case, KWSN_2.2B_SSE2-P4_Ben-Joe.exe has inserted itself where the standard app used to be... worth a closer look



I also seem to recall that the first time I restarted Boinc without the app info file, some of the rigs went through some sort of update, downoading Seti program files,


This is what happened yo my machine. However, the optimised app is apparently doing the crunching. (I renemed the app_info.xml to oldapp_info.xml, restarted, and got the WUs. Then stopped the service again, renamed oldapp_info.xml to app_info.xml and restarted agan)
ID: 571862 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14010
Credit: 208,696,464
RAC: 304
Australia
Message 571870 - Posted: 20 May 2007, 0:34:32 UTC - in response to Message 571749.  

My cashe is set on 10 days, so I was a bit surprised I only got 2 WU's.

Try setting it to something more realistic such as 4.

Grant
Darwin NT
ID: 571870 · Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 22 Apr 04
Posts: 758
Credit: 27,771,894
RAC: 0
United States
Message 571901 - Posted: 20 May 2007, 1:10:02 UTC - in response to Message 571749.  

My cashe is set on 10 days, so I was a bit surprised I only got 2 WU's. I have a duo core so I guess it gave me one per core?

You have run into the "panic" problem (aka EDF, I think). S@H has WUs with a deadline as short as 4 days. The way the scheduling logic works, if you have your connect time set to anything longer than "shortest deadline -3", it will go into panic mode, and not download any more units. Short version: The longer you set your connect time, the smaller the queue will be. I know, it's backwards. But it is what it is.

The solution is to upgrade to 5.9.11, set your connect time to something like ".1", and set the "Maintain enough work for an additional" to something like "5". There is a risk that you will download more work than can be returned before the deadline. For example, if you downloaded 100% WUs with a 4 day deadline. So keep an eye on your queue to make sure you don't get into trouble.
Dublin, California
Team: SETI.USA
ID: 571901 · Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 22 Apr 04
Posts: 758
Credit: 27,771,894
RAC: 0
United States
Message 571907 - Posted: 20 May 2007, 1:13:43 UTC - in response to Message 571720.  

Crikey! it works on Macs :O

Oh, were we waiting for confirmation? Sorry, I've been doing this successfully on my macs for over a day now.

The problem is with the way the server deals with an app_info.xml file. So that makes the problem and solution platform independent.
Dublin, California
Team: SETI.USA
ID: 571907 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 571915 - Posted: 20 May 2007, 1:20:13 UTC - in response to Message 571901.  

My cashe is set on 10 days, so I was a bit surprised I only got 2 WU's. I have a duo core so I guess it gave me one per core?

You have run into the "panic" problem (aka EDF, I think). S@H has WUs with a deadline as short as 4 days. The way the scheduling logic works, if you have your connect time set to anything longer than "shortest deadline -3", it will go into panic mode, and not download any more units. Short version: The longer you set your connect time, the smaller the queue will be. I know, it's backwards. But it is what it is.

The solution is to upgrade to 5.9.11, set your connect time to something like ".1", and set the "Maintain enough work for an additional" to something like "5". There is a risk that you will download more work than can be returned before the deadline. For example, if you downloaded 100% WUs with a 4 day deadline. So keep an eye on your queue to make sure you don't get into trouble.


Excuse me, but I didn't think that Boinc was using that setting anywhere yet. It says in the update preferences page that it requires 5.10 client, which AFAIK doesn't exist yet. Have they already implemented it in 5.9.11?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 571915 · Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 22 Apr 04
Posts: 758
Credit: 27,771,894
RAC: 0
United States
Message 571949 - Posted: 20 May 2007, 1:49:20 UTC - in response to Message 571915.  
Last modified: 20 May 2007, 1:50:22 UTC

Excuse me, but I didn't think that Boinc was using that setting anywhere yet. It says in the update preferences page that it requires 5.10 client, which AFAIK doesn't exist yet. Have they already implemented it in 5.9.11?

BOINC uses odd dot numbers for the development versions, and then changes it to an even number for the release versions. In other words, all the 5.9.* = 5.10.*.

5.9.* is just the development version of 5.10.*. So they will have the same feature set.
Dublin, California
Team: SETI.USA
ID: 571949 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 571953 - Posted: 20 May 2007, 1:54:21 UTC - in response to Message 571949.  

Excuse me, but I didn't think that Boinc was using that setting anywhere yet. It says in the update preferences page that it requires 5.10 client, which AFAIK doesn't exist yet. Have they already implemented it in 5.9.11?

BOINC uses odd dot numbers for the development versions, and then changes it to an even number for the release versions. In other words, all the 5.9.* = 5.10.*.

5.9.* is just the development version of 5.10.*. So they will have the same feature set.


Thank you for the clarification!

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 571953 · Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 22 Apr 04
Posts: 758
Credit: 27,771,894
RAC: 0
United States
Message 571954 - Posted: 20 May 2007, 1:55:59 UTC - in response to Message 571953.  

5.9.* is just the development version of 5.10.*. So they will have the same feature set.


Thank you for the clarification!

I just learned myself. Check it out here more a bit more information:

http://boinc.berkeley.edu/dev/forum_thread.php?id=1818
Dublin, California
Team: SETI.USA
ID: 571954 · Report as offensive
Idefix
Volunteer tester

Send message
Joined: 7 Sep 99
Posts: 154
Credit: 482,193
RAC: 0
Germany
Message 571997 - Posted: 20 May 2007, 2:52:45 UTC - in response to Message 571854.  

I don't know if I'm jumping to conclusions here but it seems like BOINC just compares version numbers and since they are equal it doesn't bother to look at the filenames and just keeps whatever it originally had.

It looks like you are right.

Some users are using a different version number in their app_info.xml (e. g. those users who are using the app_info.xml which comes with the Chicken apps). The version number does not match anymore. Therefore the client starts to download the stock application if you remove/rename the app_info.xml.

Regards,
Carsten
ID: 571997 · Report as offensive
Odysseus
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 1808
Credit: 6,701,347
RAC: 6
Canada
Message 572071 - Posted: 20 May 2007, 6:32:02 UTC - in response to Message 571731.  

Are you sure it is running the right app? I don't know much about [modern] Macs . sorry

I’m not sure what you mean by “the right app”. It is the optimized app I’ve been using all along, but now BOINC’s Tasks tab says it’s the stock v5.13. The host is only halfway through the batch of WUs it downloaded, and none have been reported yet, so I don’t know what BOINC will tell the servers about it.
ID: 572071 · Report as offensive
Odysseus
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 1808
Credit: 6,701,347
RAC: 6
Canada
Message 572081 - Posted: 20 May 2007, 6:38:51 UTC - in response to Message 571742.  

Speculation on my part: since you mention Beta, do you have one of those multi-version app_info, with a version_number for stock and a version_number for Beta (and possibly some spares as well), all referring to the same application?

Yes, but just the two entries: one for v5.13 and one for v5.17 because of the ‘crossed wires’ with Beta. The tasks and results crunched by the optimized app have always been labelled v5.17 up to now.
ID: 572081 · Report as offensive
Profile Misfit
Volunteer tester
Avatar

Send message
Joined: 21 Jun 01
Posts: 21804
Credit: 2,815,091
RAC: 0
United States
Message 572107 - Posted: 20 May 2007, 7:03:44 UTC

200th GHOST!
me@rescam.org
ID: 572107 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · Next

Message boards : Number crunching : Ghost WU issue (and some talk about deadlines)


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.