Post-Weekend Roundup (Feb 05 2007)

Message boards : Technical News : Post-Weekend Roundup (Feb 05 2007)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 545633 - Posted: 13 Apr 2007, 20:20:29 UTC - in response to Message 545617.  

Additional information on Zombie creation. The records do not seem to be processed in report date order. They seem to be processed by download order instead. I watched this happen several times and have recorded an event as an example. In order the jobs to be run were dated May 10, April 21 and May 10 with run times of 7, 2 and 7 hours. When the next job was selected, the first in the list (May 10) was processed. The system has been running for several days now and is an Apple OS X PPC system. This is not a problem for me because I limit data to only one day worth of processing, but people loading a week or more worth of data could have problems with this and create Zombies by having jobs time out before being processed.

If you have three results to process, and only three, and you've got "connect every 1 days" then these three work units represent no deadline pressure at all. They will be done in the order they were assigned.

If BOINC sees that it can't return a work unit before the deadline, it processes in order by deadline instead -- and as long as the work estimate is reasonably accurate (that the 2 hour work unit will take 2 hours) and that the machine is on a reasonable amount of time per day, everything will get reported on time.

There are only a couple of reasons that might not be true:

1) Really big cache and lots of short deadlines.

2) Change in the operating patterns of this machine.

But with 16 hours of work, connect every day, and the closest deadline more than a week away, the odds of something getting orphaned is very small.
ID: 545633 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 545668 - Posted: 13 Apr 2007, 21:24:22 UTC

Avoiding orphaned data is the reason I limit myself to one day of work. Processing in received order can still bite. I had a World Community Grid work unit that uncovered a bug and required over 35 hours to complete when it should have taken less than 4 hours. It forced my system to stop swapping jobs till the work unit was complete. I just think it would make the code much cleaner to process by Report Deadline instead of placing special conditions on when the hurry up code can be used. If the special code miscalculates when Deadline code is used, you end up with expired work units. You can also have people that turn their system off over the week end that would cause the prediction software to fail because it was unable to run while the system was powered off. You would buy more time by processing early deadlines first.
ID: 545668 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 545695 - Posted: 13 Apr 2007, 22:13:10 UTC - in response to Message 545668.  

Avoiding orphaned data is the reason I limit myself to one day of work. Processing in received order can still bite. I had a World Community Grid work unit that uncovered a bug and required over 35 hours to complete when it should have taken less than 4 hours. It forced my system to stop swapping jobs till the work unit was complete. I just think it would make the code much cleaner to process by Report Deadline instead of placing special conditions on when the hurry up code can be used. If the special code miscalculates when Deadline code is used, you end up with expired work units. You can also have people that turn their system off over the week end that would cause the prediction software to fail because it was unable to run while the system was powered off. You would buy more time by processing early deadlines first.

Dena,

The big problem is CPDN. Deadlines for CPDN are on the order of a year for most folks.

If you process strictly by deadline order and gave CPDN a 50% resource share, you'd do no work on CPDN until debt or deadlines got close, then nothing but CPDN for a very long time, and then no more CPDN for months.

Otherwise, I'd agree -- but that only works if the work units are all measured in hours or days, not months and years.

-- Ned
ID: 545695 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 546024 - Posted: 14 Apr 2007, 15:18:11 UTC
Last modified: 14 Apr 2007, 15:26:35 UTC

I've had a project make an error of a factor of 65 in its crunching estimate. 3 hours of crunch time slowly became 197 hrs, on a two week deadline. I had work on hand for all the other six projects I was connected to at the time. Boinc managed to process everything else I had on the computer on time and managed to return the errant WU only 3 or 4 hr later than my due time. By then, the project had noticed the problem and made adjustments at their end to accept and credit the past due WU. That was on my slow Duron system. As long as you keep a reasonable sized cache there is little reason to worry that Boinc will have problems under normal circumstances and it can even adapt to some degree when the projects mess up.

I've never lost any credits or wasted CPU time because Boinc wasn't able to process and return work on time. As far as I'm concerned, the Boinc work scheduler as always been extremely efficient at doing its job as intended. I can only tip my hat to JM7 and the other developers who are continuously tweaking and improving it to remove the few 'special situations' glitches that remain.



Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 546024 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 546128 - Posted: 14 Apr 2007, 18:07:46 UTC
Last modified: 14 Apr 2007, 18:12:18 UTC

I had a situation as follows:

I had just set up a PIII 750 laptop for Boinc/Seti. My standard "home" computer at the time was a AMD Sempron 2400+, with a 4.5 day cache. For some reason, the BOINC client (5.4.11) thought that the LT had the same processing time as the "home" machine and downloaded results comensurate with the home machine's processing time. (About 20 WU's, each in the 7 hour range, for my "home" machine.) Unfortunately, those WU's take about 24-25 hours on the LT! Once I ran by deadline, (10 days for all the WU's) a) the BOINC client had a better idea of how long a WU takes to crunch on the LT; and b)I manually cancelled all the WU's left on the machine; (They showed as "compute error" in Berkeley.) reset the project; got more WU's; and haven't had a problem since. (This was back in January of this year.)

[edit to add] My home computer has been upgraded since then, BTW [/edit]
.

Hello, from Albany, CA!...
ID: 546128 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 546271 - Posted: 14 Apr 2007, 22:40:51 UTC - in response to Message 546128.  

I had a situation as follows:

I had just set up a PIII 750 laptop for Boinc/Seti. My standard "home" computer at the time was a AMD Sempron 2400+, with a 4.5 day cache. For some reason, the BOINC client (5.4.11) thought that the LT had the same processing time as the "home" machine and downloaded results comensurate with the home machine's processing time. (About 20 WU's, each in the 7 hour range, for my "home" machine.) Unfortunately, those WU's take about 24-25 hours on the LT! Once I ran by deadline, (10 days for all the WU's) a) the BOINC client had a better idea of how long a WU takes to crunch on the LT; and b)I manually cancelled all the WU's left on the machine; (They showed as "compute error" in Berkeley.) reset the project; got more WU's; and haven't had a problem since. (This was back in January of this year.)

[edit to add] My home computer has been upgraded since then, BTW [/edit]

did you copy the entire BOINC directory, or just the account_*.files? Or was it a clean install?


BOINC WIKI
ID: 546271 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 546645 - Posted: 15 Apr 2007, 15:01:49 UTC - in response to Message 546271.  
Last modified: 15 Apr 2007, 15:08:12 UTC

I had a situation as follows:

I had just set up a PIII 750 laptop for Boinc/Seti. My standard "home" computer at the time was a AMD Sempron 2400+, with a 4.5 day cache. For some reason, the BOINC client (5.4.11) thought that the LT had the same processing time as the "home" machine and downloaded results comensurate with the home machine's processing time. (About 20 WU's, each in the 7 hour range, for my "home" machine.) Unfortunately, those WU's take about 24-25 hours on the LT! Once I ran by deadline, (10 days for all the WU's) a) the BOINC client had a better idea of how long a WU takes to crunch on the LT; and b)I manually cancelled all the WU's left on the machine; (They showed as "compute error" in Berkeley.) reset the project; got more WU's; and haven't had a problem since. (This was back in January of this year.)

[edit to add] My home computer has been upgraded since then, BTW... both hardware (to Opteron 165) and Boinc (to 5.8.15)[/edit]


did you copy the entire BOINC directory, or just the account_*.files? Or was it a clean install?


Entirely clean install of BOINC on the LT... the LT came to me (it was used...) with Win XP Pro installed, (upgraded from Win Me) and Office 2K3 on it (although the installation seems faulty, the few times I've tried Office on the LT...)

The home machine runs XP Home Ed, and Win XP Pro x64, BTW (I dual boot...) Boinc is installed on x64, but I only run ClimatePrediction on x64...
.

Hello, from Albany, CA!...
ID: 546645 · Report as offensive
Previous · 1 · 2

Message boards : Technical News : Post-Weekend Roundup (Feb 05 2007)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.