Crunching in Chronological Order?

Message boards : Number crunching : Crunching in Chronological Order?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 676916 - Posted: 12 Nov 2007, 23:26:42 UTC - in response to Message 676910.  


Sorry Astro, I left out no 3 average CPU Efficiency which is 0.987436

Then those settings aren't the reason. If it started during an outage, did you happen to play with something and forget to set it back?? Like "no new tasks", did you accidently invoke "local prefs" to override web based, etc.
ID: 676916 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 676947 - Posted: 13 Nov 2007, 0:01:52 UTC - in response to Message 676916.  


Sorry Astro, I left out no 3 average CPU Efficiency which is 0.987436

Then those settings aren't the reason. If it started during an outage, did you happen to play with something and forget to set it back?? Like "no new tasks", did you accidently invoke "local prefs" to override web based, etc.


No, nothing was touched on my wife's machine. I do all my tinkering, testing etc on my main system (NET-1)
ID: 676947 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 677040 - Posted: 13 Nov 2007, 2:47:04 UTC - in response to Message 676947.  


Sorry Astro, I left out no 3 average CPU Efficiency which is 0.987436

Then those settings aren't the reason. If it started during an outage, did you happen to play with something and forget to set it back?? Like "no new tasks", did you accidently invoke "local prefs" to override web based, etc.


No, nothing was touched on my wife's machine. I do all my tinkering, testing etc on my main system (NET-1)

[humour mode]
You might just have found the problem "wife's machine". At one time this lady turned your brain to mush so badly that you asked her to marry you. If you could be handled so easily, how the hell is a silly piece of silicon going to survive untouched.
[/humour mode]
ID: 677040 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677146 - Posted: 13 Nov 2007, 12:14:19 UTC - in response to Message 677040.  

[humour mode]
You might just have found the problem "wife's machine". At one time this lady turned your brain to mush so badly that you asked her to marry you. If you could be handled so easily, how the hell is a silly piece of silicon going to survive untouched.
[/humour mode]


You could be right! 2 left with 1 at 76% crunched - AND its running normal - no high priority - but still no downloads!
ID: 677146 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677154 - Posted: 13 Nov 2007, 13:07:07 UTC - in response to Message 677146.  

[humour mode]
You might just have found the problem "wife's machine". At one time this lady turned your brain to mush so badly that you asked her to marry you. If you could be handled so easily, how the hell is a silly piece of silicon going to survive untouched.
[/humour mode]


You could be right! 2 left with 1 at 76% crunched - AND its running normal - no high priority - but still no downloads!


It just downloaded 3 new wu's then immediately switched to the 3rd one & started running in high priority.

This is really weird!!!
ID: 677154 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677166 - Posted: 13 Nov 2007, 13:32:34 UTC - in response to Message 677154.  

Talk about weird!

Rest project & it resent the 3 wu's aborted. Then it immediately went to the 3rd & started running in high priority.

Aborted again, deleted boinc, ran disk cleanup, defragged, ran antivirus then rebooted system.

Did clean install of boinc & it d/l'ed 6 new wu's & started running normal.

Could this be a bug in boinc?
ID: 677166 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 677168 - Posted: 13 Nov 2007, 13:35:09 UTC

break down "deleted boinc" for me
ID: 677168 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677175 - Posted: 13 Nov 2007, 13:53:17 UTC - in response to Message 677168.  

break down "deleted boinc" for me


Aborted wu's, exited boinc. control panel, add/remove, then deleted folder.

Cleaned up system, rebooted, reinstalled boinc.

Downloaded 6 wu's & started crunching normally. (Still is)

This system is totally clean, any new programs etc is 1st tested on mine, & if ok, a copy is loaded on to my wife's.

Since upgrading this system late Sept, nothing else has been done to it.
Only remaining upgrade is to add ram at the end of the month.

Nothing has been touched/changed on boinc & it has been running great until last weekend.
ID: 677175 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 677184 - Posted: 13 Nov 2007, 14:20:10 UTC

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.
ID: 677184 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677186 - Posted: 13 Nov 2007, 14:23:24 UTC - in response to Message 677184.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?
ID: 677186 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 677188 - Posted: 13 Nov 2007, 14:28:07 UTC - in response to Message 677186.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?

? If I knew of something, I'd of had you look at it.
ID: 677188 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677191 - Posted: 13 Nov 2007, 14:30:18 UTC - in response to Message 677188.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?

? If I knew of something, I'd of had you look at it.



Ok. Thanks for the help, it was appreciated.

Regards

PJ
ID: 677191 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 677199 - Posted: 13 Nov 2007, 14:50:38 UTC - in response to Message 677186.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?

Just one suggestion: if it happens again, you might try saving a copy of the BOINC directory as is before doing your uninstall (I assume if you hide a copy elsewhere with a different top-level directory name that the uninstall won't find it).

Then, if it starts back up healthy, and curiosity extends far enough, you'd have comparison good/bad copies to check if any of us had any bright ideas to offer.

Or you could even use the file compare feature of a program such as Textpad to look. Unfortunately, there would probably be such a mass of unimportant differences that the comparison method would only be helpful to check a specific suspicion.

It pretty much has to be a bad piece of state, and, as Astro has suggested, the .xml files seem likely the harbor for state that persists after rebooting. Unless, of course, BOINC uses the registry for such things.

Thanks for sharing your experience. It sure was odd.

ID: 677199 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 677234 - Posted: 13 Nov 2007, 15:33:01 UTC - in response to Message 677188.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?

? If I knew of something, I'd of had you look at it.

Tony, do you think a cc_config.xml with <work_fetch_debug> turned on might reveal something useful if it happens again?
                                                         Joe
ID: 677234 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 677256 - Posted: 13 Nov 2007, 16:23:09 UTC - in response to Message 677199.  

OK, you uninstalled, then deleted the folder creating a fresh install. most the xml's aren't removed by simply "uninstalling" alone. So, when you deleted the folder, you deleted the xml file which was causing what you were seeing. Must have been some record/setting we didn't find.


Richard put me onto RDCF. As well as the info everyone has provided, is there anything else you can think of that I should watch out for should it happen again?

Just one suggestion: if it happens again, you might try saving a copy of the BOINC directory as is before doing your uninstall (I assume if you hide a copy elsewhere with a different top-level directory name that the uninstall won't find it).

Then, if it starts back up healthy, and curiosity extends far enough, you'd have comparison good/bad copies to check if any of us had any bright ideas to offer.

Or you could even use the file compare feature of a program such as Textpad to look. Unfortunately, there would probably be such a mass of unimportant differences that the comparison method would only be helpful to check a specific suspicion.

It pretty much has to be a bad piece of state, and, as Astro has suggested, the .xml files seem likely the harbor for state that persists after rebooting. Unless, of course, BOINC uses the registry for such things.

Thanks for sharing your experience. It sure was odd.


You're more than welcome. I have found the N/C board to be of tremendous value. These guys deserve every plaudit we can muster.
ID: 677256 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 677278 - Posted: 13 Nov 2007, 17:09:31 UTC - in response to Message 677234.  

Tony, do you think a cc_config.xml with <work_fetch_debug> turned on might reveal something useful if it happens again?

rr_simulation will show calculations like "misses deadline by NNN" and so on, so should atleast in theory be helpful.

work_fetch_debug will show then "project not contactable" and so on, so will also be helpful in cases don't asks for more work.

The disadvantage of both of these are they'll very quickly generate a large log-file, and especially rr_simulation will be difficult to understand...


While doesn't tell the reason, for anyone running BOINC v5.10.14 or later a fairly easy to remember rule is:
As long as atleast one Task for a project is marked "High priority" == this project blocked from asking for more work, except if idle cpu.


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 677278 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 677337 - Posted: 13 Nov 2007, 21:28:23 UTC - in response to Message 677278.  


As long as atleast one Task for a project is marked "High priority" == this project blocked from asking for more work, except if idle cpu.

Very interesting: I've not noticed this marking, and don't know where it would show up if I somehow got into this state. Would it be in the status column of the Tasks tab of BOINCmgr?

ID: 677337 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 677360 - Posted: 13 Nov 2007, 22:22:51 UTC - in response to Message 677337.  

Very interesting: I've not noticed this marking, and don't know where it would show up if I somehow got into this state. Would it be in the status column of the Tasks tab of BOINCmgr?

Yes, on Tasks tab, it will show-up as "Running, High Priority".

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 677360 · Report as offensive
Profile Mac-Nic
Volunteer tester
Avatar

Send message
Joined: 29 Jun 00
Posts: 165
Credit: 551,008
RAC: 0
Belgium
Message 677504 - Posted: 14 Nov 2007, 1:58:41 UTC

Another one to bear in mind.
I did the following test with Boinc ver 5.10.20

1 suspended an wu deadlined at 08-01-2008
2 raised the cache from 3 to 4 days with the connection time set at 0.001 day
3 crunched 16 wu's
4 observation: no downloads
5 reactivated this 08-01-2008 wu
6 result: download back to normal

Conclusion: Boinc feature to prevent the suspended task times out

To say it with Ingleside's words
As long as atleast one Task for a project is marked "Suspended" == this project blocked from asking for more work.
ID: 677504 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 677535 - Posted: 14 Nov 2007, 4:06:43 UTC - in response to Message 676632.  


At the same time, it is not d/l'ing any more wu's, whereas my other hosts are.

Since upgrading to newer BOINC versions (5.10.28 currently, but saw this before that), I've been seeing incidences in which it stops requesting work on one project on one machine, working the queue down to very near zero before requesting work.

I've mentioned a couple of instances here, and John McLeod has responded with a list of reasons for stopping fetch. However in the cases in point, none of the reasons he supplied were at hand.

Whether this is some sort of bug dependence on some internal state not obviously dependent on actual circumstances, or is just yet another intentional circumstance which applies to me but not mentioned by John, I don't know.

One quite recent example:
Q6600 Windows XP boinc 5.10.28
4 % SETI share, 96 % Einstein share
Short term and long term debt between projects in balance to less than 10
Normal alternation between 4 Einstein instances and 3 Einstein plus one SETI continued unaltered (i.e. mine did not look like EDF)
"Computer is connected to the Internet about every " 0.002 days
"Maintain enough work for an additional" 7.26 days
SETI Result Duration Correction Factor in this period .14 to .17

While Einstein work request continued, typically getting one result at time at intervals of about one to three hours, SETI work request just stopped. The amount of SETI work in queue, as estimated by BOINCView, declined from its usual about 150 hours (as prorated for resource share and such) to about 10 hours (near zero, really, given the resource share pro-rating) before a work request was finally generated.

I did not observe an abnormality of work order choice in these instances, but I was not looking for it.

This is not a match to the situation Sirius_B reports here, but is one in the category of work fetch oddities.

On the topic of "why would anyone care?", in this particular case it defeated my attempt to maintain a diverse stock of unprocessed work units of varied angle ranges and observation dates in order to provide a quick assessment of the correctness and relative speed of a hoped-for new SETI science ap optimized code release. By the time fetch resumed, all of the stock I had husbanded for weeks was gone.

On the other hand, fetch did indeed resume, without any violent actions on my part, and absent some special interests such as the assessment I mentioned above, there was not much of a problem here. I'm just adding to the observation pool.

If there is 100 hours of S@H work with a 4% resource share, the RR simulator is going to believe that it will not complete for 2500 hours - reason enough to stop work fetch and run in EDF.


BOINC WIKI
ID: 677535 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Crunching in Chronological Order?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.