WFV- Worst case solution

Message boards : Number crunching : WFV- Worst case solution
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Grenadier
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 63
Credit: 5,445,784
RAC: 0
United States
Message 155688 - Posted: 23 Aug 2005, 15:48:03 UTC

Wouldn't the new client come with a new version number (4.20, let's say), that would trump the old entries in app_info.xml? So, worrying that people will keep using the old optimized app is not a problem, since their old optimized client will not be used when the new version gets rolled out. They would need to install a new optimized client that was associated with the new version number.
ID: 155688 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 155689 - Posted: 23 Aug 2005, 15:50:16 UTC - in response to Message 155686.  

I also note increasingly that the pending units have two and three other accepted results but are still out there as pending.

Fear NOT. The second you "reported" the returned WUs, an entry was made into the DB with the date and time. It won't be overdue.

Unless, ofcourse I get proven wrong. Has happened you know?
ID: 155689 · Report as offensive
Profile Tern
Volunteer tester
Avatar

Send message
Joined: 4 Dec 03
Posts: 1122
Credit: 13,376,822
RAC: 44
United States
Message 155693 - Posted: 23 Aug 2005, 15:51:33 UTC - in response to Message 155686.  

You know, I am beginning to think a piece of that worst case scenario is happening. That is, as the WFV queue includes older and older submitted work units, when they finally get processed, no credit gets awarded as the submitted work is defined as past due.


Past due = RETURNED after the deadline. Not VALIDATED after the deadline. It makes no difference if the work is validated the day it's returned or six months later, as long as it was returned by you before the deadline, and is correct, you'll get credit...
ID: 155693 · Report as offensive
Profile Speedy67 & Friends
Volunteer tester
Avatar

Send message
Joined: 14 Jul 99
Posts: 335
Credit: 1,178,138
RAC: 0
Netherlands
Message 155695 - Posted: 23 Aug 2005, 15:52:42 UTC - in response to Message 155688.  

Wouldn't the new client come with a new version number (4.20, let's say), that would trump the old entries in app_info.xml? So, worrying that people will keep using the old optimized app is not a problem, since their old optimized client will not be used when the new version gets rolled out. They would need to install a new optimized client that was associated with the new version number.


When the 4.18 app replaced the 4.09 app my 4.11 with app_info.xml config didn't update to the new client. It seems to ignore a new version number.

Greetings,
Sander



ID: 155695 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 155699 - Posted: 23 Aug 2005, 15:58:10 UTC - in response to Message 155686.  

You know, I am beginning to think a piece of that worst case scenario is happening. That is, as the WFV queue includes older and older submitted work units, when they finally get processed, no credit gets awarded as the submitted work is defined as past due.


The Validator only looks on whatever wu is waiting in the Validator-queue, and doesn't care if a wu have been sitting there for 1 second or one month.

As long as a result have been reported before it's deadline is out, it doesn't matter if validator is backlogged or not.
ID: 155699 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 155702 - Posted: 23 Aug 2005, 16:03:35 UTC - in response to Message 155688.  

Wouldn't the new client come with a new version number (4.20, let's say), that would trump the old entries in app_info.xml? So, worrying that people will keep using the old optimized app is not a problem, since their old optimized client will not be used when the new version gets rolled out. They would need to install a new optimized client that was associated with the new version number.


It should come with another application-name, so anyone using the current SETI@Home- optimized application will either only get "normal" wu, or most likely they'll stop splitting "normal" wu so after a very short time will get a message like "no work for your platform" or something, till they removes the old optimized version.



ID: 155702 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 155710 - Posted: 23 Aug 2005, 16:15:30 UTC - in response to Message 155693.  

I realize that -- and hope it operates that way, but the sort of sustained fall off I'm now seeing (which is actually more severe the first day after the 3 hour deletion outage) suggests that something else is also going on. I hope I'm wrong here and that my daily credits will recover to the >1500 daily mark at some point.
I know part of the fall off (from >2K) will be a function of my revising down my SETI resource shares on my local farm and do expect to see that. Should SETI BOINC achieve stable operations across the board for some continuous run, I'll likely bump up resource shares in the future -- but even at the current shares, I'm expecting well over 1K a day from my various systems since some of them are still SETI only.



Past due = RETURNED after the deadline. Not VALIDATED after the deadline. It makes no difference if the work is validated the day it's returned or six months later, as long as it was returned by you before the deadline, and is correct, you'll get credit...


ID: 155710 · Report as offensive
Divide Overflow
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 365
Credit: 131,684
RAC: 0
United States
Message 155796 - Posted: 23 Aug 2005, 19:19:51 UTC - in response to Message 155702.  

It should come with another application-name, so anyone using the current SETI@Home- optimized application will either only get "normal" wu, or most likely they'll stop splitting "normal" wu so after a very short time will get a message like "no work for your platform" or something, till they removes the old optimized version.


There's an interesting idea! Version control is a big weakness in the current anonymous platform application system.


ID: 155796 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 156308 - Posted: 24 Aug 2005, 14:43:49 UTC

Or, UCB could begin to release the optimized versions too. I think that we are down to a stable of 4-6 different versions. I don't know how long it takes to do a compile, but, that is not that big of a set ... if they were real ambitious they could make it part of the current make processes.

Then all they would need is a two step install process where the first step would figure out which download was needed and then go and get the correct version.

If that was too hard, well, they could still let the volunteer compiler corps make the extra versions, but have them deliver the versions back to UCB. I mean, originally, the question for the optimizers was "what flags do we use and for which architecture?" That has an answer now.

Just more thoughts ...
ID: 156308 · Report as offensive
[ue] Toni_V

Send message
Joined: 6 Apr 03
Posts: 52
Credit: 141,788
RAC: 0
Finland
Message 156330 - Posted: 24 Aug 2005, 15:36:26 UTC

A worst case solution would be to stop Seti-Boinc until all files are purged. That means the old ophan files are deleted and every returned result would be validated. This would take maybe few days. After the total purge, Validators would be happy to crunch the loads of files that are waiting in users HDD's.

Actually this might be a good thing too - when orphan deleting and validation are done totally there should not be any returned files left. If there is, then there's some other bug too.
ID: 156330 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 156333 - Posted: 24 Aug 2005, 15:45:09 UTC - in response to Message 156330.  

A worst case solution would be to stop Seti-Boinc until all files are purged.

By Jove! I do hope they take the courtesy of closing down the Q&A forums then, for we're already getting a flood of people in who don't read the front page for news, or any other thread in there about not being able to upload/download/get a scheduler reply!



ID: 156333 · Report as offensive
TPR_Mojo
Volunteer tester

Send message
Joined: 18 Apr 00
Posts: 323
Credit: 7,001,052
RAC: 0
United Kingdom
Message 156345 - Posted: 24 Aug 2005, 16:21:40 UTC - in response to Message 156330.  

A worst case solution would be to stop Seti-Boinc until all files are purged. That means the old ophan files are deleted and every returned result would be validated. This would take maybe few days. After the total purge, Validators would be happy to crunch the loads of files that are waiting in users HDD's.

Actually this might be a good thing too - when orphan deleting and validation are done totally there should not be any returned files left. If there is, then there's some other bug too.


Maybe it is time to stop tickling this problem. I applaud the team for leaving things down over the last day or so to deal with the newly discovered RAID problem - sometimes the best solution won't be the most popular.

How about:

Close all up- and downloads
Close the website and message boards except for a one page message on the project URL saying "The project is currently down for maintenance"
Delete all orphans
Validate all results
Delete any new orphans ;)
Tidy up the user and host tables - remove any zeros, goneaways or other orphaned records
Backup
Bring all services back up

Draconian yes but nobody should lose any valid work, us or the science team, and the filesystems and databases should shrink nicely.



ID: 156345 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 156351 - Posted: 24 Aug 2005, 16:43:30 UTC - in response to Message 156330.  


Actually this might be a good thing too - when orphan deleting and validation are done totally there should not be any returned files left. If there is, then there's some other bug too.


You've forgot all the wu with 1 or 2 results, meaning not enough results to validate yet. ;) There will also be some with 3 or more "success"-results, due to "no consensus yet".
ID: 156351 · Report as offensive
[ue] Toni_V

Send message
Joined: 6 Apr 03
Posts: 52
Credit: 141,788
RAC: 0
Finland
Message 156376 - Posted: 24 Aug 2005, 19:42:57 UTC - in response to Message 156351.  

You've forgot all the wu with 1 or 2 results, meaning not enough results to validate yet. ;) There will also be some with 3 or more "success"-results, due to "no consensus yet".


Yeah, after I posted I started to think about that too.

Ofcourse Seti people could just calculate the amount of files there should be and the see how close it's with the current file count. For example if there's 1 million files waiting for validation, 0.5 million ready to send (I'm guessing there's 3.5 result's per packet) - that would give 3.5*1+0.5 = 4 million files.
ID: 156376 · Report as offensive
Bill & Patsy
Avatar

Send message
Joined: 6 Apr 01
Posts: 141
Credit: 508,875
RAC: 0
United States
Message 156425 - Posted: 24 Aug 2005, 21:44:50 UTC - in response to Message 155254.  

I would interested to find out just what percentage of the file system is orphans. With continued growth in the participant base (new users, bigger farms, etc.), couldn't the project eventually have this many legitimate work unit files? In practice, this is throttled by the splitters, but it is something to think about from a design standpoint.


We estimate about 40% of all the files in the upload directories are these "antique" files. We are also about to release a client that does significantly more science, and therefore takes much longer - and has the wonderful side effect of reducing all our I/O bandwidth by as much as 75% (or more?).

- Matt


Which didn't really answer Octagon's question. Increasing the computational time by a factor of four just postpones the problem until the user base increases again by a factor of four. The question was: "With continued growth in the participant base ..., couldn't the project eventually have this many legitimate work unit files?" It's still a very good, very relevant question.

--Bill

ID: 156425 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 156561 - Posted: 25 Aug 2005, 3:23:52 UTC - in response to Message 156425.  

Which didn't really answer Octagon's question. Increasing the computational time by a factor of four just postpones the problem until the user base increases again by a factor of four. The question was: "With continued growth in the participant base ..., couldn't the project eventually have this many legitimate work unit files?" It's still a very good, very relevant question.


The computation times for the beta trial, which will probably be the new seti client are 10 times that of now, using optimised clients on both. My times are 12h:30m for beta, 1h:20m for the normal seti.

Andy
ID: 156561 · Report as offensive
Profile Mike Allen
Avatar

Send message
Joined: 11 Aug 01
Posts: 21
Credit: 2,064,925
RAC: 0
United States
Message 156567 - Posted: 25 Aug 2005, 3:50:24 UTC

I have a really dumb question: What exactly is a "credit" and why do I want it?

ID: 156567 · Report as offensive
N/A
Volunteer tester

Send message
Joined: 18 May 01
Posts: 3718
Credit: 93,649
RAC: 0
Message 156586 - Posted: 25 Aug 2005, 4:50:05 UTC - in response to Message 156567.  

What exactly is a "credit"...

[font='courier,courier new']A measurement of processing as compared to a benchmark theoretical computer.[/font]
...and why do I want it?

[font='courier,courier new']Bragging rights as a measure of accomplishment.

Unless you're in it for the science, in which case it doesn't matter one iota as long as it goes up over time.[/font]

ID: 156586 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 156592 - Posted: 25 Aug 2005, 5:00:05 UTC - in response to Message 156586.  

Lad, it is not an either or.


Bragging rights as a measure of accomplishment.

Unless you're in it for the science, in which case it doesn't matter one iota as long as it goes up over time.


ID: 156592 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 156593 - Posted: 25 Aug 2005, 5:01:19 UTC - in response to Message 156586.  

What exactly is a "credit"...

[font='courier,courier new']A measurement of processing as compared to a benchmark theoretical computer.[/font]
...and why do I want it?

[font='courier,courier new']Bragging rights as a measure of accomplishment.

Unless you're in it for the science, in which case it doesn't matter one iota as long as it goes up over time.[/font]



Even if you don't care what your current credits are, it's importatnt to watch them increase even if you are only interested in the science. If they don't increase, it's a sign that your system's results are being tossed out for some reason. (or ignored). Without a steady increase in credits, it could mean that you are just heating the room, and adding nothing to science. Credits are important for those who want the credits, and for those who want to provide science
ID: 156593 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : WFV- Worst case solution


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.