Off-line cache facility (like "SETI Queue")

Questions and Answers : Wish list : Off-line cache facility (like "SETI Queue")
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile UBT - Timbo
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 157
Credit: 10,720,947
RAC: 362
United Kingdom
Message 849 - Posted: 23 Jun 2004, 22:17:54 UTC

Would love BOINC to adopt an off-line "SETI-Queue" like facility, so that WU's can be cached and fed to a set of off-line (but inter-connected) PC's acting as a "farm".

Then I can carry on crunching using a few older (but still quite healthy) PC's CPU power.

I'm sure others would like this as we don't all have broadband connections.

Otherwise, that power will be wasted.

T.
ID: 849 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 941 - Posted: 24 Jun 2004, 1:51:44 UTC

The offline cache program will have to download work for a specific machine. When returned, that specific machine will have to have done the work, or it will not be accepted. If there is enough need, I am certain that someone will write it.
jm7
ID: 941 · Report as offensive
Obelix

Send message
Joined: 13 Jun 01
Posts: 1
Credit: 596,200
RAC: 0
South Africa
Message 1310 - Posted: 24 Jun 2004, 19:26:23 UTC - in response to Message 941.  

> The offline cache program will have to download work for a specific machine.
> When returned, that specific machine will have to have done the work, or it
> will not be accepted. If there is enough need, I am certain that someone will
> write it.
> jm7

With seti1 i wrote scripts that ran on my firewall that sent & received workunits ( without actually doing any work itself ) and then distributed wu's to the various pc's as they needed it. When the firewall goes off-line ( its on a dialup ) it feeds the pc's out of a queue, until the queue is empty. When it connects to the net ( which could be days apart ) it sends results and fetches new ones.

Boinc works differently and i'm still trying to sort out what happens and when. For instance, what happens when it tries to send results and the system is offline ? Does it wait ? ( how long between attempts ? ).

If it were to look for a trigger file, which when present prevented it from trying to send and receive ( meaning the system is offline ) and when missing, it means the system is on-line and sending is allowed. That way i can create the file when the system goes on-line and delete it when the system goes off-line. Seeing as boinc does queuing anyway, each pc can now queue its own set of work-units.

Just a thought
ID: 1310 · Report as offensive
Heffed
Volunteer tester

Send message
Joined: 19 Mar 02
Posts: 1856
Credit: 40,736
RAC: 0
United States
Message 1321 - Posted: 24 Jun 2004, 19:39:01 UTC - in response to Message 1310.  

> Boinc works differently and i'm still trying to sort out what happens and
> when. For instance, what happens when it tries to send results and the system
> is offline ? Does it wait ? ( how long between attempts ? ).

It waits. You can test it yourself by disabling network access under the file menu.
ID: 1321 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 1342 - Posted: 24 Jun 2004, 19:59:03 UTC - in response to Message 1310.  

> Boinc works differently and i'm still trying to sort out what happens and
> when. For instance, what happens when it tries to send results and the system
> is offline ? Does it wait ? ( how long between attempts ? ).

The retry is a random exponential backoff. There are several possibilities for how long the start is. If there is no contact at all, the minimum wait is 1 minute. If there is some contact, but the scheduler is off line, then the minimum wait has to be longer than the minimum retry time at the server. Unfortunately, I am not certain whether the minimum retry time at the server is 10 or 20 minutes. There are a couple of different servers that would be contacted. There is a setting that can be changed somewhere on the client to disallow network access. I have no idea how to set this programatically.
jm7
ID: 1342 · Report as offensive
ammo

Send message
Joined: 9 Dec 03
Posts: 2
Credit: 4,229
RAC: 0
United States
Message 1574 - Posted: 25 Jun 2004, 1:09:51 UTC

I would like to add my vote for a flexible caching system. I was using Lin-seti to run the old system (debian linux). I set my cache at 10 WUs and could merrily process them in the background while doing other stuff. Every few days I could replenish the cache while online (dialup). I was rarely idle due to lack of work and usually didn't have WUs more than 4 days old.

Does the new system allow replenishing work anytime or will I have to wait until I'm empty? Right now I'm idle and waiting for some time limit to run out. Is BOINC adding a time penalty because I'm offline most of the time?

Thanks...John
ID: 1574 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 1615 - Posted: 25 Jun 2004, 1:57:22 UTC - in response to Message 1574.  

> I would like to add my vote for a flexible caching system. I was using
> Lin-seti to run the old system (debian linux). I set my cache at 10 WUs and
> could merrily process them in the background while doing other stuff. Every
> few days I could replenish the cache while online (dialup). I was rarely idle
> due to lack of work and usually didn't have WUs more than 4 days old.
>
> Does the new system allow replenishing work anytime or will I have to wait
> until I'm empty? Right now I'm idle and waiting for some time limit to run
> out. Is BOINC adding a time penalty because I'm offline most of the time?
>
> Thanks...John
>
Yes, set your cache size to say 5 to 5 days. Since you are dialup, you will wanto to go to the file menu (assuming Windows) and disable network access. When you want to connect again, turn network access back on, and everything can flow nicely. BOINC slightly over fills the queue in that it fills the queue to use every second requested, and if a WU goes over that, it will be downloaded as well. For example I have my queue length set to .1 to .2 days, and I get 1 WU at a time. Since I have always on connections, and have signed up for a couple of projects (Alpha (by invite only), Beta, S@H, and Predictor), I figure at least one of them will have work any time one of my machines asks.
jm7
ID: 1615 · Report as offensive
wedgenix

Send message
Joined: 3 Apr 99
Posts: 19
Credit: 0
RAC: 0
Message 1992 - Posted: 25 Jun 2004, 17:31:22 UTC

SetiQueue is r0x0rz! I run seti classic on machines scattered all over town and they all report to my one machine running SetiQueue. It keeps nice stats, I can hit the page anywhere from the web and see if any machines are falling behind. I've actually found problems with some of my clients' machines this way and showed up before they even called to tell me their computer went down or they lost their internet connection. I once went in to work early because nothing in our building had hit the queue overnight. turned out a router had failed and I was able to replace it before anyone even showed up. The caching system in boinc is all right for individual computers I suppose, but you either need to overload boinc's servers requesting gobs of work units to start (happening now...haven't received WU #1 yet in over 24hrs), or risk not having enough to weather the inevitable outages. Take last weekend in classic for example. My 100 or so per day worth of units kept crunching and not a single cpu cycle was lost. That's actually more server-friendly to seti when it comes back on line, too....I have only setiqueue trying to replentish the queue and send results, one at a time, instead of 50 machines all trying at the same time. The boinc caching method actually increases their load substantially, and they obviously cannot handle it at this time. Boincqueue would reduce the load of users like me 50-fold each. I'm sure there are people doing far more than my measly 100/day as well...and they likely ran through one queue. I know of at least one user running over a thousand seti1 clients through his setiqueue, and there are many teams where the entire team runs through the team queue. Isn't that a FAR better way to do things than to have the boinc server insist on hearing each machine individually? I'm confused as to why it was shocking that the server load was "more than double what was experienced with seti classic"....seems to me the expected load should be upwards of 100 times what it used to be on seti1 where thousands upon thousands of client traffic could be taken in a nice, neat, slow-paced orderly fashion from the big setiqueues out there and now all those thousands have to connect individually and all try more or less at once. Double the load is a problem? Sheesh...just wait til some of the "big boys" convert. We definately need a boinc-queue type of application.
ID: 1992 · Report as offensive
[BOINCstats] Willy
Volunteer tester

Send message
Joined: 4 Mar 01
Posts: 202
Credit: 152,243
RAC: 0
Netherlands
Message 3682 - Posted: 3 Jul 2004, 13:32:05 UTC
Last modified: 3 Jul 2004, 13:32:21 UTC

Because BOINC insists on having all hosts connect to the server and every host has its own host-ID, I'm unable to continue to produce as much work as I did with SETI-1.

At my work we build systems, and we run SETI on them as a burn-in test. It's rare when a system completes a full WU. Another system takes over, and so several systems can complete a single WU. It's then transfered to a SETIQueue.

With BOINC, I can run a single WU on more systems (as far as can tell right now, haven't got that many WU's to test with), but I'm not sure what happens when I flush. The WU may be rejected beacause it's not processed on the system that downloaded it.

Or, if I make the BOINC server 'think' it is processed by the flush-system, I probably be banned, because one system can't process 100~250 WU's a day....

So, now what?
Twisted Hardware - Can you handle it?
ID: 3682 · Report as offensive
Heffed
Volunteer tester

Send message
Joined: 19 Mar 02
Posts: 1856
Credit: 40,736
RAC: 0
United States
Message 3769 - Posted: 3 Jul 2004, 20:26:17 UTC - in response to Message 3682.  

> With BOINC, I can run a single WU on more systems (as far as can tell right
> now, haven't got that many WU's to test with), but I'm not sure what happens
> when I flush. The WU may be rejected beacause it's not processed on the system
> that downloaded it.
>
> Or, if I make the BOINC server 'think' it is processed by the flush-system, I
> probably be banned, because one system can't process 100~250 WU's a day....

Yikes! Quit it!

Spreading a WU across multiple systems pretty much guarantees the scientific value is null and void. :(
ID: 3769 · Report as offensive
[BOINCstats] Willy
Volunteer tester

Send message
Joined: 4 Mar 01
Posts: 202
Credit: 152,243
RAC: 0
Netherlands
Message 3941 - Posted: 4 Jul 2004, 10:02:40 UTC - in response to Message 3769.  

> > With BOINC, I can run a single WU on more systems (as far as can tell
> right
> > now, haven't got that many WU's to test with), but I'm not sure what
> happens
> > when I flush. The WU may be rejected beacause it's not processed on the
> system
> > that downloaded it.
> >
> > Or, if I make the BOINC server 'think' it is processed by the
> flush-system, I
> > probably be banned, because one system can't process 100~250 WU's a
> day....
>
> Yikes! Quit it!
>
> Spreading a WU across multiple systems pretty much guarantees the scientific
> value is null and void. :(
>

I DON'T run the WU simultaniously on those systems! Example: system1 runs 1-34%, after that system2 35-70% and then system3 71-100%. I don't see anything wrong with that. It's like shutting down, and completing a WU after reboot on the same system.
Twisted Hardware - Can you handle it?
ID: 3941 · Report as offensive
Heffed
Volunteer tester

Send message
Joined: 19 Mar 02
Posts: 1856
Credit: 40,736
RAC: 0
United States
Message 3951 - Posted: 4 Jul 2004, 10:21:49 UTC - in response to Message 3941.  

> I DON'T run the WU simultaniously on those systems! Example: system1 runs
> 1-34%, after that system2 35-70% and then system3 71-100%. I don't see
> anything wrong with that. It's like shutting down, and completing a WU after
> reboot on the same system.

I didn't say simultaneously. And no, it's not like shutting down and rebooting. A different system could get very different results. Your science will be buggered.
ID: 3951 · Report as offensive
[BOINCstats] Willy
Volunteer tester

Send message
Joined: 4 Mar 01
Posts: 202
Credit: 152,243
RAC: 0
Netherlands
Message 3981 - Posted: 4 Jul 2004, 13:09:32 UTC - in response to Message 3951.  

> A different system could get very different results. Your science
> will be buggered.


If that is the case, then 2 different systems will produce a different result for the same workunit, and thus all science will be buggered.

But maybe one of the executives on this project could settle this argument?
Twisted Hardware - Can you handle it?
ID: 3981 · Report as offensive
Profile Thierry Van Driessche
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3083
Credit: 150,096
RAC: 0
Belgium
Message 4020 - Posted: 4 Jul 2004, 14:49:06 UTC - in response to Message 3951.  
Last modified: 20 Jul 2004, 12:56:16 UTC

> > I DON'T run the WU simultaniously on those systems! Example: system1
> runs
> > 1-34%, after that system2 35-70% and then system3 71-100%. I don't see
> > anything wrong with that. It's like shutting down, and completing a WU
> after
> > reboot on the same system.

The fact is, a WU is send to 3 different PC's. Once the results are turned back to the server, the server expects to receive the WU's back from the one were it has been send to. Remember each PC configuration is registered as a PC id on the server at Berkeley. The PC id's from where the WU's has been returned have to match the PC id's were they have been send to.


Greetings from Belgium.
ID: 4020 · Report as offensive
Profile Thierry Van Driessche
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3083
Credit: 150,096
RAC: 0
Belgium
Message 4025 - Posted: 4 Jul 2004, 14:57:07 UTC - in response to Message 3981.  

> If that is the case, then 2 different systems will produce a different result
> for the same workunit, and thus all science will be buggered.

No. The result will be the same. The only difference between the 2 different systems will be the claimed credit between both if the CPU time and/or benchmark are different.
ID: 4025 · Report as offensive
[BOINCstats] Willy
Volunteer tester

Send message
Joined: 4 Mar 01
Posts: 202
Credit: 152,243
RAC: 0
Netherlands
Message 4343 - Posted: 5 Jul 2004, 9:41:39 UTC - in response to Message 4020.  
Last modified: 5 Jul 2004, 16:05:29 UTC

> > > I DON'T run the WU simultaniously on those systems! Example:
> system1
> > runs
> > > 1-34%, after that system2 35-70% and then system3 71-100%. I don't
> see
> > > anything wrong with that. It's like shutting down, and completing a
> WU
> > after
> > > reboot on the same system.
>
> The fact is, a WU is send to 3 different PC's. Once the results are turned
> back to the server, the server expects to receive the WU's back from the one
> were it has been send to. Remember each PC configuration is registered as a PC
> id on the server at Berkeley. The PC id's from where the WU's has been
> returned have to match the PC id's were they have been send to.
>
>
> Greetings from Belgium.
>
>

The WU's will be flushed from one system.

I'm not trying to mess with the science or with the WU's, I'm trying to find a way to crunch more WU's!
http://home.deds.nl/~th = Twisted Hardware - Can you handle it?
ID: 4343 · Report as offensive
Yavanius
Volunteer tester
Avatar

Send message
Joined: 8 Jul 99
Posts: 50
Credit: 249,309
RAC: 0
Antarctica
Message 6399 - Posted: 11 Jul 2004, 7:11:50 UTC - in response to Message 3981.  

> If that is the case, then 2 different systems will produce a different result
> for the same workunit, and thus all science will be buggered.
>
> But maybe one of the executives on this project could settle this argument?
> Twisted Hardware - Can you handle it?


Since this is SETI, the results should not be different, ASSUMING the hardware is 100% properly functional at all times.

The only way to truly tell if you can run WUs acquired on a connected machine on an unconnected machine is to try it. However, some issues appear to me.

1. Can you simply copy and paste BOINC, or does it need to be installed?
2. If it needs to be installed, it needs to connect to the server to download the proper files including SETI@home.
3. Unless you are a programmer, its going to be hell (literally) to copy the results and files and merge/synch them with the client on the connected system.


This is not SETI@home classic where you simply paste some files and the client returns/works on the results/WU and everything is hunky-dorey. Those XML files you see in the BOINC directory contain information about status of the client, communications, and work. It is not something you want to go idly messing with. Screwing up those files can mean lost work if you have to reset the project to fix them.

I don't foresee a solution to this from Berkeley in the immediate future. For now you will have to stick with classic or run another project (such as Distributed Folding) that can run off-line. Remember that BOINC was setup to prevent the cheating and hacking we saw in classic. Yeah it sucks we can't run it on unconnected machines, but the trade-off is small for the security we gain.

At any rate invest in some cheap NICs and and a cheap hub and connect those PCs up. =) If you can afford the electric bill to run them... ;)

~David
ID: 6399 · Report as offensive
[BOINCstats] Willy
Volunteer tester

Send message
Joined: 4 Mar 01
Posts: 202
Credit: 152,243
RAC: 0
Netherlands
Message 6474 - Posted: 11 Jul 2004, 11:52:46 UTC - in response to Message 6399.  
Last modified: 11 Jul 2004, 11:53:51 UTC

> Since this is SETI, the results should not be different, ASSUMING the hardware
> is 100% properly functional at all times.
>
> The only way to truly tell if you can run WUs acquired on a connected machine
> on an unconnected machine is to try it. However, some issues appear to me.
>
> 1. Can you simply copy and paste BOINC, or does it need to be installed?
> 2. If it needs to be installed, it needs to connect to the server to download
> the proper files including SETI@home.
> 3. Unless you are a programmer, its going to be hell (literally) to copy the
> results and files and merge/synch them with the client on the connected
> system.
>
>
> This is not SETI@home classic where you simply paste some files and the client
> returns/works on the results/WU and everything is hunky-dorey. Those XML files
> you see in the BOINC directory contain information about status of the client,
> communications, and work. It is not something you want to go idly messing
> with. Screwing up those files can mean lost work if you have to reset the
> project to fix them.
>
> I don't foresee a solution to this from Berkeley in the immediate future. For
> now you will have to stick with classic or run another project (such as
> Distributed Folding) that can run off-line. Remember that BOINC was setup to
> prevent the cheating and hacking we saw in classic. Yeah it sucks we can't run
> it on unconnected machines, but the trade-off is small for the security we
> gain.
>
> At any rate invest in some cheap NICs and and a cheap hub and connect those
> PCs up. =) If you can afford the electric bill to run them... ;)
>
> ~David
>
>
All PC's are connected to a network. They are preinstalled systems, ready to be quality checked, and after that, they are send to a customer.
I don't want to install BOINC, as I don't know whether the customer wants BOINC or not. We use BOINC (or in the past SETI classic) mainly for burn-in test.

So, what I do is this:
On the preinstall server I have a directory BOINC, and in this directory I have subdirs called BOINC01~BOINC50.
In every subdir is a complete BOINC installation.
I make a network connection to the server, so that we have: R:/BOINC/BOINCxx.
I then run the CLI from that directory, and to make sure that only one system runs in that dir, a lock a file in that dir, which is checked before starting the CLI. If it's occupied, the program skips to the next subdir.
Now, if a system is removed, the subdir lock is gone, and another system can use this dir, and complete the WU.
Important note: BOINC files are not moved around, they stay in the same location!
Important note2: If a systems crashes while running BOINC, we reinstall BOINC into the used subdir to prevent corrupt WU's.

That was the easy part, now the flushing:
If I enable flushing from the 'guest'-system, the BOINC server will detect a different system (I think) and reject the WU's as they are not send to this host. Therefore I have to flush from a single system (the preinstall server). Don't know if this works (I had a week off last week), and if the BOINC server accepts this.
As one host can only dowload 50 WU's a day, I might have to spread the subdir flush over more systems (like the server, and my workstation.)

Willy.

=================
www.boincstats.tk
ID: 6474 · Report as offensive
Profile Team Jolt Cola
Volunteer tester
Avatar

Send message
Joined: 21 Aug 99
Posts: 31
Credit: 2,015,228
RAC: 0
United Kingdom
Message 9346 - Posted: 18 Jul 2004, 12:10:25 UTC - in response to Message 941.  

> The offline cache program will have to download work for a specific machine.
> When returned, that specific machine will have to have done the work, or it
> will not be accepted. If there is enough need, I am certain that someone will
> write it.
> jm7
>

There is a need - some of OcUK's biggest users will NOT be running BOINC/SETI/anything when SETIclassic finishes unless there is some caching tool- which also has time restrictions. These are people who do 1000+ WU's/day on SETIclassic.

This is not just because of the security issues of downloading some unknown code from somewhere (which I don't like myself, but I think there is a way round this) - but because they are not allowed to connect or use the internet bandwidth through the working day. SETIqueue allows restrictions of time for connections to between specific hours.

The use of proxies can remove requirement for individual systems to connect directly - but its still a problem.

I think once all these XML files are understood, then a BOINCqueue will come along where the HOST ID is assigned by the queue machin and it assigns pseudo host-id's to the clients and then mucks around with the results returned to it and returns the results under its own host id - This may be what is happening with Mr.O having a lowly PIV with a higher RAC than the quad opeteron of Richard Smith at the top of the "Top Computers" list.

M.
ID: 9346 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 9592 - Posted: 19 Jul 2004, 4:27:18 UTC - in response to Message 9346.  

> > The offline cache program will have to download work for a specific
> machine.
> > When returned, that specific machine will have to have done the work, or
> it
> > will not be accepted. If there is enough need, I am certain that someone
> will
> > write it.
> > jm7
> >
>
> There is a need - some of OcUK's biggest users will NOT be running
> BOINC/SETI/anything when SETIclassic finishes unless there is some caching
> tool- which also has time restrictions. These are people who do 1000+ WU's/day
> on SETIclassic.
>
> This is not just because of the security issues of downloading some unknown
> code from somewhere (which I don't like myself, but I think there is a way
> round this) - but because they are not allowed to connect or use the internet
> bandwidth through the working day. SETIqueue allows restrictions of time for
> connections to between specific hours.
>
> The use of proxies can remove requirement for individual systems to connect
> directly - but its still a problem.
>
> I think once all these XML files are understood, then a BOINCqueue will come
> along where the HOST ID is assigned by the queue machin and it assigns pseudo
> host-id's to the clients and then mucks around with the results returned to it
> and returns the results under its own host id - This may be what is happening
> with Mr.O having a lowly PIV with a higher RAC than the quad opeteron of
> Richard Smith at the top of the "Top Computers" list.
>
> M.
>
Remember that there is a limit on the number of WUs that can be downloaded to a particular host in a day. This is currently 50 in S@H.

ID: 9592 · Report as offensive
1 · 2 · Next

Questions and Answers : Wish list : Off-line cache facility (like "SETI Queue")


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.