Why not in order of date?

Message boards : Number crunching : Why not in order of date?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Steven Gaber

Send message
Joined: 19 Jan 13
Posts: 111
Credit: 2,834,186
RAC: 11
United States
Message 2009894 - Posted: 29 Aug 2019, 21:22:08 UTC

Right now, has 155 S@H tasks ready to start, with due dates of 9 October, 14 October and 21 October.

The computer is working on three tasks due on 21 October.

One would think that BOINC should have been programmed to work on the tasks with earlier dates than on the ones with later dates. Why is this not the case?

Cheers,
Steven Gaber
Oldsmar, FL
ID: 2009894 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36654
Credit: 261,360,520
RAC: 489
Australia
Message 2009897 - Posted: 29 Aug 2019, 21:28:53 UTC

BOINC works on the "First In First Out" (or FIFO) order unless deadlines become a problem. ;-)

Cheers.
ID: 2009897 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22508
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2009898 - Posted: 29 Aug 2019, 21:30:34 UTC

Because BOINC works on a First-in, First-out priority - if you look at the "sent" dates you will see that those being processed first are slightly older (maybe by only a few minutes). than those sitting waiting to be processed.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2009898 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2009900 - Posted: 29 Aug 2019, 21:33:05 UTC - in response to Message 2009894.  
Last modified: 29 Aug 2019, 21:33:49 UTC

Consider this:

You have two projects added.
Project one has tasks with deadlines of 3 weeks away.
Project two has tasks with deadlines of 3 days away.

If BOINC were to run tasks in order of deadline, it would run the tasks of the 3 days deadline project first, very possibly all the way until the tasks for the 3 week deadline project were past the deadline. So it's better to just run them in the order they were received in.
ID: 2009900 · Report as offensive
Steven Gaber

Send message
Joined: 19 Jan 13
Posts: 111
Credit: 2,834,186
RAC: 11
United States
Message 2009945 - Posted: 30 Aug 2019, 4:55:29 UTC - in response to Message 2009900.  

Thanks for those insights.

I understand the first-in-first out principle. But in addition to the aforementioned 155 S@H tasks due in October, I have 65 Asteroids tacks with deadlines of 3 September, only four days away. I will probably abort most of them.

I formerly also had 75 Rosetta tasks with short deadlines, but had to abort 45 of them.

Steven Gaber
Oldsmar, FL
ID: 2009945 · Report as offensive
Steven Gaber

Send message
Joined: 19 Jan 13
Posts: 111
Credit: 2,834,186
RAC: 11
United States
Message 2009946 - Posted: 30 Aug 2019, 4:58:17 UTC - in response to Message 2009897.  

BOINC works on the "First In First Out" (or FIFO) order unless deadlines become a problem. ;-)

Cheers.


For me, those long task lists and short deadlines have become a problem.

Salud.

Steven Gaber
Oldsmar, FL
ID: 2009946 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19376
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2009949 - Posted: 30 Aug 2019, 5:15:24 UTC - in response to Message 2009946.  

I would decrease your cache size. You shouldn't run a 10 day cache when you have tasks with 3 day deadline. A one day cache should be more than sufficient.

And lets hope Dorian does cause you problems.
ID: 2009949 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13848
Credit: 208,696,464
RAC: 304
Australia
Message 2009953 - Posted: 30 Aug 2019, 6:04:00 UTC - in response to Message 2009945.  

I have 65 Asteroids tacks with deadlines of 3 September, only four days away. I will probably abort most of them.

I formerly also had 75 Rosetta tasks with short deadlines, but had to abort 45 of them.
Why???
The manager will take in to account how long your computer is on, how much time it is able to process work while it is on, how long it takes to process each projects work, what your resource share settings are, and then work to meet those requirements.
Downloading work, aborting it, suspending & re-enabling crunching just makes it more difficult for the manager to do what you have set it to do.

If you run more than one project, set a small cache (no more than a day), and let the manager work things out. It will still work things out with a larger cache, but it will takes ages (several weeks to a month or more) to do so. The smaller the cache, the sooner it can resolve your settings & the work it has to do.
Grant
Darwin NT
ID: 2009953 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22508
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2009957 - Posted: 30 Aug 2019, 6:53:04 UTC

Not only a small "primary cache" ("store at least x days"), but a really small "secondary cache" (the "store x additional days"). Something like 1 +0.01
Further, some projects will send you a very inflated amount of work and over-fill the caches which is not good for cache management within BOINC.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2009957 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 2010033 - Posted: 30 Aug 2019, 20:45:19 UTC - in response to Message 2009957.  

Not only a small "primary cache" ("store at least x days"), but a really small "secondary cache" (the "store x additional days"). Something like 1 +0.01
Further, some projects will send you a very inflated amount of work and over-fill the caches which is not good for cache management within BOINC.


Yup, I finally figured out settings on the World Community Grid website to get that to stop. But I am not sure I ever got the Einstein@Home to cooperate.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2010033 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2010035 - Posted: 30 Aug 2019, 20:52:29 UTC - in response to Message 2010033.  

Not only a small "primary cache" ("store at least x days"), but a really small "secondary cache" (the "store x additional days"). Something like 1 +0.01
Further, some projects will send you a very inflated amount of work and over-fill the caches which is not good for cache management within BOINC.


Yup, I finally figured out settings on the World Community Grid website to get that to stop. But I am not sure I ever got the Einstein@Home to cooperate.

Tom

+1
Only solution for Einstein is GET/NNT/EMPTY/GET.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2010035 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2010050 - Posted: 30 Aug 2019, 22:02:50 UTC - in response to Message 2010033.  
Last modified: 30 Aug 2019, 22:05:42 UTC

Not only a small "primary cache" ("store at least x days"), but a really small "secondary cache" (the "store x additional days"). Something like 1 +0.01
Further, some projects will send you a very inflated amount of work and over-fill the caches which is not good for cache management within BOINC.


Yup, I finally figured out settings on the World Community Grid website to get that to stop. But I am not sure I ever got the Einstein@Home to cooperate.

Tom
I'm running Einstein too. I get what I ask for - no more, sometimes less. The Event Log records both request and allocation.

Edit - most recent this machine:

30/08/2019 21:04:28 | Einstein@Home | Sending scheduler request: To fetch work.
30/08/2019 21:04:28 | Einstein@Home | Reporting 1 completed tasks
30/08/2019 21:04:28 | Einstein@Home | Requesting new tasks for Intel GPU
30/08/2019 21:04:28 | Einstein@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
30/08/2019 21:04:28 | Einstein@Home | [sched_op] NVIDIA GPU work request: 0.00 seconds; 0.00 devices
30/08/2019 21:04:28 | Einstein@Home | [sched_op] Intel GPU work request: 4412.18 seconds; 0.00 devices
30/08/2019 21:04:32 | Einstein@Home | Scheduler request completed: got 1 new tasks
30/08/2019 21:04:32 | Einstein@Home | [sched_op] Server version 611
30/08/2019 21:04:32 | Einstein@Home | Project requested delay of 60 seconds
30/08/2019 21:04:32 | Einstein@Home | [sched_op] estimated total CPU task duration: 0 seconds
30/08/2019 21:04:32 | Einstein@Home | [sched_op] estimated total NVIDIA GPU task duration: 0 seconds
30/08/2019 21:04:32 | Einstein@Home | [sched_op] estimated total Intel GPU task duration: 18527 seconds
30/08/2019 21:04:32 | Einstein@Home | [sched_op] handle_scheduler_reply(): got ack for task LATeah1049ad_196.0_0_0.0_49327964_1
30/08/2019 21:04:32 | Einstein@Home | [sched_op] Deferring communication for 00:01:00
30/08/2019 21:04:32 | Einstein@Home | [sched_op] Reason: requested by project
30/08/2019 21:04:34 | Einstein@Home | Started download of templates_LATeah1049ae_0100_7551530.dat
30/08/2019 21:04:36 | Einstein@Home | Finished download of templates_LATeah1049ae_0100_7551530.dat
ID: 2010050 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2010071 - Posted: 31 Aug 2019, 0:44:16 UTC

Sure if you have a Einstein only project machine.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2010071 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13848
Credit: 208,696,464
RAC: 304
Australia
Message 2010079 - Posted: 31 Aug 2019, 1:36:15 UTC
Last modified: 31 Aug 2019, 1:41:00 UTC

30/08/2019 21:04:28 | Einstein@Home | [sched_op] Intel GPU work request: 4412.18 seconds; 0.00 devices
30/08/2019 21:04:32 | Einstein@Home | [sched_op] estimated total Intel GPU task duration: 18527 seconds
Looks like about 4 times as much as was requested was delivered.

I guess it's a juggling act between "Store at least" and "Store up to an additional" along with system up time & BOINC crunching time while the system is up as to whether it waits until the time needed to process work requested is equal to or greater than the time to process the smallest runtime work unit that is available before issuing more (even to the point of running out of that projects work before getting more, or even not getting anymore until the resource share debt gets to the point that more than is requested is need to be processed in order to honour the resource share settings), or if it issues more work as soon as what the system has is less than what has been set as a minimum, even if it greatly exceeds the requested amount of processing time.

Adding projects, some with longer running WUs, other with much shorter running WUs, and a mixture of all three all with different deadlines would make the Managers job of honouring Resource Share settings only possible over the long term.
Unfortunately people tend to expect to see their resource share settings honoured immediately, or at least on a hour by hour basis. But with multiple projects, limited system up time, limited processing time while the system is up, and limited system resources (ie low clock speeds/low core count/low performance GPUs (if any)), the resource share can only be met over the long term- ie 4 or more weeks.
Grant
Darwin NT
ID: 2010079 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2010107 - Posted: 31 Aug 2019, 8:14:20 UTC - in response to Message 2010079.  

30/08/2019 21:04:28 | Einstein@Home | [sched_op] Intel GPU work request: 4412.18 seconds; 0.00 devices
30/08/2019 21:04:32 | Einstein@Home | [sched_op] estimated total Intel GPU task duration: 18527 seconds
Looks like about 4 times as much as was requested was delivered.
30/08/2019 21:04:32 | Einstein@Home | Scheduler request completed: got 1 new tasks
You left that line out of the quote - can't send a fractional task. It sent "smallest amount possible".

I guess it's a juggling act between "Store at least" and "Store up to an additional" along with system up time & BOINC crunching time while the system is up as to whether it waits until the time needed to process work requested is equal to or greater than the time to process the smallest runtime work unit that is available before issuing more (even to the point of running out of that projects work before getting more, or even not getting anymore until the resource share debt gets to the point that more than is requested is need to be processed in order to honour the resource share settings), or if it issues more work as soon as what the system has is less than what has been set as a minimum, even if it greatly exceeds the requested amount of processing time.

Adding projects, some with longer running WUs, other with much shorter running WUs, and a mixture of all three all with different deadlines would make the Managers job of honouring Resource Share settings only possible over the long term.
Unfortunately people tend to expect to see their resource share settings honoured immediately, or at least on a hour by hour basis. But with multiple projects, limited system up time, limited processing time while the system is up, and limited system resources (ie low clock speeds/low core count/low performance GPUs (if any)), the resource share can only be met over the long term- ie 4 or more weeks.
The machine in question is running four separate projects. Given that they all have very different deadlines (from 5 days to 7 weeks, typically) and different scientific requirements, my cache settings are 0.25 days + 0.05 days. So yes, there are multiple issues at play, and it's the overall balance of the machine and the various resources at play that matters. Did you notice the 'intel_gpu'?

There is one particular bug, which I've only seen very, very rarely - maybe four or five times in 15 years of crunching at Einstein (and it's only happened at Einstein). This is where the client requests a modest amount of work (as in my example), and gets a reasonable reply (again, as here). But a minute later, the client asks for the same amount of work again, and gets it. And a minute later, it asks again. And again. And again. And again....

But that's a client bug. It ends up asking for too much. And I've never been able to track it down, because (a) you can't trigger it deliberately, and (b) anything you do to investigate it, like setting new logging flags, stops the behaviour. It'll have to remain a mystery, unless it happens spontaneously while I have the logging flags set for another reason. But the critical point remains: it's the client that requests work - no project can send work unsolicited.
ID: 2010107 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13848
Credit: 208,696,464
RAC: 304
Australia
Message 2010122 - Posted: 31 Aug 2019, 10:26:22 UTC - in response to Message 2010107.  

You left that line out of the quote - can't send a fractional task. It sent "smallest amount possible".
Which I was fully aware of, I guess I just didn't make it as obvious as I thought I did when I actually discussed the Scheduler suppling work even if the smallest amount available is considerably larger than the requested amount, along with the alternative option.

The point you were addressing was people saying that project was giving them significantly more work than they needed to meet their cache & resource share settings. To me 50% more of something is significant, 400% more qualifies as a huge amount more (even if the absolute amount of that something is very small).


Did you notice the 'intel_gpu'?
Yes, although i'm not sure why that is of particular significance in this case?


There is one particular bug, which I've only seen very, very rarely - maybe four or five times in 15 years of crunching at Einstein (and it's only happened at Einstein). This is where the client requests a modest amount of work (as in my example), and gets a reasonable reply (again, as here). But a minute later, the client asks for the same amount of work again, and gets it. And a minute later, it asks again. And again. And again. And again....
Which may be what was happening with the people that were talking about getting more work than they should have according to their settings, although i'd have thought they'd have noticed if that was what was actually occurring.
Grant
Darwin NT
ID: 2010122 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2010124 - Posted: 31 Aug 2019, 10:47:37 UTC - in response to Message 2010122.  

And a minute later, it asks again. And again. And again. And again....
Which may be what was happening with the people that were talking about getting more work than they should have according to their settings, although i'd have thought they'd have noticed if that was what was actually occurring.
Which was my primary reason for butting into this conversation. The client bug I've seen is, well, bugging me: but it's very elusive. Fixing it will rely on people making precise, formal, observations - and reporting them accurately. Sloppy thinking like "Einstein sends...", well, bugs me again.
ID: 2010124 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2010167 - Posted: 31 Aug 2019, 16:59:01 UTC - in response to Message 2010124.  

I guess I am volunteering again. My last experiment on Einstein work request was with these parameters.
Einstein resource share =25
Seti resource share = 1000
Primary work cache = 0.01 days
Secondary work cache = 0.0 days
Unset NNT and received 260 Einstein Gamma Ray Pulsar tasks on the first connection. The second connection 60 seconds later netted another 225 tasks. I then set NNT again and started aborting all the work I couldn't finish in time. A Einstein task on my host crunches in between 465 -585 seconds depending on the card.
I should have received only one task at a request of 864 seconds of work. So why did I get sent 130,000 seconds of work?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2010167 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22508
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2010172 - Posted: 31 Aug 2019, 17:19:04 UTC

Unlike many of the settings within BOINC the cache settings do not detect zero as "use a default value" (or even as "I really mean zero"), but try to interpret zero as a real value and probably in so doing end up with something very strange- try a setting of 0.01 or even 0.001.

I'm just about to set this machine up on Einstein to see what it does with a "virgin" machine with similar settings to those used by Keith, but a non-zero secondary cache.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2010172 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22508
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2010173 - Posted: 31 Aug 2019, 17:29:08 UTC

A question before I do the actual test - roughly how long does a "typical" Einstein job take to run on a CPU?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2010173 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Why not in order of date?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.