"got 0 new tasks" - why so?

Message boards : Number crunching : "got 0 new tasks" - why so?
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1784663 - Posted: 4 May 2016, 6:57:12 UTC

since a few hours, when the scheduler is requesting new tasks, it always says "scheduler request completed - got 0 new tasks" (without giving any detailed reason).
However, if I interpret correctly what I can see from the Server Status page, there should be plenty of v8 tasks available.
What's the Problem?
ID: 1784663 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5055
Credit: 376,588,116
RAC: 108,146
United States
Message 1784665 - Posted: 4 May 2016, 7:15:58 UTC - in response to Message 1784663.  

One possibility is the Green Bank Data.

Most of it is VLARs which are not sent to the Nvidia GPUs.

So as those tapes are being split, there is a large amount of VLARs which get sent to CPU and AMD GPUs but leaves all other machines without any work for several hours until those get drained.
ID: 1784665 · Report as offensive
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1784668 - Posted: 4 May 2016, 7:21:54 UTC

thanks for the explanation.

Just a minute ago, I did get quite a number of tasks for my 2 high-end NVIDIA GPUs :-)
So let's hope that this goes on like this!
ID: 1784668 · Report as offensive
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1785837 - Posted: 8 May 2016, 13:46:05 UTC

since yesterday, again I do not receive any new tasks for my NVIDIA GPU.

Plus, I have trouble to recognice from the Server status page as to whether tasks for NVIDIA GPU are available or not.
Could anyone please give me a short instruction as to how I can find this out?
Thanks in advance.
ID: 1785837 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15678
Credit: 80,056,840
RAC: 25,465
United States
Message 1785847 - Posted: 8 May 2016, 14:47:57 UTC - in response to Message 1785837.  

From the server's perspective, all workunits are the same. The server doesn't have CPU workunits, GPU workunits, Rasberry PI workunits, ARM workunits. It is the BOINC client on your machine that asks for work for a particular resource.

That being said, the server has a queue of only a few hundred, so a single client can grab them all before it is reloaded with more work to send. Seeing messages that your client was unable to download more work is perfectly normal and not necessarily indicative of a problem.
ID: 1785847 · Report as offensive
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1785851 - Posted: 8 May 2016, 14:57:27 UTC - in response to Message 1785847.  

... Seeing messages that your client was unable to download more work is perfectly normal and not necessarily indicative of a problem.

thanks a lot for the explanation :-)
ID: 1785851 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 17459
Credit: 392,958,583
RAC: 102,019
United Kingdom
Message 1785855 - Posted: 8 May 2016, 15:09:23 UTC

Cancel what I was typing as you appear to be picking up work for your CPUs again.....

Check your cache settings, to keep a reasonably full cache (100 CPU tasks and 100 tasks per GPU) you need to have the "store at least x days" set to between 4 and 6, with the additional days set to a LOW value, say between 0.01 and 0.1 days. Setting "additional days" to a low setting will make sure that whenever your computer connects to the server it will attempt to get work, while having a large value causes a delay of that many days - not quite what one would expect!
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1785855 · Report as offensive
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1785869 - Posted: 8 May 2016, 16:12:39 UTC - in response to Message 1785855.  

yes, meanwhile I picked up work again - some 100 all at once.
My cache is presently at 0.1 days for both entries.

I did this 0.1 on purpose, because I am also crunching WCG, and I don't want WCG to download such a tremendous amount of work, because these tasks are running between 4 and 10 hours ea., so with too many tasks being downloaded at a time, there is the risk that they are not being crunched before their due date.

Do I remember having read somewhere here in the forum that in no case, regardless of the settings, more than 100 Seti tasks are being downloaded at a time?
ID: 1785869 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2766
Credit: 498,898,621
RAC: 766,027
Canada
Message 1785878 - Posted: 8 May 2016, 16:45:28 UTC - in response to Message 1785869.  

100 is the maximum per device.
ID: 1785878 · Report as offensive
Erich56

Send message
Joined: 31 Dec 14
Posts: 7
Credit: 828,717
RAC: 0
Austria
Message 1785883 - Posted: 8 May 2016, 17:25:17 UTC - in response to Message 1785878.  

100 is the maximum per device.

okay, thanks. So I get this amount of tasks already with settings 0.1 and 0.1 days of work; hence, no need to inrease these values, I guess.
ID: 1785883 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1677
Credit: 429,152,221
RAC: 123,952
United States
Message 1785903 - Posted: 8 May 2016, 18:40:48 UTC

Can someone expound on the details of the when and why's of the 100 task limit. I believe it isn't a very new thing, but I know it hasn't been around forever, and I can't recall hearing an explanation of why it was implemented.

ID: 1785903 · Report as offensive
kittyman Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 50375
Credit: 981,151,631
RAC: 48,906
United States
Message 1785905 - Posted: 8 May 2016, 18:45:55 UTC - in response to Message 1785903.  

Can someone expound on the details of the when and why's of the 100 task limit. I believe it isn't a very new thing, but I know it hasn't been around forever, and I can't recall hearing an explanation of why it was implemented.

To lighten the load on the servers and cut down on the number of WUs being stored waiting for results to be reported. It cuts down the turnaround time.
"Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein
"With cats." kittyman

ID: 1785905 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5055
Credit: 376,588,116
RAC: 108,146
United States
Message 1785908 - Posted: 8 May 2016, 18:53:59 UTC - in response to Message 1785905.  

Like Mark said,

It was to lighten the load, also to cut down on how many, people were hoarding. That way no one downloaded a thousand work unit that then time out.

But it didn't work like they wanted it to.

Instead of 100 per total GPUs and CPU it instead became 100 PER GPU and 100 PER CPU chip.

They never went back to fix it but hey, it's a good thing they didn't ;)
ID: 1785908 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6533
Credit: 193,782,930
RAC: 12,506
United States
Message 1785937 - Posted: 8 May 2016, 21:04:26 UTC - in response to Message 1785908.  

Like Mark said,

It was to lighten the load, also to cut down on how many, people were hoarding. That way no one downloaded a thousand work unit that then time out.

But it didn't work like they wanted it to.

Instead of 100 per total GPUs and CPU it instead became 100 PER GPU and 100 PER CPU chip.

They never went back to fix it but hey, it's a good thing they didn't ;)

It was originally a fixed 100 CPU & 100 GPU. Then a change was made to the BOINC server code with the intention to allow the limits to apply be applied per GPU vendor (ATI, Intel, or Nvidia). As the then current limit could lead to resource starvation. If you had 1 ATI & 1 NVIDIA device you could hit your limit of tasks for only one device. Leaving the other to sit idle.
The first version of the change was a bit bugged & the limit was applied based on the total number of GPUs in the system. So if you has 1 ATI, 1 Intel, & 1 Nvidia you could have 300 GPU tasks for any 1 vendor. Instead of 100 for each separate vendor. Once the BOINC server code was working as intended correctly we were limited to 100 CPU tasks & 100 GPU * N Vendor GPUs.
So a system with 2 ATI, an iGPU, & 2 Nvidia would be limited to 600 total tasks.
100 CPU tasks
200 tasks shared across 2 ATI GPUs
100 tasks for the iGPU
200 tasks shared across 2 NVIDIA GPUs
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1785937 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1677
Credit: 429,152,221
RAC: 123,952
United States
Message 1785999 - Posted: 9 May 2016, 1:39:30 UTC

Well, I guess my question then would be, has this limit been re-evaluated as time progresses and as the processing power of especially video cards has improved? I would think there would be a time, if we're not already seeing it happen, where systems would run out of tasks to process during the downtimes and relatively short outages. You can't expect to achieve the goals that were mentioned above while having enough WU's on hand to handle any possible outage of course, but as things become faster, possibly the limit might be raised enough to at least keep everyone full for the average period of time of normally planned outages?

Also, might it be possible to adjust the number of tasks on hand based upon either the detected video card (though I doubt that would work because I have systems with multiple different level - a 950 and 750 for example - cards in them, and it doesn't currently even detect those correctly) or probably even better just base it upon correctly completed work and how fast it's turnaround time is. That is info the server already has, and is processor/brand independent, just set it to a number that would be equal to the number 100 would have been when it was implemented. So if the 570/580 was the current hot card back when the 100 limit was put in place, maybe it would be 200 for the 1070/1080 (assuming those work properly out of the chute of course, and process them the same as current cards do, just a bit faster)? Or would even this be too difficult to put in place currently or in the reasonably near future?

ID: 1785999 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6533
Credit: 193,782,930
RAC: 12,506
United States
Message 1786016 - Posted: 9 May 2016, 3:25:01 UTC - in response to Message 1785999.  
Last modified: 9 May 2016, 3:28:36 UTC

Well, I guess my question then would be, has this limit been re-evaluated as time progresses and as the processing power of especially video cards has improved? I would think there would be a time, if we're not already seeing it happen, where systems would run out of tasks to process during the downtimes and relatively short outages. You can't expect to achieve the goals that were mentioned above while having enough WU's on hand to handle any possible outage of course, but as things become faster, possibly the limit might be raised enough to at least keep everyone full for the average period of time of normally planned outages?

Also, might it be possible to adjust the number of tasks on hand based upon either the detected video card (though I doubt that would work because I have systems with multiple different level - a 950 and 750 for example - cards in them, and it doesn't currently even detect those correctly) or probably even better just base it upon correctly completed work and how fast it's turnaround time is. That is info the server already has, and is processor/brand independent, just set it to a number that would be equal to the number 100 would have been when it was implemented. So if the 570/580 was the current hot card back when the 100 limit was put in place, maybe it would be 200 for the 1070/1080 (assuming those work properly out of the chute of course, and process them the same as current cards do, just a bit faster)? Or would even this be too difficult to put in place currently or in the reasonably near future?

The reason the limits were put in place had to do with the database server going belly up when there were to many Results out in the field. I want to say the tipping point was somewhere in the 11-14 million range. As far as I know that is unchanged. Implementing limits is the projects chosen solution for that problem. It also benefits users by having tasks validated soon.
There isn't really any advantage for the project to increase the limits. The data we are analyzing is not really time sensitive. A few times they had partially full shipments of drives sent up from Arecibo when we ran out of data to process. They generally try to accommodate as many hosts as possible, but catering to a handful of the few very fast hosts wouldn't really make sense.
I know some people will complain about having to wait a little while for more work once a week. It is apparently a huge inconvenience for them. I imagine they are the same kind of people that would complain if a blood drive told them they ran out of supplies to collect blood, or didn't have any more sugar cookies.

My personal preference for any changes to the limits would be to have CPUs have limited applied to them like GPUs. So multi socket systems would be treated like multi GPU systems. However I don't think BOINC even detects the number of CPUs in a system. Only the total number of processors (physical or virtual).
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1786016 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1677
Credit: 429,152,221
RAC: 123,952
United States
Message 1786062 - Posted: 9 May 2016, 6:33:58 UTC - in response to Message 1786016.  

I'm not trying to be argumentative, just trying to understand the mechanics. You mentioned results out in the field, does that mean tasks waiting to be ran on hosts? Or does it mean tasks that have been processed and sitting waiting on peoples computers to be reported back to home base? The word results isn't clear to me.

Also, your answer brings up another question, if due to something out of the ordinary happens, (I don't know, lets say JK Rowling decides that she is facinated with SETI one day soon, the word gets out and her millions of fans think Hey, I could do this, and we add 10's or 100's of thousands of new users), would the system automatically throttle back the number of tasks per person to avoid going over the 11-14 number?

I was actally thinking my solution would possibly reduce the load on the system, because it would lower the number of tasks out there for the systems that can't optimally support processing 100 tasks on their systems in a reasonable timeframe, or does it already throttle them for those hosts, and the 100 limit is for only the fastest crunchers out there?

ID: 1786062 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5055
Credit: 376,588,116
RAC: 108,146
United States
Message 1786065 - Posted: 9 May 2016, 6:42:28 UTC - in response to Message 1786062.  

I'm guessing he means tasks that were sent out but haven't returned yet with results.

Can't speak for any more on this.
ID: 1786065 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8237
Credit: 11,853,903
RAC: 11,467
United States
Message 1786177 - Posted: 9 May 2016, 16:14:53 UTC - in response to Message 1786065.  

I'm guessing he means tasks that were sent out but haven't returned yet with results.

Can't speak for any more on this.

That's my understanding of the term.

Most of us think of "result" as the data file returned to S@H after we process a TASK. But for the lines on the SSP, "Results waiting to Send" and "Results out in the field" are actually the Tasks sent to our computers to be processed. Since each Task file sent out produces 1 single Result file, the two terms mean the same thing.
Donald
Infernal Optimist / Submariner, retired
ID: 1786177 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2766
Credit: 498,898,621
RAC: 766,027
Canada
Message 1786220 - Posted: 9 May 2016, 17:54:48 UTC - in response to Message 1786177.  

I think the term results is a result (pardon the pun) of the code being written in sections.

splitters have a task to do, the result is data sets are created.
web server has a task to do, the result is data is on our computers.
validator has a task to do, the result is pending.
etc, etc, etc
ID: 1786220 · Report as offensive

Message boards : Number crunching : "got 0 new tasks" - why so?


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.