Task postponed for 30 seconds error and solution

Message boards : Number crunching : Task postponed for 30 seconds error and solution
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1745008 - Posted: 26 Nov 2015, 12:42:47 UTC

If you encounter next situation with GPU tasks (I just observed it on ATi APU):

task launched by BOINC but almost immediately acquire tag "postponed"; stderr has many:
BOINC assigns device 0, slots 0 to -1 (including) will be checked
ERROR: Wait for free slot failed, app will restart now,used_Mutex=-1,Slot_Mutex=44071440,BOINCs_device=0
CPU affinity adjustment enabled
GPUlock enabled. Use -instances_per_device N switch to provide number of instances to run if BOINC is configured to launch few tasks per device.
Number of app instances per device set to:0
Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to -1 (including) will be checked
ERROR: Wait for free slot failed, app will restart now,used_Mutex=-1,Slot_Mutex=16808464,BOINCs_device=0
CPU affinity adjustment enabled
GPUlock enabled. Use -instances_per_device N switch to provide number of instances to run if BOINC is configured to launch few tasks per device.
Number of app instances per device set to:0
Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to -1 (including) will be checked
ERROR: Wait for free slot failed, app will restart now,used_Mutex=-1,Slot_Mutex=43547152,BOINCs_device=0
CPU affinity adjustment enabled
GPUlock enabled. Use -instances_per_device N switch to provide number of instances to run if BOINC is configured to launch few tasks per device.

And BOINC message log has many
26/11/2015 15:30:11 | SETI@home | task postponed 30.000000 sec: Wait for free slot failed
26/11/2015 15:30:13 | SETI@home | task postponed 30.000000 sec: Wait for free slot failed
26/11/2015 15:30:14 | SETI@home | task postponed 30.000000 sec: Wait for free slot failed

Just reload BOINC completely. For some reason it goes mad sometimes.
ID: 1745008 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1745018 - Posted: 26 Nov 2015, 13:40:00 UTC - in response to Message 1745008.  

How many slots are there in the BOINC Data/Slots directory?
ID: 1745018 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1745024 - Posted: 26 Nov 2015, 14:11:43 UTC - in response to Message 1745018.  

How many slots are there in the BOINC Data/Slots directory?

26. At normal operation should be only 6.
ID: 1745024 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1745030 - Posted: 26 Nov 2015, 14:33:17 UTC - in response to Message 1745024.  

Okay, I would urge you to update to a newer than 7.6.9, but best wait until 7.6.17 or later is out for Windows for that.
ID: 1745030 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1746253 - Posted: 2 Dec 2015, 8:49:01 UTC - in response to Message 1745008.  

One weird thing about the error message: Wait for free slot failed.
Why is the science application reporting that it was waiting for a free slot and failed to do so? Is that the same slot as the slots BOINC uses?

The science app shouldn't be (needing to) check for an open slot. That's like your car checking if there's an open slot in the parking lot. (Well, maybe that the self-driving cars can do that).

So what kind of app is this doing this checking, and what is it checking for?
ID: 1746253 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1746372 - Posted: 2 Dec 2015, 21:35:36 UTC - in response to Message 1746253.  
Last modified: 2 Dec 2015, 21:38:11 UTC

One weird thing about the error message: Wait for free slot failed.
Why is the science application reporting that it was waiting for a free slot and failed to do so? Is that the same slot as the slots BOINC uses?

The science app shouldn't be (needing to) check for an open slot. That's like your car checking if there's an open slot in the parking lot. (Well, maybe that the self-driving cars can do that).

So what kind of app is this doing this checking, and what is it checking for?

Yep, consider this app as auto-car that existed long before driver learnt driving ;) Some terminology issue here: app's slot is not the same as BOINC directory slot, but serves similar thing. App needs to know how many instances it has for CPU affinity scheduling (BOINC still doesn't provide this) and historically for scheduling as whole (since then BOINC evolved to support GPU scheduling by its own, some rules that driver learnt ;) ).

Technically this bookkeeping is done via OS mutex interface. SOmehow it got screwed. Why app restart did not heal situation but BOINC restart did (and no OS restart required) - no idea, just fact notice.
ID: 1746372 · Report as offensive

Message boards : Number crunching : Task postponed for 30 seconds error and solution


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.