work units in error state


log in

Advanced search

Questions and Answers : Getting started : work units in error state

Author Message
Profile genfoch01
Avatar
Send message
Joined: 23 Jun 99
Posts: 4
Credit: 873,571
RAC: 0
United States
Message 1116267 - Posted: 12 Jun 2011, 14:43:37 UTC

So I am running boinc on a windows machine using the gpu. I have been getting wu in an error state with the stderr

<core_client_version>6.12.26</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
]]>

anyone know what this means? It can not be speaking of the wu deadline since thats over 2 weeks away.

I now have 7 of these so I am curious as to what this is about and if there is any way to eliminate it.
____________
Angels can fly because they take themselves lightly

- G.K.Chesterton

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 359,239
RAC: 31
Germany
Message 1116322 - Posted: 12 Jun 2011, 16:15:25 UTC - in response to Message 1116267.

When have you rebooted that machine the last time?

It looks like the CUDA tasks get stuck on your GPU. Are they showing any progress in the Tasks tab of BOINC manager when running?

Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12324
Credit: 2,626,600
RAC: 954
Netherlands
Message 1116409 - Posted: 12 Jun 2011, 21:20:22 UTC - in response to Message 1116267.

<core_client_version>6.12.26</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
]]>

anyone know what this means?

What it means is this:
Each task comes equipped with a 'runtime estimate' which is measured in FLOPs (FLoating point OPerations). It's the <rsc_fpops_est>64788201036261.297000</rsc_fpops_est> amount in the <workunit/> tags in client_state.xml file (my example given is from an Einstein task).

BOINC will check whether or not the application goes over this estimated amount, and if it does, the application is stopped and that error is given.

Another error similar to that is when <rsc_disk_bound>100000000.000000</rsc_disk_bound> gets exceeded, then the message is "Maximum disk space exceeded". Then the amount of disk space that the task should use up is too big.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2789
Credit: 6,300,044
RAC: 7,495
Bulgaria
Message 1116464 - Posted: 13 Jun 2011, 0:13:48 UTC - in response to Message 1116409.
Last modified: 13 Jun 2011, 0:43:03 UTC


Simply put:
If the task for whatever reason runs 10 times longer than the initial estimate (column "To completion")
BOINC aborts the task automatically as it decides that the task is hang/stuck.

(But why it happened to tasks with "Run time: 0.00" I don't know
http://setiathome.berkeley.edu/results.php?hostid=6014929&offset=0&show_names=0&state=5&appid=

Maybe because you just reached:
Consecutive valid tasks 11
http://setiathome.berkeley.edu/host_app_versions.php?hostid=6014929
)

Info on old threads:
http://setiathome.berkeley.edu/forum_thread.php?id=60413&nowrap=true
http://setiathome.berkeley.edu/forum_thread.php?id=62226


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Questions and Answers : Getting started : work units in error state

Copyright © 2014 University of California