Message boards :
Number crunching :
Want: EXIT_TIME_LIMIT_EXCEEDED to happen faster
Message board moderation
Author | Message |
---|---|
Joseph Stateson Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 |
There is a discussion here but I cannot make sense of it. My Problem: I have tasks on my pair of 5850s that should take 20-40 seconds. Occasionally one hangs and almost 2 hours later that exit timeout error occurs and that GPU finally starts crunching again. I want to somehow set the expected time to about 2 minutes and it seems there possibly is a way to do it by estimating the flops usage but I don't understand claggys instructions. There is a discussion here on how to set a timeout in the project job file but a client restart erases the XML file and I am back babysitting the project in either case. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
There is a discussion here but I cannot make sense of it. That url doesn't go anywhere. My Problem: I have tasks on my pair of 5850s that should take 20-40 seconds. Occasionally one hangs and almost 2 hours later that exit timeout error occurs and that GPU finally starts crunching again. I want to somehow set the expected time to about 2 minutes and it seems there possibly is a way to do it by estimating the flops usage but I don't understand claggys instructions. No normal Seti v7 task should only take 20 to 40 seconds, I don't think there are any GPUs fast enough to do Seti v7 Wu's that fast, you're also got no HD5850's attached to Seti, If you're talking about another project's apps, wouldn't the correct course of action be to stop those apps hanging? What were my alleged instructions? I don't recall giving you any instructions. There is a discussion here on how to set a timeout in the project job file but a client restart erases the XML file and I am back babysitting the project in either case. Claggy |
Joseph Stateson Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 |
Sorry, I was busy looking for the correct url as I failed to bookmark the thread http://setiathome.berkeley.edu/forum_thread.php?id=71148&postid=1347580#1347580 Anyway, you are correct as the problem should be addressed by the project if they would get around to it. I probabloy should have posted this at the boinc client forum. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
Sorry, I was busy looking for the correct url as I failed to bookmark the thread No need to bookmark that thread - the (very specific) BOINC upgrade problem being discussed there now has its own 'sticky' thread at the top of this forum: Seti, BOINC 7.0.28 going to 7.0.6x and 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED error on GPU work Unless you were making that specific transition across the BOINC v7.0.30 boundary, both threads are irrelevant to you. Most projects (including this one) won't have a 'job file': that's specific to the 'wrapper' application which is only used by projects which can't, or won't, write their own native-to-BOINC science application. The only other control you have available is <rsc_fpops_bound>, but that would be even more wasteful of your babysitting time: it's sent in the <workunit> specification of each individual task you're allocated. If your tasks are indeed completing validly in 20-40 seconds (unlikely here), you'd need to do a big global search/replace and restart BOINC every few minutes. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
The easiest way to unstick GPU tasks is to bounce the GPU at an interval. In the early days of doing MB work on my ATI GPUs I would run this every few hours. boinccmd --set_gpu_mode never 10 That way BOINC would suspend GPU processing for 10 seconds & then resume. For Bitcoin Utopia you may want to run that every 10-15 min. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Joseph Stateson Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 |
Thanks! This worked. I am setting up the win7 task scheduler to perform this function. Two tasks that were hung actually completed successfully after restarting. One was hung for 5 hours. I noticed that the run time reported by the project did NOT show 5 hours, only the normal 40 or so seconds. Apparently it did not exit properly after successfully completing. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Thanks! This worked. I am setting up the win7 task scheduler to perform this function. Two tasks that were hung actually completed successfully after restarting. One was hung for 5 hours. I noticed that the run time reported by the project did NOT show 5 hours, only the normal 40 or so seconds. Apparently it did not exit properly after successfully completing. If you suspend a BCU tasks it starts over. I don't think they checkpint because when you are running one it is doing stuff over the internet. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.