Questions and Answers :
GPU applications :
Normal for Astropulse units to slow down with time?
Message board moderation
Author | Message |
---|---|
Thunder Send message Joined: 3 May 03 Posts: 65 Credit: 993,581 RAC: 0 ![]() |
I have one linux machine with an Nvidia GT730 that's acting a bit odd with an Astropulse v7 task. Quite a while back it downloaded 1 with an estimated completion time of about 2.25h. It started it and I watched as it finished the first few percent and saw that it was completing about 1% every minute or minute and a half. I figured the estimate was about right. About 18 hours later, I had the chance to glance at that machine and was surprised to find it still working on the same task and the rate had slowed to about 1 1/1000th of a percent every 10 minutes. However, it was beyond 99%, so I figured I'd let it finish. About 45 minutes later, I decided it wasn't progressing at all and ended up restarting BOINC. Sadly, I learned that apparently those don't checkpoint because it reverted to 0% done. :-( Fast forward to today and it's restarted that same task again. It's now at about 3.5h (estimate was 2.25h) and has obviously slowed to where it's finishing about 1% of the task every 10 minutes. I can't tell a lot about what the GPU is doing other than by temperature and it's sitting coolly about 1C above the temp I would expect it to be at idle, so it's clearly not doing much. Is this just a bad task? Is it normal behavior? |
Thunder Send message Joined: 3 May 03 Posts: 65 Credit: 993,581 RAC: 0 ![]() |
4/14/2016 8:01:10 AM Aborting task ap_24jl10ac_B6_P1_00323_20160408_21551.wu_0: exceeded elapsed time limit 82718.30 (18210521.14G/220.15G) This was the message I got after 23h 15m of time and the task being at 99.995% complete. :-P Between the 2 attempts, there's about 41 hours of processing time that could have actually gone to something useful. In the absence of any response, I guess I'm just turning off Astropulse work. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
About 45 minutes later, I decided it wasn't progressing at all and ended up restarting BOINC. Have you ever tried to restart the computer, to clear the GPU's memory in case something was stuck there? Sadly, I learned that apparently those don't checkpoint because it reverted to 0% done. :-( Then something was definitely wrong with your computer, because even though Astropulse tasks don't checkpoint as many times as the Multibeam tasks do, they do checkpoint. Something like 10 times or so. Of course, there are APs that don't run well on all computers, but normally they err in different ways, they don't tend to hold up the GPU for many hours before crashing. But a next time, try to reboot the computer first. You never know what may have been stuck until you try. :) |
Thunder Send message Joined: 3 May 03 Posts: 65 Credit: 993,581 RAC: 0 ![]() |
Well, the good news is that another AP task finally floated down to that machine and it finished completely normally. I'm going to guess the previous one just had "issues". |
©2023 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.