Long time workunits

Message boards : Number crunching : Long time workunits
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1435672 - Posted: 30 Oct 2013, 18:34:56 UTC - in response to Message 1435667.  

Thanks. Now Julie has the AMD links she needs for her Hosts.
Now, if she will just install them :-)
ID: 1435672 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1436464 - Posted: 1 Nov 2013, 13:55:21 UTC

Oh Noes, it did it again.

Everything was going great for a couple days, then I found the ATI AP sitting at 46% complete with an elapsed time of 10 hours. Suspending/resuming did nothing. I stopped BOINC 7.0.44 and changed the plan class on that task to nvidia. After relaunching BOINC, the task now says 46% complete, elapsed time 2:57 minutes. That would be about normal for an ATI AP task taking 6.5 hours. The nVidia MB task and the AP CPU task are unaffected, it's just the ATI task that stalled.

This is starting to remind me of the DELL P4 XP machine I have that has BOINC LOCK up after 2 days when running just CPU tasks. I fixed the DELL by switching to Linux. Unfortunately, you can't run a nVidia MB, and an ATI AP at the same time in Linux. At least not with Ubuntu, AFAIK.

Hmmmm...
ID: 1436464 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1437939 - Posted: 4 Nov 2013, 20:22:08 UTC

So far so good. I decided to go back to what worked the last time the 4670 was working with XP. From around Oct to Feb it worked fine in my other Dual core Host with XP. Then from Feb to Aug it worked fine with Windows 8 in the same other Host. Right now it's using BOINC 7.0.36 and AstroPulse r1316. Seems there is less screen lag with BOINC 7.0.36 on that host. For some reason both BOINC 7.0.44 & 7.0.64 invoke quite a bit of Kernel Activity on that Host, and there is quite a bit of redraw flickering in the BOINC Manager. Because of that, I usually kept the Manager minimized, which caused the Kernel Activity to be reduced. In fact, the last couple times an AP stalled, it was overnight with the Manager minimized. I think it was just a coincidence the AP stalled when minimized. If it makes it another day without stalling, I'll try it when minimized :-)
ID: 1437939 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1439661 - Posted: 7 Nov 2013, 14:36:36 UTC - in response to Message 1437939.  

It's not a coincidence. The only ATI Stall suffered in the last few days occurred overnight when the BOINC Manager was minimized. The CPU & nVidia tasks were not affected. As soon as the Manager was Maximized, the ATI Progress resumed. This doesn't happen on my other Hosts.

Strange things when viewing CPU activity in SIV. The 'heavy' boincmgr Kernel Activity is back. When Maximized, SIV shows boincmgr as using 20-40% 'Kernel' (RED). If you just scroll and hide the 3 active tasks, the Kernel Activity drops to almost nothing. Restarting BOINC does nothing. Whut?
ID: 1439661 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1439672 - Posted: 7 Nov 2013, 14:58:11 UTC - in response to Message 1439661.  
Last modified: 7 Nov 2013, 15:03:26 UTC

It's not a coincidence. The only ATI Stall suffered in the last few days occurred overnight when the BOINC Manager was minimized. The CPU & nVidia tasks were not affected. As soon as the Manager was Maximized, the ATI Progress resumed. This doesn't happen on my other Hosts.

Strange things when viewing CPU activity in SIV. The 'heavy' boincmgr Kernel Activity is back. When Maximized, SIV shows boincmgr as using 20-40% 'Kernel' (RED). If you just scroll and hide the 3 active tasks, the Kernel Activity drops to almost nothing. Restarting BOINC does nothing. Whut?

Which BOINC version was that particular experience with?

Charlie Fenton (BOINC developer with primary responsibility for the User Interface - i.e. the Manager - for all versions), has been upgrading wxWidgets (the control responsible for displaying the grid for the task list) - and he tried some techniques for reducing the flicker, but backed out of them again.

My v7.2.28 (x64) shows wxWidgets Version 2.8.10 - could you post comparative figures from your Manager, please? (from Help|About).

If there's a problem, we need to catch it quickly before this build is made 'recommended' - and I've delayed it three times already. I'm going to have to be very, very sure of my ground before I try that again.

Edit - having read your second-to-last post again, I see you're using an older Alpha manager at the moment. So, maybe testing the release candidate v7.2.28 would be a good idea? You can find the links here. I'd still like to see the wxWidgets version number for the Manager you're using, please.
ID: 1439672 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1439678 - Posted: 7 Nov 2013, 15:05:47 UTC - in response to Message 1439672.  

I've experienced ATI stalls and Heavy Kernel activity with BOINC 7.0.36, 7.0.44, and 7.0.64. I'll update to 7.0.64 and play with SIV.
ID: 1439678 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1439712 - Posted: 7 Nov 2013, 16:12:05 UTC - in response to Message 1439672.  
Last modified: 7 Nov 2013, 16:36:15 UTC

After a fresh system restart, BOINC 7.0.64 shows around 10-35% Kernel, the Apps show less than 1% Kernel apiece. I've seen boincmgr higher with 7.0.64, probably take a couple days as with 7.0.44/36. Hiding the active tasks reduces boincmgr Kernel to around 5%. Total CPU usage in the BOINC Status window is around 55%...with the tasks hidden.

I'll give the candidate a try.
ID: 1439712 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1439720 - Posted: 7 Nov 2013, 16:50:43 UTC

I have noticed the cpu V7 multibeam going to 5 hours per wu compared to 2 hours or 2.5 maximum time.
Since yesterday I have noticed this increase in time and cpu usage is half of what it was before.
It was 95% - 100% loaded now 50% is what they are using.

Long time for these units.
ID: 1439720 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1439742 - Posted: 7 Nov 2013, 17:22:03 UTC

Well after thinking it was 7.2.27 I downloaded 7.2.28 and upon restarting Boinc I had 100% loading on all cores. 10 minutes later while on the same wu the load dropped to 50%

Weird long run times




ID: 1439742 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1439764 - Posted: 7 Nov 2013, 17:41:06 UTC - in response to Message 1439672.  

...having read your second-to-last post again, I see you're using an older Alpha manager at the moment. So, maybe testing the release candidate v7.2.28 would be a good idea? You can find the links here. I'd still like to see the wxWidgets version number for the Manager you're using, please.

7.2.28, wxWidgets 2.8.10, appears to be about the same as 7.0.64. The average Kernel use in SIV is around 15% then drops to around 5% when I hide the three active tasks. It drops to just about nothing when minimized. It might be awhile to see if the ATI task stalls when the Manager is minimized. I changed the ATI AP App back to r1843.
ID: 1439764 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1440018 - Posted: 7 Nov 2013, 22:21:48 UTC - in response to Message 1439764.  

The first task using 7.2.28 finished normally. The next ones will be run with the Manager Minimized. I just checked my Windows 8.1 Host running BOINC 7.0.64, SIV shows the boincmgr Kernel usage at around 1 to 2% while displaying basically the same three type of tasks as the XP Host.
ID: 1440018 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1440960 - Posted: 10 Nov 2013, 17:08:17 UTC - in response to Message 1440018.  
Last modified: 10 Nov 2013, 17:21:23 UTC

Still no ATI AP stalls using BOINC 7.2.28. I must admit though, since determining that it only stalls when the BOINC Manager is Minimized, I haven't been minimizing it that much. It did make it through a couple of nights minimized. So, for anyone having this problem I would recommend trying BOINC 7.2.28. The GPU Driver updates mentioned earlier wouldn't hurt either. I've also noticed this Host, Computer 7102466 is still needlessly trashing APs and depriving the Raccoons of much needed points. A properly functioning 5870 should produce around 30,000 RAC a day running APs correctly.

Also, after posting about the DELL P4, I realized I had never run it with SETI v7 tasks. Back on the 5th, I set it up to run the Lunatics v0.41 SSE2 CPU App. Other than the accumulated XP updates and updating to BOINC 7.0.64, that was the only change made since turning it off back in March. It is now approaching 6 days of 24/7 operation and the BOINC Manager still hasn't crashed. I think I will let it run a couple more days then give up on it.
ID: 1440960 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1441281 - Posted: 11 Nov 2013, 17:40:42 UTC

Oh well, just as soon as you post about it, it happens again. Same scenario. Minimize BOINC Manager, go to bed, then wake up to the ATI App showing much longer elapsed time than it should. It had just started the AP, almost 7 hours later it was showing 55% complete, it should have finished at 6.5 hours. After maximizing the Manager, it was showing normal progress. The nVidia CUDA App and CPU AP App were not affected. The CUDA App shows normal progress during the same period. The main Monitor is running off the ATI card with a s-video adapter connected to the nVidia card. The ATI App works fine as long as you keep the BOINC Manager Maximized. There wasn't any Blanking on the ATI AP. Restarting the BOINC Manager didn't reset the Elapsed time.

It's working on another AP now, this one does have some blanking. Based on past times, I'd say it will finished the partially blanked AP in about 7 hours, much better than the 9.5 it just took on the Un-blanked one.

Just keep it Maximized....
ID: 1441281 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 1441446 - Posted: 12 Nov 2013, 2:59:37 UTC - in response to Message 1441281.  

...
Just keep it Maximized....

Did you try to set <no_priority_change>1</no_priority_change> in cc_config.xml already ?
_\|/_
U r s
ID: 1441446 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1441453 - Posted: 12 Nov 2013, 3:35:51 UTC - in response to Message 1441446.  

...
Just keep it Maximized....

Did you try to set <no_priority_change>1</no_priority_change> in cc_config.xml already ?

The ATI card is set to -HP in the CL file, the nVidia card is set to Normal in the mbcuda file, you can see it in the Stderr output. I'd rather not raise the priority of the CPU task. What's strange is the inconsistency. I've been noticing a few longer than normal tasks since I moved the card to this host. You can see the OP host also has inconsistency,
Error tasks for computer 7061796...
?
ID: 1441453 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1443112 - Posted: 16 Nov 2013, 17:24:40 UTC - in response to Message 1441453.  

Well, it's been 5 days since the last 'Stall'. The previous Stalls were occurring about every 2 or 3 days. Since leaving the BOINC Manager Maximized, the Stalls have disappeared. The un-blanked APs are all completing in a pretty consistent run-time, AstroPulse v6 tasks for computer 6979629
So, what's up with that?
ID: 1443112 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1456460 - Posted: 22 Dec 2013, 21:17:11 UTC - in response to Message 1443112.  

It appears I have found the solution on MY Host, Computer 6979629
Leaving the BOINC Manager Maximized worked for the then existing setup, the HD4670 in the 16x slot driving the monitor and the GTS250 in the 1x slot connected to a S-video adapter. Then I changed the setup to having an 8800GT in the 16x slot running the monitor and a HD6770 in the 1x slot connected to the S-Video connector. The 6770 began going idle, sometimes within 30 minutes. The Task Time Elapsed kept right on going even though SIV showed the GPU had ceased output, down-clocked, and cooled down. Again, the nVidia card wasn't affected. WTH?

After some time, I determined it must be the ATI Driver with Catalyst 12.1 since the nVidia card with driver 266.58 wasn't affected. Then I changed the Windows XP Power Setting from putting the monitor asleep after 2 hours to NEVER. So far, no more idle moments for the 6770. Apparently, the ATI driver thought that setting meant it was fine to kill the GPU whenever it wanted, as long as there wasn't any movement on the screen.

Now to see what happens overnight after I turn the monitor Off...

ID: 1456460 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1457947 - Posted: 27 Dec 2013, 15:37:23 UTC - in response to Message 1456460.  

Well, that worked for about a day....

I went back to what worked before...Just Leave It Maximized.
Since I had the XP ATI desktop to the right of the NV desktop, I just extended the Event Log window so part of it was on the ATI desktop. That works, but, sometimes it's a few minutes before there is activity in the Log and the ATI card will Sleep for those few minutes. I needed something small and simple to create motion on the ATI desktop constantly. A Simple Fix, that works;
http://softadvice.informer.com/Free_Desktop_Clock_For_Windows_Xp.html
ClocX, It Works. Install it, right click on the tray icon and chose Options.../Appearance, and check 'Draw second hand'. Then drag it to the ATI Desktop. The second hand is all the motion you need to keep the ATI card from Sleeping.
Look at the Consistent Runtimes for the tasks with a CPU time of around 200 seconds. They are all within a few seconds since using the clock, because, The ATI card is Not Sleeping.
All AstroPulse v6 tasks for computer 6979629
Works for me.
ID: 1457947 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Long time workunits


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.