VLARs may be compounding the Scheduler problems

Message boards : Number crunching : VLARs may be compounding the Scheduler problems
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 32170
Credit: 79,922,639
RAC: 181
Germany
Message 1298577 - Posted: 25 Oct 2012, 9:38:02 UTC - in response to Message 1298541.  

Oh, I prefer it if VLAR WUs stay off ATI GPUs. I know they don't suffer nearly as badly as NV ones in terms of processing time, but my GUI responsiveness goes down dramatically with VLAR WUs. If the scheduler was to return sending VLAR WUs to ATI GPUs, I would hope there'd be an option to disable that, please.


You can just use higher period_iterations_num value so its no real problem.
New apps will have it as default.

With each crime and every kindness we birth our future.
ID: 1298577 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 1,255
Australia
Message 1298541 - Posted: 25 Oct 2012, 5:53:16 UTC

Oh, I prefer it if VLAR WUs stay off ATI GPUs. I know they don't suffer nearly as badly as NV ones in terms of processing time, but my GUI responsiveness goes down dramatically with VLAR WUs. If the scheduler was to return sending VLAR WUs to ATI GPUs, I would hope there'd be an option to disable that, please.
Soli Deo Gloria
ID: 1298541 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1296795 - Posted: 19 Oct 2012, 4:32:49 UTC - in response to Message 1296794.  

I'm running 296.10 since it came out. Don't remember
what I was running before that.
ID: 1296795 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 12990
Credit: 208,696,464
RAC: 690
Australia
Message 1296794 - Posted: 19 Oct 2012, 4:25:05 UTC - in response to Message 1296754.  

It may have something to do with the driver version

Could be.
When the VLARs were coming through for the GPU i had no problem running 2 at a time on my GTX 560Ti (Vista) or GTX 460 (Win7), using driver 275.33 on both.


Grant
Darwin NT
ID: 1296794 · Report as offensive
Profile Wiggo "Democratic Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 18397
Credit: 261,360,520
RAC: 1,109
Australia
Message 1296754 - Posted: 18 Oct 2012, 23:57:46 UTC - in response to Message 1296750.  
Last modified: 18 Oct 2012, 23:58:05 UTC


(snippage)

What happened to mine that as each VLAR was processed it seemed to leave a bit behind in video memory and as more VLAR's were processed more video RAM was taken up until the cards just ran out of memory.
Now I could've baby sat each 1 through 1 at a time only but it was more time saving just to abort the lot of them that I had (and there was quite a lot of them).

Cheers.


Well, ran and am still running, VLARs on my GTX560Ti cards 3 at a time
and no problems seen, even after one stretch of over 12 hours straight.

No increase in memory utilization seen using GPUz.
Perhaps your using Vista has something to do with it.

That could be possible but then why did my Win7 rig with dual GTX550Ti's do the exact same thing?

It may have something to do with the driver version I use also but until I have to I'm not updating it just to try them again as other software here must also be stable with the drivers I use.

Cheers.
ID: 1296754 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1296750 - Posted: 18 Oct 2012, 23:14:24 UTC - in response to Message 1296010.  


(snippage)

What happened to mine that as each VLAR was processed it seemed to leave a bit behind in video memory and as more VLAR's were processed more video RAM was taken up until the cards just ran out of memory.
Now I could've baby sat each 1 through 1 at a time only but it was more time saving just to abort the lot of them that I had (and there was quite a lot of them).

Cheers.


Well, ran and am still running, VLARs on my GTX560Ti cards 3 at a time
and no problems seen, even after one stretch of over 12 hours straight.

No increase in memory utilization seen using GPUz.
Perhaps your using Vista has something to do with it.
ID: 1296750 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1296060 - Posted: 17 Oct 2012, 5:01:42 UTC - in response to Message 1296010.  

Sorry 'bout that, twas in a bit of a hurry.

So far have done 12 of them 3 at a time on 2 gtx560ti's.
They're averaging ~2H 15Min a piece.

The cpu's were averaging 2H 25Min.

No video problems noticed yet. I think I'll let it run overnight and see what it looks like in the morning.
ID: 1296060 · Report as offensive
Profile Wiggo "Democratic Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 18397
Credit: 261,360,520
RAC: 1,109
Australia
Message 1296010 - Posted: 17 Oct 2012, 1:25:29 UTC - in response to Message 1295993.  

Early NVidia GPUs had a problem running VLAR tasks.

Sorry but it's not just early ones that suffer the problem as I found that out myself recently. They started out fine at 1 or 2 now and again but once they got to a constant stream they would eventually leave the system bogged down in molasses plus start to produce invalid/errored work that only a complete system restart would temporarily fix (obviously they got aborted really quick).

I was hoping you had done some investigation in to exactly what happened
to cause the bog down.

Some systems maybe fine with them so long as the card is only doing 1 at a time but I can't say for certain as I prefer my later cards to do 2 at a time.

Mine have no problem with two at a time. I think I'll do some reshuffling to see
what happens with three at a time.


Also for those chasing RAC forget them as they take twice as long at least to process for less than half the usual credit (sorry I'm not one but these people also need to know this too).

Cheers.



I nearly missed this Bill, you really need to learn how to quote and reply better mate, but to answer your question.

I was hoping you had done some investigation in to exactly what happened
to cause the bog down.


What happened to mine that as each VLAR was processed it seemed to leave a bit behind in video memory and as more VLAR's were processed more video RAM was taken up until the cards just ran out of memory.
Now I could've baby sat each 1 through 1 at a time only but it was more time saving just to abort the lot of them that I had (and there was quite a lot of them).

Cheers.
ID: 1296010 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1295993 - Posted: 16 Oct 2012, 23:30:56 UTC - in response to Message 1295931.  

Early NVidia GPUs had a problem running VLAR tasks.

Sorry but it's not just early ones that suffer the problem as I found that out myself recently. They started out fine at 1 or 2 now and again but once they got to a constant stream they would eventually leave the system bogged down in molasses plus start to produce invalid/errored work that only a complete system restart would temporarily fix (obviously they got aborted really quick).

I was hoping you had done some investigation in to exactly what happened
to cause the bog down.

Some systems maybe fine with them so long as the card is only doing 1 at a time but I can't say for certain as I prefer my later cards to do 2 at a time.

Mine have no problem with two at a time. I think I'll do some reshuffling to see
what happens with three at a time.


Also for those chasing RAC forget them as they take twice as long at least to process for less than half the usual credit (sorry I'm not one but these people also need to know this too).

Cheers.

ID: 1295993 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 32170
Credit: 79,922,639
RAC: 181
Germany
Message 1295962 - Posted: 16 Oct 2012, 14:06:48 UTC


I have actually still 235 VLARs in my cache.
Assigned to CPU of course.

With each crime and every kindness we birth our future.
ID: 1295962 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1295956 - Posted: 16 Oct 2012, 13:57:49 UTC - in response to Message 1295931.  
Last modified: 16 Oct 2012, 13:58:21 UTC

Early NVidia GPUs had a problem running VLAR tasks.

Sorry but it's not just early ones that suffer the problem as I found that out myself recently. They started out fine at 1 or 2 now and again but once they got to a constant stream they would eventually leave the system bogged down in molasses plus start to produce invalid/errored work that only a complete system restart would temporarily fix (obviously they got aborted really quick).

Some systems maybe fine with them so long as the card is only doing 1 at a time but I can't say for certain as I prefer my later cards to do 2 at a time.

Also for those chasing RAC forget them as they take twice as long at least to process for less than half the usual credit (sorry I'm not one but these people also need to know this too).

Cheers.


Haven't seen VLAR WUs nor for CPU nor ATI GPU?! Mostly ~0.4AR to VHAR
work
.
ID: 1295956 · Report as offensive
Profile Wiggo "Democratic Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 18397
Credit: 261,360,520
RAC: 1,109
Australia
Message 1295931 - Posted: 16 Oct 2012, 11:25:49 UTC - in response to Message 1295782.  
Last modified: 16 Oct 2012, 11:30:42 UTC

Early NVidia GPUs had a problem running VLAR tasks.

Sorry but it's not just early ones that suffer the problem as I found that out myself recently. They started out fine at 1 or 2 now and again but once they got to a constant stream they would eventually leave the system bogged down in molasses plus start to produce invalid/errored work that only a complete system restart would temporarily fix (obviously they got aborted really quick).

Some systems maybe fine with them so long as the card is only doing 1 at a time but I can't say for certain as I prefer my later cards to do 2 at a time.

Also for those chasing RAC forget them as they take twice as long at least to process for less than half the usual credit (sorry I'm not one but these people also need to know this too).

Cheers.
ID: 1295931 · Report as offensive
Profile Mark St. Denis

Send message
Joined: 20 May 99
Posts: 27
Credit: 8,577,900
RAC: 0
United States
Message 1295899 - Posted: 16 Oct 2012, 8:37:18 UTC - in response to Message 1295782.  

Thank you for the info. I just found the other thread about it. :)
ID: 1295899 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 45
United States
Message 1295782 - Posted: 15 Oct 2012, 23:11:55 UTC - in response to Message 1295723.  
Last modified: 15 Oct 2012, 23:14:58 UTC

Just noticed that I have 28 errors from VLAR's. All of them had quick turnaround times with some of them under 8 minutes. I think we still have a scheduling problem.

Looks like you got hit with the "No VLARs to NVidia GPUs" rule. There is a thread about it, but the short answer is:

Early NVidia GPUs had a problem running VLAR tasks. So a change was made to the Scheduler to prevent VLARs from being sent to NVidia GPUs. What probably happened was your computer requested CPU work, and got assigned 28 VLAR tasks. But the pipe was congested, so the scheduler reply with the assignment info got "lost", so your computer never got the message to download those tasks. A short time later, your computer asked for more work, this time for both CPU and GPU. The scheduler realized that you did not get teh previously assigned VLARs, and tried to "resend" them. But since the Scheduler tries to fill the faster resource (GPU) first, it said "oops, can't send VLARs to an Nvidia GPU" and cancelled them/timed out.

Maybe someday, the scheduler will be able to look at the model of GPU and apply the "No VLARs" rule only to older models that have the problem, but until then, this is how it works.
Donald
Infernal Optimist / Submariner, retired
ID: 1295782 · Report as offensive
Profile Mark St. Denis

Send message
Joined: 20 May 99
Posts: 27
Credit: 8,577,900
RAC: 0
United States
Message 1295723 - Posted: 15 Oct 2012, 19:40:34 UTC

Just noticed that I have 28 errors from VLAR's. All of them had quick turnaround times with some of them under 8 minutes. I think we still have a scheduling problem.
ID: 1295723 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 18642
Credit: 416,307,556
RAC: 863
United Kingdom
Message 1291604 - Posted: 5 Oct 2012, 13:30:49 UTC

Sounds sensible Clive, maybe too sensible?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1291604 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 156
United Kingdom
Message 1291401 - Posted: 4 Oct 2012, 22:03:46 UTC

You may think i have my head unscrewd but......
I am wondering if it would be possible to have a switch in `Project preferences` to make VLAR a seperate kind of work unit,
It already is by definition a `VLAR`
Have it as an `opt in` choice,
My thinking is that we would have :-

MB - S@H Enhanced
AP - Astropulse
VLAR on GPU
Other Applications

And if MB AP and `other apps` were unticked VLAR would be the only type of workunit to download.
I am nuts enough to run it that way if i could :¬)
This way the longest running work unit go to the quickest processor - GPU.
ID: 1291401 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 18642
Credit: 416,307,556
RAC: 863
United Kingdom
Message 1291361 - Posted: 4 Oct 2012, 20:01:41 UTC
Last modified: 4 Oct 2012, 20:03:17 UTC

Wasn't that the rule that was put in place to flush the VLARS that were wrongly sent to Nvidia GPU? For a few weeks I was only getting VLAR on the CPU, now they are popping up regularly on the GPU.

Another load have just arrived as resends - indeed the whole of the most recent batch of downloads - which are not tagged as "resends" - are VLARS going to an Nvidia GPU.



edit:
(call me a twit - the latest batch are "long file name" ones not VLAR)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1291361 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 12990
Credit: 208,696,464
RAC: 690
Australia
Message 1291359 - Posted: 4 Oct 2012, 19:53:06 UTC - in response to Message 1291241.  

It certainly looks as though the "don't send VLARS to (Nvidia) GPUs" rule has been turned off - over the last few days I've had a few dozen, which have been timed out in a few hours.

From what i can tell the Do not send VLARs to Nvidia GPUs is still in effect. That's why on lost WU resends some WUs time out straight away- because they were initially allocated to the CPU, but get resent to the GPU which isn't meant to get them.
Grant
Darwin NT
ID: 1291359 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 18642
Credit: 416,307,556
RAC: 863
United Kingdom
Message 1291241 - Posted: 4 Oct 2012, 16:18:51 UTC

It certainly looks as though the "don't send VLARS to (Nvidia) GPUs" rule has been turned off - over the last few days I've had a few dozen, which have been timed out in a few hours. Not good, but given this is a few dozen in over a thousand WU not a big impact on the servers.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1291241 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : VLARs may be compounding the Scheduler problems


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.