CUDA App Memory Usage

Message boards : Number crunching : CUDA App Memory Usage
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1231704 - Posted: 13 May 2012, 20:22:25 UTC - in response to Message 1231695.  


Yes, it has only 1 GB RAM, NVIDIA GeForce 6150SE nForce 430 (driver 191.07)

I just say that probably >90% of users that run 32 bit Windows use PAE without even knowing that it is enabled.
(DEP is enabled by default after Win XP SP2 and today's CPUs support hardware DEP, so PAE is ON by default on most computers)



Agreed!, sadly doesn't make the drivers LME compliant, nor the older graphics card DEP & PAE capable. I'd love to see a memory map from such a 32 bit Windows system with PAE adding the VRAM to maxxed out System RAM, bur won't hold my breath ;)

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1231704 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1231712 - Posted: 13 May 2012, 20:31:00 UTC - in response to Message 1231699.  


Where did you check?
There is no option to disable DEP (and this way also PAE on "new" CPUs) unless you edit directly boot.ini




(but if the CPU is older, e.g. P4 (which does not qualify as made 2006+) and have no hardware support for DEP
you are right - you have software DEP and no PAE enabled)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1231712 · Report as offensive
Wembley
Volunteer tester
Avatar

Send message
Joined: 16 Sep 09
Posts: 429
Credit: 1,844,293
RAC: 0
United States
Message 1231716 - Posted: 13 May 2012, 20:35:35 UTC

[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows XP Professional" /fastdetect /NoExecute=OptOut /OneCPU




/NoExecute turns on /PAE, and I still have 3.5G of usable ram.
ID: 1231716 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1231719 - Posted: 13 May 2012, 20:40:11 UTC - in response to Message 1231716.  

/NoExecute turns on /PAE, and I still have 3.5G of usable ram.


Correct, card & driver not LME compliant, so still gets mapped into the 32 bit system space.

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1231719 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1231734 - Posted: 13 May 2012, 21:00:55 UTC - in response to Message 1231719.  


But his video has 1 GB VRAM and still only 0.5 GB of System RAM are hidden/not used:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=5108663

How that happens?

Wembley, can you run SIV and show us PCI BARs






 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1231734 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1231739 - Posted: 13 May 2012, 21:06:11 UTC - in response to Message 1231734.  
Last modified: 13 May 2012, 21:07:04 UTC

How that happens?


New architecture drivers, Hybrid XPDM - WDDM, are virtualised, so can switch in different areas. This is where newer driver extra overhead comes in, to emulate WDDM, and if you use bigger than some amount performance will be very bad (worse than old XP driver)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1231739 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1231751 - Posted: 13 May 2012, 21:25:15 UTC - in response to Message 1231739.  

so can switch in different areas

You mean that different parts (pages?) of VRAM are dynamically mapped to be visible at different times in the same address space/range?
Can the driver be forced to use smaller address range (smaller "hole") into which to map e.g. 1/4 of the VRAM and not 1/2 (even if it may be slower)?


and if you use bigger than some amount ...

"bigger than some amount" of what? ;) (I really don't know)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1231751 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1231755 - Posted: 13 May 2012, 21:31:11 UTC - in response to Message 1231751.  

"bigger than some amount" of what? ;) (I really don't know)


VRAM, the concept is called 'paging' (virtualised video memory) and isn't supported under original XP Driver model (XPDM <= driver 190.38 or so, which use physical addressing). The newer hybrid 'compatibility' model will page in the upper or lower portion, at extreme cost (to/from disk)

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1231755 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1231762 - Posted: 13 May 2012, 22:02:12 UTC - in response to Message 1231755.  

... to/from disk!?
Oh, I didn't expected this!
I expected something like video card (directed by driver) knows which parts/pages of VRAM have to be currently "exposed" in/through the physical address range and "just do it" ;)
(as if the VRAM is 1/2 size and maps "the old way")
Even no memcopy, more like flipbuffers.

Why is disk involved?


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1231762 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1231824 - Posted: 14 May 2012, 2:46:50 UTC - in response to Message 1231762.  

... to/from disk!?
Oh, I didn't expected this!
I expected something like video card (directed by driver) knows which parts/pages of VRAM have to be currently "exposed" in/through the physical address range and "just do it" ;)
(as if the VRAM is 1/2 size and maps "the old way")
Even no memcopy, more like flipbuffers.

Why is disk involved?


Older XP uses a physical memory model for VRAM, newer paged to allow Cuda to operate by doing some of what WDDM does. (notice ~10% cost for old pre-Fermi cards without DMA engines on drivers after ~190.38 or so)

For effective sharing of hardware between OS & applications, including security & reliability features, NT / XP Virtual memory model consists of a certain structure (without Video on earlier XP display model):
- non paged kernel memory - that must hold core OS stuff (including kernel drivers) & page tables for the rest. Some page-locked memory by kerl & applications.
- paged Kernel memory - must hold any resident pages of non - user memory that are active in used ('paged in') the rest is paged out to disk
- paged user memory, for applications & any later user mode driver stub ends, inactive applications get aggressively paged out to disk on XP, to keep space for foreground applications & active background services etc

from PAE wiki
Microsoft Windows implements PAE if booted with the appropriate option, but current 32-bit desktop editions enforce the physical address space within 4GB even in PAE mode. According to Geoff Chappell, Microsoft limits 32-bit versions of Windows to 4GB as a matter of its licensing policy,[2] and Microsoft Technical Fellow Mark Russinovich says that some drivers were found to be unstable when encountering physical addresses above 4GB.[3] Unofficial kernel patches for Windows Vista and Windows 7 32-bit are available[4] that break this enforced limitation, though the stability is not guaranteed.


So you're stuck with 4G total address space due to policy & driver limitations.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1231824 · Report as offensive
Wembley
Volunteer tester
Avatar

Send message
Joined: 16 Sep 09
Posts: 429
Credit: 1,844,293
RAC: 0
United States
Message 1231848 - Posted: 14 May 2012, 3:47:48 UTC


ID: 1231848 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1232117 - Posted: 14 May 2012, 19:07:50 UTC
Last modified: 14 May 2012, 19:14:32 UTC


This is good to read:

Licensed Memory in 32-Bit Windows (by Geoff Chappell)
http://www.geoffchappell.com/notes/windows/license/memory.htm

At the end is the part about Windows XP SP2/SP3

"As much as anyone talks of defective 32-bit drivers that simply assume a 32-bit physical address space, it will forever remain that the most prominent example of such an assumption is what Microsoft itself coded into the 32-bit HAL for Windows XP SP2.
The difference is that Microsoft would have us believe that when Microsoft writes code that makes such an ordinarily fautly assumption, it’s not a defect but a defence. ......."


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1232117 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1232126 - Posted: 14 May 2012, 19:22:31 UTC - in response to Message 1232117.  

Great, that clarifies that :). So, according to that, the drivers should 'probably' work if they could, but Enterprise or Datacenter editions/licences would be needed for >4Gig (inc mapped VRAM), and has some other performance related effects to consider. Not sure it's any less of a dead end, but at least clearer ;)

Jason


"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1232126 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : CUDA App Memory Usage


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.