Message boards :
Number crunching :
AP task (NV) hitting 2GB memory limit and resetting over and over again
Message board moderation
Author | Message |
---|---|
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4264 Credit: 85,281,665 RAC: 126 |
Hi here's the task http://setiathome.berkeley.edu/workunit.php?wuid=1835824773 It is running on my GTX970 and the CPU usage and memory usage are very high. The CPU usage is about 95% and memory usage goes up to 2 GB and then the application just restarts itself. The application is Lunatics AP7_win_x86_SSE2_OpenCL_NV_r2721.exe with parameters: -unroll 18 -oclFFT_plan 256 16 256 -use_sleep -ffa_block 16384 -ffa_block_fetch 8192 -hp -tune 1 64 8 1 -tune 2 64 8 1 Is there something I could try before aborting it? |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Is there something I could try before aborting it? That depends on how many units you are doing at one time on the gpu ? How many units is the CPU doing at one time ? Are you using the IGPU ? |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
I must ask you why you think the unroll command is doing it ? I would delete the unroll command restart Bionic and see if that stops it in stead of aborting the unit if that is what you meant |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
I just also looked at your machine the Nvida driver is 344 and it has opencl 1.1 but your IGPU has opencl 1.2 possible problems there driver mismatch ? Upgrade your Nvida driver or better rollback the IGPU HD 4000 driver to a early'r one sorry can tell you witch one i haven't got a IGPU |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Hi Upgrade your Lunatics to v0.43a, which contains the r2737 app which eliminates this problem |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Hi Just wondering Richard i thought there was only a problem with the older cards and that's why Eric made the 0.43a version ? Just being curious |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Hi Eric didn't make the Lunatics 0.43a Installer, Richard did that. Claggy |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Hi *I* made the v0.43a version, because after v0.43 went live, it turned out that some special cases - like this workunit - had been missed in testing, and were badly handled by the application which Raistmer had originally supplied to me for distribution. |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
ops sorry Richard i thought it was one of you but wasn't shore getting old the brain doesn't remember well |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
ops sorry and thank's for the info |
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4264 Credit: 85,281,665 RAC: 126 |
I just installed Lunatics 0.43a, let's see what happens. A few comments for your questions: - I am running 3 WUs at a time on the 970. Two were APs and one Einstein BRP6. - Two CPU cores were free to feed the GPU - Intel GPU was not used - 6 WUs running on CPU, 2 MBs and 4 CPDN PNWs Tahnk you for everybody helping. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I have 970's and run Lunatics 0.43a for quite a while now. I too typically run two SETI tasks and one BRP6 task on each card at the same time with no issues. Or two SETI tasks and one MW task on each card. That is the only time I see an increase in completion times for SETI tasks on that card. MW tasks take more computing power and more PCIe bus time and slow down other tasks. Keith Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4264 Credit: 85,281,665 RAC: 126 |
The problematic WU finished without a major hitch with the new app, and so have the rest of AP WUs since that. Only thing to point out is that the app seems from time to time use high amount of CPU which causes some lag on computer. For example the clock app stops maybe for 5 seconds before catching up again. And when running two APs at the same time the apps seem to sync that high CPU usage making it seem even worse. Maybe removing the -hp parameter from cmd-line or allowing only one AP at a time in app_config could improve the user experience. |
rob smith Send message Joined: 7 Mar 03 Posts: 22286 Credit: 416,307,556 RAC: 380 |
CPU use depends on a number of things such as the amount of radar blanking that was applied to the data. (I think if there is a lot of blanking a lot of CPU use is required) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
CPU use depends on a number of things such as the amount of radar blanking that was applied to the data. (I think if there is a lot of blanking a lot of CPU use is required) That was with Astropulse v6, we're moved onto Astropulse v7 now, that doesn't have to do those calculations, so no extra CPU use. Claggy |
rob smith Send message Joined: 7 Mar 03 Posts: 22286 Credit: 416,307,556 RAC: 380 |
...That's strange because I just did a check on one of my crunchers. It has 2 GTX980 GPU and an AMD FX8 CPU. I'm running 2 AP tasks per GPU. Windows task manager reports: 4 off AP7_win_64_AVX_CPU_r2692.exe, each at ~13% (1 CPU core each) and 4 off AP7_win_x86_SSE2_OpenCL_NV-r2737.exe*32 each at ~13% (1 CPU core each) All 8 CPU cores are "red lined" at 100%. The BOINC gui manager says I'm running four GPU (openCL_Nvidia_100) and four CPU tasks (sse2). So what is happening? Are the "CPU" tasks actually using two cores each, or are the GPU cores using "50%" of a GPU plus a CPU core each, or what? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
So what is happening? Are the "CPU" tasks actually using two cores each, or are the GPU cores using "50%" of a GPU plus a CPU core each, or what? It's the OpenCL_NV application which has the high level of CPU use, nothing to do with radar blanking in that case. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
So what is happening? Are the "CPU" tasks actually using two cores each, or are the GPU cores using "50%" of a GPU plus a CPU core each, or what? That's because the Nvidia OpenCL 100% CPU usage Bug that has been in existence since the 27x.xx drivers, there is also a school of thought that the NV OpenCL apps could be programmed differently to work round that problem. Claggy |
Mike Send message Joined: 17 Feb 01 Posts: 34283 Credit: 79,922,639 RAC: 80 |
The AP app has the -use_sleep switch to reduce CPU usage significantly. Read the read me file please. With each crime and every kindness we birth our future. |
rob smith Send message Joined: 7 Mar 03 Posts: 22286 Credit: 416,307,556 RAC: 380 |
Thanks all Edited the AP command line file on the machine in question Argh, why does it decide to run a pile of MBs just before I started to to the edit, so now I'll have to wait until a pile of APs start.... (In fact bigger ARGHHH - no APs around for the GPUs, so an even longer wait...) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.