Message boards :
Number crunching :
OpenCL AstroPulse crash after processing completion - write here.
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next
Author | Message |
---|---|
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I shall limp along until the new builds are available. My 1763 WU cache now has 1704 AP WUs in it. So even opting out of AP would take me a long time to clear. Thank you for your reply, I shall watch this thread for further news. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
BOINC is latest version. I even have 2 cores "free" of work for a single GPU. If there isn't any special need to run a certain video driver, Version: 266.58 WHQL should work fine on your Host. It DOESN'T use a full CPU core to process Astropulse tasks and is also good up to, and including CUDA 3.2. I use it on both my NV 8800 & GTS 250 to run APs when I desire. From my tests, there isn't any slow down from using 266.58 instead of a driver that uses a full CPU core. The release notes say it good for GeForce GTX 580s and below. You would need to run the Clean Install Option... GeForce/ION Driver Release 266.58 WHQL |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
BOINC is latest version. I even have 2 cores "free" of work for a single GPU. That would be worth a try for Mark also then. With each crime and every kindness we birth our future. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
BOINC is latest version. I even have 2 cores "free" of work for a single GPU. Except that I need higher driver revisions to run 4.2 and 5.0 Cuda, which are the versions best suited to the MB apps on my GPUs. Meowsigh. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
BOINC is latest version. I even have 2 cores "free" of work for a single GPU. It's not connected with free cores. Should not at least. As experiment try to downgrade BOINC. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Ibo Send message Joined: 22 Mar 00 Posts: 6 Credit: 6,075,931 RAC: 0 |
As experiment try to downgrade BOINC. Back on 7.0.28. It might take a while till I get some AP units (one is in queue). The chance to get an error was so far 2 out of 3. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
As an experiment, I raised the priority of boinc.exe to high along with the app on the 3 problem rigs I have. Nope, that didn't stop the errors either. Oh well, it was a thought. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
As an experiment, I raised the priority of boinc.exe to high along with the app on the 3 problem rigs I have. Nope, that didn't stop the errors either. Oh well, it was a thought. Sorry....I have been trying to mitigate the damage. But it's still more than I am willing to accept. Hoping that Raistmer can come up with a build that will work better on those rigs. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I have one idea... but it doesn't explain why you see issues on older BOINC too. What about trying to get to older NV drivers for testing? I know it will hit your CUDA MB performance and don't consider this as solution even as workaround but worth to try and see will it reduce number of failures or not. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
I have one idea... but it doesn't explain why you see issues on older BOINC too. What i`ve suggested already. With each crime and every kindness we birth our future. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I have one idea... but it doesn't explain why you see issues on older BOINC too. I had older drivers on them....304.79, I believe. Mike had suggested updating them as one of the first things to try to get the errors to stop. I also just had another thought. Until Eric patches the servers to stop sending VLAR work to NV hosts (which he hoped to do today or tomorrow), I am still getting VLAR GPU work. Now, there has been much documentation about how badly VLAR on NV Cuda hosts can tie up the whole system, and will in fact slow down another non-VLAR task running at the same time. I am wondering if a VLAR running whilst the AP task is trying to finish could be having some impact here. They can and do cause problems with system hangs for some people. If the fix is implemented and my caches clear the VLAR tasks, it would be interesting to see if THAT might be having an impact here. I suspect it's not the answer, but... And the odd thing is that the app is running fine on my top host with 3 580s. Not a single AP error turned in. Same Boinc, same OS, still on 304.79 drivers. Ditto for my number 2 and 3 rigs....no AP errors. Now, the rigs having the problems are older mobos, processors, and are slower. One is my only dual core. Just more thinking out loud. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
VLARs are very useful for pulse kind of signals cause they allow better Pulse accumulation (better sensitivity) and wider pulse sizes (bigger parameter space for search). Unfortunately, they don't map nicely to current GPU hardware (more precisely, our current algorithm implementation doesn't map too well). This can and will be improved over time. But no need to perceive VLAR tasks as some type of trash tasks, they don't trash tasks at all. And because some tapes contain mostly VLAR tasks it would be good if GPU would be able to process them too. AFAIK ATi cards have less performance hit than NV ones but new NV cards have less hit than pre-FERMI ones. So it was interesting to try if modern GPU can handle VLARs well enough. Flaming on boards shows that "still not right time". Maybe we could get separate checkbox for VLARs on GPU indeed. To enable VLARs on GPU when no other work for GPU available. SETI apps news We're not gonna fight them. We're gonna transcend them. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
VLARs are very useful for pulse kind of signals cause they allow better Pulse accumulation (better sensitivity) and wider pulse sizes (bigger parameter space for search). Unfortunately, they don't map nicely to current GPU hardware (more precisely, our current algorithm implementation doesn't map too well). This can and will be improved over time. But no need to perceive VLAR tasks as some type of trash tasks, they don't trash tasks at all. For now, I think Eric is just going to stop sending VLAR to NV. Same as he did for v6. I would not mind having the other approaches available as well. Of course, the ultimate would be an app for NV MB that does not choke on VLAR work so it would no longer be an issue. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
try this build for NV. and put pdb file along with exe. https://dl.dropboxusercontent.com/u/60381958/AP6_win_x86_SSE2_OpenCL_NV_r1857.7z SETI apps news We're not gonna fight them. We're gonna transcend them. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
try this build for NV. Tried it on one host, did not work so well. Copied everything over except the readme, Authors, Copying, and Copyright files. Ran aimerge. Rebooted. It did start up and run 1857 instead of 1843, noticed in task manager right after startup that it was using a whole CPU core (might just have been getting things ready to run), but then it went into a loop restarting the task that had already been running, and then after a few seconds came back with exited but no finished file or something like that. You might have to reset the project, yada yada. And just kept looping like that. Re-ran the installer to get back to 1843. Should there have been a problem restarting a task that was already 60% completed with 1843? Or might I have done something wrong? Meow? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Ibo Send message Joined: 22 Mar 00 Posts: 6 Credit: 6,075,931 RAC: 0 |
As experiment try to downgrade BOINC. 7.0.28 did no good: http://setiathome.berkeley.edu/result.php?resultid=3033287726 I just started one using build 1857. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Yes, there is a problem swapping Apps in mid task. You might want to finish the ongoing one. The program creates binary files when it runs, if it finds that the App & Binaries don't match, it could cause problems. You need to start with a fresh task, so it places the correct files in the Slots. Since you have trashed so many, I don't see a problem trashing another. You might want to suspend all the tasks, stop, install the new App, start, then unsuspend one new task and see how it goes. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Yes, there is a problem swapping Apps in mid task. You might want to finish the ongoing one. The program creates binary files when it runs, if it finds that the App & Binaries don't match, it could cause problems. You need to start with a fresh task, so it places the correct files in the Slots. Since you have trashed so many, I don't see a problem trashing another. You might want to suspend all the tasks, stop, install the new App, start, then unsuspend one new task and see how it goes. I'll not get that fancy. I'll just recopy the aistub and run aimerge again, and the manually abort the WU that is underway. We'll see if it starts a new one successfully. It's getting late here, but I'll give it a go. I would like to have it running whilst I get some sleep to see what's up in the morning. And thanks for that bit of info, TBar. Meowf. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Ibo Send message Joined: 22 Mar 00 Posts: 6 Credit: 6,075,931 RAC: 0 |
I just started one using build 1857. It is getting even more strange. I just saw that it set back the elapsed time from 1:53 to 1:30 several times. The following is contained several times in the stderr.txt. No idea if it is related. Are there any other things that might be of interest? ### Restart at 0.00 percent. state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 s0=0;s1=0,s2=0 ERROR: some exception inside short FFA, probably video-driver restart, restarting app... Running on device number: 0 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: NVIDIA Corporation BOINC assigns device 0 Info: BOINC provided device ID used Used GPU device parameters are: Number of compute units: 2 Single buffer allocation size: 128MB max WG size: 512 FERMI path used: no Build features: Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i7 CPU Q 740 @ 1.73GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
ok, stop using this one and await next attempt. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.