Message boards :
Number crunching :
AstroPulse v6 v6.04 (opencl_nvidia_100)
Message board moderation
Author | Message |
---|---|
Roger Clark Send message Joined: 6 Dec 12 Posts: 5 Credit: 2,990,609 RAC: 0 |
I just happened to look at this running task 3136251711 and saw that it was stuck at 55.885% completion for several minutes while running on 0.1xx CPU + 1 NVIDIA GPU, when I brought up the GPU monitor the Power% was fluttering between 70 and 95% whereas other GPU tasks from SETI@home usually sit at ~80% power and burn through in about 10 minutes. I aborted the task cause I'm not sure why it was stuck there. Anyone else seen this kind of issue? |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
On GPU the 10 min tasks are Multibeam tasks. The Astropulse tasks take about 1000-3000 seconds on GPU or more if a) the have high blanking and thus must use CPU to do some calculations b) you run your 3930K with hyper threading enabled (12 cores) and you have "use 100% of the processors". The remedy - "use 50% of the processors". -- petri33 To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0 |
I just happened to look at this running task 3136251711 and saw that it was stuck at 55.885% completion for several minutes while running on 0.1xx CPU + 1 NVIDIA GPU, when I brought up the GPU monitor the Power% was fluttering between 70 and 95% whereas other GPU tasks from SETI@home usually sit at ~80% power and burn through in about 10 minutes. I aborted the task cause I'm not sure why it was stuck there. Also, while progress on MB tasks is relatively linear, i.e., the "progress bar" will continue to advance regularly throughout processing, the same is not the case with AP tasks. IMX, it isn't unusual for "freezes" in progress of upwards of 30 seconds or so on a high-end (I use 580's and 590's) GPU. I suspect this could be longer on lesser or less powerful cards. I'd say you shouldn't have aborted the WU, it most likely would have finished normally. EDIT: Your GPU finished this task normally, so you don't have a problem with AP's, or your GPU. |
Roger Clark Send message Joined: 6 Dec 12 Posts: 5 Credit: 2,990,609 RAC: 0 |
Thanks for the quick feedback. The previous Astropulse unit took 2258sec, this one hit the 55.xx% at 1991sec. I'm running a GTX670 with 4GB DDR5 and 1344 CUDA cores, the i7-3930 running 50% of the processors 100% of CPU time. Might have finished fine, guess I was just a little jumpy with a 2 week old rig that I'm "burning in" to check everything out and noticed power fluttering rather cycling... It's also the first time I'd seen a AstroPulse WU come down with GPU capabilities (others have been v6 6.01 running on the main processor) |
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0 |
Thanks for the quick feedback. The previous Astropulse unit took 2258sec, this one hit the 55.xx% at 1991sec. The completed AP was only 2.44% blanked. Most likely the aborted one had a higher blanking % and therefore would run longer. While most of my AP's complete in under an hour, some highly blanked ones will take upwards of 2-2.5 hours. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
It's inevitable with current app design. Blanking uses Mersenne twister random number generator in double precision on CPU. If someone wants to implement it on GPU welcome. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Roger Clark Send message Joined: 6 Dec 12 Posts: 5 Credit: 2,990,609 RAC: 0 |
I'm curious what the app design looks like, NVIDIA CUDAZone says it's got a "drop-in" library cuRAND to do the twister on GPU??? |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks? To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
ivan Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 |
I'm curious what the app design looks like, NVIDIA CUDAZone says it's got a "drop-in" library cuRAND to do the twister on GPU??? The Mersenne Twister in that is single-precision (i.e. float rather than double). Not all CUDA devices can handle doubles a) natively or b) efficiently. [Edit] Although, looking at the header file, the generator can return doubles, the engine appears to be 32-bit, CURAND_RNG_PSEUDO_MTGP32. [/Edit] |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It appears we are being set up for another multiweek AP outage. There is now a large number of 'tapes' and a large disparity in 'total channels to do:'. This usually means a long wait for the channel number to equalize. We could go another couple of weeks without APs since most semi-fast machines run out in a few days. Is there some reason this is being done? I would much rather see the AP outage last for just a few days verses weeks. Is there some reason so many files are being loaded at once? Server status page |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
It's inevitable with current app design. Does that mean that an AP will use the CPU for some of its work even if the host is set to use GPU only? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks? Could work if seed always the same... SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
It's inevitable with current app design. Surprised? CPU always used, in some or another degree. Only CPU can handle other PC devices interrupts. SETI apps news We're not gonna fight them. We're gonna transcend them. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
It's inevitable with current app design. Well I know the CPU is used a little (.04 for Seti and .2 for Einstein (or vice versa)), but I didn't know it got into it more than that. I think I'll have to turn off AP for my GPU-only machine. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
It might be something to do with the size of discs the "tapes" have been loaded onto. There appears to be about 50 tapes in a batch, which would make sense as they are using 3 and 3 tetra byte drives for transfer. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks? The seed is the same at the beginning of each AP WU, but not restarted for each of the 14080 passes through the input data. The generater continues, with current state saved in the state file so starting from a checkpoint will work. So rather than 8 megabytes you'd need to have ~118 Gigabytes of saved data. The Mersenne Twister actually produces 32 bit unsigned values, which may then be converted to other numerical forms. The original version which will run on any CPU is what is used, the later SIMD version does not produce the same sequence of pseudo-random numbers although it is equally as good. Conversion of the equally distributed Twister output to normally distributed pseudo-random numbers using the Box–Muller transform probably accounts for much of the time required for blanking. Joe |
ivan Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 |
How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks? That's starting to sound to me like the curand library is usable after all, especially if it's the *quality* of the PRNG that's the issue, rather than the reproducability of the sequence. As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available? https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AP Claggy |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
...How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks? Unfortunately the sequence must be exactly the same. As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available? https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt is our repository for AP as well as MB sources. Joe |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Indeed, AP and MB shares common parts of utility code. And, unfortunately, I'm not aware about any way to use CUDA libs calls from OpenCL code. Whole app porting to CUDA would be required. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.