AstroPulse v6 v6.04 (opencl_nvidia_100)

Message boards : Number crunching : AstroPulse v6 v6.04 (opencl_nvidia_100)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Roger Clark

Send message
Joined: 6 Dec 12
Posts: 5
Credit: 2,990,609
RAC: 0
United States
Message 1409423 - Posted: 29 Aug 2013, 17:30:28 UTC

I just happened to look at this running task 3136251711 and saw that it was stuck at 55.885% completion for several minutes while running on 0.1xx CPU + 1 NVIDIA GPU, when I brought up the GPU monitor the Power% was fluttering between 70 and 95% whereas other GPU tasks from SETI@home usually sit at ~80% power and burn through in about 10 minutes. I aborted the task cause I'm not sure why it was stuck there.

Anyone else seen this kind of issue?
ID: 1409423 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1409440 - Posted: 29 Aug 2013, 17:57:02 UTC - in response to Message 1409423.  
Last modified: 29 Aug 2013, 17:57:34 UTC

On GPU the 10 min tasks are Multibeam tasks.
The Astropulse tasks take about 1000-3000 seconds on GPU or more if
a) the have high blanking and thus must use CPU to do some calculations
b) you run your 3930K with hyper threading enabled (12 cores) and you have "use 100% of the processors".

The remedy - "use 50% of the processors".

--
petri33
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1409440 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1409456 - Posted: 29 Aug 2013, 18:38:11 UTC - in response to Message 1409423.  
Last modified: 29 Aug 2013, 18:41:59 UTC

I just happened to look at this running task 3136251711 and saw that it was stuck at 55.885% completion for several minutes while running on 0.1xx CPU + 1 NVIDIA GPU, when I brought up the GPU monitor the Power% was fluttering between 70 and 95% whereas other GPU tasks from SETI@home usually sit at ~80% power and burn through in about 10 minutes. I aborted the task cause I'm not sure why it was stuck there.

Anyone else seen this kind of issue?


Also, while progress on MB tasks is relatively linear, i.e., the "progress bar" will continue to advance regularly throughout processing, the same is not the case with AP tasks. IMX, it isn't unusual for "freezes" in progress of upwards of 30 seconds or so on a high-end (I use 580's and 590's) GPU. I suspect this could be longer on lesser or less powerful cards. I'd say you shouldn't have aborted the WU, it most likely would have finished normally.

EDIT: Your GPU finished this task normally, so you don't have a problem with AP's, or your GPU.
ID: 1409456 · Report as offensive
Roger Clark

Send message
Joined: 6 Dec 12
Posts: 5
Credit: 2,990,609
RAC: 0
United States
Message 1409480 - Posted: 29 Aug 2013, 19:08:34 UTC - in response to Message 1409456.  

Thanks for the quick feedback. The previous Astropulse unit took 2258sec, this one hit the 55.xx% at 1991sec.

I'm running a GTX670 with 4GB DDR5 and 1344 CUDA cores, the i7-3930 running 50% of the processors 100% of CPU time.

Might have finished fine, guess I was just a little jumpy with a 2 week old rig that I'm "burning in" to check everything out and noticed power fluttering rather cycling... It's also the first time I'd seen a AstroPulse WU come down with GPU capabilities (others have been v6 6.01 running on the main processor)
ID: 1409480 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1409494 - Posted: 29 Aug 2013, 19:45:59 UTC - in response to Message 1409480.  

Thanks for the quick feedback. The previous Astropulse unit took 2258sec, this one hit the 55.xx% at 1991sec.

I'm running a GTX670 with 4GB DDR5 and 1344 CUDA cores, the i7-3930 running 50% of the processors 100% of CPU time.

Might have finished fine, guess I was just a little jumpy with a 2 week old rig that I'm "burning in" to check everything out and noticed power fluttering rather cycling... It's also the first time I'd seen a AstroPulse WU come down with GPU capabilities (others have been v6 6.01 running on the main processor)


The completed AP was only 2.44% blanked. Most likely the aborted one had a higher blanking % and therefore would run longer. While most of my AP's complete in under an hour, some highly blanked ones will take upwards of 2-2.5 hours.
ID: 1409494 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1409751 - Posted: 30 Aug 2013, 12:02:46 UTC - in response to Message 1409494.  

It's inevitable with current app design.
Blanking uses Mersenne twister random number generator in double precision on CPU.
If someone wants to implement it on GPU welcome.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1409751 · Report as offensive
Roger Clark

Send message
Joined: 6 Dec 12
Posts: 5
Credit: 2,990,609
RAC: 0
United States
Message 1409774 - Posted: 30 Aug 2013, 14:07:15 UTC - in response to Message 1409751.  

I'm curious what the app design looks like, NVIDIA CUDAZone says it's got a "drop-in" library cuRAND to do the twister on GPU???
ID: 1409774 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1409775 - Posted: 30 Aug 2013, 14:08:35 UTC - in response to Message 1409751.  

How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks?
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1409775 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1409777 - Posted: 30 Aug 2013, 14:10:16 UTC - in response to Message 1409774.  
Last modified: 30 Aug 2013, 14:24:04 UTC

I'm curious what the app design looks like, NVIDIA CUDAZone says it's got a "drop-in" library cuRAND to do the twister on GPU???

The Mersenne Twister in that is single-precision (i.e. float rather than double). Not all CUDA devices can handle doubles a) natively or b) efficiently.
[Edit] Although, looking at the header file, the generator can return doubles, the engine appears to be 32-bit, CURAND_RNG_PSEUDO_MTGP32. [/Edit]
ID: 1409777 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1409784 - Posted: 30 Aug 2013, 14:30:46 UTC

It appears we are being set up for another multiweek AP outage. There is now a large number of 'tapes' and a large disparity in 'total channels to do:'. This usually means a long wait for the channel number to equalize. We could go another couple of weeks without APs since most semi-fast machines run out in a few days. Is there some reason this is being done? I would much rather see the AP outage last for just a few days verses weeks.

Is there some reason so many files are being loaded at once?
Server status page
ID: 1409784 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1409787 - Posted: 30 Aug 2013, 14:46:45 UTC - in response to Message 1409751.  

It's inevitable with current app design.
Blanking uses Mersenne twister random number generator in double precision on CPU.
If someone wants to implement it on GPU welcome.

Does that mean that an AP will use the CPU for some of its work even if the host is set to use GPU only?

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1409787 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1409826 - Posted: 30 Aug 2013, 16:08:26 UTC - in response to Message 1409775.  

How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks?

Could work if seed always the same...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1409826 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1409828 - Posted: 30 Aug 2013, 16:09:52 UTC - in response to Message 1409787.  

It's inevitable with current app design.
Blanking uses Mersenne twister random number generator in double precision on CPU.
If someone wants to implement it on GPU welcome.

Does that mean that an AP will use the CPU for some of its work even if the host is set to use GPU only?

Surprised? CPU always used, in some or another degree. Only CPU can handle other PC devices interrupts.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1409828 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1409831 - Posted: 30 Aug 2013, 16:14:44 UTC - in response to Message 1409828.  

It's inevitable with current app design.
Blanking uses Mersenne twister random number generator in double precision on CPU.
If someone wants to implement it on GPU welcome.

Does that mean that an AP will use the CPU for some of its work even if the host is set to use GPU only?

Surprised? CPU always used, in some or another degree. Only CPU can handle other PC devices interrupts.

Well I know the CPU is used a little (.04 for Seti and .2 for Einstein (or vice versa)), but I didn't know it got into it more than that. I think I'll have to turn off AP for my GPU-only machine.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1409831 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1409875 - Posted: 30 Aug 2013, 17:59:03 UTC

It might be something to do with the size of discs the "tapes" have been loaded onto. There appears to be about 50 tapes in a batch, which would make sense as they are using 3 and 3 tetra byte drives for transfer.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1409875 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1409879 - Posted: 30 Aug 2013, 18:19:12 UTC - in response to Message 1409826.  

How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks?

Could work if seed always the same...

The seed is the same at the beginning of each AP WU, but not restarted for each of the 14080 passes through the input data. The generater continues, with current state saved in the state file so starting from a checkpoint will work. So rather than 8 megabytes you'd need to have ~118 Gigabytes of saved data.

The Mersenne Twister actually produces 32 bit unsigned values, which may then be converted to other numerical forms. The original version which will run on any CPU is what is used, the later SIMD version does not produce the same sequence of pseudo-random numbers although it is equally as good.

Conversion of the equally distributed Twister output to normally distributed pseudo-random numbers using the Box–Muller transform probably accounts for much of the time required for blanking.
                                                                   Joe
ID: 1409879 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1409979 - Posted: 30 Aug 2013, 23:00:24 UTC - in response to Message 1409879.  
Last modified: 30 Aug 2013, 23:01:08 UTC

How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks?

Could work if seed always the same...

The seed is the same at the beginning of each AP WU, but not restarted for each of the 14080 passes through the input data. The generater continues, with current state saved in the state file so starting from a checkpoint will work. So rather than 8 megabytes you'd need to have ~118 Gigabytes of saved data.

The Mersenne Twister actually produces 32 bit unsigned values, which may then be converted to other numerical forms. The original version which will run on any CPU is what is used, the later SIMD version does not produce the same sequence of pseudo-random numbers although it is equally as good.

Conversion of the equally distributed Twister output to normally distributed pseudo-random numbers using the Box–Muller transform probably accounts for much of the time required for blanking.
                                                                   Joe

That's starting to sound to me like the curand library is usable after all, especially if it's the *quality* of the PRNG that's the issue, rather than the reproducability of the sequence.
As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available?
ID: 1409979 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1409980 - Posted: 30 Aug 2013, 23:17:39 UTC - in response to Message 1409979.  

As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available?

https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AP

Claggy
ID: 1409980 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1409996 - Posted: 31 Aug 2013, 0:53:28 UTC - in response to Message 1409979.  

How about creating an eight megabyte random file just once and storing to disk and blanking using that data on all subsequent AP tasks?

Could work if seed always the same...
...
The Mersenne Twister actually produces 32 bit unsigned values, which may then be converted to other numerical forms. The original version which will run on any CPU is what is used, the later SIMD version does not produce the same sequence of pseudo-random numbers although it is equally as good.

Conversion of the equally distributed Twister output to normally distributed pseudo-random numbers using the Box–Muller transform probably accounts for much of the time required for blanking.
                                                                   Joe

That's starting to sound to me like the curand library is usable after all, especially if it's the *quality* of the PRNG that's the issue, rather than the reproducability of the sequence.

Unfortunately the sequence must be exactly the same.

As a matter of interest (since I hope to have my Xeon Phi compiling OpenCL code in the next month...) where are the OpenCL AstroPulse sources available?

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt is our repository for AP as well as MB sources.
                                                                  Joe
ID: 1409996 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1410082 - Posted: 31 Aug 2013, 9:31:13 UTC - in response to Message 1409996.  
Last modified: 31 Aug 2013, 9:32:15 UTC

Indeed, AP and MB shares common parts of utility code.
And, unfortunately, I'm not aware about any way to use CUDA libs calls from OpenCL code. Whole app porting to CUDA would be required.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1410082 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : AstroPulse v6 v6.04 (opencl_nvidia_100)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.