Posts by Raistmer


log in
1) Message boards : Number crunching : AP running for 11+ hours on GPU then erroring out (Message 1683997)
Posted 4 days ago by Profile Raistmer
Actually, if NV demonstrates same issue with loaded CPU as ATi does (w/o affinity management) then exactly sporadic few times execution slowdown is expected.
I would add -cpu_lock.

We need to find ways of distinguishing 'slowdown' from 'complete dead stop'. I find CPU efficiency monitoring helps - 0.0000 efficiency is a very specific marker, which I don't see on any task simply running slow. And if I do see it, "suspend and resume" restores normal progress, but usually winds back elapsed time to a 'last checkpoint' several hours ago.

Zero CPU consumption means driver restart and runtime being stuck. So, app just never returns in last runtime call.
2) Message boards : Number crunching : AP running for 11+ hours on GPU then erroring out (Message 1683795)
Posted 4 days ago by Profile Raistmer
Actually, if NV demonstrates same issue with loaded CPU as ATi does (w/o affinity management) then exactly sporadic few times execution slowdown is expected.
I would add -cpu_lock.
3) Message boards : Number crunching : Cl file build failure (Message 1683651)
Posted 5 days ago by Profile Raistmer
There are vector types of almost all base types in OpenCL <base_type>N,... but bool.
Actually I was quite surprised when this issue appeared first time in Urs' OS X build. Even bug report was filed to NV on this matter.
petri33's usage of bool2 in improved dechirping code looked absolutely natural and in line with other OpenCL vector types. Also it's in line with C++ spirit - user classes seamlessly augment builtin ones. It worked OK for ATi and iGPU so far, but NVidia decided to become "dog in the manger": to forbid users to use bool2 and not implementing bool2 in runtime by itself. So, ugly bool_2 name that doesn't follow vector types naming scheme had to be used.
4) Message boards : Number crunching : Updated OpenCL AstroPulse coming main (Message 1683171)
Posted 6 days ago by Profile Raistmer
21 May this round of AstroPulse OpenCL apps update for all supported platforms is finished.
5) Message boards : Number crunching : IntelĀ® iGPU AP bench test run (e.g. @ J1900) (Message 1682821)
Posted 6 days ago by Profile Raistmer
In ap_cmdline_win_x86_SSE2_OpenCL_Intel.txt: -cpu_lock -instances_per_device 1
in ap_cmdline_win_x86_SSE2_OpenCL_NV.txt: -cpu_lock -instances_per_device 3

The same result in Task-Manager:
1 Intel iGPU app at Core#0
1 NV GPU app at Core#0 and 1 NV GPU app at Core#1

You could try -cpu_lock_fixed_cpu 3 (or 4) for the Intel iGPU. That should work for now.

I expect Raistmer may modify the mutex implementation to make some future versions cooperate between different brand GPUs, but no promises...
Joe


There will be special command-line option to handle such cases in new builds:

-total_GPU_instances_num N : To use together with -cpu_lock on multi-vendor GPU hosts. Set N to total number of simultaneously running GPU OpenCL SETI apps for host (total among all used GPU of all vendors). App needs to know this number to properly select logical CPU for execution in affinity-management (-cpu_lock) mode.


mutexes remain unchanged but their usage decoupled from affinity management with this new option. Can be used on multi-vendor hosts w/o messing up GPUlock feature.
6) Message boards : Number crunching : If I am going to start crunching with my car, it has a radiator. (Message 1682042)
Posted 8 days ago by Profile Raistmer
At least some of car's computers use QNX RTOS that has POSIX subsystem and even has Intel compiler (ICC) for it. Also, GNU C supported. So, it should be possible to port current SETI apps there. The real issue is access to car's comp initialization scripts. That is, how to put own binary there.
Hardly it can be done w/o reflashing their OS image, that is, w/o changing firmware. That almost definitely will void any car's warranty and could make driving of such car even illegal (just imagine if breaks will be busy with VLAR computation ;D ;D ).
Much more promizing area IMHO is bankomates. There are thousands and thousands computerized devices, w/o real-time needs (instead of cars) that waste electricity to show silly advertizments while waiting someone to put credit card inside them (24/7 !! ). What a wasted computational power... But again, same issue - no easy way to put own binary there.
7) Message boards : Number crunching : Why use CPU on SETi@home? (Message 1682035)
Posted 8 days ago by Profile Raistmer
Quite strange thread.
OP proposes to stop CPU crunching for SETI.... but use CPU for another BOINC projects. What about _their_ inefficiency and damage to environment then. if these argument so matter for OP ??

Maybe worth to consider, not now, but in timescale of 2-3 years to cut own left hand? Cause it operates less effectively than right one (left-handed peoples read this reverse) ...

Pointless, just pointless.
8) Message boards : Number crunching : Loading APU to the limit: performance considerations - ongoing research (Message 1681449)
Posted 10 days ago by Profile Raistmer
Since I think focus is on memory contention right now.

Yes, I think so.
9) Message boards : Number crunching : NV GPU - AP bench test run (e.g. @ GT730) (Message 1680413)
Posted 13 days ago by Profile Raistmer
KWSN bench re-enables BOINC on completion. Hence separate handling of this required for multi-instance test.
Again, look provided scripts, they handle this.
10) Message boards : Number crunching : IntelĀ® iGPU AP bench test run (e.g. @ J1900) (Message 1679589)
Posted 14 days ago by Profile Raistmer
try -instances_per_device 3
11) Message boards : Number crunching : NV GPU - AP bench test run (e.g. @ GT730) (Message 1679544)
Posted 15 days ago by Profile Raistmer
Here: http://lunatics.kwsn.info/1-discussion-forum/loading-apu-to-the-limit-performance-considerations.0.html I put all stuff needed for such kind of testing additionally to standard KWSN bench pack.
12) Message boards : Number crunching : Server-side/project achievements/records thread (Message 1678879)
Posted 16 days ago by Profile Raistmer
Current result creation rate 41.8528/sec 0.0365/sec 5m
Results out in the field 3,609,225 45,746 0m
Results received in last hour 89,778 1,212 0m
13) Message boards : Number crunching : AMD ATI Radeon HD 2300/2400/3200/4200 (RV610) (336MB) - SETI/AP possible? (Message 1678625)
Posted 17 days ago by Profile Raistmer
I recived confirmation that on target hardware (my own ATi cards no longer target ones, driver dropped real CAL support, though GPU detected by app as CAL one still) app works OK and return valid results in benchmark.
So, seems can be used indeed.
14) Message boards : Number crunching : AMD ATI Radeon HD 2300/2400/3200/4200 (RV610) (336MB) - SETI/AP possible? (Message 1677815)
Posted 18 days ago by Profile Raistmer
OK, have fun:
https://dl.dropboxusercontent.com/u/60381958/AP7_win_x86_SSE_Brook_r2904.7z
It crashes on my ATi hosts though so fully UNtested.
15) Message boards : Number crunching : 2899 oclFFT: NV local memory shortage debugging (Message 1675125)
Posted 21 days ago by Profile Raistmer
So, how long will it last? I was going to post a combined Update this weekend.
Now this...

Until issue will be solved.
You can manually disable that print.
BTW, if you have some NV hosts try to build and run NV version - what numbers it will give?
16) Message boards : Number crunching : 2899 oclFFT: NV local memory shortage debugging (Message 1675122)
Posted 21 days ago by Profile Raistmer


Is this permanent?

No, it's temporary for debugging
17) Message boards : Number crunching : Loading APU to the limit: performance considerations - ongoing research (Message 1674842)
Posted 21 days ago by Profile Raistmer

I am not sure is the iGPU running ~39% & the CPU times still being that high is a bad sign or helps point to something else to try. I plan to try more configs to see if there is one that will make everything magically work together as well as BayTrail in the meantime.
-unroll 4 -ffa_block 64 -ffa_block_fetch 16
-unroll 4 -ffa_block 128 -ffa_block_fetch 32
Then play around with unroll again.


To make your search more targeted:
main loop intervenes FFAs in AP. And time to time, additionally to short FFA much bigger long FFA are performed. So, with Clean01 you could see periodical load in GPU-Z. Also, allocated memory will be change periodically. To separate main loop time from FFA time one can vary corresponding memory buffers size and look for GPU memory usage in GPU-Z. -unroll changes mainloop. -ffa_block_size changes FFA buffers.
18) Message boards : Number crunching : AMD ATI Radeon HD 2300/2400/3200/4200 (RV610) (336MB) - SETI/AP possible? (Message 1674660)
Posted 22 days ago by Profile Raistmer
In Brook build FFA part of processing was done on GPU. To increase throughput it was recommended to run few AP tasks on GPU, maybe even overcomitting CPU part.
But currently, with absence of AP tasks to make Brook AP7 seems little impractical.
19) Message boards : Number crunching : Updated OpenCL AstroPulse coming main (Message 1674317)
Posted 23 days ago by Profile Raistmer
BTW, now we have new Linux OpenCL AP deployed on Beta. So try to test it.
And whose who know what they do could take binaries from where.

Hmmmm, while looking at the Applications page I noticed the MB Mac Apps are at 7.05 dated Apr 23rd. I was about to comment on it and then noticed how it just now changed to 7.05. So, we should have been receiving 7.05 for 2 weeks and weren't? hmmmm....

New apps were released May 5.
Linux/x86_64 7.08 (cuda_opencl_100) 5 May 2015, 20:18:35 UTC 0 GigaFLOPS
Linux/x86_64 7.08 (cuda_opencl_cc1) 5 May 2015, 20:18:35 UTC 0 GigaFLOPS
Linux/x86_64 7.08 (opencl_ati_100) 5 May 2015, 20:18:35 UTC 0 GigaFLOPS
Linux/x86_64 7.08 (opencl_nvidia_100) 5 May 2015, 20:18:35 UTC 0 GigaFLOPS
Linux/x86_64 7.08 (opencl_nvidia_cc1) 5 May 2015, 20:18:35 UTC 0 GigaFLOPS

And, additionally, new MB v7 apps released 6 May:

Mac OS X/64-bit Intel 7.05 (opencl_ati5zc_mac) 6 May 2015, 0:51:12 UTC 0 GigaFLOPS
Mac OS X/64-bit Intel 7.05 (opencl_ati5zc_mac_nocal) 6 May 2015, 0:51:12 UTC 11 GigaFLOPS

Again, this info regarding new BETA apps for those who don't look Beta very often.
20) Message boards : Number crunching : MBv7 re-observation of MBv6 processed data - optimization is possible? (Message 1674316)
Posted 23 days ago by Profile Raistmer
Then lets recall in what SETI MB v7 is different from v6?
Only Autocorrelation?


Next 20

Copyright © 2015 University of California