Continuing SETI Problems with 2 ATI Cards Installed.

Message boards : Number crunching : Continuing SETI Problems with 2 ATI Cards Installed.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356752 - Posted: 14 Apr 2013, 6:42:37 UTC - in response to Message 1356728.  

Just because I could... I replaced the 3650 with a real 2600, the one that shipped in my Mac back in 2008;



Lookie;
4/14/2013 2:19:26 AM |  | Starting BOINC client version 7.0.62 for windows_intelx86
4/14/2013 2:19:26 AM |  | CAL: ATI GPU 0: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 960 GFLOPS peak)
4/14/2013 2:19:26 AM |  | CAL: ATI GPU 1: (not used) ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 256MB, 224MB available, 336 GFLOPS peak)
4/14/2013 2:19:26 AM |  | OpenCL: AMD/ATI GPU 0: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)
4/14/2013 2:19:26 AM | SETI@home | Config: excluded GPU.  Type: ATI.  App: astropulse_v6.  Device: 1
4/14/2013 2:19:26 AM |  | [work_fetch] Request work fetch: Prefs update
4/14/2013 2:19:26 AM |  | [work_fetch] Request work fetch: Startup
4/14/2013 2:19:27 AM | SETI@home | Restarting task ap_19se12ab_B3_P0_00306_20130408_21511.wu_2 using astropulse_v6 version 601 in slot 1
4/14/2013 2:19:27 AM | SETI@home | Restarting task ap_31my12ae_B5_P1_00361_20130325_27308.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 0
4/14/2013 2:19:27 AM |  | [work_fetch] work fetch start
4/14/2013 2:19:27 AM |  | [work_fetch] choose_project() for ATI: buffer_low: no; sim_excluded_instances 0
4/14/2013 2:19:27 AM |  | [work_fetch] choose_project() for CPU: buffer_low: yes; sim_excluded_instances 0
4/14/2013 2:19:27 AM |  | [work_fetch] no eligible project for CPU
4/14/2013 2:19:27 AM |  | [work_fetch] ------- start work fetch state -------
4/14/2013 2:19:27 AM |  | [work_fetch] target work buffer: 172800.00 + 8640.00 sec
4/14/2013 2:19:27 AM |  | [work_fetch] --- project states ---
4/14/2013 2:19:27 AM | SETI@home | [work_fetch] REC 72442.158 prio -1.868449 can't req work: "no new tasks" requested via Manager
4/14/2013 2:19:27 AM |  | [work_fetch] --- state for CPU ---
4/14/2013 2:19:27 AM |  | [work_fetch] shortfall 26605.86 nidle 0.00 saturated 154834.14 busy 0.00
4/14/2013 2:19:27 AM | SETI@home | [work_fetch] fetch share 0.000
4/14/2013 2:19:27 AM |  | [work_fetch] --- state for ATI ---
4/14/2013 2:19:27 AM |  | [work_fetch] shortfall 0.00 nidle 0.00 saturated 804054.07 busy 0.00
4/14/2013 2:19:27 AM | SETI@home | [work_fetch] fetch share 0.000
4/14/2013 2:19:27 AM |  | [work_fetch] ------- end work fetch state -------


...for ATI: buffer_low: no; sim_excluded_instances 0
[work_fetch] --- state for ATI ---
shortfall 0.00
nidle 0.00 saturated...


Big Difference. Note BOINC lists the 2600XT as (not used). Too bad the old 2600XT is about shot, noisy fan, 256mb memory, and toasted memory that leaves artifacts on the screen. But, it's a Mac card that will show the Mac boot screen. A non-Mac card, like the 6850, will not show the Boot screen with the system chooser.

Note GPUz also lists the 2600 as having OpenCL....
ID: 1356752 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1356805 - Posted: 14 Apr 2013, 10:16:27 UTC - in response to Message 1356752.  
Last modified: 14 Apr 2013, 10:31:01 UTC

Don't you want to swap the GPUs around so the HD2600 is detected first, then see if the revised Boinc.exe detects it or the HD4600 as an OpenCL device.

From your first post:
3/21/2013 6:53:33 PM | | Starting BOINC client version 7.0.58 for windows_intelx86
3/21/2013 6:53:33 PM | | CAL: ATI GPU 0 (ignored by config): ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
3/21/2013 6:53:33 PM | | CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 992 GFLOPS peak)
3/21/2013 6:53:33 PM | | OpenCL: AMD/ATI GPU 0 (ignored by config): ATI Radeon HD 2600 (RV630) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 348 GFLOPS peak)


a later post says:
Progress. Swapping Card Slots got me back to where I was in November. The 4670 was listed as the Card with OpenCL, things were looking up;



3/23/2013 12:22:05 AM | | Starting BOINC client version 7.0.58 for windows_intelx86
3/23/2013 12:22:05 AM | | CAL: ATI GPU 0: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 960 GFLOPS peak)
3/23/2013 12:22:05 AM | | CAL: ATI GPU 1: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
3/23/2013 12:22:05 AM | | OpenCL: AMD/ATI GPU 0: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)


and now it still says:

4/14/2013 1:13:15 AM | | Starting BOINC client version 7.0.62 for windows_intelx86
4/14/2013 1:13:15 AM | | CAL: ATI GPU 0: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 992 GFLOPS peak)
4/14/2013 1:13:15 AM | | CAL: ATI GPU 1: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
4/14/2013 1:13:15 AM | | OpenCL: AMD/ATI GPU 0: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 992 GFLOPS peak)


Claggy
ID: 1356805 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1356807 - Posted: 14 Apr 2013, 10:18:15 UTC - in response to Message 1356752.  


Note GPUz also lists the 2600 as having OpenCL....

report bug to GPU-Z dev team

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1356807 · Report as offensive
Charlie Fenton

Send message
Joined: 3 Apr 99
Posts: 35
Credit: 6,090,490
RAC: 5
Message 1356811 - Posted: 14 Apr 2013, 10:25:38 UTC - in response to Message 1356728.  
Last modified: 14 Apr 2013, 11:04:35 UTC

I've replaced the 7.0.60 BOINC.exe with the new one. I'm still running 7.0.60 with the exception of the new BOINC.exe. I don't see any change...so far.

This looks really good to me. BOINC now recognizes that only the 4670 is supported by your drivers for OpenCl but both cards are supported for CAL, which is correct. And it is only trying to run a single SETI@home GPU task. Am I missing something?

The misidentification of the HD 3650 as HD 2600 is a minor cosmetic issue that I understand and is easily fixed. (The HD 2600 series is RV 630 and the HD 3650 is RV 635, but the information from CAL does not distinguish between the two.)

What happens if you now remove the <exclude_gpu> option from your cc_config.xml file?

What happens if you put the HD 3650 in the lower slot and the HD 4670 in the higher slot?
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 1356811 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356815 - Posted: 14 Apr 2013, 11:42:06 UTC - in response to Message 1356811.  
Last modified: 14 Apr 2013, 12:30:39 UTC

I've replaced the 7.0.60 BOINC.exe with the new one. I'm still running 7.0.60 with the exception of the new BOINC.exe. I don't see any change...so far.

This looks really good to me. BOINC now recognizes that only the 4670 is supported by your drivers for OpenCl but both cards are supported for CAL, which is correct. And it is only trying to run a single SETI@home GPU task. Am I missing something?

The misidentification of the HD 3650 as HD 2600 is a minor cosmetic issue that I understand and is easily fixed. (The HD 2600 series is RV 630 and the HD 3650 is RV 635, but the information from CAL does not distinguish between the two.)

What happens if you now remove the <exclude_gpu> option from your cc_config.xml file?

What happens if you put the HD 3650 in the lower slot and the HD 4670 in the higher slot?

It's still trying to download more ATI work even though it has over 8 days worth and the Buffer is set to 2 days. Even if it had two working 4670s, 34 APs is 4 days worth of work, and the buffer is set to 2 days.
From above;
4/14/2013 12:38:57 AM | | [work_fetch] choose_project() for ATI: buffer_low: yes; sim_excluded_instances 2
4/14/2013 12:38:57 AM | | [work_fetch] no eligible project for ATI
4/14/2013 12:38:57 AM | | [work_fetch] choose_project() for CPU: buffer_low: yes; sim_excluded_instances 0
4/14/2013 12:38:57 AM | | [work_fetch] no eligible project for CPU
4/14/2013 12:38:57 AM | | [work_fetch] ------- start work fetch state -------
4/14/2013 12:38:57 AM | | [work_fetch] target work buffer: 172800.00 + 8640.00 sec
4/14/2013 12:38:57 AM | | [work_fetch] --- project states ---
4/14/2013 12:38:57 AM | SETI@home | [work_fetch] REC 72465.712 prio -0.462817 can't req work: "no new tasks" requested via Manager
4/14/2013 12:38:57 AM | | [work_fetch] --- state for CPU ---
4/14/2013 12:38:57 AM | | [work_fetch] shortfall 21966.20 nidle 0.00 saturated 159473.80 busy 0.00
4/14/2013 12:38:57 AM | SETI@home | [work_fetch] fetch share 0.000
4/14/2013 12:38:57 AM | | [work_fetch] --- state for ATI ---
4/14/2013 12:38:57 AM | | [work_fetch] shortfall 181440.00 nidle 1.00 saturated 0.00 busy 0.00

[work_fetch] choose_project() for ATI: buffer_low: yes - instances 2 - ATI shortfall 181440.00
With the real 2600, it is not asking for more ATI work.

This is what happens when I remove the Exclude GPU. If I let it go, it will go down the list starting every AP and then Fail to run it. When it runs out of the 33 APs it can start, it will go back to the beginning and try again. After a few tries on each of the 33 APs, it will start Erring them out, one by one, very quickly. This is what it did, a couple of times, back in November. I've been down this road before...

4/14/2013 7:49:04 AM |  | Re-reading cc_config.xml
4/14/2013 7:49:04 AM |  | Using proxy info from GUI
4/14/2013 7:49:04 AM |  | Using HTTP proxy 192.168.1.3:5555
4/14/2013 7:49:04 AM |  | log flags: file_xfer, sched_ops, task
4/14/2013 7:49:04 AM | SETI@home | Starting task ap_30no12ae_B4_P1_00309_20130331_13224.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 2
4/14/2013 7:49:06 AM | SETI@home | Starting task ap_31oc12aa_B1_P0_00375_20130331_12152.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 3
4/14/2013 7:49:07 AM | SETI@home | Starting task ap_31my12ae_B6_P0_00305_20130325_28354.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 4
4/14/2013 7:49:08 AM | SETI@home | Starting task ap_30no12ae_B5_P0_00178_20130331_19812.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 5
4/14/2013 7:49:09 AM | SETI@home | Starting task ap_31oc12aa_B1_P1_00070_20130331_21193.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 6
4/14/2013 7:49:10 AM | SETI@home | Starting task ap_30no12ae_B5_P0_00138_20130331_19812.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 7
4/14/2013 7:49:12 AM | SETI@home | Starting task ap_30no12ae_B4_P1_00336_20130331_13224.wu_2 using astropulse_v6 version 604 (ati_opencl_100) in slot 8
4/14/2013 7:49:13 AM | SETI@home | Starting task ap_30no12ae_B5_P1_00019_20130331_22505.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 9
4/14/2013 7:49:14 AM | SETI@home | Starting task ap_31oc12aa_B1_P1_00026_20130331_21193.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 10
4/14/2013 7:49:15 AM | SETI@home | Starting task ap_30no12ae_B5_P0_00139_20130331_19812.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 11
4/14/2013 7:49:16 AM | SETI@home | Starting task ap_30no12ae_B5_P0_00098_20130331_19812.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 12
4/14/2013 7:49:17 AM | SETI@home | Starting task ap_31oc12aa_B1_P1_00109_20130331_21193.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 13
4/14/2013 7:49:18 AM | SETI@home | Starting task ap_30no12ae_B6_P0_00018_20130331_22506.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 14
4/14/2013 7:49:19 AM | SETI@home | Starting task ap_31oc12aa_B1_P1_00069_20130331_21193.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 15
4/14/2013 7:49:20 AM | SETI@home | Starting task ap_31oc12aa_B1_P1_00141_20130331_21193.wu_1 using astropulse_v6 version 604 (ati_opencl_100) in slot 16
4/14/2013 7:49:21 AM |  | [task] Suspending GPU computation - user request
...


Now, unless I edit the State file and remove all those Active tasks, they will stay until the 4670 runs them, and that will take a while...
4/14/2013 7:49:21 AM |  | [task] Suspending GPU computation - user request
4/14/2013 8:02:30 AM |  | Re-reading cc_config.xml
4/14/2013 8:02:30 AM |  | Using proxy info from GUI
4/14/2013 8:02:30 AM |  | Using HTTP proxy 192.168.1.3:5555
4/14/2013 8:02:30 AM | SETI@home | Config: excluded GPU.  Type: ATI.  App: astropulse_v6.  Device: 1
4/14/2013 8:02:30 AM |  | log flags: file_xfer, sched_ops, task
4/14/2013 8:02:40 AM |  | [task] Resuming GPU computation
4/14/2013 8:02:40 AM | SETI@home | Restarting task ap_31my12ae_B5_P1_00361_20130325_27308.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 0
ID: 1356815 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356828 - Posted: 14 Apr 2013, 13:45:13 UTC - in response to Message 1356815.  
Last modified: 14 Apr 2013, 14:01:52 UTC

The 4670 just finished the task it was working and started one of the 'Failed' 3650 tasks. This is the stderr text from Slot 2;
4/14/2013 9:25:49 AM | SETI@home | Computation for task ap_31my12ae_B5_P1_00361_20130325_27308.wu_0 finished
4/14/2013 9:25:49 AM | SETI@home | Restarting task ap_30no12ae_B4_P1_00309_20130331_13224.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 2

Running on device number: 1
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
WARNING: BOINC supplied wrong platform!
BOINC assigns device 1
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
ERROR: clCreateContext: -33
Creating Command Queue. (clCreateCommandQueue) -34
Running on device number: 0
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0
Info: BOINC provided device ID used
Used GPU device parameters are:
	Number of compute units: 8
	Single buffer allocation size: 128MB
	max WG size: 128

Build features: Non-graphics	OpenCL	OCL_ZERO_COPY	COMBINED_DECHIRP_KERNEL	FFTW	USE_INCREASED_PRECISION	USE_SSE2	x86	
     CPUID: Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz 

     Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 
AstroPulse v.6
Non-graphics	FFTW	USE_CONVERSION_OPT	
Windows x86 rev 1761, V6 match, by Raistmer with support of Lunatics.kwsn.net team.	SSE2

OpenCL version by Raistmer...


I've seen that Warning in quite a few other host results;
"WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities"
ID: 1356828 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1356833 - Posted: 14 Apr 2013, 14:34:52 UTC - in response to Message 1356828.  

I've seen that Warning in quite a few other host results;
"WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities"

It's nothing to worry about, with Boinc 6 it will tell the app to Start on a CAL device, the app just converts it to the OpenCL device id and starts app, Eric has done a app_version for Boinc 6 hosts, ie (ati_opencl_100),
with Boinc 7 (it has OpenCL detection), Boinc 7 wiil ask for app to start on an OpenCL device id, it's plan class is opencl_ati_100 (Boinc 7 hosts might also still get get work from the ati_opencl_100 plan class too, i did ask Eric to stop that)

Claggy
ID: 1356833 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1356845 - Posted: 14 Apr 2013, 15:34:01 UTC - in response to Message 1356833.  
Last modified: 14 Apr 2013, 15:39:40 UTC

ati_opencl_* plan class is actually CAL plan class.
So, BOINC attempts to launch on CAL device. App should not allow that. But, in the log we see that with wrong BOINC proposal APP's queue creation fails too.

TBar, could you try to add -gpu_lock -instances_per_device 1 to ap_cmdline*.txt file

WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
ERROR: clCreateContext: -33
Creating Command Queue. (clCreateCommandQueue) -34

these lines worry me.

#define CL_INVALID_DEVICE                           -33
#define CL_INVALID_CONTEXT                          -34


Hence, app says it will use own enumeration but failswith that.
What will be if -gpu_lock will be in place?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1356845 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1356850 - Posted: 14 Apr 2013, 15:55:28 UTC - in response to Message 1356815.  
Last modified: 14 Apr 2013, 15:59:27 UTC

This is what happens when I remove the Exclude GPU.

Don't remove exclude GPU, you're running an app_info, while you can have multiple <app_versions> in an app_info for Backward compatibility when upgrading from Stock,
you can't have an <app_version> for CAL devices and another for AMD/ATI OpenCL devices, Boinc will just start work on both GPUs because you're supplying the apps in the app_info:

Anonymous platform

If you try running Stock apps you might find it'll work differently, then Boinc might only try starting OpenCL apps on the OpenCL device.

Claggy
ID: 1356850 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1356879 - Posted: 14 Apr 2013, 17:50:39 UTC - in response to Message 1356850.  
Last modified: 14 Apr 2013, 17:50:56 UTC


you can't have an <app_version> for CAL devices and another for AMD/ATI OpenCL devices, Boinc will just start work on both GPUs because you're supplying the apps in the app_info:

Claggy


??? If app_info contains only OpenCL plan class why BOINC will use CAL device?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1356879 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356909 - Posted: 14 Apr 2013, 18:42:05 UTC - in response to Message 1356845.  
Last modified: 14 Apr 2013, 18:44:35 UTC

ati_opencl_* plan class is actually CAL plan class.
So, BOINC attempts to launch on CAL device. App should not allow that. But, in the log we see that with wrong BOINC proposal APP's queue creation fails too.

TBar, could you try to add -gpu_lock -instances_per_device 1 to ap_cmdline*.txt file

WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
ERROR: clCreateContext: -33
Creating Command Queue. (clCreateCommandQueue) -34

these lines worry me.

#define CL_INVALID_DEVICE                           -33
#define CL_INVALID_CONTEXT                          -34


Hence, app says it will use own enumeration but failswith that.
What will be if -gpu_lock will be in place?

I added the -gpu_lock -instances_per_device 1 to the cmd file. The next ATI task should start in about 1.2 hours using slot 3. Slot 3 already has the files from the failed attempt with the stderr text file reading;
Running on device number: 1
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
WARNING: BOINC supplied wrong platform!
BOINC assigns device 1
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
ERROR: clCreateContext: -33
Creating Command Queue. (clCreateCommandQueue) -34
ID: 1356909 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1356912 - Posted: 14 Apr 2013, 18:44:54 UTC - in response to Message 1356879.  


you can't have an <app_version> for CAL devices and another for AMD/ATI OpenCL devices, Boinc will just start work on both GPUs because you're supplying the apps in the app_info:

Claggy


??? If app_info contains only OpenCL plan class why BOINC will use CAL device?

From the app_info we have one <coproc> type, and it's just called ATI, or CUDA, or intel_gpu, that's what determines what device Boinc will start an app on, don't think the <plan_class> matters to Boinc at all:

<app_version>
<app_name>astropulse_v6</app_name>
<version_num>604</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.2</max_ncpus>
<plan_class>ati_opencl_100</plan_class>
<cmdline></cmdline>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1812.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libfftw3f-3.dll</file_name>
</file_ref>
<file_ref>
<file_name>ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt</file_name>
<open_name>ap_cmdline.txt</open_name>
</file_ref>
</app_version>

Claggy
ID: 1356912 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1356930 - Posted: 14 Apr 2013, 19:25:47 UTC - in response to Message 1356912.  

I see, thanks, Claggy. Looks like BOINC design flaw to me.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1356930 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356949 - Posted: 14 Apr 2013, 20:50:00 UTC
Last modified: 14 Apr 2013, 21:07:22 UTC

This is the results from the new cmd entries;
-unroll 4 -ffa_block 2048 -ffa_block_fetch 1024 -gpu_lock -instances_per_device 1 -hp

Running on device number: 1
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
WARNING: BOINC supplied wrong platform!
BOINC assigns device 1
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
ERROR: clCreateContext: -33
Creating Command Queue. (clCreateCommandQueue) -34
Running on device number: 0
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Old way GPU lock enabled. Use -instances_per_device N switch to provide number of instances to run.
Number of app instances per device set to:1
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to 0 (including) will be checked
Used slot is 0;	Info: BOINC provided device ID used
Used GPU device parameters are:
	Number of compute units: 8
	Single buffer allocation size: 128MB
	max WG size: 128
ID: 1356949 · Report as offensive
Charlie Fenton

Send message
Joined: 3 Apr 99
Posts: 35
Credit: 6,090,490
RAC: 5
Message 1356955 - Posted: 14 Apr 2013, 21:24:49 UTC - in response to Message 1356949.  
Last modified: 14 Apr 2013, 21:25:11 UTC

Thanks, Claggy for your explanation of the need for the exclude GPU.

TBar, please understand that the issue I am working on here is BOINC's identification of GPUs. Work fetch is a different matter right now. Please try it with the 3650 or 2600 in the high slot and the 4670 in the low slot, and post the output from BOINC identifying the GPUs. As Claggy suggested earlier, this is the test result I really need to see.

Note that you will probably need to change the exclude GPU to GPU 1 when you do this.

Thank you.
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 1356955 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1356961 - Posted: 14 Apr 2013, 21:58:13 UTC - in response to Message 1356955.  

Thanks, Claggy for your explanation of the need for the exclude GPU.

TBar, please understand that the issue I am working on here is BOINC's identification of GPUs. Work fetch is a different matter right now. Please try it with the 3650 or 2600 in the high slot and the 4670 in the low slot, and post the output from BOINC identifying the GPUs. As Claggy suggested earlier, this is the test result I really need to see.

Note that you will probably need to change the exclude GPU to GPU 1 when you do this.

Thank you.

There are other people using the machine at present. You will have to wait until late tonight to pull the machine apart again. I can just about guarantee you it will say the same thing it did last time;

3/21/2013 6:53:33 PM | | CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
3/21/2013 6:53:33 PM | | CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 992 GFLOPS peak)
3/21/2013 6:53:33 PM | | OpenCL: AMD/ATI GPU 0: ATI Radeon HD 2600 (RV630) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 348 GFLOPS peak)
3/21/2013 6:53:33 PM | SETI@home | Found app_info.xml; using anonymous platform
3/21/2013 6:53:33 PM | | App version needs OpenCL but GPU doesn't support it


This is due to the fact that the Work Fetch is still identifying the machine as having 2 OpenCL cards, and BOINC is still trying to start a task using the 3650. Nothing has changed. Back in November the machine WAS running Stock, without an App_Info file, and the results were the same as they are now. Note the title of the thread; How to get 2 Stock ATI AP tasks run on your 4670 at the same time, Sorta

The easiest way to check if BOINC is working correctly with the 3650 is to see if Work Fetch is still asking for work for two cards. When Work Fetch asks for work for One Card, the rest will work. I'd bet on it.
ID: 1356961 · Report as offensive
Charlie Fenton

Send message
Joined: 3 Apr 99
Posts: 35
Credit: 6,090,490
RAC: 5
Message 1357062 - Posted: 15 Apr 2013, 5:24:32 UTC - in response to Message 1356961.  
Last modified: 15 Apr 2013, 5:29:23 UTC

I can just about guarantee you it will say the same thing it did last time;

[code]3/21/2013 6:53:33 PM | | CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
3/21/2013 6:53:33 PM | | CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 992 GFLOPS peak)
3/21/2013 6:53:33 PM | | OpenCL: AMD/ATI GPU 0: ATI Radeon HD 2600 (RV630) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 348 GFLOPS peak)
3/21/2013 6:53:33 PM | SETI@home | Found app_info.xml; using anonymous platform
3/21/2013 6:53:33 PM | | App version needs OpenCL but GPU doesn't support it

I am hoping it will say something like this instead:
CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 992 GFLOPS peak)
OpenCL: AMD/ATI GPU 1: OpenCL: AMD/ATI GPU 0: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)

(It should identify the OpenCL device as the 4600 series, and as GPU 1, not GPU 0.)

Once again, work fetch is a related issue but not the one I am trying to solve right now. We can't begin to tackle that until I get BOINC's basic identification of the GPUs working (i.e., get the display to correctly identify which GPU is which.)

Also, I think I had the exclude GPU thing backwards: I think you may need to change it from GPU 1 to GPU 0.

Cheers,
--Charlie
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 1357062 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1357079 - Posted: 15 Apr 2013, 6:31:27 UTC - in response to Message 1357062.  

Seems you're right, it's not the same. It's even stranger now. A task was running before the swap. I didn't change the Exclude, now, it's still running. The start-up says;
4/15/2013 2:04:05 AM |  | CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
4/15/2013 2:04:05 AM |  | CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 960 GFLOPS peak)
4/15/2013 2:04:05 AM |  | OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)
4/15/2013 2:04:05 AM | SETI@home | Config: excluded GPU.  Type: ATI.  App: astropulse_v6.  Device: 1
4/15/2013 2:04:05 AM | SETI@home | Restarting task ap_30no12ae_B5_P0_00178_20130331_19812.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 1
4/15/2013 2:04:05 AM | SETI@home | Restarting task ap_21fe13ab_B6_P0_00399_20130411_01014.wu_0 using astropulse_v6 version 601 in slot 0


OK, the Slot data says;

Running on device number: 0
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Number of app instances per device set to:1
Old way GPU lock enabled. Use -instances_per_device N switch to provide number of instances to run.
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to 0 (including) will be checked
Used slot is 0;	Info: BOINC provided device ID used
Used GPU device parameters are:
	Number of compute units: 8
	Single buffer allocation size: 128MB
	max WG size: 128

Build features: Non-graphics	OpenCL	OCL_ZERO_COPY	COMBINED_DECHIRP_KERNEL	FFTW	USE_INCREASED_PRECISION	USE_SSE2	x86	
     CPUID: Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz 

     Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 
AstroPulse v.6
Non-graphics	FFTW	USE_CONVERSION_OPT	
Windows x86 rev 1761, V6 match, by Raistmer with support of Lunatics.kwsn.net team.	SSE2

OpenCL version by Raistmer

oclFFT fix for ATI GPUs by Urs Echternacht
ffa threshold mods by Joe Segur
SSE3 dechirping by JDWhale
Combined dechirp kernel by Frizz
Number of OpenCL platforms:				 1


 OpenCL Platform Name:					 AMD Accelerated Parallel Processing
Number of devices:				 1
  Max compute units:				 8
  Max work group size:				 128
  Max clock frequency:				 775Mhz
  Max memory allocation:			 134217728
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 536870912
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Global
  Local memory size:				 16384
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI RV730
  Vendor:					 Advanced Micro Devices, Inc.
  Driver version:				 CAL 1.4.1734
  Version:					 OpenCL 1.0 AMD-APP (937.2)
  Extensions:					 cl_khr_gl_sharing cl_amd_device_attribute_query cl_khr_d3d10_sharing 


state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\clFFTplan_ATIRV730_32768_r1761.bin, continue with recompile...
WARNING: patching required max_kernel_wg_size=32
GPU device synched
Termination request detected or computations are finished. GPU device synched,  exiting...
Running on device number: 0
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Number of app instances per device set to:1
Old way GPU lock enabled. Use -instances_per_device N switch to provide number of instances to run.
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to 0 (including) will be checked
Used slot is 0;	Info: BOINC provided device ID used
Used GPU device parameters are:
	Number of compute units: 8
	Single buffer allocation size: 128MB
	max WG size: 128

Build features: Non-graphics	OpenCL	OCL_ZERO_COPY	COMBINED_DECHIRP_KERNEL	FFTW	USE_INCREASED_PRECISION	USE_SSE2	x86	
     CPUID: Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz 

     Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 
### Restart at 13.51 percent.
state.fold_buf_size_short=65536; state.fold_buf_size_long=262144

I'm still using -unroll 4 -ffa_block 2048 -ffa_block_fetch 1024 -instances_per_device 1 -gpu_lock -hp

According to the above, it's running an AP on the 3650. But, we all know that can't be the case. GPUz agrees, the 4670 is at 99% while the 3650 is at 1% usage. At least it's still working....for now.
ID: 1357079 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1357106 - Posted: 15 Apr 2013, 7:41:24 UTC - in response to Message 1357062.  


OpenCL: AMD/ATI GPU 1: OpenCL: AMD/ATI GPU 0: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)

(It should identify the OpenCL device as the 4600 series, and as GPU 1, not GPU 0.)

--Charlie


Charlie, GPU 1 or GPU 0 ? Both are mentioned in single string.
Also, what device (via -- device N) BOINC will report to science app ?
If it will be --device 1 while only single OpenCL-capable card in host and counting starts from 0 - it will be strange at least.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1357106 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1357110 - Posted: 15 Apr 2013, 7:51:22 UTC - in response to Message 1357079.  
Last modified: 15 Apr 2013, 7:53:57 UTC

The start-up says;
[code]4/15/2013 2:04:05 AM | | CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 992MB available, 348 GFLOPS peak)
4/15/2013 2:04:05 AM | | CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 1024MB, 992MB available, 960 GFLOPS peak)
4/15/2013 2:04:05 AM | | OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 960 GFLOPS peak)
4/15/2013 2:04:05 AM | SETI@home | Config: excluded GPU. Type: ATI. App: astropulse_v6. Device: 1
OK, the Slot data says;

[code]
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0, slots 0 to 0 (including) will be checked
Used slot is 0; Info: BOINC provided device ID used


I see inconsistence here.
--device was 0 apparently while BOINC states it's device 1 to user (compare bolded parts).
App did its own OpenCL-ony enumeration and sees (of course) single OpenCL capable device enumerated as 0-th device (as should be because there is convention to start array indexes from 0 just as C-array).
Better to change wording in BONC report to OpenCL AMD/ATi GPU 0 IMHO.
If you want to make apparent what GPU is what add PCIe number (for example) to both devices.
Then same PCIe identificator will be near CAL device 1 and OpenCL device 0 where device N means index/offset in homogenious device lists. If User will have NV device in slot before or let say RAID controller in slot before we will not count all devices as device 2,3 and so on, right?... Each list of similar devices starts from zero.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1357110 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

Message boards : Number crunching : Continuing SETI Problems with 2 ATI Cards Installed.


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.