Downloading tasks for offline re-check

Message boards : Number crunching : Downloading tasks for offline re-check
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1991708 - Posted: 27 Apr 2019, 15:03:06 UTC

Oh my.... seems another Fermi card has hooked up with the CUDA 75 App, which we know isn't supposed to work. Unfortunately, it seems you can't block the older cards from being sent the CUDA 75 App, just as you can't block the ATI HD4 cards from being sent the HD5 App. The Fermi cards are supposed to be blocked from the CUDA 75 App. Anyway, it seems some of the tasks are Validating somehow, and since the times are better, somehow, than the OpenCL App the server insists on sending the CUDA App instead. I'm not real sure how some tasks are validating, the results only display Errors. Really strange stuff going on here, https://setiathome.berkeley.edu/results.php?hostid=7568678&offset=440
ID: 1991708 · Report as offensive
MFWBBB

Send message
Joined: 8 Feb 01
Posts: 3
Credit: 2,173,541
RAC: 2
United States
Message 1991715 - Posted: 27 Apr 2019, 16:54:20 UTC

I'm unable to download any new tasks. Just updated to the new bionic program, same results. Been down for awhile now. Any ideas as to what I should do?
ID: 1991715 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1991724 - Posted: 27 Apr 2019, 19:07:28 UTC - in response to Message 1991617.  

Hmmmm, the amount of Memory Booster needed to fix that post would probably pose a health risk. Perhaps you should ask Raistmer where he was getting all those Mac Apps he passed on to Eric.

Well, what reminder actually I need is do we had any of Petri's builds on beta already or still not?
For OpenCL issues on Mac there could be good solution to abandon OpenCL in favor of Petri's CUDA special (and this requires passing through beta). At least for NV.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1991724 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1991734 - Posted: 27 Apr 2019, 20:27:48 UTC - in response to Message 1991724.  

At this time there isn't a CUDA Special App at Beta. I do think the current v0.98b1 would be an excellent candidate though. The VRAM requirements have been lowered to where it will work acceptably on the many 2 GB GPUs. Also, the problem with the Arecibo tasks has been greatly reduced to where there are very few bad best pulses on the VHAR tasks while the speed on the VLARs is much better. The problem might be trying to block the older pre-Maxwell GPUs. As we've just seen, apparently it isn't possible to block the pre-Kepler GPUs from the Old CUDA 75 App. If it isn't possible to block the Older CUDA cards from the Special App it could be a horrible display of failed tasks. There is an 'updated' version 0.99 of the Special App in the works, however, the more I think about it the less I think it's 'features' would be desirable on the SETI Server. There are problems with the Mac 0.98 version right now, but the Linux version appears to be ready for Beta, IMO. Provided the older GPUs could be blocked from being sent the App.
ID: 1991734 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1991736 - Posted: 27 Apr 2019, 20:58:45 UTC - in response to Message 1991734.  
Last modified: 27 Apr 2019, 21:12:24 UTC

I'm not sure whether Eric has preferred to specify plan classes in XML or plan classes in C++. If the former, he'd have to switch to C++, but then there'd be no problem in specifying both the minimum and maximum compute capability: for NVidia, that's enough to pick out card generations precisely. You can set min display driver version and min video RAM as well, of course.

The only thing that's needed is for the developer to make it explicitly clear what the requirements of his or her application are, so that Eric can deploy it properly.

Edit - the big problem comes in when a user has both a new generation card and an older generation card in the same computer, and manually sets BOINC to use all GPUs. The plan class mechanism can't overrule the user's choices, to keep the restricted application off the lesser card.
ID: 1991736 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1991785 - Posted: 28 Apr 2019, 7:35:26 UTC - in response to Message 1991736.  


Edit - the big problem comes in when a user has both a new generation card and an older generation card in the same computer, and manually sets BOINC to use all GPUs. The plan class mechanism can't overrule the user's choices, to keep the restricted application off the lesser card.

BOINC client could obey plan class restrictions too. Actually it knows all needed info. It's client who reports precise host info to server to make choice what to send and what not to send.
In return client recives plan class along with its restrictions. Why not doing correct scheduling then?....

I think app itself should have fail-safe measures like those I embed in OpenCL apps. App checks GPU compatibility. And then should just end with error code instead of providing wrong results.
It will be problem on mixed GPU hosts still, but at least will not resut in any problems with reported results validity.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1991785 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1991788 - Posted: 28 Apr 2019, 8:25:31 UTC - in response to Message 1991785.  

Actually, not. The client knows everything about your machine, but it only reports the best bits.

If I look inside the sched_request file on this machine, it starts

<coproc_cuda>
   <count>2</count>
   <name>GeForce GTX 970</name>
   <available_ram>3215065088.000000</available_ram>
...
- but actually, that's one GTX 970 and one GTX 750 Ti. There's no mention of the 750 Ti in any of the information which the server uses to decide on work allocation.

David made that design decision way back before GPU computing was first launched in 2008, and by the time of the 2014 BOINC Workshop in Budapest, he was acknowledging in public that it was a mistake. But it would be a huge job to rectify it in both client and server now, and I don't think anybody has the strength to tackle it. So, for the foreseeable future, we have to work around and with what we've got, and try not to fight against it.
ID: 1991788 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1991791 - Posted: 28 Apr 2019, 9:37:46 UTC

it seems that this IS the problem on these
two new hosts https://setiathome.berkeley.edu/forum_thread.php?id=69782&postid=1991790#1991790

bad app selected .
ID: 1991791 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1991802 - Posted: 28 Apr 2019, 13:55:40 UTC - in response to Message 1991791.  

it seems that this IS the problem on these
two new hosts https://setiathome.berkeley.edu/forum_thread.php?id=69782&postid=1991790#1991790

bad app selected .
No, neither of those - they each have just the one (mobile) GPU. There must be a reason they're getting a bad app, but it can't be because a good GPU and a bad GPU (in app terms) are being muddled up in the work allocation process.

Perhaps the slightly different specs of the mobile versions (they tend to have less dedicated memory?) weren't taken account of when the plan class limits were being drawn up for stock deployment?
ID: 1991802 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1993100 - Posted: 8 May 2019, 12:58:30 UTC - in response to Message 1991724.  

...Well, what reminder actually I need is do we had any of Petri's builds on beta already or still not?
It appears we've narrowed down one of the last problems, where the App selects the wrong Best Pulse when running the Arecibo VHAR tasks. Seems the determining factor is how many Compute Units the GPU has. In offline tests it was noticed the GPU with 8 CUs always selected the correct Best Pulse whereas the GPUs with 5 or 6 CUs tended to select the Wrong Pulse as Best. Since running the GPU with 8 CUs at Beta the Bad Best Pulses have disappeared and the only Inconclusive results are from the Quick Overflows and "misbehaving apps/machines". Notice the difference when changing from a 750 Ti with 5 CUs to a 960 with 8 CUs, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=72229&state=3 As soon as that is fixed...
ID: 1993100 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Downloading tasks for offline re-check


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.