.vlar WUs to NVIDIA GPUs (Problem Solved)

Message boards : Number crunching : .vlar WUs to NVIDIA GPUs (Problem Solved)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
S@NL - John van Gorsel
Volunteer tester
Avatar

Send message
Joined: 5 Jul 99
Posts: 193
Credit: 139,673,078
RAC: 0
Netherlands
Message 1269911 - Posted: 10 Aug 2012, 8:36:25 UTC

Found 3 more vlar tasks sent to the GPU on 2 other pc's. 2 were sent in a request of 23:12 CET and that would be before the new scheduler was active, the third one was received after a request at 4:39 CET.
ID: 1269911 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1269912 - Posted: 10 Aug 2012, 8:39:13 UTC - in response to Message 1269911.  


I just got another VLAR on my GTX460.
Grant
Darwin NT
ID: 1269912 · Report as offensive
S@NL - John van Gorsel
Volunteer tester
Avatar

Send message
Joined: 5 Jul 99
Posts: 193
Credit: 139,673,078
RAC: 0
Netherlands
Message 1269919 - Posted: 10 Aug 2012, 9:18:31 UTC

And it gets worse: this host just received 6 vlars, the request is from 10:59 CET (8:59 UTC).

I will set No New Tasks on all of my pc's until this is fixed.
ID: 1269919 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1269924 - Posted: 10 Aug 2012, 10:05:50 UTC - in response to Message 1269919.  

And it gets worse: this host just received 6 vlars, the request is from 10:59 CET (8:59 UTC).

I will set No New Tasks on all of my pc's until this is fixed.

John - that host is running Linux. I'm not 100% sure whether the 'no VLAR to NV' policy used to be applied on all platforms, or just Windows. Have you ever seen VLARs before this?

I'll mention it in my next report to Eric, when the lab opens.
ID: 1269924 · Report as offensive
S@NL - John van Gorsel
Volunteer tester
Avatar

Send message
Joined: 5 Jul 99
Posts: 193
Credit: 139,673,078
RAC: 0
Netherlands
Message 1269928 - Posted: 10 Aug 2012, 10:24:03 UTC - in response to Message 1269924.  

John - that host is running Linux. I'm not 100% sure whether the 'no VLAR to NV' policy used to be applied on all platforms, or just Windows. Have you ever seen VLARs before this?

I'll mention it in my next report to Eric, when the lab opens.


Richard, I have 3 Linux hosts and before this, I never saw tasks with the vlar suffix going to the GPU
ID: 1269928 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1269930 - Posted: 10 Aug 2012, 10:39:48 UTC - in response to Message 1269928.  

John - that host is running Linux. I'm not 100% sure whether the 'no VLAR to NV' policy used to be applied on all platforms, or just Windows. Have you ever seen VLARs before this?

I'll mention it in my next report to Eric, when the lab opens.

Richard, I have 3 Linux hosts and before this, I never saw tasks with the vlar suffix going to the GPU

Thanks, I think that confirms that the previous policy applied cross-platform.

I wasn't sure, because of course the project only has stock applications for Windows.
ID: 1269930 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1270216 - Posted: 11 Aug 2012, 1:55:49 UTC
Last modified: 11 Aug 2012, 1:57:37 UTC

The fixed scheduler might be catching .0 and .1 vlars, but apparently not resends. Example
I've got 31 of these on my main rig since Thursday night Berkeley time, and over 110 altogether. With a cache of over 5000 WU's, I'm disinclined to mess with corrections; I'm just going to let them run out in their own time.
ID: 1270216 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1270319 - Posted: 11 Aug 2012, 10:07:13 UTC - in response to Message 1270216.  

The fixed scheduler might be catching .0 and .1 vlars, but apparently not resends. Example
I've got 31 of these on my main rig since Thursday night Berkeley time, and over 110 altogether. With a cache of over 5000 WU's, I'm disinclined to mess with corrections; I'm just going to let them run out in their own time.

Unfortunately not. I got 25ap11ac.19317.20930.3.10.218.vlar_1 sent to my laptop this morning.
ID: 1270319 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1270361 - Posted: 11 Aug 2012, 13:53:55 UTC

My BOINC still getting .vlar WUs for the NVIDIA GPU...

11 Aug 2012 09:37:59 UTC - x.vlar_1
11 Aug 2012 10:38:05 UTC - x.vlar_1
11 Aug 2012 10:38:05 UTC - x.vlar_1
11 Aug 2012 12:57:30 UTC - x.vlar_1


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1270361 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1270383 - Posted: 11 Aug 2012, 14:38:09 UTC - in response to Message 1270361.  

And I haven't seen VLAR WUs ATI-GPU for days, most if not all, are ~0.4ARWUs.
Can be a coïncedence?

And 1200 Resends, glad this is ON.


ID: 1270383 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1270574 - Posted: 11 Aug 2012, 21:59:20 UTC - in response to Message 1270383.  


Still getting VLARs on my Nvidia card here too.
Grant
Darwin NT
ID: 1270574 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1270706 - Posted: 12 Aug 2012, 7:12:56 UTC - in response to Message 1270574.  

Hello just got a bunch of _vlar2:

49 _vlar2
1 _vlar3

How will this turn out correctly?
Is there a sticky post how to resolve it?
What was the issue with new preferences settings?
Will we need a new BoinC client version installation?



ID: 1270706 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1270710 - Posted: 12 Aug 2012, 7:37:55 UTC - in response to Message 1270706.  

Hello just got a bunch of _vlar2:

49 _vlar2
1 _vlar3

How will this turn out correctly?
Is there a sticky post how to resolve it?
What was the issue with new preferences settings?
Will we need a new BoinC client version installation?




I think if you read the posts in this thread, you will have most of your questions answered.

This problem has nothing to do with preferences or your BOINC version.
ID: 1270710 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1270711 - Posted: 12 Aug 2012, 7:42:16 UTC - in response to Message 1269021.  

Richard is working on 'how to get the server to resend the VLAR to the CPU' instructions, which he will post once he's confirmed the procedure works reliably.

OK - as the Lady says...

First, these instructions are a first draft, and pretty telegraphic. They assume you're already familiar with the terminology, you know where to find the various BOINC files, and you know the rules for making changes to them. That's what we used to call ADVANCED USERS ONLY.

That's the only warning you're going to get. Read the instructions through carefully: check that you understand every point, and how to do it. If you're at all uncomfortable, don't even start. You're on your own from here.

  • Ensure you have a CPU application active for MB tasks
  • Unset 'Use NV GPU' (web preferences)
  • Set 'Use CPU' (web preferences)
  • Set 'No new tasks' (BOINC Manager)
  • Update project (BOINC Manager - if needed, some versions will report work immediately when NNT is set)
  • Suspend networking
  • Stop BOINC
  • Make backup copy of all .vlar datafiles
  • Edit client_state.xml: remove all '<result>' blocks for .vlar tasks
  • Restart BOINC
  • Restore all .vlar datafiles
  • Resume networking
  • Allow new work
  • Wait until all VLAR work has been resent to CPU
  • Set 'Use NV GPU' (web preferences)
  • Rinse and repeat




* Make backup copy of all .vlar datafiles
* Restore all .vlar datafiles

Does this mean "ALL" meaning also those .vlar datafiles that are already assigned to CPU?


ID: 1270711 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1270730 - Posted: 12 Aug 2012, 8:47:55 UTC - in response to Message 1270711.  

* Make backup copy of all .vlar datafiles
* Restore all .vlar datafiles

Does this mean "ALL" meaning also those .vlar datafiles that are already assigned to CPU?

At the time I wrote that, we were hoping that Eric would be able to restore the status quo quickly and easily, so we wouldn't need to apply the treatment more than once.

What I do is to sort the file list in Windows Explorer on the 'Type' column (in 'detail' view). That groups all "VLAR Files" together, and I can right-click them into whatever archiver I have handy - WinZip, 7-zip, WinRAR, or plain old 'compressed folder'.

After the BOINC restart step, files for the results you're deleted will have disappeared from the list. When you copy files back from the archive, the missing files will drop straight back in, but the old files, still being present, will prompt Windows to say 'do you want to overwrite this file?', or something like that. It really doesn't matter, but you may as well say no to all, to save wear and tear on the disk.

But let Windows do the decision-making - it's really not worth the effort to pick them out individually yourself. If your machine is fast enough to have lots of VLAR files that need treating, it should be able to handle the backup process quickly too. [hint: you could save a few seconds of shutdown time by making the VLAR file archive before you stop BOINC - that's OK]

Just don't apply this process to the .vlar_0_0 files generated while the CPU is working on the tasks you processed last time. And don't delete the <result> blocks for CPU tasks (version 603, no plan_class) - see the dicussion on Notepad++ and regular expressions earlier in this thread.
ID: 1270730 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1270736 - Posted: 12 Aug 2012, 8:55:49 UTC - in response to Message 1270711.  
Last modified: 12 Aug 2012, 8:57:41 UTC

Does this mean "ALL" meaning also those .vlar datafiles that are already assigned to CPU?

Actually not, but it's probably a lot easier to just backup all .vlar files, and skip those that are there, when restoring them. However "Edit client_state.xml: remove all '<result>' blocks for .vlar tasks" applies only to those, that are assigned to GPU, i.e. version 608, 609 or 610 (see also the NPP discussion above on how to easily find and remove those).

EDIT: I really need to learn to hit F5 before posting to a thread.
ID: 1270736 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1270741 - Posted: 12 Aug 2012, 9:21:31 UTC
Last modified: 12 Aug 2012, 9:23:48 UTC

I didn't notice any VLARs assigned to my CUDA GPUs when this issue was first raised, but now I'm starting to get them with increasing frequency. I'm not complaining at all, just noting that the issue appears to be ongoing.

Side note: I know that VLAR on ATI/AMD OpenCL is nowhere near as big a performance hit as with CUDA, but I do notice a run-time about 3 times as long as normal WUs, plus sometimes I experience an increase in GUI lag.
Soli Deo Gloria
ID: 1270741 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1270743 - Posted: 12 Aug 2012, 9:26:42 UTC - in response to Message 1270730.  

* Make backup copy of all .vlar datafiles
* Restore all .vlar datafiles

Does this mean "ALL" meaning also those .vlar datafiles that are already assigned to CPU?

At the time I wrote that, we were hoping that Eric would be able to restore the status quo quickly and easily, so we wouldn't need to apply the treatment more than once.

Just don't apply this process to the .vlar_0_0 files generated while the CPU is working on the tasks you processed last time. And don't delete the <result> blocks for CPU tasks (version 603, no plan_class) - see the dicussion on Notepad++ and regular expressions earlier in this thread.


I got it regarding the backup. The .vlar_0_0 was however a new point. Will all the active tasks get an extra _0 after the filename?


ID: 1270743 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1270750 - Posted: 12 Aug 2012, 9:48:02 UTC - in response to Message 1270743.  

* Make backup copy of all .vlar datafiles
* Restore all .vlar datafiles

Does this mean "ALL" meaning also those .vlar datafiles that are already assigned to CPU?

At the time I wrote that, we were hoping that Eric would be able to restore the status quo quickly and easily, so we wouldn't need to apply the treatment more than once.

Just don't apply this process to the .vlar_0_0 files generated while the CPU is working on the tasks you processed last time. And don't delete the <result> blocks for CPU tasks (version 603, no plan_class) - see the dicussion on Notepad++ and regular expressions earlier in this thread.

I got it regarding the backup. The .vlar_0_0 was however a new point. Will all the active tasks get an extra _0 after the filename?

To explain about the names:

I'll take my WU 1045795612 as an example.

The workunit is '25ap11ac.2053.25024.4.10.250.vlar', and that's the name of the file that both I and my wingmate have downloaded.

My Task is '25ap11ac.2053.25024.4.10.250.vlar_0', and my wingmate's task is '25ap11ac.2053.25024.4.10.250.vlar_1'.

My computer will create a result file '25ap11ac.2053.25024.4.10.250.vlar_0_0', and my wingmate will create '25ap11ac.2053.25024.4.10.250.vlar_1_0' - those two files will be uploaded to the server and compared.

So you may see any variation of _x_0 for tasks which are running. They are created as soon at the job starts to run (with a copy of the telescope recording information from the header of the WU file), but added to at intervals during the run, as signals are found. It would be safe to back them up and restore them while BOINC is stopped, but not safe while BOINC is running - the file might have changed between backup and restore.
ID: 1270750 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1270964 - Posted: 12 Aug 2012, 22:36:52 UTC - in response to Message 1270706.  

Hello just got a bunch of _vlar2:

49 _vlar2
1 _vlar3

How will this turn out correctly?
Is there a sticky post how to resolve it?
What was the issue with new preferences settings?
Will we need a new BoinC client version installation?




I see you are using GTX 560ti cards. I recently installed
one of these and was wondering just what king of problems
you are running into with vlar work units?

Mine is set to run three WUs at one time and haven't had
any troubles so far and they run faster on the gpu than
on the cpu. I'm looking for any problems that I may encounter
so I'll be prepared with a solution.
ID: 1270964 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : .vlar WUs to NVIDIA GPUs (Problem Solved)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.