Impossible Deadline

Message boards : Number crunching : Impossible Deadline
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1112071 - Posted: 1 Jun 2011, 15:46:51 UTC - in response to Message 1112057.  
Last modified: 1 Jun 2011, 15:50:19 UTC

I don't know if you mean me.

My experiences showed, if BOINC ask for CPU and nVIDIA GPU WUs, BOINC get only GPU WUs until the WU cache is filled up.
So in this time BOINC get no .vlar WUs, because they are only for CPUs.

If the 'ghost/re-sent function' will send 'missed' WUs after BOINC asked for CPU and nVIDIA GPU WUs, BOINC get the 'Didn't resend lost task xxxxxxxxxxxxxxxxxxxxxxxxx.vlar_x (expired)' message if there were .vlar WUs and this WUs get a new deadline (finished).


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
ID: 1112071 · Report as offensive
Profile Steven Meyer
Avatar

Send message
Joined: 24 Mar 08
Posts: 2333
Credit: 3,428,296
RAC: 0
United States
Message 1112259 - Posted: 2 Jun 2011, 4:30:16 UTC

On -> this page, there is an explanation of the "Time reported or Deadline" column in the tasks list...

Not reported yet, deadline in the past.
Deadline, shown in red.


...from which I take it to mean that when the date is red, then it is a deadline that has expired, which is what I was thinking originally.
ID: 1112259 · Report as offensive
Profile AI4FR
Avatar

Send message
Joined: 13 Apr 11
Posts: 57
Credit: 23,590,991
RAC: 0
United States
Message 1112268 - Posted: 2 Jun 2011, 6:53:47 UTC

I got one of these as well yesterday.

It was sent 31 May 2011 | 9:09:50 UTC
with a deadline of 31 May 2011 | 9:15:10 UTC


http://setiathome.berkeley.edu/result.php?resultid=1929727574




ID: 1112268 · Report as offensive
Profile Steven Meyer
Avatar

Send message
Joined: 24 Mar 08
Posts: 2333
Credit: 3,428,296
RAC: 0
United States
Message 1112318 - Posted: 2 Jun 2011, 12:29:52 UTC - in response to Message 1112268.  

I got one of these as well yesterday.

It was sent 31 May 2011 | 9:09:50 UTC
with a deadline of 31 May 2011 | 9:15:10 UTC


http://setiathome.berkeley.edu/result.php?resultid=1929727574


...and again it is a VLAR.

I wonder if there is something more happening here with the VLAR WU.

ID: 1112318 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1112328 - Posted: 2 Jun 2011, 13:51:42 UTC - in response to Message 1112318.  

...and again it is a VLAR.

I wonder if there is something more happening here with the VLAR WU.

Yes, and again - as we all told you yesterday (and please read back, if you're still not clear) - it is precisely part of the coping strategy for VLARs that causes us to to see this.

Back to first principles. SETI tasks are all the same. There is no such thing as a 'CPU task' or a 'GPU task'. BUT: experience told us that VLAR tasks run very, very slowly on GPUs (nVidia GPUs only, I mean, before any of the resident pedants chip in). The user desktop experience is also horrible when VLARs are running on a nV GPU.

So, a policy decision was taken that the servers wouldn't send a VLAR task to nVidia GPUs. That turned out to be easy to manage when tasks are first issued - you just don't see VLARs assigned to GPU crunching any more.

But when the communications are congested (as they have been recently), and messages between the servers and your computer are failing to battle their way through the cacophany, the question of 'lost results' comes up. And, once volunteer crunchers had contributed enough funds to purchase more powerful servers, the project was once more in a position to enable the code for "resend lost results". It turned out that VLARs were once more being sent out to GPUs.

The solution - admittedly, a bit of a kludge - was to fiddle with the deadline for VLAR tasks in danger of being sent to a GPU.

So, as Carola said, the sequence of events is:

  • Your computer asks for CPU work (with or without a concurrent request for GPU work).
  • The server assigns a VLAR for CPU processing, and sets a normal deadline (some 47 days ahead).
  • The server sends a message to tell your computer about the work assignment.
  • That message fails to reach your computer.
  • Your computer still needs work, and five minutes later asks again - this time, with a request that includes work for a GPU.
  • The server realises that the previous message didn't get through.
  • It looks at the work which was previously assigned but lost, and tries to assign it again - starting by trying to fill the the GPU request.
  • The server finds the VLAR task, previously assigned to the CPU, and realises that it can't be assigned to the GPU.
  • The server changes the deadline of the task from "six weeks ahead" to "now" - "now" being the time of your second work request.
  • It's now too late to do the work, so it doesn't get sent - the object of the exercise.
  • The server moves on, and finds other work for your GPU.
  • The left-behind VLAR task stays in the system, for assignation against another CPU request in due course.


It may not be elegant or intuitive, but that's how its done. And it works.

ID: 1112328 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1112352 - Posted: 2 Jun 2011, 15:39:38 UTC

Thank you Richard for the consice explanation. When the servers had been jammed up before I have seen those timeouts and thought it was just a lack of communication. Now I know how they are caused. One less thing to fret over.
[/quote]

Old James
ID: 1112352 · Report as offensive
Profile Steven Meyer
Avatar

Send message
Joined: 24 Mar 08
Posts: 2333
Credit: 3,428,296
RAC: 0
United States
Message 1112543 - Posted: 3 Jun 2011, 1:19:11 UTC - in response to Message 1112328.  


So, as Carola said, the sequence of events is:

  • Your computer asks for CPU work (with or without a concurrent request for GPU work).
  • The server assigns a VLAR for CPU processing, and sets a normal deadline (some 47 days ahead).
  • The server sends a message to tell your computer about the work assignment.
  • That message fails to reach your computer.
  • Your computer still needs work, and five minutes later asks again - this time, with a request that includes work for a GPU.
  • The server realises that the previous message didn't get through.
  • It looks at the work which was previously assigned but lost, and tries to assign it again - starting by trying to fill the the GPU request.
  • The server finds the VLAR task, previously assigned to the CPU, and realises that it can't be assigned to the GPU.
  • The server changes the deadline of the task from "six weeks ahead" to "now" - "now" being the time of your second work request.
  • It's now too late to do the work, so it doesn't get sent - the object of the exercise.
  • The server moves on, and finds other work for your GPU.
  • The left-behind VLAR task stays in the system, for assignation against another CPU request in due course.


It may not be elegant or intuitive, but that's how its done. And it works.


Thank you Richard for the consice explanation. When the servers had been jammed up before I have seen those timeouts and thought it was just a lack of communication. Now I know how they are caused. One less thing to fret over.


Yes, a concise explanation is what was needed.

As Ricky Ricardo would have said . . .
"Lucy! You've got some 'splainin to do."
ID: 1112543 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Impossible Deadline


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.