Impossible Deadline

Author	Message
Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1112071 - Posted: 1 Jun 2011, 15:46:51 UTC - in response to Message 1112057. Last modified: 1 Jun 2011, 15:50:19 UTC I don't know if you mean me. My experiences showed, if BOINC ask for CPU and nVIDIA GPU WUs, BOINC get only GPU WUs until the WU cache is filled up. So in this time BOINC get no .vlar WUs, because they are only for CPUs. If the 'ghost/re-sent function' will send 'missed' WUs after BOINC asked for CPU and nVIDIA GPU WUs, BOINC get the 'Didn't resend lost task xxxxxxxxxxxxxxxxxxxxxxxxx.vlar_x (expired)' message if there were .vlar WUs and this WUs get a new deadline (finished). *- Best regards!* -** Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. - ID: 1112071 ·

Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0	Message 1112259 - Posted: 2 Jun 2011, 4:30:16 UTC On -> this page, there is an explanation of the "Time reported or Deadline" column in the tasks list... Not reported yet, deadline in the past. Deadline, shown in red. ...from which I take it to mean that when the date is red, then it is a deadline that has expired, which is what I was thinking originally. ID: 1112259 ·

AI4FR Send message Joined: 13 Apr 11 Posts: 57 Credit: 23,590,991 RAC: 0	Message 1112268 - Posted: 2 Jun 2011, 6:53:47 UTC I got one of these as well yesterday. It was sent 31 May 2011 \| 9:09:50 UTC with a deadline of 31 May 2011 \| 9:15:10 UTC http://setiathome.berkeley.edu/result.php?resultid=1929727574 ID: 1112268 ·

Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0	Message 1112318 - Posted: 2 Jun 2011, 12:29:52 UTC - in response to Message 1112268. I got one of these as well yesterday. It was sent 31 May 2011 \| 9:09:50 UTC with a deadline of 31 May 2011 \| 9:15:10 UTC http://setiathome.berkeley.edu/result.php?resultid=1929727574 ...and again it is a VLAR. I wonder if there is something more happening here with the VLAR WU. ID: 1112318 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1112328 - Posted: 2 Jun 2011, 13:51:42 UTC - in response to Message 1112318. ...and again it is a VLAR. I wonder if there is something more happening here with the VLAR WU. Yes, and again - as we all told you yesterday (and please read back, if you're still not clear) - it is precisely part of the coping strategy for VLARs that causes us to to see this. Back to first principles. SETI tasks are all the same. There is no such thing as a 'CPU task' or a 'GPU task'. BUT: experience told us that VLAR tasks run very, very slowly on GPUs (nVidia GPUs only, I mean, before any of the resident pedants chip in). The user desktop experience is also horrible when VLARs are running on a nV GPU. So, a policy decision was taken that the servers wouldn't send a VLAR task to nVidia GPUs. That turned out to be easy to manage when tasks are first issued - you just don't see VLARs assigned to GPU crunching any more. But when the communications are congested (as they have been recently), and messages between the servers and your computer are failing to battle their way through the cacophany, the question of 'lost results' comes up. And, once volunteer crunchers had contributed enough funds to purchase more powerful servers, the project was once more in a position to enable the code for "resend lost results". It turned out that VLARs were once more being sent out to GPUs. The solution - admittedly, a bit of a kludge - was to fiddle with the deadline for VLAR tasks in danger of being sent to a GPU. So, as Carola said, the sequence of events is: Your computer asks for CPU work (with or without a concurrent request for GPU work). The server assigns a VLAR for CPU processing, and sets a normal deadline (some 47 days ahead). The server sends a message to tell your computer about the work assignment. That message fails to reach your computer. Your computer still needs work, and five minutes later asks again - this time, with a request that includes work for a GPU. The server realises that the previous message didn't get through. It looks at the work which was previously assigned but lost, and tries to assign it again - starting by trying to fill the the GPU request. The server finds the VLAR task, previously assigned to the CPU, and realises that it can't be assigned to the GPU. The server changes the deadline of the task from "six weeks ahead" to "now" - "now" being the time of your second work request. It's now too late to do the work, so it doesn't get sent - the object of the exercise. The server moves on, and finds other work for your GPU. The left-behind VLAR task stays in the system, for assignation against another CPU request in due course. It may not be elegant or intuitive, but that's how its done. And it works. ID: 1112328 ·

James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54	Message 1112352 - Posted: 2 Jun 2011, 15:39:38 UTC Thank you Richard for the consice explanation. When the servers had been jammed up before I have seen those timeouts and thought it was just a lack of communication. Now I know how they are caused. One less thing to fret over. [/quote] Old James ID: 1112352 ·

Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0	Message 1112543 - Posted: 3 Jun 2011, 1:19:11 UTC - in response to Message 1112328. So, as Carola said, the sequence of events is: Your computer asks for CPU work (with or without a concurrent request for GPU work). The server assigns a VLAR for CPU processing, and sets a normal deadline (some 47 days ahead). The server sends a message to tell your computer about the work assignment. That message fails to reach your computer. Your computer still needs work, and five minutes later asks again - this time, with a request that includes work for a GPU. The server realises that the previous message didn't get through. It looks at the work which was previously assigned but lost, and tries to assign it again - starting by trying to fill the the GPU request. The server finds the VLAR task, previously assigned to the CPU, and realises that it can't be assigned to the GPU. The server changes the deadline of the task from "six weeks ahead" to "now" - "now" being the time of your second work request. It's now too late to do the work, so it doesn't get sent - the object of the exercise. The server moves on, and finds other work for your GPU. The left-behind VLAR task stays in the system, for assignation against another CPU request in due course. It may not be elegant or intuitive, but that's how its done. And it works. Thank you Richard for the consice explanation. When the servers had been jammed up before I have seen those timeouts and thought it was just a lack of communication. Now I know how they are caused. One less thing to fret over. Yes, a concise explanation is what was needed. As Ricky Ricardo would have said . . . "Lucy! You've got some 'splainin to do." ID: 1112543 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.