Message boards :
Number crunching :
Impossible Deadline
Message board moderation
Author | Message |
---|---|
Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 |
I was just now taking a look at the "Error" tasks listed for my account and this one is listed as "Timed out - no response". It was sent        1 Jun 2011 1:22:51 UTC with a deadline of 1 Jun 2011 1:29:56 UTC 7 minutes and 5 seconds is just a bit too quick, don't you think? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
It's a resent Wu, but because it's a VLAR it can't get resent to your GPU, so times out, Claggy |
Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 |
It's a resent Wu, but because it's a VLAR it can't get resent to your GPU, so times out, Something is screwy with the deadline calculation. I doubt that it could be downloaded, computed, and then uploaded in 7 minutes and 5 seconds, even with a GPU. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
It's a resent Wu, but because it's a VLAR it can't get resent to your GPU, so times out, It's a resent Wu, your computer didn't get the task on the first attempt, then ~5 minutes later, it did another request, and asked for GPU work this time, But VLAR's can't be sent to GPU's, so the time in red is when it timed out, if you had got the Wu, completed it, uploaded and reported it, then it would show the time when it was reported. Claggy |
Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 |
I guess I am not being clear, or maybe I am being too dense. As I understand it, a task is sent out on some date at some time, with a deadline some days in the future. That particular work unit is now out on two other computers with a deadline on July 17 -- more than a month away. When it was sent to me 1 June at 1:22:51 UTC, it had a deadline of 1 June at 1:29:56 UTC, or just about 7 minutes later. Even with an idle server, an idle network, and an idle GPU -- if VLARs could be processed on a GPU -- one would need more than 7 minutes to download, compute and upload the results. I am wondering how the deadline was set to just a few minutes after it was sent. Can you explain why the deadline was set so short? Here are messages around the time of the only message with this WU's name. The whole process, starting with Sending scheduler request: To fetch work and ending with 03mr11ab.14997.11519.10.10.227.vlar_0 (expired) took just 4 seconds. 5/31/2011 6:30:07 PM SETI@home Sending scheduler request: To fetch work. 5/31/2011 6:30:07 PM SETI@home Reporting 1 completed tasks, requesting new tasks for CPU and GPU 5/31/2011 6:30:11 PM SETI@home Scheduler request completed: got 4 new tasks 5/31/2011 6:30:11 PM SETI@home Message from server: Resent lost task 02ap11ab.16001.8247.7.10.212_1 5/31/2011 6:30:11 PM SETI@home Message from server: Resent lost task 02ap11ab.16001.8247.7.10.213_1 5/31/2011 6:30:11 PM SETI@home Message from server: Didn't resend lost task 03mr11ab.14997.11519.10.10.227.vlar_0 (expired) 5/31/2011 6:30:11 PM SETI@home Message from server: Resent lost task 02ap11ab.16001.8247.7.10.225_1 5/31/2011 6:30:11 PM SETI@home Message from server: Resent lost task 02ap11aa.17681.20517.12.10.27_1 5/31/2011 6:30:13 PM SETI@home Started download of 02ap11ab.16001.8247.7.10.212 5/31/2011 6:31:23 PM SETI@home Started upload of 03mr11ac.31817.23380.15.10.89_0_0 5/31/2011 6:31:28 PM SETI@home Finished upload of 03mr11ac.31817.23380.15.10.89_0_0 5/31/2011 6:35:16 PM SETI@home Sending scheduler request: To fetch work. 5/31/2011 6:35:16 PM SETI@home Reporting 1 completed tasks, requesting new tasks for CPU and GPU 5/31/2011 6:35:18 PM SETI@home Scheduler request completed: got 5 new tasks |
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
I guess I am not being clear, or maybe I am being too dense. I'm afraid it's the latter. ;-) And besides, you have set the emphasis wrong in the log: 5/31/2011 6:30:07 PM SETI@home Sending scheduler request: To fetch work. So, the task wasn't resent at all, because it was already expired, probably due to the reason Claggy explained. Gruß, Gundolf |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Steven, the point you are missing is that the 5 Tasks in question were not originally assigned to you on 5/31/2011, the date you actually downloaded them - they were assigned earlier, but due to network congestion, you did not receive the assignment confirmation, and so your BOINC never downloaded them. When your BOINC sent the Scheduler Request on 5/31/2011 6.30.07 PM, the Scheduler process recognized that you had not downloaded those tasks, and so it "resent" 4 of them. The one exception was the VLAR, which was originally assigned to your CPU. When a "lost" task is resent, if the Scheduler believes it can be completed within the original deadline, it keeps that deadline. But if it does not expect your host to complete it on time, it will reset the deadline. When a request for both CPU and GPU work is received, the Scheduler will fill the CPU portion first, if it can, and then fill the GPU request. If the first 2 resends filled the CPU request, the rest had to go to fill the GPU request. Except the Scheduler process is not allowed to send a VLAR Task to a GPU. So since the Scheduler now had nowhere to send the VLAR Task, it expired it from your task list and resent it to someone else. That is why there appeared to be such a short deadline on that Task. Edit: Gundolf posted while I was drafting this, but I didn't see it until after I posted. Donald Infernal Optimist / Submariner, retired |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... No, it first tries to send tasks to the "best" application and that would be the GPU unless the system has an extremely fast CPU and one of the least capable GPUs. And there's no logic to switch the assignment to CPU just because the special .vlar test keeps it from being sent to GPU. Joe |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
... Thank you, Joe. I stand corrected. But on resends, are they not already assigned to an application, and have to be resent to that same application? Donald Infernal Optimist / Submariner, retired |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
... Yes, but the application the server is concerned about is "setiathome_enhanced". The server doesn't mind which application version is used - CPU or CUDA will work just as well. |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
... Okay, I think I have it straight now. Version doesn't matter, except that marked .VLARs can only be sent to 6.03 Bottom line: the Task in question could not be resent to the CPU, so the Scheduler expired it and reissued it to another host. Donald Infernal Optimist / Submariner, retired |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
... Yes. I think that when work has been 'lost' like this, the server will deal with the resends first, and only consider new work when it's all been dealt with. Your BOINC client will sometimes ask for GPU work only, sometimes CPU work only, and sometimes both. If you happen to be asking 'GPU only' at a time when .vlar tasks are waiting to be resent, you'll get the 'not resent - expired' message. |
Steven Meyer Send message Joined: 24 Mar 08 Posts: 2333 Credit: 3,428,296 RAC: 0 |
OK, so let me see if I have this right ... I am just trying to understand how this happens ... Some time around a month ago, the VLAR task was supposed to be sent to me but it did not arrive here because of network congestion, along with four other tasks. Eventually, the Scheduler decided to fulfill one of my many requests for work with four of those lost tasks. Also, at the same time, the Scheduler decided that I would not be able to complete the VLAR task before the original deadline which was looming near. As a result, the Scheduler changed the sent date and the deadline to seven minutes and five seconds in the past and the current time, respectively, and expired the task -- in this scenario I am assuming that the server's system clock and mine are set very nearly the same since we are in the same time zone. Or maybe the original deadline was left unchanged and only the sent date was adjusted to seven minutes and five seconds in the past... Or... the server's system clock and mine are out of sync by 00:07:05 and the sent time was set to "now" according to the server, and the deadline was set to 00:07:05 in the future... By the way, my system clock is 11 seconds fast according to http://tycho.usno.navy.mil/simpletime.html. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Steven I think the big picture you are not seeing here is that the time of "1 Jun 2011 | 1:29:56 UTC" is/was not the deadline. It is the reported time or the time your computer was done dealing with that task. As the header states "Time reported or deadline". Had you seen the Status of the task at before that time it would have been "not downloaded", or whatever the actual message they use is, and the deadline. However the time stamp displayed now is the Time reported. If you want to find your log of when you originally tried to download 03mr11ab.14997.11519.10.10.227.vlar_0 look in stdoutdae.txt or stdoutdae.old. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
Ok, once again. The task was created 31 May 2011 | 23:56:40 UTC and sent 1 Jun 2011 | 1:22:51 UTC. It was a VLAR task for CPU so you will have requested CPU work about a minute earlier. 1:22:51 is the time, the task was assigned to your machine. Because of congestion on the DL link your machine did not receive the scheduler reply and you did not, in fact download the task. 6-7 minutes later the backoff on your machine ended and it asked again for work, this time for GPU. It also told the server about the tasks it had at that time. The server saw that the task was missing from the list, so it initiated resend. It then realised, that it was a VLAR and a GPU was asking. Therefore it timed out the task (as 'not doable') 1:29:56. As HAL9000 mentioned you should have a log of the two scheduler reuqests/replies in stdoutdae.txt or .old at the time of UTC +- your time difference to UTC. Carola ------- I'm multilingual - I can misunderstand people in several languages! |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
... I don't think so. My machine asked for CPU and nVIDIA GPU tasks and BOINC got: Didn't resend lost task xxxxxxxxxxxxxxxxxxxxxxxxx.vlar_1 (expired) NC subforum : 'canceled deadline because of re-sent function - Message 1098989' From my experiences, if BOINC ask for CPU and nVIDIA GPU tasks simultaneously, only GPU WUs come down, never CPU WUs. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
... I have got CPU and GPU work before in the same request, Normally it comes in the form of Multibeam for the Nvidia GPU and Astropulse for the CPU, (don't have CPU Multibeam in the app_info) and i also have got ATI and Nvidia GPU work together in the same request, it just depends which device is the most needy, and what the scheduler has available when Boinc asks for work, What it won't do is send VLAR work to the CPU if the Nvidia GPU is the most needy, Claggy |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
On 2 of my hosts, X9650+GTX480 & i7-2600 + 2x EAH5870, I've only got MB CUDA/OpenCL and AstroPulse, fortunatly I've a few CPU-only Projects,like CPDN, Docking, Leinden Classical. One of my hosts gets stuck, doing only CUDA tasks and once gave me Chasis Intrusion warning? Only I've to manage UPLoads by hand/finger (pushing Button), downloads appear to be alright. Tomorrow 2 june 2011, is a holiday in the Netherlands.( Hemelvaartsdag) And since this day just has started in Berkeley, I hope all tasks will be Uploaded, downloads came through. New Ghosts are appearing, in my FAULT LIST! Faults? This host. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I don't think so. This is an other story, if you have no S@h Enhanced (MultiBeam) for CPU. ;-) On my machine most of the time only S@h Enhanced (MultiBeam) enabled in the project settings. And my experiences showed, BOINC ask for CPU and nVIDIA GPU WUs and get only GPU WUs until the set WU cache is filled up, then ask for CPU WUs only. - Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. - |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
It would be nice, if people actually read before posting, especially posts from people who know what they are talking about (like Joe). This would greatly reduce the amount of times people who know what they are talking about have to repeat themselves. The machine askes for work - mixed CPU/GPU. The server first honours the demand of the more efficient app, normally the GPU. it does so until it either has fullfilled the request or has run out of tasks (remember the queue only holds 100 tasks at a time). If there are tasks left in the queue after the GPU had it's share the CPU gets some. If no tasks are left, the machine will only recieve part of what it asked for and is going to ask again. What has this got to do with deadlines? Carola ------- I'm multilingual - I can misunderstand people in several languages! |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.