Message boards :
Number crunching :
"Lost Task"
Message board moderation
Author | Message |
---|---|
John S. (Low IFR) Jenkins Send message Joined: 22 Feb 01 Posts: 53 Credit: 24,515,908 RAC: 0 |
What is a "lost task", as in "Sending Lost task" message |
W-K 666 Send message Joined: 18 May 99 Posts: 19057 Credit: 40,757,560 RAC: 67 |
What is a "lost task", as in "Sending Lost task" message A task that failed to reach your computer the first time it was sent. |
Area 51 Send message Joined: 31 Jan 04 Posts: 965 Credit: 42,193,520 RAC: 0 |
What is a "lost task", as in "Sending Lost task" message A task that the servers allocated to your host - but effectively never arrived, or never finished arriving - perhaps due to network congestion. When your host reports, it tells the servers what it has in its cache - if that doesn't agree with what the server thinks you have, the servers resend any tasks your host hasn't got but should have. These lost tasks are often called ghosts. |
W-K 666 Send message Joined: 18 May 99 Posts: 19057 Credit: 40,757,560 RAC: 67 |
A further bit of info. Tasks marked VLAR do not run on the GPU and therefore on the first sending only, they are sent to the CPU. When they are resends they can be sent to the GPU, if this happens a quick and dirty piece of code kicks in and gives them an impossible deadline. This effectively aborts the task. |
Sakletare Send message Joined: 18 May 99 Posts: 132 Credit: 23,423,829 RAC: 0 |
What I don't understand about lost tasks is why they are only resent in batches of 20. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
What I don't understand about lost tasks is why they are only resent in batches of 20. Because that is the arbitrary number they selected for the upper limit. I would guess it is to limit the amount of data the servers have to move from the disk array or something of that nature. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
What I don't understand about lost tasks is why they are only resent in batches of 20. When the tasks are first sent out, there's some fairly rigorous checking done on the server (in theory, at least) to ensure that no task is sent which might possibly run into deadline trouble, given the speed of the host, what proportion of the time it's crunching, how much work it has on board from other projects, etc., etc. I think the 'resend' logic is much simpler - which is one reason why we have this problem with VLARs. And don't forget, your computer might have gone off and collected a bucketful of work from another project in the meantime, thinking that SETI hadn't given it anything. I think the 20 task limit is just a simple-minded precaution against overwork. You can always ask for another 20 five minutes later. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
What I don't understand about lost tasks is why they are only resent in batches of 20. Yes, the 20 comes from the <max_wus_to_send> project setting. Resend for lost tasks uses that setting directly rather than multiplied by the number of processing resources. Even for normal work assignments, a single core CPU system without a usable GPU is restricted to 20 per request. Joe |
Francesco Forti Send message Joined: 24 May 00 Posts: 334 Credit: 204,421,005 RAC: 15 |
In theese days I see a lot of 896 SETI@home 24.05.2012 10:18:39 Resent lost task 07my10ag.14625.51365.15.10.220_1 916 SETI@home 24.05.2012 10:18:39 [error] Already have task 07my10ag.14625.51365.15.10.220_1 Of course I see 20 lost task resent followed by 20 Already have task This mainly on two hosts. [Edit_ now three]. |
Fred E. Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0 |
[quote]In theese days I see a lot of 896 SETI@home 24.05.2012 10:18:39 Resent lost task 07my10ag.14625.51365.15.10.220_1 916 SETI@home 24.05.2012 10:18:39 [error] Already have task 07my10ag.14625.51365.15.10.220_1 Of course I see 20 lost task resent followed by 20 Already have task This mainly on two hosts. [Edit_ now three]. [/quote This is an unexpected consequence from a change to the server code that was made during Tuesday's maintenance window. It will occur when you try to report more than 64 results, so it should only occur on your fastest hosts. The project has been alerted to this issue. There is more info in the thread "Unannounced Server-Side Change?" Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Typically, "resent lost task:" happens because of the following: Your machine sends a server request asking for work. The request makes it to the server, gets processed, and WUs get assigned to you. The message from the server back to you informing you of these tasks and where to download them from gets dropped due to network congestion. Your machine will say something like "time-out window reached" after 360 seconds because it did not get a response from the server. After usually 60 seconds, it will send another scheduler request. When you send scheduler requests, it sends a copy of client_state.xml along with it which says what work you have, how long it is expected to take, etc. The server looks at what your machine says it has, compares it to what you're supposed to have, and sends you those same tasks again, but as "lost tasks." Limited to 20 at a time, and as long as there are no crazy hardware limitations, other projects getting a whole pile of work in the meantime or you're getting too close to deadlines, you can get an indefinite number of batches of 20 resends until you have all that you're supposed to have. The resending of lost tasks is a relatively new feature. It used to be called "ghosts" because they would show up on your tasks page here on the website, but they weren't on your machine. They would simply just sit there and wait until the deadline passed and then get sent to somebody else. The only time ghost tasks happen are when the network pipe is maxed out, which in the past 6+ months.. has been nearly 24/7. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.