Impossible Deadline

Message boards : Number crunching : Impossible Deadline
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 1403453 - Posted: 15 Aug 2013, 14:37:44 UTC
Last modified: 15 Aug 2013, 14:40:31 UTC

I had a batch of 11 cpu tasks allocated to one of my machines that all had a deadline of 1 Hr 7 min and 20 Secs. One is - Here

I checked my event log and the tasks were never downloaded.
There was also 1 AP allocated that is missing from my task list.

Here are the event log entries around the time the server allocated them.

15/08/2013 6:49:40 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:49:56 AM | SETI@home | Scheduler request failed: Couldn't resolve host name
15/08/2013 6:51:03 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 6:51:03 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:51:19 AM | SETI@home | Scheduler request failed: Couldn't resolve host name
15/08/2013 6:52:14 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 6:52:14 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:54:13 AM | SETI@home | Scheduler request failed: Failure when receiving data from the peer
15/08/2013 6:55:09 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 6:55:09 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:55:25 AM | SETI@home | Scheduler request failed: Couldn't resolve host name
15/08/2013 6:56:16 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 6:56:16 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:56:32 AM | SETI@home | Scheduler request failed: Couldn't resolve host name
15/08/2013 6:57:12 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 6:57:12 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 6:57:28 AM | SETI@home | Scheduler request failed: Couldn't resolve host name


It must have been the request that resulted in data failure where the 12 tasks went missing as for about +/- 15 minutes all other requests show as Couldn't resolve host name due to loss of signal on my wireless data modem.

However, that does not explain why the deadline was impossibly short!
Flying high with Team Sicituradastra.
ID: 1403453 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1403466 - Posted: 15 Aug 2013, 15:28:44 UTC
Last modified: 15 Aug 2013, 15:29:16 UTC

http://lunatics.kwsn.net/1-discussion-forum/faq-read-only.msg47867.html#msg47867

Now, if I could be bothered to do about 15 more FAQ entries...
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1403466 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1403469 - Posted: 15 Aug 2013, 15:33:03 UTC

Actually, that rig only has CPU...

Do you have the log entries from the time the units were timed out? that must have been another scheduler contact.

Such deadlines mean the server timed those tasks out - must have decided it couldn't send them to you any more, which is strange - none of the usual reasons for that apply.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1403469 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1403472 - Posted: 15 Aug 2013, 15:39:01 UTC - in response to Message 1403469.  
Last modified: 15 Aug 2013, 16:13:28 UTC

So what you're saying is the server is [heuristically challenged]?

[Edit:] changed appropriate wording to politically correct version, At behest of William the conquering munchkin.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1403472 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 1403491 - Posted: 15 Aug 2013, 16:41:34 UTC - in response to Message 1403469.  

Actually, that rig only has CPU...

Do you have the log entries from the time the units were timed out? that must have been another scheduler contact.

Such deadlines mean the server timed those tasks out - must have decided it couldn't send them to you any more, which is strange - none of the usual reasons for that apply.


Ah ha!

Event log at expiry time.

15/08/2013 7:59:47 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 7:59:47 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 8:04:54 AM | SETI@home | Scheduler request failed: Timeout was reached
15/08/2013 8:05:22 AM | | Project communication failed: attempting access to reference site
15/08/2013 8:10:28 AM | | BOINC can't access Internet - check network connection or proxy configuration.


Signal on modem was still poor - happens sometimes as I am 44km (28mile) from the cell tower, and behind a hill!


So what you're saying is the server is [heuristically challenged]?


Looks like it. :)

The server must have decided it was unable to send me MB's so it would cancel them and send me an AP instead.

Flying high with Team Sicituradastra.
ID: 1403491 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1403512 - Posted: 15 Aug 2013, 17:14:45 UTC

would have to walk the code to ascertain why/if such a failed contact times them out and the reasoning behind it (if any).

I take it your question is answered?
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1403512 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1403518 - Posted: 15 Aug 2013, 17:29:44 UTC - in response to Message 1403512.  

...I take it your question is answered?


Whose ? Mine ? no Mine's not answered, and I don't expect it to be.

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1403518 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1403522 - Posted: 15 Aug 2013, 17:35:01 UTC - in response to Message 1403491.  

Event log at expiry time.

15/08/2013 7:59:47 AM | SETI@home | Sending scheduler request: To fetch work.
15/08/2013 7:59:47 AM | SETI@home | Requesting new tasks for CPU
15/08/2013 8:04:54 AM | SETI@home | Scheduler request failed: Timeout was reached
15/08/2013 8:05:22 AM | | Project communication failed: attempting access to reference site
15/08/2013 8:10:28 AM | | BOINC can't access Internet - check network connection or proxy configuration.


Signal on modem was still poor - happens sometimes as I am 44km (28mile) from the cell tower, and behind a hill!

Bad cell signal garbles scheduler requests and corrupts file transfers. We've heard that before.....
Donald
Infernal Optimist / Submariner, retired
ID: 1403522 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 1403531 - Posted: 15 Aug 2013, 17:44:53 UTC - in response to Message 1403512.  

would have to walk the code to ascertain why/if such a failed contact times them out and the reasoning behind it (if any).

I take it your question is answered?


Hmmm ... maybe a change in thinking

That rig just asked for more tasks, and the AP that was not in its task list got timed out by the server at the time of the scheduler request.

It now seems to me that the server no longer resends task that have not been received by the client - it cancels them by expiring the deadline instead!


Flying high with Team Sicituradastra.
ID: 1403531 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 1403604 - Posted: 15 Aug 2013, 20:11:33 UTC - in response to Message 1403518.  

...I take it your question is answered?


Whose ? Mine ? no Mine's not answered, and I don't expect it to be.


I suspect the answer is 42.
ID: 1403604 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1403610 - Posted: 15 Aug 2013, 20:15:21 UTC - in response to Message 1403531.  

It now seems to me that the server no longer resends task that have not been received by the client - it cancels them by expiring the deadline instead!

I was going to say it doesn't do that and include a log snippet that shows it still resends lost tasks, but... um, I didn't expect this:

to 15. elokuuta 2013 23.00.55 | SETI@home | Requesting new tasks for CPU
to 15. elokuuta 2013 23.00.57 | SETI@home | Scheduler request completed: got 1 new tasks
to 15. elokuuta 2013 23.00.57 | SETI@home | Didn't resend lost task 14se08ae.29771.328957.13.12.134.vlar_0 (expired)


I think the server may expire tasks if it thinks your host won't get them done before deadline. While the turn-around time for my host is a bit high the server should have seen my host had more than enough time to complete the task.

What our hosts have in common is that both are on their first v7 tasks. Maybe the server doesn't trust them enough yet or something like that. Although I'm not quite satisfied with that explanation either.
ID: 1403610 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 1403758 - Posted: 16 Aug 2013, 3:55:07 UTC - in response to Message 1403610.  

I think the server may expire tasks if it thinks your host won't get them done before deadline.

That does not seem likely, there was still plenty of time.

What our hosts have in common is that both are on their first v7 tasks. Maybe the server doesn't trust them enough yet or something like that. Although I'm not quite satisfied with that explanation either.

Not true. My host has previously completed V7's.
Flying high with Team Sicituradastra.
ID: 1403758 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1403780 - Posted: 16 Aug 2013, 6:31:07 UTC - in response to Message 1403758.  
Last modified: 16 Aug 2013, 6:40:38 UTC

I think the server may expire tasks if it thinks your host won't get them done before deadline.

That does not seem likely, there was still plenty of time.

What our hosts have in common is that both are on their first v7 tasks. Maybe the server doesn't trust them enough yet or something like that. Although I'm not quite satisfied with that explanation either.

Not true. My host has previously completed V7's.

But not the minimun 10 for the Scheduler to establish trustworthy APR/turnaround times. See Details for Host 5798096. As of this writing, it shows only 8 completed v7 tasks and a turnaround time of over 30 days.

Edit] And the task you cited in your original post was issued on 14 Aug and had an initial deadline of 4 September - 21 days (a shorty).
Donald
Infernal Optimist / Submariner, retired
ID: 1403780 · Report as offensive

Message boards : Number crunching : Impossible Deadline


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.