Message boards :
Number crunching :
Panic Mode On (54) Server problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
Indeed, it looks as if the scheduler is assigning work again. The cricket graph is back to maxxed out on the download side, however uploads still aren't going through. I believe you are the first to report receiving AP tasks. Perhaps someone at the lab is having an early morning, or maybe a late night ;) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14655 Credit: 200,643,578 RAC: 874 |
I am in the process of downloading 6 AP WUs with completion times of 10x. I thought AP downloads were turned off. I'm beginning to wonder if it might be working like this: BOINC servers aren't supposed to send out work if it can't be completed in time. The servers know, obviously, what work has been allocated to a host: and they also know what work the host knows about (otherwise 'resend lost results' wouldn't work). What if (and I'm guessing blind here): you have something approaching a 10-day cache. The server is suddenly thinking your host is ten times slower than it really is (because of the APR cap). That would make the server think your 10-day cache is going to take you 100 days. That would take you beyond deadline, so there wouldn't be any point in sending you work that (it thinks) you wouldn't finish in time. But once your (real) cache has dropped below 2.5 days, then the server's (inflated) estimate of your cache size would come down to 25 days - and you'd start to be capable of completing extra work before deadline. It's a theory, anyway. |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
Makes sense to me. Geoff, what are the estimated times for those tasks? And what is your current DCF? I know my Q8300 machine does AP tasks with the Lunatics app in ~12 hours. With the current calculations being crazy over estimated, I'm interested to see just how long they are estimated. My CPU work was off by about 4x on my i7 machine, with GPU work being off by about 60x for the same machine's GTX460's. My other machines have been mostly unable to download new work during this time, so I can't compare them intelligently. |
geoff Send message Joined: 25 Apr 00 Posts: 123 Credit: 34,100,351 RAC: 18 |
DCF 1.1 Completion times 87.55 hrs ! Normal completion time about 9hrs |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
So it looks like it's off by about a factor of 9. By itself isn't AWFUL, but when you figure the tasks are usually 8-24 hours long, they could fill up even a 10 day cache VERY fast. Using your machine for example, quad-core taking 80 hours per task (adjusted the 87.5hrs back to a 1.0DCF), it would only take 12 tasks to fill 10 days. |
W-K 666 Send message Joined: 18 May 99 Posts: 19136 Credit: 40,757,560 RAC: 67 |
I can shoot that theory down immediately as the DCF on my computer has kept in the range 0.5 to 2.0. This is thanks to the incorrect APR for AP that thought my computer was faster than it is and gave me 16 AP tasks just before the maintenance. By forcing the computer to always have a least one AP task running, every time one finishes it forces the DCF up, last increase was to 1.88xxxx. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Well AP work is obviously being handed out. Last night, ready to send was above 30k, and now it is below 10k, so 20k have been issued already.. or thrown at /dev/null and are being re-split. I'm hoping to pick some up really soon here though. My 10-day cache is going to be empty probably during maintenance tomorrow. I'll start to have idle cores in about 10 hours. With a completely empty cache, I wonder if I can get one of these 9x ETA tasks. Presently I'm asking for ~3.1M seconds of work and getting nothing in return. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Floyd Send message Joined: 19 May 11 Posts: 524 Credit: 1,870,625 RAC: 0 |
I can shoot that theory down immediately as the DCF on my computer has kept in the range 0.5 to 2.0. This is thanks to the incorrect APR for AP that thought my computer was faster than it is and gave me 16 AP tasks just before the maintenance. Ok... as a noob I am wondering about the DCF factor , what is a good one ? according to the computer/details my dcf is - Task duration correction factor 0.265752 Yet I am Just getting work 1 0r 2 WU's at a time... as I finish one it ( after Reporting ) only sends me 1 or 2 more . Yet when the WU starts it still says about 25x time more than it actually takes to do the Unit... !!! SO how does that work ? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13765 Credit: 208,696,464 RAC: 304 |
Yet when the WU starts it still says about 25x time more than it actually takes to do the Unit... !!! Check the Shorties estimate up from three minutes to six hours thread. In short- server side update borked things badly. The fact that the upload server is also down isn't helping. Grant Darwin NT |
SciManStev Send message Joined: 20 Jun 99 Posts: 6653 Credit: 121,090,076 RAC: 0 |
Will there ever come a day again, when AP flows freely, or will this drought of AP units force me to stray away from the path of AP only? At this point, I would be happy just being able to obtain and hold a full cache, regardless of what was in it. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
Will there ever come a day again, when AP flows freely, or will this drought of AP units force me to stray away from the path of AP only? I don't know, the last AP I got took over twenty minutes to download :). |
Jim_S Send message Joined: 23 Feb 00 Posts: 4705 Credit: 64,560,357 RAC: 31 |
At this point in time has anyone got anything to upload or download? I Desire Peace and Justice, Jim Scott (Mod-Ret.) |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0 |
At this point in time has anyone got anything to upload or download? well considering that i had well over 100 WUs in my cache before the server crash on the 13th, i probably have 30 or so AP tasks waiting to upload right now... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Can't upload right now (for few hours already) |
bryn Send message Joined: 6 Mar 01 Posts: 7 Credit: 145,339 RAC: 0 |
I dont think anything can be working right at the moment, I have only ever returned 1 ap, and I am currently crunching my second one, yet my account only shows the one that I have returned! Managed to upload 13 wu's earlier but they went through very slowly over about 5/6 hours, I now have another 6 ready to return! Wow just as I post that they all went through! |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
At this point in time has anyone got anything to upload or download? Seven tasks still waiting to upload. Zero to download. |
MikeN Send message Joined: 24 Jan 11 Posts: 319 Credit: 64,719,409 RAC: 85 |
yes I can get wu's to upload, but it takes a lot of effort. It has just taken 5 minutes and three attempts to upload 1 wu. The good news is that once they are uploaded, reporting works fine. Edit the cricket graph also shows uploads currently at over 20 M bit/s Edit Edit but the server status page currently showing the upload server as disabled and the date / time seems to be up to date? If the upload server is disabled, where is all the uploaded data going??? |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
9/19/2011 10:58:22 PM SETI@home Sending scheduler request: To fetch work. 9/19/2011 10:58:22 PM SETI@home Requesting new tasks for CPU 9/19/2011 10:58:27 PM SETI@home Scheduler request completed: got 0 new tasks 9/19/2011 10:58:27 PM SETI@home No tasks sent 9/19/2011 10:58:27 PM SETI@home This computer has reached a limit on tasks in progress Do we have a new surprise (?) :) This is happening on two of my CPU-only hosts and a GPU one, too. Cache nowhere near the usual 5 days I run. Edit: I hope this is a temporary measure to prevent overfetch after they revert back to normal run times. Does that mean they will fix it? Yay, if so! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14655 Credit: 200,643,578 RAC: 874 |
9/19/2011 10:58:22 PM SETI@home Sending scheduler request: To fetch work. That does sound like a very sensible precaution by the project, to protect against runaway work-fetch during recovery from the APR and DCF problem. We've seen the limit applied before during recovery from problems: it will be a limit on the number of tasks in progress, not taking any account of their expected or actual run-time. Roughly how many tasks do you have in progress on the affected boxes? |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
That does sound like a very sensible precaution by the project, to protect against runaway work-fetch during recovery from the APR and DCF problem. Yes, I was thinking it's for that reason, too. At least we have a confirmation the fix is on the way. We've seen the limit applied before during recovery from problems: it will be a limit on the number of tasks in progress, not taking any account of their expected or actual run-time. Roughly how many tasks do you have in progress on the affected boxes? 340 or so shorties on CPU one, but that will be enough for 2-3 days, so I'm not panicking. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.