Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (117)
Message board moderation
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 52 · Next
Author | Message |
---|---|
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
In my case, I have to report frequently, more than 70 tasks reported at a time after outages results in Scheduler issues and nothing gets reported.False logic. Even if you manage to report everything outstanding right at the beginning of an outage, you still have every task completed during the outage still waiting at the end. There are other ways of ensuring that the 'end of outage report' isn't too large, and they work even if we have a 12 hour outage or a 2 day outage next time. |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
Hi Grant, . . Or maybe only use it if the machine has not reported results in over 1 hour. Stephen :) |
![]() Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 ![]() |
Once I used the parameter in the cc_config.xml file to reduce the # of reported tasks to something lower than 400-500 the server stopped having a hiccup every time I tried to report after the last Tuesday outage. Some like this: <cc_config> <log_flags> <sched_op_debug>1</sched_op_debug> </log_flags> <options> <use_all_gpus>1</use_all_gpus> <save_stats_days>90</save_stats_days> <max_file_xfers>16</max_file_xfers> <max_file_xfers_per_project>8</max_file_xfers_per_project> <max_tasks_reported>150</max_tasks_reported> </options> </cc_config> Tom A proud member of the OFA (Old Farts Association). |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
If I remember correctly, Richard can confirm . . . . the client is hard-wired in the scheduler module to contact the project at minimum every hour. Yep, in cs_scheduler.cpp // report results within this time after completion // #define MAX_REPORT_DELAY 3600 Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Yes to both of those - MAX_REPORT_DELAY 3600 <max_tasks_reported>150</max_tasks_reported> - though I'd personally take <max_tasks_reported> even lower, to perhaps 64. Only caveat: neither of those were included in the earlier versions of BOINC. If you're still using one of those, I'd seriously suggest you consider upgrading it: there are some good things in the newer versions, even if you have to do a little re-learning. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
False logic. Even if you manage to report everything outstanding right at the beginning of an outage, you still have every task completed during the outage still waiting at the end. <max_tasks_reported>150</max_tasks_reported>Which is why I have that set to 75 to avoid Scheduler problems when reporting. Avoid letting it build up to problem levels (when possible), and limit the number to avoid problems when it's not possible to avoid the build up in the first place. Edit- oh, and it's taken a while but finally my Linux system is able to get work on most requests. Grant Darwin NT |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
Yes to both of those - . . It's OK Wiggo, it is ONLY a recommendation, you can keep using your prehistoric version of BOINC :) Stephen :) |
![]() ![]() Send message Joined: 24 Jan 00 Posts: 37554 Credit: 261,360,520 RAC: 489 ![]() ![]() |
. . It's OK Wiggo, it is ONLY a recommendation, you can keep using your prehistoric version of BOINC :)But I'm not using Windows any more Stephen. ;-) Cheers. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
And he's using BOINC v7.14.2 :-) Well done, that man.. . It's OK Wiggo, it is ONLY a recommendation, you can keep using your prehistoric version of BOINC :)But I'm not using Windows any more Stephen. ;-) |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . It's OK Wiggo, it is ONLY a recommendation, you can keep using your prehistoric version of BOINC :)But I'm not using Windows any more Stephen. ;-) . . D'oh! < Stephen slaps himself with a wet trout ... > . . Damn, a perfectly good joke ruined/wasted ... Stephen :) |
Lazydude Send message Joined: 17 Jan 01 Posts: 45 Credit: 96,158,001 RAC: 136 ![]() ![]() |
. . It's OK Wiggo, it is ONLY a recommendation, you can keep using your prehistoric version of BOINC :)But I'm not using Windows any more Stephen. ;-) . . Damn, a perfectly good joke ruined/wasted ... nope It just got backfired and got much funnier ... |
Kiska Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0 ![]() |
I'll put that in once I remember how I setup munin :DExcellent! Ok I'll added both graphs. But I don't have historical data since I just added them |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Ok I'll added both graphs. But I don't have historical data since I just added themThanks for that. The graphs make it so much easier to see how things are going than having just the current numbers. Grant Darwin NT |
Kiska Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0 ![]() |
Ok I'll added both graphs. But I don't have historical data since I just added themThanks for that. Also I am sorry for anyone wanting those graphs to load the other million on the page... Just a side effect of graphing most projects |
Lazydude Send message Joined: 17 Jan 01 Posts: 45 Credit: 96,158,001 RAC: 136 ![]() ![]() |
I'll put that in once I remember how I setup munin :DExcellent! Thank you very much! On my wishlist: Result turnaround time (last hour average) Its an good indication on when there are much shorties in the system. Earlier this year (Aug) if the value went under 30h - then I suspected that the system will be in trouble in a couple of hours I have not yet seen when start to be trouble again- 26h seems to be fine Thanks again! |
![]() ![]() ![]() Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 ![]() ![]() |
I'm starting to see a small amount in the Ready to Send Queue... 40K. I take this as a good sign. Are some of the faster machines now getting some WUs to fill the cache?? |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
But I'm not using Windows any more Stephen. ;-) . . Glad it wasn't wasted ... Stephen :) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'm starting to see a small amount in the Ready to Send Queue... 40K. I take this as a good sign. Are some of the faster machines now getting some WUs to fill the cache?? Think that is the effect of all the spoofed clients reducing their gpu count to reasonable levels owing the 400 per gpu limit now. I certainly backed off considerably on all my hosts. Still working through all the overabundance of gpu tasks trying find the new reduced cache floor. Haven't asked for gpu work since discovering the new limits this morning. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 1 Apr 13 Posts: 1858 Credit: 268,616,081 RAC: 1,349 ![]() ![]() |
+1 !! ![]() ![]() |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13903 Credit: 208,696,464 RAC: 304 ![]() ![]() |
And hosts such as my Linux one that kept getting mostly "Project has no tasks available" responses when trying for work, now that's it's regularly getting work it's finally managed to fill it's cache.I'm starting to see a small amount in the Ready to Send Queue... 40K. I take this as a good sign. Are some of the faster machines now getting some WUs to fill the cache??Think that is the effect of all the spoofed clients reducing their gpu count to reasonable levels owing the 400 per gpu limit now. The Results-in-progress line is now more horizontal than vertical; it's still going to take a while for things to settle down but the end is in sight. Looks like there will be an extra 1.8 million or so WUs out with hosts now (around 6.8 million in total). Grant Darwin NT |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.