Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 82 · 83 · 84 · 85 · 86 · 87 · 88 . . . 94 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
@Grant:The solution doesn't match the problem. 'Update' will (amongst other things) try to get more work allocated by the scheduler: but this work is already allocated - it just needs to be downloaded. What's needed is some variation on --file_transfer URL filename {retry | abort}but as it stands that needs a filename. You could use --get_file_transfersand pipe the output to a file - then pick filenames from that. |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
Also unconditional running of update will force a scheduler request to happen when you are still within the cooldown form the previous request. This makes the scheduler refuse to send you any work and reset the cooldown back to full five minutes. When it happens every 15 min, you may lose up to 33% of your opportunities to get more work, which puts you in a disadvantage in a situation where work generation is being throttled. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
for i in `boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do boinccmd --file_transfer http://setiathome.berkeley.edu $i retry;done Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Also unconditional running of update will force a scheduler request to happen when you are still within the cooldown form the previous request. This makes the scheduler refuse to send you any work and reset the cooldown back to full five minutes. When it happens every 15 min, you may lose up to 33% of your opportunities to get more work, which puts you in a disadvantage in a situation where work generation is being throttled. i run it on an interval of 930s. while you might miss a couple times, it doesn't seem to have any real adverse affects. my systems still get topped up all the way. it's a lot better than going into extended backoffs while you're sleeping and waking up to a cold system Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
i run it on an interval of 930s. while you might miss a couple times, it doesn't seem to have any real adverse affects. my systems still get topped up all the way. it's a lot better than going into extended backoffs while you're sleeping and waking up to a cold systemYour cron can use smaller units than minutes? |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
I don’t run it in cron. I just open a terminal window and use the watch command and let it run. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4829 Credit: 85,281,665 RAC: 126 |
BoincTasks can do the upload/download retries for you automatically. There is an example config.xml file under "Program Files\eFMer\BoincTasks\examples". You copy and edit this file to folder where BoincTasks exe-files are and restart BoincTasks. By default it checks upload/download queue every 4000 seconds and retries them. If file transfers still fail, it will shorten the interval automatically by 360 seconds (or something like that, I don't remember exactly). The interval is decreased after each retry if file transfers fail until it is 180/360 seconds (sorry, don't remember the exact value). If file transfers are successful, the interval is increased with same step until it is 4000 seconds. The config.xml also allows you to control the work requests as well but I haven't experimented with those parameters. The file has comments in it for different parameters and their purpose. You can see how it has operated under BoincTasks menu Show->Log. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
The solution doesn't match the problem. 'Update' will (amongst other things) try to get more work allocated by the scheduler: but this work is already allocated - it just needs to be downloaded.That's what I get for posting so late (early) at night. Sorry, I was conflating two different issues. What I was addressing there was the issue of BOINC "forgetting' to even go asking for work for extended periods when there was none. NVM Sounds like the BoincTasks option may be worth exploring ... appreciate it. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
for i in `boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do boinccmd --file_transfer http://setiathome.berkeley.edu $i retry;done This command will give the boot to stuck transfers if you don’t want to run additional software. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
for i in `boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do boinccmd --file_transfer http://setiathome.berkeley.edu $i retry;done Do you just run this every time you power up the machine or do you wait for an issue to come up? I am new to Linux as well, what options if any do you set for the watch command? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I would only run the watch command when you expect or getting issues and aren't going to be monitoring the machine while you are away or asleep. Because it can preempt the normal 305 second scheduler connection and force a "too soon" timeout backoff interval. Generally that doesn't cause an issue because it is infrequent or worst case synchronizes with the stock interval and keeps preempting it. But if you choose a much longer than 305 second interval and an odd number that won't happen very often. But it is safe to use. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Do you need to panic? I'm not panicking in the slightest Results ready to send 0 0 391,092 9m Current result creation rate ** 0/sec 0.0184/sec 3.9030/sec 5m Results out in the field 0 32,248 6,382,603 9m |
Niteryder Send message Joined: 1 Mar 99 Posts: 64 Credit: 22,663,988 RAC: 18 |
Are we back? |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
Are we back? yup, but expect some recovery lag as the system gets pounded by EVERYONE all returning the finished WUs and asking for fresh data. The system will be wonky while dealing with the volume for a while. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
hello? |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 |
Are we back? Yep. Much Wonkiness. That's the computer equivalent to Jerkiness in Calculus. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
hello? Good evening cruncher America |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 |
hello? Good Afternoon in NZ! Not a lot of crunching going on right now. Seems things are a little stuck at the moment. Validations are slow, downloads aren't happening despite a growing number of ready to send. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
hello? All part of the recovery process |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
Splitters stopped too late. There is now over 1.2 million tasks in RTS queue and the total number of tasks in the db is way past the 20 mil stability limit :( |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.