Message boards :
Number crunching :
Project Backoff too extreme...
Message board moderation
Author | Message |
---|---|
hiamps Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0 |
When the servers are slow the "Project Backoff" sucks. If I leave it alone I will keep having more waiting to upload than I have uploading. It wants to keep waiting like an hour or 2 when uploads are working but just slow. On it's own Boinc managed to upload 18 units, in the last 20 minutes I made it upload 185 units. I'll take the flak as I feel this is too extreme. The scheduler should know the difference from a slow connection to one that is not working. If they don't see anything is wrong and want to tell us maybe I can break it the rest of the way so they will fix it! Official Abuser of Boinc Buttons... And no good credit hound! |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
The scheduler should know... Uploads and downloads aren't done via the scheduler. Simply said, only reports and requests for new work use the scheduler. Up- and downloads just transfer data from the project's server (disk) to your computer (disk). |
hiamps Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0 |
The scheduler should know... What tells it how long to wait from one try to the next? Official Abuser of Boinc Buttons... And no good credit hound! |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
Yes It does, I can see the retry in x number of minutes, But backoff frequently means longer and longer times and then when It does try It's for a minute or two only and then It goes right back to backoff, Somebody rip It out, If not I could go back to a version of Boinc that doesn't support It like 6.10.18(I think that one doesn't at least). The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
I use v6.10.18 and it has project backoff. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
I use v6.10.18 and it has project backoff. I wasn't sure, Thanks. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Somebody rip It out It's there to stop computers DDoS'ing the project. Max back-off will be 24 hours. If you don't like it ... The first version to have it was 6.6.38 |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
When the servers are slow the "Project Backoff" sucks. If I leave it alone I will keep having more waiting to upload than I have uploading. It wants to keep waiting like an hour or 2 when uploads are working but just slow. On it's own Boinc managed to upload 18 units, in the last 20 minutes I made it upload 185 units. I'll take the flak as I feel this is too extreme. The scheduler should know the difference from a slow connection to one that is not working. If they don't see anything is wrong and want to tell us maybe I can break it the rest of the way so they will fix it! I'm sorry, my friend, but the backoffs are not big enough. Your complaint is that when a transaction fails, it takes too long for the next retry. My statement is "if the backoff was long enough, the very next try would almost always be successful." According to the "30 day" graphs on Scarecrow's site, something just under 50,000 results per hour are reported. That implies that there are about 50,000 uploads per hour. (call it just under four per second). SETI was struggling for a bit over 30 hours. That's a backlog of 1,500,000 uploads, and if I remember correctly, the cap on retries is 4 hours -- meaning we're asking the upload servers to take over 100 completed uploads per second. That won't ever happen. So, the project wide backoff helps by saying "if one upload failed, another upload in a few seconds will also likely fail, so there is no great need to try." ... but there are many clients that don't have the project-wide backoff, and even if the project-wide backoff cut the load in half, it's still going to be a monster number. What is needed is a mechanism to tell the clients to slow down at times like this. If we knew that the upload servers could handle 20 uploads per second, and we could tell all the clients to back-off to the point where there were exactly 20 per second, then backlogs would clear very much faster. The biggest single problem is that our normal reaction is that anything will fit if we just push very hard. Sometimes we still think "push harder" is the right answer even after we break the tool. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I use v6.10.18 and it has project backoff. You'd have to go back to whatever the last version of 6.6 was IIRC. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
The scheduler should know... It is called an "exponential back-off" -- the back-off is calculated based on the number of times the upload has failed. The first few back-offs are fairly short, and it increases to an upper limit. The average, as I remember, is two hours. It doesn't go all the way to four hours and stay there (which is the normal way of doing an exponential back-off). I think most of us know (directly or indirectly) what happens if you jam stuff into the garbage disposer too fast. Feed it at a reasonable rate, and it grinds up the garbage into nice tiny bits that are flushed down the drain. Feed it too fast, and the motor jams, the circuit breaker pops, and you have to get the tool and unjam the motor -- if you're lucky. If you clog the drain you might have to disassemble the trap or even call the rooter-dude. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
The biggest single problem is that our normal reaction is that anything will fit if we just push very hard. Sometimes we still think "push harder" is the right answer even after we break the tool. Get a bigger hammer?? :-) PROUD MEMBER OF Team Starfire World BOINC |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
When the servers are slow the "Project Backoff" sucks. If I leave it alone I will keep having more waiting to upload than I have uploading. It wants to keep waiting like an hour or 2 when uploads are working but just slow. On it's own Boinc managed to upload 18 units, in the last 20 minutes I made it upload 185 units. I'll take the flak as I feel this is too extreme. The scheduler should know the difference from a slow connection to one that is not working. If they don't see anything is wrong and want to tell us maybe I can break it the rest of the way so they will fix it! My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out, as I realize the cpus are under what amounts to a heavy bombardment, It's that or forbid any new accounts for Seti@Home use until something changes. At least the cpus haven't turned molten. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out, as I realize the cpus are under what amounts to a heavy bombardment, It's that or forbid any new accounts for Seti@Home use until something changes. At least the cpus haven't turned molten. But the problem comes from the fact that, given a choice of letting things cool off, and turning up the heat, too many people want to try to melt the CPUs. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out Or they can just figure out what the present problem is & fix it. Unlikely to require new hardware to do that. Grant Darwin NT |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
The biggest single problem is that our normal reaction is that anything will fit if we just push very hard. Sometimes we still think "push harder" is the right answer even after we break the tool. Exactly. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out Look at the numbers. The problem isn't the hardware on the server side, it's 100 upload attempts per second. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Somebody rip It out Project jack off........IMHO. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
The biggest single problem is that our normal reaction is that anything will fit if we just push very hard. Sometimes we still think "push harder" is the right answer even after we break the tool. Well, that shows everybody who said Ned doesn't know Jack. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65759 Credit: 55,293,173 RAC: 49 |
My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out And what is the maximum capacity to generate acks by the servers? Are We 50% over? 100% over? or what? The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
My backoffs have been ongoing for nearly a week, Seti needs to get AP off It's turf and to have a few more servers so that acks can be sent out Probably more like 250% SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.