Server Run, July 30 - August 2 2010

Author	Message
Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1022113 - Posted: 3 Aug 2010, 1:16:25 UTC - in response to Message 1022084. Last modified: 3 Aug 2010, 1:16:58 UTC Sutaru, That VLAR_2 was just one of the new ones that got sent back for some reason or another. So long as it doesn't end up on your GPU it should be no problem. ? I spoke about hiamps VLAR WU.. ;-) I use Fred's BOINC Rescheduler (V1.6) for to protect my GPU for VLAR WUs and for to avoid the famous -177 errors. :-) Got 4 more Vlars and used the 1.6 this time...Looked thru the list but didn't see 4 that looked any different. I got ~ 20 VLAR WUs for GPU, which weren't marked as .vlar_x . So BOINC Rescheduler and fine. ;-) ID: 1022113 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1022115 - Posted: 3 Aug 2010, 1:21:54 UTC - in response to Message 1022112. It sure would be cool if they could raise the work unit cache from 100 to 1000 or something due to the limited amount of download time. You mean the daily WU quota for GPU? Your top_RAC_host have current 'Max tasks per day 453' (GPU). This value x8 = 3,624 WUs/GPU/day. It's a pity, that they don't write there the correct WU value. ID: 1022115 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1022134 - Posted: 3 Aug 2010, 5:07:41 UTC - in response to Message 1022115. Last modified: 3 Aug 2010, 5:08:15 UTC It sure would be cool if they could raise the work unit cache from 100 to 1000 or something due to the limited amount of download time. You mean the daily WU quota for GPU? Your top_RAC_host have current 'Max tasks per day 453' (GPU). This value x8 = 3,624 WUs/GPU/day. It's a pity, that they don't write there the correct WU value. Sutaru, I think he is talking about expanding the Download Feeder Process, which has slots for only 100 Results at a time. Hiamps, if the Feeder slots could be expanded to 200 or more, what might that do to the already maxed-out download bandwidth, not to mention all the issues some folks have had with "ghosts" - WUs that were assigned but were never properly downloaded? A larger Download Feeder would also require more memory space on the server - is there enough available? ID: 1022134 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1022155 - Posted: 3 Aug 2010, 7:38:59 UTC - in response to Message 1022134. It sure would be cool if they could raise the work unit cache from 100 to 1000 or something due to the limited amount of download time. You mean the daily WU quota for GPU? Your top_RAC_host have current 'Max tasks per day 453' (GPU). This value x8 = 3,624 WUs/GPU/day. It's a pity, that they don't write there the correct WU value. Sutaru, I think he is talking about expanding the Download Feeder Process, which has slots for only 100 Results at a time. Hiamps, if the Feeder slots could be expanded to 200 or more, what might that do to the already maxed-out download bandwidth, not to mention all the issues some folks have had with "ghosts" - WUs that were assigned but were never properly downloaded? A larger Download Feeder would also require more memory space on the server - is there enough available? Not sure how that works anymore...Seems if 10,000 people have downloads going when the servers go down then when they come back up the servers seem to handle it OK? Just going by what I see....Looks like they opened the gates wide anyways as I took a nap and came back to 3500 downloads. Official Abuser of Boinc Buttons... And no good credit hound! ID: 1022155 ·

RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0	Message 1022372 - Posted: 4 Aug 2010, 2:18:24 UTC what happened this outage is exactly why ALL LIMITS NEED TO BE REMOVED AFTER 24 HOURS. check cricket graphs, server problem after 10 hours after removing limits. ID: 1022372 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 1022378 - Posted: 4 Aug 2010, 2:30:56 UTC I dissagree......................... When all limits are removed is precisely when "ghost" work units are more likely to occur and the probablity of project going down increases substantialy. Due to excessive data base activity and bandwidth being maxed out. I think if the cricket graph could be kept at a maximum of 80 (or less due to reduced demand) for the entire weekend then everything would work a lot better. Boinc....Boinc....Boinc....Boinc.... ID: 1022378 ·

RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0	Message 1022380 - Posted: 4 Aug 2010, 2:43:37 UTC - in response to Message 1022378. and then they have a chance to fix it before the outage, rather then a few hours to fill my belly... :P ID: 1022380 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1022381 - Posted: 4 Aug 2010, 2:44:52 UTC - in response to Message 1022372. what happened this outage is exactly why ALL LIMITS NEED TO BE REMOVED AFTER 24 HOURS. check cricket graphs, server problem after 10 hours after removing limits. If you really convince them of that relationship, they absolutely won't make the final limits boost earlier and incur some obligation to fix things during the weekend. They're not paid for that kind of support, though they usually do it when needed if they can. It might be possible to have one set of limits at the Friday morning start, a boost near their quitting time Friday afternoon, then the final boost Monday morning. But that first Friday morning set might need to be lower in order to reach a steady state before quitting time Friday afternoon. I doubt they'll be inclined to experiment along those lines next weekend, assuming there are new AP validators to watch. Joe ID: 1022381 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19062 Credit: 40,757,560 RAC: 67	Message 1022414 - Posted: 4 Aug 2010, 8:07:18 UTC - in response to Message 1022381. what happened this outage is exactly why ALL LIMITS NEED TO BE REMOVED AFTER 24 HOURS. check cricket graphs, server problem after 10 hours after removing limits. If you really convince them of that relationship, they absolutely won't make the final limits boost earlier and incur some obligation to fix things during the weekend. They're not paid for that kind of support, though they usually do it when needed if they can. It might be possible to have one set of limits at the Friday morning start, a boost near their quitting time Friday afternoon, then the final boost Monday morning. But that first Friday morning set might need to be lower in order to reach a steady state before quitting time Friday afternoon. I doubt they'll be inclined to experiment along those lines next weekend, assuming there are new AP validators to watch. Joe I would like them to move the three day outage to Mon thru Wed, so that they can start on Thur with limited downloads and then can be ramped up on Fri under supervision. At the moment UK office computers only have Monday and Tuesday to d/load. And if there are w/end outages even that option is dead. At present the re-start on Fridays is after UK office hours. ID: 1022414 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1022424 - Posted: 4 Aug 2010, 9:19:48 UTC - in response to Message 1022381. what happened this outage is exactly why ALL LIMITS NEED TO BE REMOVED AFTER 24 HOURS. check cricket graphs, server problem after 10 hours after removing limits. If you really convince them of that relationship, they absolutely won't make the final limits boost earlier and incur some obligation to fix things during the weekend. They're not paid for that kind of support, though they usually do it when needed if they can. It might be possible to have one set of limits at the Friday morning start, a boost near their quitting time Friday afternoon, then the final boost Monday morning. But that first Friday morning set might need to be lower in order to reach a steady state before quitting time Friday afternoon. I doubt they'll be inclined to experiment along those lines next weekend, assuming there are new AP validators to watch. Joe If I would be an admin of this project (and I guess it shouldn't be a problem to adjust the limit from at home, remote control of the server) it wouldn't be a problem for me to look two times/day over the weekend to the cricket graph for to increase the limit in steps. ;-) ID: 1022424 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1022496 - Posted: 4 Aug 2010, 15:04:43 UTC I thought Jeff's original idea was to script the increases gradually after the outage ended. What happened to that plan? Idling bandwidth over the weekend and then setting up a mad dash for everybody to fill their tanks on Monday just makes no sense to me. And did anybody ever figure out what the problem was with all of the difficulty folks were having with connecting to the servers even when the bandwidth usage was rather reasonable this run? Meow meow. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1022496 ·

IFRS Volunteer tester Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0	Message 1022520 - Posted: 4 Aug 2010, 16:37:03 UTC Not to mention that the babysit on big crunchers, that never was low, its like double now, and sometimes you canÂ´t fill them enough for the outage = frustration. I donÂ´t know how much time I will be able to keep with this pace before I got sick of it. Working on the machines like 3 or 4 hours everyday just to keep the farm running is a price that I may not be able to pay FOREVER. ID: 1022520 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 1022546 - Posted: 4 Aug 2010, 18:13:31 UTC - in response to Message 1022496. I agree. The weekend that Jeff seemed to be on call went very well. He kept us informed and made small increases throughout the weekend. By Monday I had very little to "top off" with for my 5 day cache. Nobody ever explained the connection problem last weekend. As usual no information flows from the top, down. We get no respect! Boinc....Boinc....Boinc....Boinc.... ID: 1022546 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1022579 - Posted: 4 Aug 2010, 21:52:03 UTC DA knows there is a problem with ghosts yet allows them to reset DCF so you can't get work.... Official Abuser of Boinc Buttons... And no good credit hound! ID: 1022579 ·

Ghery S. Pettit Send message Joined: 7 Nov 99 Posts: 325 Credit: 28,109,066 RAC: 82	Message 1022604 - Posted: 4 Aug 2010, 23:52:25 UTC I don't know what all the hoopla is about. I just leave the machines alone and they take care of themselves. Not much choice when I've been traveling and can only access one of them remotely. Plenty of work for them. Now, I only have a functional GPU on one of them, but it seems to do alright, as well. Good thing as I won't be able to babysit them the next two shutdowns, either. ID: 1022604 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1022711 - Posted: 5 Aug 2010, 14:35:58 UTC Last modified: 5 Aug 2010, 14:38:24 UTC If your PC would have at least 4 (for example GTX260-216) GPUs, you would need to have at least ~ 700 normal WUs/day. For the three day outage you need at least ~ 2,100 normal WUs. For to have little bit security reserves (4 day WU cache) = ~ 2,800 WUs. If you would let run the BOINC Manager alone and your PC get all the time 'no tasks available' and a backlog for the new work request.. It's not possible to fill up the WU cache with ~ 2,800 WUs. If you get 1/4 shorties and 3/4 normal WUs you would need ~ 2,100 normal and ~ 2,800 shorty WUs = ~ 4,900 WUs for 4 days. BTW, and I don't think the BOINC Client/Manager can manage this big WU cache. If the SETI@home scheduler send your BOINC ~ 20 WUs/request your BOINC need ~ 245 well contacts. I think with this example it's well shown that only ~ 24 hours without or increased limit is too less. ID: 1022711 ·

IFRS Volunteer tester Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0	Message 1022723 - Posted: 5 Aug 2010, 15:03:53 UTC - in response to Message 1022711. If your PC would have at least 4 (for example GTX260-216) GPUs, you would need to have at least ~ 700 normal WUs/day. For the three day outage you need at least ~ 2,100 normal WUs. For to have little bit security reserves (4 day WU cache) = ~ 2,800 WUs. If you would let run the BOINC Manager alone and your PC get all the time 'no tasks available' and a backlog for the new work request.. It's not possible to fill up the WU cache with ~ 2,800 WUs. If you get 1/4 shorties and 3/4 normal WUs you would need ~ 2,100 normal and ~ 2,800 shorty WUs = ~ 4,900 WUs for 4 days. BTW, and I don't think the BOINC Client/Manager can manage this big WU cache. If the SETI@home scheduler send your BOINC ~ 20 WUs/request your BOINC need ~ 245 well contacts. I think with this example it's well shown that only ~ 24 hours without or increased limit is too less. Exactly. A 10k RAC machine, single CUDA, wonÂ´t need any babysit, or near that. But top 100 hosts, wich I guess are owned the most by the people that use this foruns, need a good amount of babysit, wich is doubled now on this system. Some of them, like Sutaru say, canÂ´t have enough work for the shortage. IF you can, BOINC canÂ´t handle that much WUÂ´s without hanging the machine, or even REPORT them when they are uploaded. All I know is the big guns are suffering. Not complaining, just saying. If something can be done to avoid it, I think it should, because the hardcore users put big money and time on the project, for the science of it. ID: 1022723 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.