Message boards :
Number crunching :
Panic Mode On (94) Server Problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 22 · Next
Author | Message |
---|---|
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I don't know about you, but this is the most APs my computers have seen in over 2 months. 1 has 118 the other 112. Finally had to turn the heater down as all the GPUs are now rived up!! |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
Yes, I get that message to, even though I have only 4 CPU APs in the queue, normally it is 100. It brings up an interesting question in how can one be at a limit when you are 96% below the maximum value. |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
Almost back to pre-Bruno-crash RAC.........Keep 'em coming! "Sour Grapes make a bitter Whine." <(0)> |
Aurora Borealis Send message Joined: 14 Jan 01 Posts: 3075 Credit: 5,631,463 RAC: 0 |
AP spliters have started running on Beta again. |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
@Sten-Arne: I'm not worried about it at all. The only thing that annoys me is the 4+ hours I have to sit in front of the puters and keep clicking on the update button to get to 100 CPU tasks. After getting to 100 CPU tasks the caches glide down to circa 5 wus again. It's just poor software. |
Dena Wiltsie Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26 |
@Sten-Arne: I'm not worried about it at all. I am running the current release and after it picked up the first few AP units, it has filled my queue to 100 work units each and has maintained the level. I gave up and decided to maintain a 2 day queue which is 100 work units each. |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
I don't have a problem maintaining work in the cache for GPU WUs, only CPU WUs are an issue as it does not really re-stock the cache as they are completed. [side issue] I would like to see the limits raised. I know there are people out there with slower machines but 100 MB CPU WUs is about 1 days worth of work. 100 AP CPU WUs is about 3.5 days worth of work (if you can get 100 WUs for your CPU that is). 200 MB GPU WUs is about 8 hours of work (based on 10 minute completion times - appreciate shorties take less than 6 minutes) and 200 AP GPU WUs are about 40 hours of work (based on circa 50 minutes completion time). The other thing that I would like to see is the 8 weeks on a slow glide path down to say 2 weeks. It was set at 8 weeks some 14+ years ago when processors were much much slower. I think it's time for a review in here. Probably got an ice creams chance in hell of any of the above. |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
It looks like there might be less AP work being issued again. If I am reading the SSP correctly, it appears that there are three splitters again working on a single file. For a short time my CPU cache was holding steady at the maximum 100 amount but now the total in progress is slowly going down hill. |
JanniCash Send message Joined: 17 Nov 03 Posts: 57 Credit: 1,276,920 RAC: 0 |
[side issue] I would like to see the limits raised. I know there are people out there with slower machines but 100 MB CPU WUs is about 1 days worth of work. Not sure I understand your problem 100%, so ignore from here if my comment doesn't compute. If all you want is to download more WUs in advance, to bridge a server outage for example, why not create multiple Virtual Machines with a total number of VCPU's equal to your physical cores (or threads)? With VTx (or similar), virtualization has almost zero performance loss these days, but each VM would appear as a separate BOINC client, maintaining its own cache. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Almost back to pre-Bruno-crash RAC.........Keep 'em coming! I'm not. Last summer I was around 15K. I've only recovered about half of my drop. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
I'm surprised no one has posted PANIC!!! about the MB RTS being so low this morning. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Here's my run-through of my job_log: With help from Cosmic_Ocean, I've been able to assemble a data distribution history for AP with records of 3051 tapes split. But that's still a long way short of the 5,544 I know have been split for MB (as at 31 Dec) - the difference is far more likely to be our incomplete jog logs, rather than a vast stash of unprocessed tapes. So, if anyone would actually like to know how we're getting along, I'd need help with logs from some of you assiduous AP-hounds out there, please. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
I'm surprised no one has posted PANIC!!! about the MB RTS being so low this morning. The tapes which have been added to the splitter queue today and over the weekend have all been split for MB before, and it looks as if re-splitting for MB has been inhibited this time - the tapes are skimmed through very quickly without any new work appearing, as was happening for AP last week. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Yup, was my computer. Sorry. Forget what I said. Bad driver..... |
betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66 |
One might wonder why they still haven't started the AP assimilators? It's as of now 477,099 AP Workunits waiting for assimilation. I was thinking this was part of a stress test for the new data base. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
One might wonder why they still haven't started the AP assimilators? It's as of now 477,099 AP Workunits waiting for assimilation. Until the assimilators start no data is going into the AP science db. sah_assimilator/ap_assimilator : Takes scientific data from validated results and puts them in the SETI@home (or Astropulse) database for later analysis. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Dena Wiltsie Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26 |
I don't have a problem maintaining work in the cache for GPU WUs, only CPU WUs are an issue as it does not really re-stock the cache as they are completed. After the outage today I burned off a fair amount of work and one work request returned 34 MB work units. I am still a bit low on AP work but I suspect I will get more work after some of the other people get theirs. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I had an interesting occurrence on my slow Sempron machine. So it was crunching away on an AP, and at that time, it had not had 11 "completed" APs yet, so the estimates were still astronomical (pun sort-of intended).. like ~230 hours. Once it got to about 75% complete, the remaining time was low enough for the 2.5+1-day cache to ask for more work. By this time, the 11th "completed task" had happened, so the estimates that came with new tasks were much more realistic. Sort of. New tasks were coming with an estimate of 31 hours, when it normally takes ~46 for a <10%-blanked AP. And then it got a few more APs. And then the one that was running finished.. and I thought since the duration on that one was more than 10% of the estimate of the others.. the others would change their estimate to match what that one actually took. But I was wrong. The estimates for all the ones that were in the cache at that time dropped to 18 hours...and then work fetch happened and got more. I realized this and set it to NNT, and then counted how many that machine had and multiplied by 48 hours and I shouldn't run past the deadline on any of them, but then a thought occurred... there's a newfangled error these days that bails on a WU when it runs for more than 2x the estimated duration. So... to avoid basically losing everything after crunching most of the way through them first.. time for some client_state editing. Suspended network comms in BOINC, then ran 'net stop boinc' and then made a copy of the data directory and opened up client_state and added a 0 to the left of the decimal in the flops_est fields. Save, 'net start boinc' and checked.. now they all show 188 hours..and of course running in high priority mode. I didn't want crazy things to happen, so I did 'net stop boinc' and opened client_state back up and decided to cut the values down by approximately a third. The high-order digits were 28, so I changed them to 10. Save, 'net start boinc' and now they're 68 hours. Close enough. It should be able to smooth itself out from here. Of course, I'll keep an eye on it. I have already set that machine back to a MB+AP venue (had it on AP-only just to try to get at least the 11th "completed task" done), so hopefully, it doesn't gobble too many more of those APs up from the rest of you. I know some of you just cringe at the fact that your GPU does them in 30 minutes and then the wingmate ends up taking 47 hours with the Lunatics CPU app. That little machine has been pretty good to me over the years. It's been crunching for just over 7 years and is edging closer to 1M credits. I just wanted to share that weird scenario though. I'm pretty sure if I had set it to NNT and only gotten one AP assigned at a time, the whole debacle would have been avoided. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
Still not getting any AP CPU WUs (and I have 1 box with none at the moment)... |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
Seems we're back to problems getting work. Before Tuesday's outage I had max GPU cache for the first time in about 2 months, now I'm down to half GPUs and it's continues going down. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.