Message boards :
Number crunching :
Panic Mode On (109) Server Problems?
Message board moderation
Previous · 1 . . . 27 · 28 · 29 · 30 · 31 · 32 · 33 . . . 36 · Next
Author | Message |
---|---|
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
34 channels left on the three remaining tapes. Could be getting dried up ... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
34 channels left on the three remaining tapes. Could be getting dried up ... That'll take a load off of the servers. Edit- now down to 24 on 1 file. Edit- make that 16. Edit- make that 0. Last 6 are in progress. No more till they load up some new files, hopefully tomorrow some time. Grant Darwin NT |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
No more till they load up some new files, hopefully tomorrow some time. Soon, I hope. Already below freezing and snow and colder temps are in the forecast. May have to add some Einstein when I awake ... |
Ghia Send message Joined: 7 Feb 17 Posts: 238 Credit: 28,911,438 RAC: 50 |
No more till they load up some new files, hopefully tomorrow some time. Nothing coming down the pipe for hours now, "no tasks available". Doesn't bode well for tomorrow's outrage. Humans may rule the world...but bacteria run it... |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
No more till they load up some new files, hopefully tomorrow some time. Yeah, no tapes in queue to get split, so no work until some get loaded, apparently manually. I might have to borrow a cat to stay warm :) |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Eric tried to get things going last night by remote, but could not. He said he will go at it again this morning after he confers with Jeff. Meow. "Time is simply the mechanism that keeps everything from happening all at once." |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
These comments have lead me to a possibly naive, but very basic question. Is BOINC truly scalable? My gut feeling, having never done a single credit on any other projects and thus don't know their volumes, is that it must be, but if so then is it a problem with how our work is configured and distributed, or how SETI was originally designed to do work, the hardware just isn't up to the task, or something else?If I/O contention is the issue, I suppose it would depend on what I/O is causing the bottleneck. The GBT splitters are all on Centurion, and I don't see any other processes that would contend with them there. However, they must be hitting the BOINC DB, probably on Oscar, and feeding the split files to the scheduler over on Synergy. The file deleters are over on Georgem and Bruno, so it wouldn't seem as if they would contend with the splitters anywhere but at the DB. I believe (but don't quote me) that SETI was the original distributed computing project. Which is cool and all that, but that might carry some drawbacks as well. Being the originator or something (oh, I don't know, cell phones, "high speed" internet) has sometimes locks you into a certain, usually quite expensive set of circumstances, and when the 2nd gen of whatever comes along from competitors who take your thing and improve on it, you're often stuck with what you have, and it's very expensive to upgrade, especially if there has been a paradigm shift. I am wondering if we are sort of in that situation right now? We were the first, blazed the trail and lead the way, and then those that followed us, saw it was great, but saw the shortcomings on how we initially did it as well, and then made adjustments and improvements to theirs that we couldn't easily make, due to the investment in time and treasure already invested, as well possibly in trying to keep with the stated goal that we are trying to support almost device, old to new (not really, but you know what I mean), and that inability to optimize due to this might be helping cause these headaches? I honestly have no idea if what I proposed has any basis in reality or not, as I haven't been on the other side of things from the day I processed my first WU. I am just tossing out an idea as to what might be part of the reason we're dealing with DB issues and such. Is it hardware limitations? Software limitations? Inherent design limitations? I guess that in the scheme of all things computing, we really are pretty small fry. I mean, think of all the data that the NSA is processing every day. Yes, I know, look at their budget, all the hardware and personnel they can toss at any problem, I guess I'm just babbling about proof of concept, or maybe nothing at all, and just wanted to get some thoughts from others who know all of this stuff _Much_ more deeply than I will ever know. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
I wonder if they could sort the database and move most of it into an archive database. All the work that is done and has nothing left in the field, thus leaving a much more manageable database active and online. Then the weekly outage would just sort the active database and move whatever has been completed during the week to the archive. I know this sounds too simple to be possible, but might that be possible? Some database inquiries would have to be rewritten to access the archived information. I dunno, just spitballing. Meow? "Time is simply the mechanism that keeps everything from happening all at once." |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Tapes are beginning to appear, slowly. IIRC, there's quite a long pre-processing phase, which we don't see: as they emerge from that, they pop up automatically in the splitter queue. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Tapes are beginning to appear, slowly. IIRC, there's quite a long pre-processing phase, which we don't see: as they emerge from that, they pop up automatically in the splitter queue. Well done, Eric and Jeff! "Time is simply the mechanism that keeps everything from happening all at once." |
Chris904395093209d Send message Joined: 1 Jan 01 Posts: 112 Credit: 29,923,129 RAC: 6 |
Looks like break time is over. Work is starting to slowly build up again. ~Chris |
betreger Send message Joined: 29 Jun 99 Posts: 11408 Credit: 29,581,041 RAC: 66 |
I wonder if they could sort the database and move most of it into an archive database. All the work that is done and has nothing left in the field, thus leaving a much more manageable database active and online. Then the weekly outage would just sort the active database and move whatever has been completed during the week to the archive. +1 |
Piotr Send message Joined: 24 May 17 Posts: 18 Credit: 20,069,282 RAC: 41 |
Hello, Everytime then the WU are withdrown, my crunchers encounts the same behavior: the slower & less core one, which has few WU's left to crunch, receives new WUs , before the second cruncher (faster & more cores) which is totally withdrown of CPU WUs , receive it's new batch. Also when it receivs it got first GPUs (which some are already from the previous batch) and no CPU WUs. How schould I configure it for letting more CPU WUs be downloaded in advance instead of GPUs? Every outrage I would need some 100 CPU WUs more to have the work sustained on the faster machine. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Results ready to send : one ???? EDIT: Just changed to zero. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Hello, Generally, if you allow both CPU and GPU work, the servers determine which to send you. You can temporarily turn off fetching of GPU WUs in BOINC (in Advanced mode) under the Activity tab. Until you turn GPU fetch back on, you will get only CPU WUs. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
So would now be a good time to panic?? ;) Just kidding....why I have back up projects. Giving beta a test, thought I'm not sure what we are looking at there...(only 1 machine and only 1 work unit per card)...Einstein is getting a boost from this. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
My backup project is to improve the code and run local tests. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
My backup project is to improve the code and run local tests.Recovering some of those 5,300+ ghosts you've created wouldn't be a bad use of your time, either. Having those locked away when other people are starving for work is not particularly helpful. |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
How does one have so many errors and invalids? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
He is testing a cuda alpha application, so there is bound to be some teething issues. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.