Message boards :
Technical News :
Status / Credit (Nov 03 2010)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Maybe there's some confusion about this credit granting process. By cleanup I wasn't referring to any corruption. In fact, the data in the databases is/are as clean as ever, it's just that both the mysql (boinc) and informix (science) databases are too darn big. This mega-outage is to get new servers on line for both that can handle the job. Before we can do that, we'd like to reduce the size of the databases to make the transitions easier. Of course there's some gunk in the databases pertaining to things like ghost wu's, orphaned results or signals, etc. Nothing new - but when the databases are quiet and vastly reduced in size this is a golden opportunity to analyse and understand these various long-standing issues - that's what I'm talking about cleanup. Meanwhile, we've been dealing with all kinds of long planned and unplanned outages, causing users much grief with timeouts. This credit granting procedure ensures that people are getting the credit they deserve for work done without having to wait for wingmen who never show up. And we're not throwing out any science. By the way, download servers are starting up so we're starting to clear out *that* queue. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
Maybe there's some confusion about this credit granting process. YAY!! tight limits should give fast turn around.. and AWESOME!! Janice |
Swibby Bear Send message Joined: 1 Aug 01 Posts: 246 Credit: 7,945,093 RAC: 0 |
Matt -- If only one download server were running, wouldn't that give a more efficient use of the internet connection and reduce or eliminate the dropped connections? Overall throughput might actually improve? And it might eliminate the ghost creations. Whit |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
I've got my first allocations since the scheduler was switched back on, and I see: 1) The tasks I'm getting are all _0 and _1 - i.e. new work, we're not down to the resends yet. 2) A very high proportion of them are 'shorties' That's hardly the most restful convalescence for poor Jocelyn, Anakin, and Vader - but I suppose it just represents the tapes the splitters were working on just before the last shutdown. There's probably AP work in the mix as well. Hardly surprising that the download link went straight to 93 Mbit/s and has stayed there ever since. To avoid creating yet more ghosts, might it be a good idea to periodically pause the 'new work allocation' side of the scheduler - now you have the tool to do that - while still allowing reporting? That would allow the download rate for allocated work to decay for a while, thus both allowing uploads/reporting to flow freely, and minimise the risk of work being trapped in limbo should, heaven forfend, the sudden rush of work trigger another nervous breakdown server-side. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66218 Credit: 55,293,173 RAC: 49 |
PS I do expect pics of the new servers though:) Ditto, Asahp please. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
Ingrid Brouwer Send message Joined: 8 Jan 00 Posts: 6 Credit: 1,562,268 RAC: 0 |
<<<SNIP>>> PS I do expect pics of the new servers though:) I agree LOL Would love to see them, in action preferably ;-D [/img] Life is not measured by the number of breaths we take, but by the moments that take our breath away. |
Mike.Gibson Send message Joined: 13 Oct 07 Posts: 34 Credit: 198,038 RAC: 0 |
I think the priority of most crunchers is not the receipt of credits. It is to prevent the waste of crunching time that results from WUs being resent when the primary crunching has been held up by an outage. There should be some sort of moratorium, say for 24 hours, after an outage to give enough time for completed work to be reported. During that time only new work should be sent out, or secondaries where expiration was before the start of the latest outage. Mike |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
Matt -- downloads seem to work better when both are up, observed. |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
yes much better with two. Matt has been enabling and disabling one of the dl servers if ya haven't noticed. |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
dl's work better right after the second server is enabled, then comes to a crawl. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I've got my first allocations since the scheduler was switched back on, and I see: My first batch of new work was all initial issue VLARs, after a couple of them had been done i got one new shortie. Since then all of my new work has been VLAR re-issues, not a shortie in sight. Grant Darwin NT |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Does it make people feel good to get credit for work which probably now has no chance of contributing to the science? Has the project underestimated the quality of the participants? +1 with arbitrary credit granting (and giving credits for nothing is arbitrary) they lost their last (very small already) meaning. Better leave credits in auto state and work on database cleanup, I'm sure there is enough work to do :P |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
I've got my first allocations since the scheduler was switched back on, and I see: Yes, the newly-split work didn't last long, and it's been nothing but resent ghosts or other fallout ever since. Which is exactly what we needed to happen. I haven't seen any VLARs here, but I wouldn't expect to - I've selected not to request CPU work, so the CPUs work on other projects while I run CUDA. Observation: the backlog of results 'ready to send' ran out, for both MB and AP, around 10:00 UTC, or a few minutes before. There's still a trickle, but effectively no new downloads will have been added to the download server queue since then. The download pipe remained saturated at 93 Mbit/sec until about 10:50 UTC, the best part of an hour later. That gives us some idea of the relative speeds of the various project components: I think we need to use the remaining time before the new servers arrive to work out some way of helping the scheduler request/reply messages get through (and hence avoid creating a new set of ghosts), while Oscar and friend keep the pipe full. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
In the past hours, starting probably late last night, local time (UTC+1), all SETI Bêta WU's are UPLoaded are new ones were D'lOaded. All 3 QUADS + CUDA are running SETI MB WU's. Can't see, Result Page Option is turned off, deminiching SERVER load. Haven't changed any setting in BOINC, since a few month. Most of the time it wasn't possible at all.....:) But I do hope WU's are validated and granted credit, if a Canonnical Result is made. (I think, giving Credit for work not done, isn't seving science, so why do it.) And I agree with Raistmer, as to the 'value' of Credits, difference between projects are out of balance, I think. About both D'Load Servers, turned on (As of 5 Nov 2010 14:30:05 UTC), all D'Loads are through almost immediatly, quite difference from last days or weeks. |
platium Send message Joined: 5 Jul 10 Posts: 212 Credit: 262,426 RAC: 0 |
i thought we were here to donate processer time, credits are interesting but not what we are here for, or am i alone in thinking the S.e.t.i programme is what counts |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
IMO, you're right and no your not alone, although, exceptions proved the rule, so 'they say', you could say, why not only count the tasks done, which would not have been fair, unlike 'most' Projects, SETI@home Multibeam and Astropulse get their, info from a (piggybagging) of the Arecibo Observatorium Telescope. Well Credits are a way to establish the amount of work done by your hosts, instead of using FLOPS* or of FLOP per WU, Jeff Cobb, has made a Formula determening the work done, FLOPS, but also Memory Bus, Harddisk, Network, all are taken into account, measuring work done for a WU. Besides, an Incomplete Picture that it's hardly manageble, let alone handy, to Count only FLOPS, numbers would easily have 18, 21, 23 and even more then 25 digits. From the Account Page, personal, all Projects Results. Now that new WU's MB as AP have been given out, is this the last [i]Task flow, before the New Servers are UP & Running? (*Whetstone is a more reliable way of measuring Floating Point Operations per Second). |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... The 1148513 MB and 26888 AP in "ready to send" at the start plus about 12000 MB and 5000 AP reissues created during the saturation period would have taken nearly 19 hours of download with an actual throughput of 90 Mbps. I suspect ghost creation of 100000 or more. With the queues drained as much as possible by other means, I hope the staff will try <resend_lost_results>1</resend_lost_results> in the project config.xml at least for short periods. Bursts of activity caused by that should be tolerable, and go a long way toward having the server "in progress" actually match what is on participants' hosts. This being Friday, I expect whoever is in the lab has other plans, but maybe next week? Joe |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
it's just that both the mysql (boinc) and informix (science) databases are too darn big. http://publib.boulder.ibm.com/infocenter/idshelp/v115/index.jsp?topic=/com.ibm.adref.doc/ids_adr_0719.htm Judging by what IBM says on it's site about informix databases Seti had limitations with the hardware. IBM says the maximum databases that they can hold is 21 million, with 477,102,080 tables in a dynamic system, 32k threads, with a maximum database size of 4TB. (I wouldn't imagine Seti has larger than that going on, but you never know!) So I suppose it would be safe to say it was a hardware limitation they are fighting. These new machines hopefully will be more than enough to handle things for a while, I don't see them ordering machines that wouldn't. Traveling through space at ~67,000mph! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
... I was surprised that the query loading on Jocelyn showed no sign of abating when the backlog of work available for allocation ran dry - with downloads a mere 20% of what they were this morning, what are all those other queries? (static copy taken from http://bluenorthernsoftware.com/scarecrow/sahstats/db.php?t=48) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I was surprised that the query loading on Jocelyn showed no sign of abating when the backlog of work available for allocation ran dry - with downloads a mere 20% of what they were this morning, what are all those other queries? Validations, assimilations, deletions? Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.