Bart Barenbrug 的帖子

1) 留言板 : Number crunching : RAC falling even though granted credit is 60+ (消息 424373)
发表于:21 Sep 2006 作者: Bart Barenbrug
Post:
My rac dipped slightly too (and is back up now), though I did have my computer do some other work too. It was never out of work during the outages, but since at that time the validators were probably off too, the timing of credits coming in was probably different from usual, causing the rac fluctuations.
2) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 409394)
发表于:28 Aug 2006 作者: Bart Barenbrug
Post:
It really depends on the WUs you get. The RDCF just goes up and down. Up and down...

Here's my RDCF for (almost) the last month:

3) 留言板 : Number crunching : Are you ready for the next generation CPU? (消息 398766)
发表于:15 Aug 2006 作者: Bart Barenbrug
Post:
Indeed. Parallel is the way to go (us boinc users should know a thing or two about that), and working towards using this kind of parallellism effectively is a great step forward. One day, when we're all using dual-processor machines, with each processor being quad-core, and each of those cores hyperthreaded, we'll still be benefitting from this work (I just don't want to be the one to write the task balancing and task migration code for such a beast, with all the different penalties of migrating a task between hyperthreads on the same core, between cores on the same processor, or between processors etc. *g*).
4) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 389894)
发表于:6 Aug 2006 作者: Bart Barenbrug
Post:
So previously I guessed that my dcf could go as low as .40 looking at the graph I posted. I thought I'd test that. Given the subsequent DCF values and the knowledge that 10% of the difference is taken (as so eloquently explained by WinterKnight earlier in this topic), it is of course easy to compute the real ratio for a certain WU by amplifying that difference by a factor 10 (only in case the DCF dropped of course: if it went up, just take the value itself, since then it's a short WU we're dealing with). This of course only works if exactly one WU completed between the times I measure the DCF (every 2 hours, and thereby not synced to WU completion), so it can also be the case that no WU completed (in that case the DCF remains the same, so there's no information, and in the graph below I basically just repeat the previous value), or that more WUs completed (in which case amplifying the difference gives way too low a ratio). Computing that for the dcfs I measured over the last week, gives the following graph:



Apart from a few high values (corresponding to short WUs) and a few very low values (corresponding to periods between measurements where more than 1 WU finished; I'm running 2 WUs simultaneously on at HT CPU after all), the rest of the values show a lot less variance, and thereby a prediction of what the dcf would be after stabilising in the absence of short WUs. The median of the values in the graph (which filters out the outliers discussed earlier) is .424885 so it turns out I was too optimistic with my estimation of .40 earlier.
5) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 389320)
发表于:5 Aug 2006 作者: Bart Barenbrug
Post:
For those interested, here's how my dcf varied over the last week. You can clearly see it raise suddenly as a short WU came by, and then decaying exponentially afterwards as regular WUs are being processed which have a better benchmark-to-compute-time ratio. During this week (all in the beginning actually), my highest dcf was .834425 and the lowest .454350 (quite a large range), though if I were not to get any short WUs anymore, it looks like it might drop to about .40.

6) 留言板 : Number crunching : Please post ~6 month old WUs, here - Revisited. (消息 388308)
发表于:4 Aug 2006 作者: Bart Barenbrug
Post:
Doesn't look like anyone who can do anything about this is reading this topic. And even then: we know that these old WUs are left-overs from the problems we had around august last year (and maybe this februari), so the easiest way to clean them up would be to do a database query to look for the results from that period, and then delete them. So reporting specific WUs here is probably not going to lead to the specific deletion of those WUs. Still: the posts here do indicate how many are still left in limbo, and if enough of them are reported, maybe that is enough of a reason for someone at Seti to spend some time on this. Otherwise this is likely a low priority for the people at Seti (and rightly so, imho).
7) 留言板 : Number crunching : validate eror after outage (消息 388236)
发表于:4 Aug 2006 作者: Bart Barenbrug
Post:
The online database gets purged on a regular basis, to keep it from getting out of hand. There are old results hanging on because of a glitch at that time period. Last August there was a glitch, also. One of the glitches that cause this is that the validators got behind, and caused too many files in directories (this has been fixed by making deeper directory structure), and it slowed the systems down, then files became orphaned because they could not be easily found when the deleters went through to delete, then some manual deleting went on, so the purger could not purge the information because all the pieces were not in place.

Thanks for the explanation. But these results are in the database, so it should be easy to query the datebase for all results within the time period of the glitches ande remove them from the database, even if the corresponding files can't be found anymore? Or is there a worry that the files are still around somewhere, and eliminating the references in the database would remove the last hope of finding them?

Regarding the original topic: I lost four WUs to this problem, but they are now old enough that they disappear from my results page. So no chance of re-validation. Oh well: it's only some computation time and some credits that got lost: the science will get re-done, and that's the important thing.
8) 留言板 : Number crunching : Seti getting stuck at "downloading"... (消息 386308)
发表于:2 Aug 2006 作者: Bart Barenbrug
Post:
And the ones I uploaded afterwards seem to validate well so far.
9) 留言板 : Number crunching : Seti getting stuck at "downloading"... (消息 386188)
发表于:2 Aug 2006 作者: Bart Barenbrug
Post:
There is a problem at berkeley. Everyone is getting error 403 on uploads since yesterday afternoon.
Which is probably a good thing since many WUs that were uploaded earlier failed to validate (as reported for example here). So I wouldn't be surpised if the uploading failures were on purpose to prevent more validation problems while they work out the problems with, and test, the validators (and to prevent the servers to fill up with results quickly which would happen if they only turned the validators off). All of this is pure speculation on my part of course.
10) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 383505)
发表于:31 Jul 2006 作者: Bart Barenbrug
Post:
I suspect yours are AMD's as you a seeing a big jump for the high AR (short) units.
It's an Intel (a Northwood to be exact). Which is why I was so surprised to see it jump that high. Of course I happened to get three of those very short units right after each other, which can't have helped.

I guess the morality of the story is that if you post your DCF here to find the minimum DCF, it might pay to keep an eye on the value for a while, as it might vary considerably (so you can pick your lowest to look good...).

I'm wondering what it is we're benchmarking anyway: the speedup seti cruncher can get with respect to a generic benchmark, or the benchmark itself behaving differently on different types of cpus... ;)
11) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 383398)
发表于:31 Jul 2006 作者: Bart Barenbrug
Post:
Trying to get a feel for how much and how the DCF varies, I started logging my dcf every two hours last Friday, using a little script that greps the dcf from the client_state file to a csv formatted file that's easily included in a spreadsheet program. I quickly noticed the exponential decay, and then the sudden jumps up as a short WU came up. Yesterday and this morning my computer had some other things to do, so it wasn't available for seti for awhile. Around this time, I also happen to notice quite a big jump up in the dcf: whereas I'm normally around or below .5 it was now briefly over .8. Here's how my dcf has progressed over the last 65 hours or so (and I'll keep monitoring):



That sudden jump up after not crunching for a bit is probably a coincidence, as boinclogx history also shows three WUs finishing early this morning which ran for only a little over a minute each (this being one of them, which seems to have finished succesfully, yet with some overflow errors). Or does the dcf somehow relate to the total elapsed (as opposed to CPU) time taken for WUs (which makes sense if you're basing cache sizes and work overcommitment etc. on it)?
12) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 380948)
发表于:28 Jul 2006 作者: Bart Barenbrug
Post:
Update on my Northwood: DCF is currently at 0.477 but sort of varies around .5
13) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 377508)
发表于:25 Jul 2006 作者: Bart Barenbrug
Post:
Now I want a DCF graph in the statistics tab of boinc, so I can keep an eye on it... ;)
14) 留言板 : Number crunching : What's your lowest DCF (Duration Correction Factor)? (消息 370321)
发表于:17 Jul 2006 作者: Bart Barenbrug
Post:
I've only got one, for a 2.6GHz P4 HT PC running your P4 SSE2 v1.3 client. The DCF is not really stable (yet). I just did a manual update to see how much it would change, and it changed from 0.485556 to 0.56794 (just from one update).

I just had a few very short WUs (done in an hour and a half). And weirdly enough, for example this one seems to have claimed a different amount of credit than another cruncher (maybe due to a different boinc core client 4.45 vs 5.49: does the 4.45 still claim credit according to computation time and benchmark results?). I've just installed BoincLogX, so I should be able to keep better track in the future.
15) 留言板 : Number crunching : Word Wrapping in the Forum (消息 368439)
发表于:15 Jul 2006 作者: Bart Barenbrug
Post:
My apologies for the long line in the other topic. I don't post enough here to realise that the "code" formatting would cause this problem (and now it's too late to edit). I was just looking for a way to "quote" the output of a program, and the "code" formatting seemed close enough. Anyone with enough priviliges to edit it (moderators only, probably): feel free to wrap some lines there. I'll try to be more careful in the future...

Edit: I just tried to report it as an offending post (asking for some line breaks) by clicking the red cross below my wide post. Unfortunately I got an "unable to handle request" error when trying to send the report, further stating that "User with id FORUM_MODERATION_EMAIL_USER_ID created but nothing returned from DB layer". So there seems to be one more thing not as it should be with the forum's software configuration...
16) 留言板 : Number crunching : New cpu test/benchmark app........... (消息 367212)
发表于:14 Jul 2006 作者: Bart Barenbrug
Post:
Ah yes. Stupid me. I should learn how to read... Thanks! It looks like I only had some left-overs from the .Net framework on my computer (I'm sure I installed it when fiddling around with boinc-spy quite awhile back). I just (re)installed .Net framework 2.0 and now it runs like a charm and tells me I chose correctly when I went Chicken last week. ;)

Minor cosmetic detail: the text in all caps at the top right urging me to quit boinc before running the tests gets truncated after "STARTING THE" on my screen (if it would help I can post a screenshot if that's not the case on your system).

Also: the stderr.txt file simply gets appended to. It's nice to see your output there already to show what the outcome should be like, but maybe it's nice to add a separator between runs of the different clients to indicate that a new set of tests starts here, by also letting the benchmark app write a few lines to stderr.txt before starting the tests (this extra line can then also include when the tests were run etc. or whatever information other might be useful). I don't know ho many people actually look at that file, so it might not be worth the effort, but I thought I'd post the idea anyhow.

I love the idea you describe in the other topic, Simon, to extend the app in the future to include configurable benchmarking and automatic result sharing. It'll be a sort of boinc manager of your own, starting up seti clients and reporting the results online, but then collecting data on how seti/boinc is doing. So more like a meta-boinc manager (or meta-seti). Let me reiterate my appreciation for all you're doing. Thanks!
17) 留言板 : Number crunching : New cpu test/benchmark app........... (消息 367169)
发表于:14 Jul 2006 作者: Bart Barenbrug
Post:
The new benchmarking app doesn't run here (but the client does, and that's the important thing). When starting it, it comes with an error message "The application failed to initalize properly (0xc0000135). Click on OK to terminate the application". It looks like the only dll that it depends on is "mscoree.dll" which on my system is present and located in c:\\windows\\system32\\URTTemp (seems a bit of a suspect location: could that be a problem, though the error message doesn't seem to point to a missing dll?). Is this .net related (I think I installed that at some point, but is there a sure way to check)?

Also, is this a debug build? There seems to be some debug data in it:
Debug Formats in File:
  Type            Size       Address    FilePtr    Charactr   TimeData   Version
  CODEVIEW        0x000000B4 0x0000C01C 0x0000B01C 0x00000000 0x44B5E65B 0.00
    (RSDS, C:\\Dokumente und Einstellungen\\Administrator\\Eigene Dateien\\Visual Studio Projects\\KWSN - CPU Detector\\obj\\Debug\\KWSN - CPU Test & Benchmark Tool BETA1.pdb)
(at least the full path in there is useless on user's pcs)
18) 留言板 : Number crunching : Completed Workunit gets 0 credit??? (消息 362563)
发表于:10 Jul 2006 作者: Bart Barenbrug
Post:
From the result page it looks like the others all handed in their work unit about five days earlier, so then it was already validated and not kept long enough for your "late" result to be processed. I would hope that positive validation would result in credit if you're later than the rest, as long as you don't get past the deadline. But it looks like the unit you're refering too also passed its deadline (only by some 11 hours, but still), which is why it was already purged from the validator database...
19) 留言板 : Number crunching : Revisit the ~6 month + old work units... (消息 362487)
发表于:10 Jul 2006 作者: Bart Barenbrug
Post:
Here's one that's almost a year old now...
20) 留言板 : Number crunching : KWSN Windows optimized science apps - Share your results and problems! (消息 362285)
发表于:10 Jul 2006 作者: Bart Barenbrug
Post:
As per Simon's request at the start of this topic: here are the results for my computer. All results reported after july 9th noon-ish have been computed with the new SSE2 client. No failures yet (and given Simon's thoroughness, I don't expect any).

Herzlichen Dank, Simon!

Btw: the last result shown in that list is from 2005: I wonder why that 's still in the database...


后面 20


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.