Posts by Principia

1) Message boards : Number crunching : why the deadlines are so long (Message 671567)
Posted 3 Nov 2007 by Profile Principia
Post:
A few thoughts on SAH's long-ish timeouts:

SAH and BOINC in general place a lot of faith in the users. I haven't read any project principles say it explicitly, but there is a definite philosophy behind BOINC that says the users should be treated as highly trusted. It reminds me of the spirit of the internet in the old days when virii and malware were unknown.

Think of it, a group (or individual with a botnet) with naughty intentions could play havoc with a project by downloading large caches and then doing nothing with them. There are enough anarchists and 14-year olds out there that such a thing isn't unimaginable.

Fortunately this doesn't seem to have happened, to my knowledge. We are, however, in a time where netiquette has become quaint. Large caches left to languish would cause me embarrassment but I know that sort of social thinking isn't the majority view any more.

Another point is that the long outage SAH suffered last summer probably motivated a lot of people to increase their cache size so they won't experience a credit slowdown if it happens again. At the top of the leader board things are pretty competitive. Credit competition is probably overall good since it drives people to dedicate high-performance hardware (I'm looking at you, NEZ) to the task but the cache-size issue is maybe the flip side of that.

I can't speak for these gentlemen http://setiathome.berkeley.edu/sah_about.php but I have been an admin before and here's what they have to deal with: over a million users for which any significant changes to policy might cause unhappiness (and the dreaded feedback), a budget you can count on your fingers and toes (I have a niece studying astrophysics as UCSD and I point to SAH and ask her, are you sure you want to do this?), "colourful" hardware, and worst of all, Jodie Foster. So be kind.

SAH and BOINC are one of the few things you can do on the internet where you are assumed competent until proven otherwise instead of the other way around. It's refreshing. Part of the price for that are work units like this http://setiathome.berkeley.edu/workunit.php?wuid=149772744.

2) Message boards : Number crunching : Barcelona appears on SETI (Message 659943)
Posted 14 Oct 2007 by Profile Principia
Post:
...Let's take the example of Seti, you very often do Int to float and float to int conversion. In the case of K10, you can see a serious slow down when doing this, due the a synch required between the 2 queues. It is funny how it is spinned as a positive...


Ok I'm not an opcode expert, but I notice this kind of thing:

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/40546.pdf

Page 165:

Floating-point-to-integer conversion in C and C++ requires the use of truncation. Use one of the instructions from CVTTSS2SI, CVTTSD2SI to convert a floating-point number to integer when truncation is required.

and page 166:

On AMD Family 10h processors, some SIMD conversion instructions are VectorPath and/or add a false dependency on previous instructions that change the same destination register. In the cases for which there are alternatives in Tables 9 and 10, these instruction sequences use DirectPath instructions and provide better performance. (All recommendations apply to both 32-bit and 64-bit software, unless stated otherwise.)
Several instructions may be required to perform some conversions from unsigned integer to floating-point, due to the lack of a suitable conversion instruction, therefore signed integers should be favored when converting to floating-point.


Could it be the code you are talking about used less than optimal k10 instructions? I didn't understand that point about synching the int and fp queues, the conversions go to a simd register no? How does that involve a communication between the queues? Honest question, I really don't know about such things.
3) Message boards : Number crunching : Barcelona appears on SETI (Message 659871)
Posted 14 Oct 2007 by Profile Principia
Post:
...It's already a big issue for AMD, and they're trying to make the best of it,
without spreading the news further that optimised C2s are more than twice as
quick as Barcelonas for FPU... The international press could easily pick up
on this and make it a worldwide issue!

AMD can still do well in data centres where throughput is much more important.

What is important is that SETI can only benefit from the competition and
optimisation.


If you read this

http://www.xbitlabs.com/articles/cpu/display/amd-k10_7.html

"As we see, the FPU of K10 processor became much more flexible. It acquired some unique features that Intel processors don’t have yet, namely, efficient unaligned loading, including Load-Execute instructions, and two 128-bit reads per clock cycle. Unlike Core 2, floating-point and integer schedulers use separate queues. Separate queues eliminate operations conflicts caused by use of the same execution ports. However, K10 still shares the FSTORE unit for SSE save operations with some data transformation instructions, which may sometimes affect their processing speed."

Architecturally there should be no reason why the k10 shouldn't be as fast or faster the C2. I wouldn't expect a "worldwide issue" to result from the comparison of optimized C2 code to unoptimized k10 code.

Also, as a general statement, the k10 seems to be well received in datacenters:

http://www.crn.com/white-box/202402138

For example

"There are no hardware conflicts and the power draw is as promised. They delivered on their technicals. On these high-performance compute and memory-intensive applications, they're kicking Intel's butt," said Brian Corn, VP of marketing and business development at Waltham, Mass.-based Source Code.

The biggest bummer was that they were six months late with the part. But I think the chip itself is performing ok so far.
4) Message boards : Number crunching : I still have units pending from Sept. (Message 659181)
Posted 13 Oct 2007 by Profile Principia
Post:
My recent Einstein@Home workunits have a 2-week timeout compared to 2 months for SETI@home. I find this strange, given that the SAH wu's are smaller.

It's a philosophical decision on the part of the SAH admins, it seems they want to accommodate much slower clients than EAH. However I'd guess that going 4 times further down the tail of the runtime distribution of the machine population gets them a minuscule amount of additional processing.

The SAH philosophy also allows clients to download large caches and process them offline, only contacting the server once in a blue moon. But honestly I haven't been able to think of the scenario where such long timeouts would be needed.

Anyway, my pending credit queue keeps trending upwards, even after I reduced SAH's resource share to 25%. It seems like something is haywire in SAH client land, maybe with time it will sort itself out. I think it had something to do with the switchover to the 5.27 client in August.

5) Message boards : Number crunching : Had an old P2-400 laying around... (Message 583831)
Posted 8 Jun 2007 by Profile Principia
Post:
Here is my Aptiva K6-2 550 running Debian and boinc-app-seti 5.13.

There's an interesting story about this one. Debian Etch has prebuilt packages for boinc-client & boinc-manager 5.4.11 and boinc-app-seti 5.13 but when I tried them I only clocked 250 Mfips, about half what I got from the default install on Win 98. Interestingly, the turnaround times were about the same, 5-6 days per wu.

After a bit of digging I learned that fftw3 version 3.1.x, which is the Etch version, only supports 3DNow (K6-2's simd instructions) for K7 and up.

However, fftw3 3.0.x supports 3DNow on k6-2 & 3, so I got the source for that package from oldstable and rebuilt it with -m3dnow (and -ffast-math and -march=k6-2). I also got the boinc-app-seti source and rebuilt it with the same switches making sure any args to the configure script were set so the code would build correctly for the different configs.

I was hopeful about that, but it didn't work, the boinc-app-seti client wouldn't run. Almost at random, I tried rebuilding boinc-client and boinc-manager (from the boinc pkg) and luckily that was the answer. boinc-app-seti's setiathome_enhanced links with boinc-client (I think) and there was some kind of binary incompatibility with the pre-packaged version).

That was the answer, and the setiathome_enhanced (running as anonymous platform) clocked about 450 Mfips.

Another thing I came across in the fftw docs is that for x86 cpu's, fftw runs faster if it is statically linked. That took a bit of tweakage in the guts of boinc-app-seti's autotools build system, but I managed to get it right. With that, and building all three packages with -march=k6-2, -ffast-math, and -3dnow, I'm clocking about 515 Mfips and the turnaround times are about 2.5 days.

Also, since I worked from Debian pkg sources, everything is built into .deb packages that I can install and remove like any other pkg.

Oh, and my kernel is a rebuilt 2.6.18 with -march=k6-2 and 3dnow activated as well.
6) Message boards : Technical News : Can't talk.. Debugging.. (May 15 2007) (Message 568729)
Posted 16 May 2007 by Profile Principia
Post:
Just before the outage I did some minor tweaks to my old K6's hardware to try to reduce the wu run times from "forever" to "an eternity". Never really got a chance to test it out.

But then, since that machine is no good for any of the other BOINC projects I know about (SAH seems to have the smallest wu's) I thought it might be a good time to upgrade from Win98 to Debian. That was surprisingly easy and Debian even has the BOINC packages, boy is the installer better than it used to be. The benchmarks came in about half what I was getting under W98. Then again, half of negligible is still, you know, who cares.

Anyway, there's a wu on it that's trying to download. One of these days. Meantime, Einstein gets the spare cycles from my other machines.





 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.