64-bit App Build Windows XP x64

Message boards : Number crunching : 64-bit App Build Windows XP x64
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419535 - Posted: 12 Sep 2006, 22:26:49 UTC
Last modified: 12 Sep 2006, 22:30:45 UTC

I installed Windows x64 on my main machine over the weekend so I wanted to see how a 64-bit client would run. I used the sources from Simon Zadra (KWSN - Chicken of Angnor)'s site as a base for the build. I let it run overnight on my x64 box and all the results that have been verified are good. (3 errors's on my results list are accidents from changing the app_info.xml).

I don't see a significant change in performance yet on my machine, but I'm hoping for some feedback on other machines. I'm assuming there's more that can be changed to give more gains from the 64-bit build. I've only had x64 installed a few days so I'm still kind of inexperienced on code optimizations that can be done.

If anyone had any tips on 64-bit optimizations, that would be awesome.

Anyone who'd like to try it, Please let me know how it works for you and your CPU type.

PLEASE keep an eye on your results, I only tested this for about 24 hours on 1 box.

Windows XP x64 64-bit client
http://tiger.towson.edu/~bdelkh1/setiathome-5.15-DeNitro-emt64.zip

Thanks guys =)

-Bob Delkhoon (DeNitro)

ID: 419535 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 419540 - Posted: 12 Sep 2006, 22:33:22 UTC
Last modified: 12 Sep 2006, 22:40:16 UTC

Hi Bob,

saw you online at my site recently :o)

Good job on the 64-Bit app, I haven't gotten one to link correctly yet.

I'm very interested in your build settings, including what SDK you used, what version of Visual Studio/ICC/IPP and such, as well as any necessary source edits (there were a couple, as I recall).

Running some tests with your app on 64-Bit Windows (Pentium-D) as I type this message, let's see what it can do!

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 419540 · Report as offensive
EricVonDaniken

Send message
Joined: 17 Apr 04
Posts: 177
Credit: 67,881
RAC: 0
United States
Message 419541 - Posted: 12 Sep 2006, 22:35:55 UTC

Going from IA32 to x86-64 gives you a few optimization opportunities:
1= The number of registers of each type doubles from 8 to 16
2= Each register is also 2x wider. This is particularly important when talking about MMX/SIMD registers.
3= There are a few instructions in x86-64 that are not in IA32.

The easiest way to take advantage of all this is to get intel's compiler, icc, and their MKL libraries and build s@h using them as aggressively as possible while still getting correct results.
These tools cost money.

Sun's Performance Evaluator tool is also useful as it will profile the code for you, make suggestions as to useful source transformations, make suggestions as to in memory data structure layout, etc, etc.
This tool is free.
ID: 419541 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 419551 - Posted: 12 Sep 2006, 22:51:22 UTC
Last modified: 12 Sep 2006, 22:55:32 UTC

Eric,

that is exactly why I am keen on invading 64-Bit space ;o) I'm well aware of the architectural advantages.

Also - I have posted this before - I've managed to acquire commercial ICC and IPP licenses quite a while ago. For anyone interested in building their own, trial versions are available from Intel online (as noted in the Windows and Linux How-To threads stickied on this board).

So really, there is nothing holding anyone back from participating - obviously, releasing apps to the public is a different thing, as the trial licenses do not really allow such usage.

To work around this issue, I've tried to assemble as many capable people in my test and development group on http://lunatics.at. I can only say, their pace is hard to keep up with :o) I will continue to offer apps compiled on the sources this team produces, as the licenses themselves do not run out (only updates and premium support next July).

You posted elsewhere that you would like to stay anonymous but still like to participate in coding. As someone else noted, you could always acquire a generic email address to register with at my site and use the same username as here - and let me know about it.

As things stand, you would not be the only one. That has not stopped Mr. and Ms. Anonymous from contributing, though.

Bob, sorry for sort of hijacking your thread - the same goes for you. If you're interested in working together on the S@H code, you're welcome! Let me know, and I'll bump your access.

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 419551 · Report as offensive
Profile Sutehk
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 42
Credit: 1,443,674
RAC: 0
United States
Message 419566 - Posted: 12 Sep 2006, 23:32:59 UTC

Is this optimization for the Intel 64 bit processors or the AMD 64 bit?
ID: 419566 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 419575 - Posted: 13 Sep 2006, 0:01:16 UTC

Hm,

it didn't run on my XP64 system at all - didn't even produce an stderr.txt file, in fact.

Not sure why, did you try it on other XP64 systems, Bob?

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 419575 · Report as offensive
Profile BORG
Volunteer tester
Avatar

Send message
Joined: 3 Aug 99
Posts: 305
Credit: 6,157,052
RAC: 0
Canada
Message 419602 - Posted: 13 Sep 2006, 1:03:45 UTC - in response to Message 419575.  

Anyone try it yet?

I just did, got lots of client errors

378761173 90781347 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761169 90781360 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761166 90781338 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761161 90781346 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761158 90781361 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761154 90781341 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---
378761151 90781333 12 Sep 2006 9:08:03 UTC 13 Sep 2006 0:59:00 UTC Over Client error Done 0.00 0.00 ---


Running it on Windows Server 2003 Standard x64 Edition - Dual core 3.4.


ID: 419602 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419613 - Posted: 13 Sep 2006, 1:39:27 UTC

Bah getting errors? It's been working fine for me. I may take the link down till I can work with it some more.

This is the machine I've been running it on. All the results since the 3 client errors are from the Exact build I posted. (The 3 client errors where accidents from me messing with the app_info.xml)
http://setiathome.berkeley.edu/show_host_detail.php?hostid=2624714

The machine is a Core Duo 2 running Windows XP X64. I may have accidentally linked the ipp wrong but I'm pretty sure it's static and correct.

Build Enviroment was Visual Studio 2005 using Intel Compiler 9.1 using IPP 5.1.1

Unfortunatly this is the only 64-bit machine I have access to so hard to test on a clean machine.

--Bob Delkhoon (DeNitro)


ID: 419613 · Report as offensive
Profile BORG
Volunteer tester
Avatar

Send message
Joined: 3 Aug 99
Posts: 305
Credit: 6,157,052
RAC: 0
Canada
Message 419614 - Posted: 13 Sep 2006, 1:42:11 UTC - in response to Message 419613.  

Bah getting errors? It's been working fine for me. I may take the link down till I can work with it some more.

This is the machine I've been running it on. All the results since the 3 client errors are from the Exact build I posted. (The 3 client errors where accidents from me messing with the app_info.xml)
http://setiathome.berkeley.edu/show_host_detail.php?hostid=2624714

The machine is a Core Duo 2 running Windows XP X64. I may have accidentally linked the ipp wrong but I'm pretty sure it's static and correct.

Build Enviroment was Visual Studio 2005 using Intel Compiler 9.1 using IPP 5.1.1

Unfortunatly this is the only 64-bit machine I have access to so hard to test on a clean machine.

--Bob Delkhoon (DeNitro)

Question DeNitro - What sse did you use? sse4? sse3? Core Duo 2 I believe have SSE4




ID: 419614 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419619 - Posted: 13 Sep 2006, 1:49:14 UTC

Ok I *may* have something. I was doing 32-bit client testing before 64-bit and I had /QaxT and /QaT set(core duo 2 flags). I read /QaxT and simmilar settings (SSE/SSE2 etc.) are ignored when compiling for EMT64 as it takes precidence, BUT /QaT *may* not be, I'm not sure. I going to remove them both just to be safe and do another build. I'll replace the link in a few minutes (and let you know). It should not effect speed any since SSE/SSE2/MMX don't work with EMT64 as emt64 use overrides them.

I may be wrong, as I'm new to 64-bit coding but I'll give it a shot

--Bob Delkhoon (DeNitro)
ID: 419619 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419627 - Posted: 13 Sep 2006, 2:04:30 UTC
Last modified: 13 Sep 2006, 2:04:46 UTC

Ok new client is build and up at the exact same link. The only change is it's built without core duo 2 specific flags (/QaxT and /QaT).

It may or may not work still, since I thought emt64 would override the sse4.

PLEASE use this on a test work unit, or wait to hear if it works on test work units from others BEFORE you use this on the actual boinc app. It *may produce errors!!*

Thanks again guys,

--Bob Delkhoon (DeNitro)
ID: 419627 · Report as offensive
Profile BORG
Volunteer tester
Avatar

Send message
Joined: 3 Aug 99
Posts: 305
Credit: 6,157,052
RAC: 0
Canada
Message 419644 - Posted: 13 Sep 2006, 2:39:54 UTC

That seems to have fixed something DeNitro. Its running now. I first tested it with the chickens kwsn-test program.

Running it now on a AMD 3500 Windows xp 64 bit edition. Will post results tomorrow.

Thanks a million DeNitro.



ID: 419644 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419661 - Posted: 13 Sep 2006, 3:01:48 UTC
Last modified: 13 Sep 2006, 3:07:55 UTC

Great to hear it's working, very sorry about that mix up. Again, I'm not sure what the speed difference will be. On my machine, I don't see a significant difference over Simon's (awesome) optimized app that this was based off but hopefully it's a start.

Simon, Sure I'd like to work together. There's some more I want to change in the code base to make it totally cross compatabile with 32-bit and 64-bit builds. There's also still a bit of warnings I want to check into. This was just a rought edit to get valid results and check initial performacne since I'm kinda time limited with school atm. As long as it's not any slower than the 32-bit apps, I hope it's an ok start.

Again PLEASE remember, this is just a test. Make sure you do a test run first on your platform with test work units before you try it on the actuall BOINC app!!!

Thanks,
--Bob Delkhoon (DeNitro)
ID: 419661 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419707 - Posted: 13 Sep 2006, 3:52:12 UTC
Last modified: 13 Sep 2006, 4:01:56 UTC

cool Thanks. You can probably tell form my post count I'm not a frequent poster here but I love the project =)

The filesize outside the zip should be 10,616,832.
If you re-downloaded after I put up the new build, your ISP may have pulled it from a proxy/cache or something.
Just in case, I also put it up under a modified filename. (same build as top post link)
http://tiger.towson.edu/~bdelkh1/setiathome-5.15-DeNitro-emt64_test1.zip

Thanks,
--Bob Delkhoon (DeNitro)
ID: 419707 · Report as offensive
Bob Delkhoon

Send message
Joined: 15 May 99
Posts: 11
Credit: 201,827
RAC: 0
United States
Message 419728 - Posted: 13 Sep 2006, 4:15:45 UTC

Oops thanks, title changed. Even on my intel chip, I felt like the see2 app worked faster than sse3. I didn't really test sse4 since I was eager to try out emt64. When I have some time this week I might try a sse4 app too.

Please Note: Most of the work that went into the app over the default was already in place in Simon's source. I just made the modifications to get it to do a 64-bit compile.

--Bob Delkhoon (DeNitro)
ID: 419728 · Report as offensive
EricVonDaniken

Send message
Joined: 17 Apr 04
Posts: 177
Credit: 67,881
RAC: 0
United States
Message 419810 - Posted: 13 Sep 2006, 9:46:52 UTC - in response to Message 419728.  
Last modified: 13 Sep 2006, 9:48:20 UTC

Oops thanks, title changed. Even on my intel chip, I felt like the see2 app worked faster than sse3. I didn't really test sse4 since I was eager to try out emt64. When I have some time this week I might try a sse4 app too.

On dual core AMD and Intel chips, one should compile for at least SSE3.

Intel Core2 chips have support for SSE4 in them and if you have icc and intel's MKL you should compile w/ SSE4 support.

If you are getting weird results like SSE2 being better then SSE3 or SSE3 being better than SSE4, something is wrong.
Note that to make best use of Intel's MKL may require some source changes to use functions that Intel has optimized for the Core2 architecture rather than their more generic equivalents.

Benher mentioned that the code is not taking good advantage of the all the registers available in the IA32 or x86-64 architecture. Fixing this might require non-trivial understanding of the source.

Also, it's =E=xtended =M=emory =64=b =T=echnology. EM64T.


ID: 419810 · Report as offensive
Profile BORG
Volunteer tester
Avatar

Send message
Joined: 3 Aug 99
Posts: 305
Credit: 6,157,052
RAC: 0
Canada
Message 419834 - Posted: 13 Sep 2006, 11:48:17 UTC

With the exception of the unit that was crunched partially by Chickens app and finnished by DeNitros, which finnished with a computation error the next 4 finnihesd without error. One verified and three pending as of this time.

Times show little improvement over the 32 app. But would need to pull a unit over to comparison run with both apps.


ID: 419834 · Report as offensive
Profile Sutehk
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 42
Credit: 1,443,674
RAC: 0
United States
Message 419836 - Posted: 13 Sep 2006, 11:58:13 UTC

Downloaded and installed on a AMD X2 3800+. Seems to be running at a few hundred Mflops faster then the SSE2 app I was running earlier, at least according to boincview. Hope this helps some.
ID: 419836 · Report as offensive
Profile BORG
Volunteer tester
Avatar

Send message
Joined: 3 Aug 99
Posts: 305
Credit: 6,157,052
RAC: 0
Canada
Message 419860 - Posted: 13 Sep 2006, 13:30:07 UTC

DeNitro

I would appreciate very much the steps you took to get this client working. I've been trying for a couple months with no luck.

Maybe the Chicken can do a how to with your help.

Borg :-)
ID: 419860 · Report as offensive
Alex Kan
Volunteer developer

Send message
Joined: 4 Dec 03
Posts: 127
Credit: 29,269
RAC: 0
United States
Message 419914 - Posted: 13 Sep 2006, 16:22:07 UTC - in response to Message 419810.  

On dual core AMD and Intel chips, one should compile for at least SSE3.

Intel Core2 chips have support for SSE4 in them and if you have icc and intel's MKL you should compile w/ SSE4 support.

If you are getting weird results like SSE2 being better then SSE3 or SSE3 being better than SSE4, something is wrong.
Note that to make best use of Intel's MKL may require some source changes to use functions that Intel has optimized for the Core2 architecture rather than their more generic equivalents.

It seems a bit simplistic to assume that using SSE3 and SSE4 wherever possible will automatically be faster than not using it. For example, using HADDPS to sum across a register has greater latency and lower throughput than other combinations of instructions, especially on Core2 chips.

As for SSE4 (which I suppose I should be calling SSSE3 now), I think it's been discussed a couple times before--the only new instructions that could be useful for us are PSHUFB and PALIGNR, but only if we find places where using them is faster. If you can find a use for all those other SIMD integer instructions in a primarily floating-point application, all the more power to you.

I imagine the Intel compiler knows much more about choosing and scheduling these instructions than I do, but it's definitely interesting that the SSE2 app has performed so well.
Benher mentioned that the code is not taking good advantage of the all the registers available in the IA32 or x86-64 architecture. Fixing this might require non-trivial understanding of the source.

I think when Ben originally said this, he wasn't implying that it was a problem to be fixed. :) A lot of SETI code is simple loopy code, and there are arguments against doing a ton of loop unrolling on Core2, like a 64-byte loop buffer and the increase in code size that you've taken by moving to x86-64.

So, EricVonDaniken, have you signed up on Simon's board yet?
ID: 419914 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : 64-bit App Build Windows XP x64


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.