Windows port of Alex v8 code

Message boards : Number crunching : Windows port of Alex v8 code
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 50 · Next

AuthorMessage
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 727273 - Posted: 17 Mar 2008, 8:36:15 UTC

After 13 hours nonstop porting, the first WU processed validates...resultid=783586431

Processor isn't very impressive E4500 running @ 2418 mHz on a crippled ECS 945GCT-M MoBo with disfunctional Crucial Ballistix memory that fails about every 6-7 hours. OS is Windows XP running diskless with BoincPE.

Build is using :
MS Visual Studio 2005 Professional
Intel C++ 10.1.020
Intel IPP 5.3.2.073


Here is the compile commandline:

/c /O3 /Og /Ob2 /Oi /Ot /Oy /GT /GA /I "C:\\Program Files\\Intel\\IPP\\5.3.2.073\\ia32\\tools\\staticlib" /I "C:\\Program Files\\Intel\\IPP\\5.3.2.073\\ia32\\include" /I "C:\\Program Files\\Intel\\Compiler\\C++\\10.1.020\\IA32\\include" /I "../../../boinc/win_build" /I "../../../boinc" /I "../../../boinc/api" /I "../../../boinc/api/win" /I "../../../boinc/client/win" /I "../../../boinc/" /I "../../../boinc/lib" /I "../../image_libs" /I "../../jpeglib" /I "../../db" /I "../../glut" /I "../../" /I "../" /D "USE_IPP" /D "USE_SSSE3" /D "USE_I386_OPTIMIZATIONS" /D "USE_I386_CORE2" /D "__INTEL_COMPILER" /D "WIN32" /D "_MT" /D "NDEBUG" /D "_WINDOWS" /D "CLIENT" /D "_CONSOLE" /D "NBOINC_APP_GRAPHICS" /D "_MBCS" /D "_VC80_UPGRADE=0x0710" /GF /FD /EHsc /MT /Zp16 /GS- /Gy /GR /Yc"..\\StdAfx.h" /Fp".\\Release/seti_boinc.pch" /Fo".\\Release/" /W3 /nologo /Zi /Gd /TP /FI "win-config.h" /fp:fast /Qprec-div- /Qprec-sqrt- /Qfp-speculationfast /QxO

I'm not very familiar developing on Windows with these tools (first try, I just installed VS2005PRO & Intel Compiler yesterday). If anyone has suggestions for better options, please let me know and I'll try.


I'll let it run while I get a few hours sleep and check some other AR's, Now it's 3:35am, time to get some shut eye.

These WU's have already been detatched by the project, so nothing lost by trying to crunch them with first attempt at ported code.

Regards,
JDWhale
ID: 727273 · Report as offensive
Profile David
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 411
Credit: 1,426,457
RAC: 0
Australia
Message 727274 - Posted: 17 Mar 2008, 8:39:41 UTC

Interesting, thanks.

ID: 727274 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 727275 - Posted: 17 Mar 2008, 9:12:14 UTC

Many thanks for the effort. We look forwards to the results.
It's good to be back amongst friends and colleagues



ID: 727275 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 727280 - Posted: 17 Mar 2008, 9:25:30 UTC
Last modified: 17 Mar 2008, 9:30:30 UTC

1,574 seconds for a VHAR at 2.4GHz is very honorable - impressive even, especially for a first run after a session like that. Congratulations! Enjoy your well-earned sleep, but I hope you had time to crack a celebratory beer first.

Once you get it bedded down and checked at other ARs, I'd be happy to give it a benchmarking run on one of the quaddies where we've already got plots for stock, Chicken and Crunch3r.

Edit - 781770844 has come in too. 5,897 seconds for 63.98 cr, and valid. Looking good.
ID: 727280 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 727290 - Posted: 17 Mar 2008, 10:20:43 UTC - in response to Message 727280.  
Last modified: 17 Mar 2008, 10:21:03 UTC

1,574 seconds for a VHAR at 2.4GHz is very honorable - impressive even, especially for a first run after a session like that. Congratulations! Enjoy your well-earned sleep, but I hope you had time to crack a celebratory beer first.

Once you get it bedded down and checked at other ARs, I'd be happy to give it a benchmarking run on one of the quaddies where we've already got plots for stock, Chicken and Crunch3r.

Edit - 781770844 has come in too. 5,897 seconds for 63.98 cr, and valid. Looking good.

Thats an impressive speed up, your previous 63.98 cr units took ~7,100 secs.
ID: 727290 · Report as offensive
Profile David
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 411
Credit: 1,426,457
RAC: 0
Australia
Message 727291 - Posted: 17 Mar 2008, 10:23:06 UTC - in response to Message 727280.  

Edit - 781770844 has come in too. 5,897 seconds for 63.98 cr, and valid. Looking good.


Wow it takes my Quaddies 5,500-5,800 seconds to earn that sort of credit (Give or take a minute), and thats clocked at 3033 Mhz for the media PC and 3105 Mhz for mine. Compare that to JDWhale's PC which is a a reasonable bit slower, so I'd say the code is really working well.

As long as the science is accurate then I'd say it's a real winner
ID: 727291 · Report as offensive
Profile Adri
Volunteer tester

Send message
Joined: 27 Apr 07
Posts: 56
Credit: 132,673
RAC: 0
Malaysia
Message 727301 - Posted: 17 Mar 2008, 11:35:41 UTC

WOW!!! THIS IS AWESOME!!!!

Now, we shall wait till this gets incorporated into another update of Crunchers app... eeekkk!!!!

ID: 727301 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 727303 - Posted: 17 Mar 2008, 11:55:29 UTC

Looks like you going to be the most popular guy on NC for a while ;)

For a .38 AR WU, you've knocked 12.5% off the time it takes my Q6600 clocked at 3336MHz running Crunch3r's SSSE3.

I'll post a quick comparison chart when there are a few more results to go on (perhaps this evening UTC), but for a true measure I support Richard H's proposal that, once it is bedded down you let him run it on a box that has already been used for benchmarking other apps - that way we reduce the variability introduced by hardware.

Excellent piece of work so far,

F.
ID: 727303 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 727306 - Posted: 17 Mar 2008, 12:25:49 UTC - in response to Message 727303.  

Looks like you going to be the most popular guy on NC for a while ;)

For a .38 AR WU, you've knocked 12.5% off the time it takes my Q6600 clocked at 3336MHz running Crunch3r's SSSE3.

I'll post a quick comparison chart when there are a few more results to go on (perhaps this evening UTC), but for a true measure I support Richard H's proposal that, once it is bedded down you let him run it on a box that has already been used for benchmarking other apps - that way we reduce the variability introduced by hardware.

Excellent piece of work so far,

F.



*** Hi there , ***

impressive peace off work John, think a lot off the WINDOWERS are waiting for such results.

Done a great job !
ID: 727306 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 727308 - Posted: 17 Mar 2008, 12:46:21 UTC - in response to Message 727273.  
Last modified: 17 Mar 2008, 12:47:28 UTC


I'm not very familiar developing on Windows with these tools (first try, I just installed VS2005PRO & Intel Compiler yesterday). If anyone has suggestions for better options, please let me know and I'll try.


I'll admit that I didn't check that slew of options you have already, and I admit that I'm a bit rusty in Visual Studio, but if you don't have it enabled, you may want to enable runtime checks. I would guess this probably has already been done, but it won't hurt. What it will do is give you the ability to handle runtime problems, such as array out of bounds, among other things... It will also give you the capability of generating output to file and continue on. I found it useful for tracking down array problems with null terminated strings where the prior developer had not allowed for the null. IIRC, it will also report on uninitialized variables...

I'm heading out to a friend's house...and probably won't be able to post until after 10pm EDT tonight...
ID: 727308 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 727312 - Posted: 17 Mar 2008, 12:57:10 UTC
Last modified: 17 Mar 2008, 13:29:28 UTC

Hi JDWhale,
Below are the compiler options for comparison [fairly similar] from the seti_boinc project of 'our' pre alpha port of the same AK_v8 code... Expect significant speedup particularly with regards to pulse finding on core 2 [SSSE3] and higher [SSE4.1] machines, Some decent improvement on SSE3 p4 builds also.

Testing offline with knabench (available from the Lunatics site) or similar would allow direct comparison between apps (say against 2.4V SSSE3 or a stock app) with the same WU, and can test with pre-shortened workunits speeding development, and enhancing repeatability.

Yes /O2 optimisations test a few percent faster than /O3 with our builds, as they [Lunatics] did also with 2.2b and 2.4 / 2.4V. I would be interested to know if you showed improvement using /O3.

Jason
[P.S my new Intel stuff arrives soon, so I haven't bothered updating for a while ... Must go to IPP5.3 ASAP]

/c /O2 /Og /Ob2 /Oi /Ot /GA /I "C:\\Program Files\\Intel\\IPP\\5.2\\ia32\\tools\\staticlib" /I "C:\\Program Files\\Intel\\IPP\\5.2\\ia32\\include" /I "C:\\Program Files\\Intel\\Compiler\\C++\\10.1.013\\IA32\\include" /I "../../../boinc/win_build" /I "../.." /I "../../../boinc" /I "../../../boinc/api" /I "../../../boinc/api/win" /I "../../../boinc/client/win" /I "../../../boinc/" /I "../../../boinc/lib" /I ".." /I "../../image_libs" /I "../../jpeglib" /I "../../db" /I "../../glut" /I "../../" /I "../../../lib/fftw-3.0.1/api" /D "WIN32" /D "_WIN32" /D "_MT" /D "NDEBUG" /D "_WINDOWS" /D "CLIENT" /D "_CONSOLE" /D "USE_I386_OPTIMIZATIONS" /D "USE_IPP" /D "USE_I386_CORE2" /D "__SSE__" /D "__SSE2__" /D "__SSE3__" /D "__INTEL_COMPILER" /D "_ATL_MIN_CRT" /D "_MBCS" /D "_VC80_UPGRADE=0x0710" /GF /Gm /MT /GS- /Gy /GR /Fo"D:\\BoincSeti_Prog\\sinbad_repositories\\AK_v8\\client\\win_build\\\\Release SSSE3\\Intermediate\\seti_boinc/" /W3 /nologo /Zi /Gd /TP /FI "win-config.h" /fp:fast /Qprec-div- /Qprec-sqrt- /Qipo- /Qftz /QxO

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 727312 · Report as offensive
Profile Adrian Taylor
Volunteer tester
Avatar

Send message
Joined: 22 Apr 01
Posts: 95
Credit: 10,933,449
RAC: 0
United Kingdom
Message 727322 - Posted: 17 Mar 2008, 13:39:37 UTC - in response to Message 727273.  

well done JDWhale

i'm so glad that someone has gone from the land of BS on this and into the world of actually doing something...!

i hope you have great success, although as a mac user i should be miffed, but its all for the science eh ?

also glad that ? isnt involved, how refreshing :-)

keep up the good work

regards

adrian
63. (1) (b) "music" includes sounds wholly or predominantly characterised by the emission of a succession of repetitive beats
ID: 727322 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 727333 - Posted: 17 Mar 2008, 14:52:30 UTC - in response to Message 727273.  

After 13 hours nonstop porting, the first WU processed validates...resultid=783586431

Processor isn't very impressive E4500 running @ 2418 MHz on a crippled ECS 945GCT-M MoBo with dysfunctional Crucial Ballistix memory that fails about every 6-7 hours. OS is Windows XP running diskless with BoincPE.

Build is using :
MS Visual Studio 2005 Professional
Intel C++ 10.1.020
Intel IPP 5.3.2.073


Here is the compile commandline:

/c /O3 /Og /Ob2 /Oi /Ot /Oy /GT /GA /I "C:\\Program Files\\Intel\\IPP\\5.3.2.073\\ia32\\tools\\staticlib" /I "C:\\Program Files\\Intel\\IPP\\5.3.2.073\\ia32\\include" /I "C:\\Program Files\\Intel\\Compiler\\C++\\10.1.020\\IA32\\include" /I "../../../boinc/win_build" /I "../../../boinc" /I "../../../boinc/api" /I "../../../boinc/api/win" /I "../../../boinc/client/win" /I "../../../boinc/" /I "../../../boinc/lib" /I "../../image_libs" /I "../../jpeglib" /I "../../db" /I "../../glut" /I "../../" /I "../" /D "USE_IPP" /D "USE_SSSE3" /D "USE_I386_OPTIMIZATIONS" /D "USE_I386_CORE2" /D "__INTEL_COMPILER" /D "WIN32" /D "_MT" /D "NDEBUG" /D "_WINDOWS" /D "CLIENT" /D "_CONSOLE" /D "NBOINC_APP_GRAPHICS" /D "_MBCS" /D "_VC80_UPGRADE=0x0710" /GF /FD /EHsc /MT /Zp16 /GS- /Gy /GR /Yc"..\\StdAfx.h" /Fp".\\Release/seti_boinc.pch" /Fo".\\Release/" /W3 /nologo /Zi /Gd /TP /FI "win-config.h" /fp:fast /Qprec-div- /Qprec-sqrt- /Qfp-speculationfast /QxO

I'm not very familiar developing on Windows with these tools (first try, I just installed VS2005PRO & Intel Compiler yesterday). If anyone has suggestions for better options, please let me know and I'll try.


I'll let it run while I get a few hours sleep and check some other AR's, Now it's 3:35am, time to get some shut eye.

These WU's have already been detached by the project, so nothing lost by trying to crunch them with first attempt at ported code.

Regards,
JDWhale

Impressive, Sounds very promising, So I'll keep an eye on this.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 727333 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 727347 - Posted: 17 Mar 2008, 16:15:03 UTC
Last modified: 17 Mar 2008, 16:17:18 UTC

Hi all,

Thanks for all the positive feedback... but all is not well with the WHALEapp just yet. Some WU's enter what seems to be "wait state", mostly within the first minute of execution. Not what you want for benchmarking or daily driver :-( But those VHAR WUs sure haul a**! I'll try changing some compiler options as suggested and try again. I really don't want to debug this, too many years on the job have burned me out.

I'll answer the obvious question before someone asks... Yes those constant 913 second run times are true for the 17 credit VHAR WUs. Only thing was I set BOINC to run on only "1" processor on this E4500 Core2 Duo. No memory or cache contention, the whole 2MB L2 mostly for the single process. Just what "High Priority" WUs deserve!

I had to shutdown the system to replace that flakey Ballistix memory, the only sticks I had laying around were unmatched 2 x 1GB DDR2-667. This system actually runs better single channel than with the unmatched sticks, but because I've configured the RAM drive (remember BOINC_PE) to use 768MB, that didn't leave a whole lot of memory for the OS & processes, so I went with the unmatched sticks for now.

Anyway, I'm putting the little C2Duo back on KWSN_2.4V_SSSE3_MB.exe for the time being.

Cheers to all,
John
ID: 727347 · Report as offensive
Profile Gecko
Volunteer tester
Avatar

Send message
Joined: 17 Nov 99
Posts: 454
Credit: 6,946,910
RAC: 47
United States
Message 727365 - Posted: 17 Mar 2008, 17:16:13 UTC
Last modified: 17 Mar 2008, 17:16:25 UTC

JDWhale: GREAT job!!! Thanks for your efforts & the surprise. It's always awesome to see continued examples of the talent and generosity that make up the community.

Hat's-off to you sir!

BTW, please check your PM.
ID: 727365 · Report as offensive
Profile Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 21 Oct 99
Posts: 2246
Credit: 6,136,250
RAC: 0
Germany
Message 727408 - Posted: 17 Mar 2008, 19:53:59 UTC

very good job, im am looking at this...

thanks @JDWahle, for yor good work...

Greetings from Germany NRW
Ulli


ID: 727408 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 727432 - Posted: 17 Mar 2008, 21:20:27 UTC
Last modified: 17 Mar 2008, 21:21:35 UTC

Okay... we're back on... I found the error of my ways in gaussFit.cpp. The substituion for the convolution call was incorrect and causing buffer overrun to the output buffer. [Edit] Thanks for the tip, you know who you are![/edit] That's what I get for pulling a replacement function out of thin air, not knowing the exact behavior of the original, and not carefully reading the user guide for the IPP replacement. 12 hours and a pint of Vodka can have that effect. I digress...

Unfortunately, I also changed some compile options and don't have any VHARs left for comparison. Prior crunched WU's can not be used for comparison as the host was left unattended and WU's were stalling, thus favoring the WU left running without contention.

I've launched a direct comparison of WhaleApp vs. 2.4V...on WU's from same splitter block (I hope that term is correct)...

2.4V
wuid=236103546
wuid=236103433

vs.

WhaleApp
wuid=236103598
wuid=236103652


The WhaleApp WUs haven't reported yet, just kicked them off simultaneously, will perform update as soon as they finish, I predict before 23:00 UDT.

Similarly the 2.4V WUs above kicked off simultaneously and virtually at the same time. (never mind the crud at the beginning list, it just shows that I tried to crunch them with the earlier WhaleApp, but they both stalled less than a minute in the run.
--------------------------

I know that this is not the best host, but it is representative of what is currently available for folks on a budget.

Intel Core2Duo E4500 @ 2418 MHz on ECS 945GCT-M with mismatched DDR2-667 Running WindowsXP diskless via BoincPE




BOINC On..On...
ID: 727432 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 727445 - Posted: 17 Mar 2008, 21:58:45 UTC - in response to Message 727432.  

Okay... we're back on... I found the error of my ways in gaussFit.cpp. The substitution for the convolution call was incorrect and causing buffer overrun to the output buffer. [Edit] Thanks for the tip, you know who you are![/edit] That's what I get for pulling a replacement function out of thin air, not knowing the exact behavior of the original, and not carefully reading the user guide for the IPP replacement. 12 hours and a pint of Vodka can have that effect. I digress...

Unfortunately, I also changed some compile options and don't have any VHARs left for comparison. Prior crunched WU's can not be used for comparison as the host was left unattended and WU's were stalling, thus favoring the WU left running without contention.

I've launched a direct comparison of WhaleApp vs. 2.4V...on WU's from same splitter block (I hope that term is correct)...

2.4V
wuid=236103546
wuid=236103433

vs.

WhaleApp
wuid=236103598
wuid=236103652


The WhaleApp WUs haven't reported yet, just kicked them off simultaneously, will perform update as soon as they finish, I predict before 23:00 UDT.

Similarly the 2.4V WUs above kicked off simultaneously and virtually at the same time. (never mind the crud at the beginning list, it just shows that I tried to crunch them with the earlier WhaleApp, but they both stalled less than a minute in the run.
--------------------------

I know that this is not the best host, but it is representative of what is currently available for folks on a budget.

Intel Core2Duo E4500 @ 2418 MHz on ECS 945GCT-M with mismatched DDR2-667 Running WindowsXP diskless via BoincPE




BOINC On..On...

I always wondered what Whales did with those large brains, Drink vodka and do math like crazy. ;) Glad You got It sorted, Now as soon as It validates, You'll need some guinea pigs, Er lab rats, Er volunteers. Yeah that's It(Couldn't help Myself).
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 727445 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 727449 - Posted: 17 Mar 2008, 22:16:25 UTC

John, well done and congratulations on putting your money where you mouth is.

Let's hope they validate.
ID: 727449 · Report as offensive
Profile Logan
Volunteer tester
Avatar

Send message
Joined: 26 Jan 07
Posts: 743
Credit: 918,353
RAC: 0
Spain
Message 727452 - Posted: 17 Mar 2008, 22:22:44 UTC - in response to Message 727445.  
Last modified: 17 Mar 2008, 22:57:51 UTC

:)

SSE3 2.4 from lunatics (C2D version, really SSSE3) and 6.10 v3 from Crunch3r inside.;) Good combination, Joker... Try it!

Is the best what I tested since today...
Logan.

BOINC FAQ Service (Ahora, también disponible en Español/Now available in Spanish)
ID: 727452 · Report as offensive
1 · 2 · 3 · 4 . . . 50 · Next

Message boards : Number crunching : Windows port of Alex v8 code


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.