Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/boinc_db.inc on line 147
Status report...

Status report...

Message boards : SETI@home Enhanced : Status report...
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 6059 - Posted: 15 Aug 2006, 17:09:24 UTC
Last modified: 15 Aug 2006, 17:09:40 UTC

Linux version 5.17 has been released.

If you are having problems with beta coexisting with public running with an anonymous platform/app_info.xml application, that was a boinc bug that has been fixed in the most recent alpha release. You can get it here.

We've gotten quite a few bug fixes (most in the BOINC code) so we'll probably release 5.18 fairly quickly.

Eric
ID: 6059 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer tester
Avatar

Send message
Joined: 14 Aug 06
Posts: 22
Credit: 190,000
RAC: 0
Austria
Message 6060 - Posted: 15 Aug 2006, 20:09:15 UTC

Hi Eric,

thanks for the status update.

I've got a question - rather, a favour - to ask: could you please update the tarballs at http://setiathome.berkeley.edu/~korpela/build/ as well?

Your home dirs seem more current than CVS, which (yesterday, anyway) still had the most recent X86-related edit somewhen in July.

It would be really nice to have a more recent image of the build dirs (and not have to mirror the whole dir with wget or the like and suck up bandwidth).

Thank you for your time,
Simon.
ID: 6060 · Report as offensive
Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 47
Credit: 147,346
RAC: 0
Germany
Message 6062 - Posted: 15 Aug 2006, 21:21:59 UTC - in response to Message 6059.  

Linux version 5.17 has been released.

If you are having problems with beta coexisting with public running with an anonymous platform/app_info.xml application, that was a boinc bug that has been fixed in the most recent alpha release. You can get it here.

We've gotten quite a few bug fixes (most in the BOINC code) so we'll probably release 5.18 fairly quickly.

Eric


one of the most called Question is

at what time we got the New WUs

When will we get data from the Multi-Beam Receiver?

...

btw @Eric thanks for your Work here...

and the Information that is given...

Greetings from Germany NRW
Ulli






ID: 6062 · Report as offensive
Profile Steve Cressman
Volunteer tester
Avatar

Send message
Joined: 14 Nov 05
Posts: 296
Credit: 13,874
RAC: 0
Canada
Message 6177 - Posted: 29 Aug 2006, 2:54:01 UTC

Eric
How is the work going on 5.18? With your last statement I kinda thought we would see it before now. Hope you have not run into any problems that are causing a delay. I'm just a little anxious to get back to work here since I can't do any units at the moment on win98se.

Steve
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
ID: 6177 · Report as offensive
Fischer-Kerli
Volunteer tester

Send message
Joined: 25 Mar 06
Posts: 100
Credit: 61,559
RAC: 0
Germany
Message 6210 - Posted: 7 Sep 2006, 11:14:58 UTC

Hello? Is there anybody out there?
ID: 6210 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 14 Jun 05
Posts: 200
Credit: 68,273
RAC: 0
Message 6217 - Posted: 8 Sep 2006, 0:34:25 UTC - in response to Message 6210.  

Hello? Is there anybody out there?

nope. LOL
ID: 6217 · Report as offensive
Profile Tetsuji Maverick Rai
Project developer
Volunteer developer
Avatar

Send message
Joined: 15 Jun 05
Posts: 399
Credit: 16,571,350
RAC: 0
Japan
Message 6229 - Posted: 9 Sep 2006, 12:14:03 UTC
Last modified: 9 Sep 2006, 12:40:53 UTC

For those who are wondering about the status.

Eric is now in a sort of "maniac" mode, optimizing the cruncher for several architectures. But he doesn't seem to be so maniac as to use GPU for calculation. He is writing hand-assembly code for SIMD. I'm not sure which is faster, his code or ICC's code in each function. Either way, new cruncher chooses faster routines, so the effort will be rewarded. As for Intel processors, I asked him to add a flag for new "CORE" processors with IPP. If you use cvs, you can see what's going on.
Luckiest in the world. WMD = Weapon of Mass Distraction
ID: 6229 · Report as offensive
Nightbird
Volunteer tester

Send message
Joined: 16 Jun 05
Posts: 22
Credit: 12,583
RAC: 0
France
Message 6235 - Posted: 9 Sep 2006, 18:34:06 UTC - in response to Message 6177.  

Eric
How is the work going on 5.18? With your last statement I kinda thought we would see it before now. Hope you have not run into any problems that are causing a delay. I'm just a little anxious to get back to work here since I can't do any units at the moment on win98se.

Steve

Don't forgot Win9x please, i can't crunch if the 'annoying little problem' is not fixed.


ID: 6235 · Report as offensive
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 2531
Credit: 1,074,556
RAC: 0
Germany
Message 6238 - Posted: 10 Sep 2006, 7:03:27 UTC

Thanks for the update Tetsui.

Mike

With each crime and every kindness we birth our future.
ID: 6238 · Report as offensive
Honza
Volunteer tester

Send message
Joined: 19 Jun 05
Posts: 42
Credit: 9,057
RAC: 0
Czech Republic
Message 6249 - Posted: 11 Sep 2006, 8:31:18 UTC - in response to Message 6229.  

I'm not sure which is faster, his code or ICC's code in each function. Either way, new cruncher chooses faster routines, so the effort will be rewarded. As for Intel processors, I asked him to add a flag for new "CORE" processors with IPP. If you use cvs, you can see what's going on.

Thanks for the update Tetsui.
I've 3 Conroes running 24/7 at home and would like to give a "Core 2 Duo/SSE4" app a shot. Any link to download compiled version for Windows? Thanks...
ID: 6249 · Report as offensive
Crunch3r
Volunteer tester

Send message
Joined: 11 Sep 05
Posts: 51
Credit: 27,831
RAC: 0
Germany
Message 6251 - Posted: 11 Sep 2006, 22:11:22 UTC - in response to Message 6229.  

For those who are wondering about the status.

Eric is now in a sort of "maniac" mode, optimizing the cruncher for several architectures. But he doesn't seem to be so maniac as to use GPU for calculation. He is writing hand-assembly code for SIMD. I'm not sure which is faster, his code or ICC's code in each function. Either way, new cruncher chooses faster routines, so the effort will be rewarded. As for Intel processors, I asked him to add a flag for new "CORE" processors with IPP. If you use cvs, you can see what's going on.


I'm not sure if "maniacing" into asm is the right way to go, but hell if he likes it why not.

Personally i'd go for "Intrinsics" which by the way will be more efficient and even more if it comes in combination with icc etc...

Any update on when a final source or the "nightly tarballs" will show up again ???

ID: 6251 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer tester
Avatar

Send message
Joined: 14 Aug 06
Posts: 22
Credit: 190,000
RAC: 0
Austria
Message 6253 - Posted: 12 Sep 2006, 1:19:30 UTC - in response to Message 6249.  
Last modified: 12 Sep 2006, 1:20:31 UTC

I'm not sure which is faster, his code or ICC's code in each function. Either way, new cruncher chooses faster routines, so the effort will be rewarded. As for Intel processors, I asked him to add a flag for new "CORE" processors with IPP. If you use cvs, you can see what's going on.

Thanks for the update Tetsui.
I've 3 Conroes running 24/7 at home and would like to give a "Core 2 Duo/SSE4" app a shot. Any link to download compiled version for Windows? Thanks...


Thanks Tetsuji :o)

Honza, I've compiled several SSE4-optimized applications and tested them vs. SSE3 ones on Core 2 and Woodcrest (Core 2 Xeon) systems - they were identical in size and speed. SSE4 adds mostly integer SIMD operations, which do not make up a lot of the total processing time.

Regards,
Simon.
ID: 6253 · Report as offensive
Honza
Volunteer tester

Send message
Joined: 19 Jun 05
Posts: 42
Credit: 9,057
RAC: 0
Czech Republic
Message 6254 - Posted: 12 Sep 2006, 7:13:34 UTC - in response to Message 6253.  

Honza, I've compiled several SSE4-optimized applications and tested them vs. SSE3 ones on Core 2 and Woodcrest (Core 2 Xeon) systems - they were identical in size and speed. SSE4 adds mostly integer SIMD operations, which do not make up a lot of the total processing time.
Thanks for the answer.
I thought SSE4 would be of no big benefit and Core 2 would need a low-level optimalization aka akosf in order to get benefit of new architecture (instruction fussion etc.)

ID: 6254 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 6256 - Posted: 12 Sep 2006, 16:29:52 UTC - in response to Message 6254.  
Last modified: 12 Sep 2006, 16:33:37 UTC

Hi all,

Sorry again about the lack of news. I've been running around with my hair on fire as usual. The reason I created the assembly SSE versions of the power spectrum routine was that a was totally fried on everything else I was doing (primarily the multibeam splitter and the pointing correction code it requires) and I needed a diversion. If I do the same thing for too long, my brain freezes in that position. An entirely different problem can unstick it. Looking at the timings, it wasn't much of an improvement. I'm not surprised because the power spectrum calculation timing is dominated by memory access speeds. Even adding prefetch instructions didn't help.

At some point I hope to get the Intel compiler and GCC 4, so things can be autovectorized for various processors.

If anyone wants to write an SSE/SSE2/SSE3/SSSE3 matrix transpose and a chirp function, feel free (and please send it to me). Shouldn't be too difficult to base one on Alex Kan's chirp function. Also on the agenda is getting a function timer/validator for gaussian fitting and pulse finding so optimized versions can be used. If anyone wants to do these things, I'll gladly accept the help.

Or if someone want to develop the equivalent functions in 3D-Now!, or VIS, or OpenGL shader language for that matter, it would be fine with me. Now that the code for timing and validating functions is in there, it should be fairly easy.

I'm hoping recent BOINC changes have fixed some of the Windows 98 problems, but until I get a new version out, I'm uncertain. If anyone else wants to do a compile and give it out to some Win 98 testers, again, feel free.

I wasn't aware that tarballs had stopped being generated. I'll try to fix it today.

This week I have a couple proposals to do, and I still need to verify and test the splitter mods. We have a couple 500GB disks full on multi-beam data coming back from Arecibo tomorrow. So I probably won't get a release out this week.

[edit]BTW, welcome back Crunch3r.[/edit]

Eric
ID: 6256 · Report as offensive
BenHer
Volunteer tester

Send message
Joined: 12 Sep 06
Posts: 9
Credit: 0
RAC: 0
United States
Message 6257 - Posted: 12 Sep 2006, 22:21:23 UTC - in response to Message 6256.  
Last modified: 12 Sep 2006, 22:22:51 UTC

Looking at the timings, it wasn't much of an improvement. I'm not surprised because the power spectrum calculation timing is dominated by memory access speeds. Even adding prefetch instructions didn't help.

Converting FPU to SSE vectorized can get about twice as fast (depending on FSB, memory speed, etc). Its not CPU bound, generally.

If anyone wants to write an SSE/SSE2/SSE3/SSSE3 matrix transpose and a chirp function, feel free (and please send it to me). Shouldn't be too difficult to base one on Alex Kan's chirp function.

Posting them over at Simon's KWSN site. Found a way not to need separate buffer or separate transpose function...same bin reordering.

Also on the agenda is getting a function timer/validator for gaussian fitting and pulse finding so optimized versions can be used. If anyone wants to do these things, I'll gladly accept the help.

Have a new benchmark/validator source file. Uses cpu timer tics. Currently works with find_pulse, getPeakPower, getChiSq, chirpData, f_sum_table, sumTables2 (subset of find_pulse). Detects cpu abilities (currently x86), only tests what can run. Easy to add aditional functions to test.

Josef Segur found a header file that has timer ticks reading versions for perhaps 10 different CPU types (powerPC, sparc, etc.) and compiler variations for GCC, IPP, and MSVCC versions...so cpu ticks should be fine.

Regarding the current verify loop, accuracy += pow(diff, 2) will square (orig[i]-test[i]), but if both orig and test are small values (1e-7 for example) then the differences can be radically wrong, but still won't add much to the accuracy total. Suggest something like accuracy += abs(1-orig[i]/test[i])
ID: 6257 · Report as offensive
Profile Tetsuji Maverick Rai
Project developer
Volunteer developer
Avatar

Send message
Joined: 15 Jun 05
Posts: 399
Credit: 16,571,350
RAC: 0
Japan
Message 6259 - Posted: 13 Sep 2006, 1:44:24 UTC - in response to Message 6256.  
Last modified: 13 Sep 2006, 4:07:54 UTC

Hi all,
I'm not surprised because the power spectrum calculation timing is dominated by memory access speeds. Even adding prefetch instructions didn't help.

Eric


PowerSpectrumCalculation can be drastically improved if the input arrays are in separate real/imag parts. As you notice, no shuffle instructions are required, so only 3 instructions calculates 4 spectra, (mulps mulps addps) and it's amazingly fast. But as you see, the output of an fft is in real/imag/real/imag....format...(or some are both input/output are in separate real/imag arrays) and rearrange takes time. I tried it, and overall performance was a bit worse...:( So I was looking for fft function whose input is in real/imag/real/imag...format, and whose output is separate real/imag array, but I cannot fine one. With such a fft, PowerSpectrum could be calculated very quickly. I think I know much of sse3, but unfortunately so far I cannot produce faster PowerSpectrum than icc with inline assembly. Ironically, icc is the best. I made an assembly version of v_GetPOwerSpectrum() in the era of yaoscw-8.1 for Linux, but icc was faster.

That's why I stick to the original v_GetPowerSpectrum with icc, especially. I also tried this with hand-assembly, but icc beats me although it produces longer code, because it often makes functions inline (while it doesn't make functions inline assembly inline). ipp also provides GetPowerSpectrum, but icc produces faster one.

So I personally conclude under most circumstances, ICC is the fatest for intel, ironically. The best way is inlining the function :) (yes it works with v_GetPowerSpectrum)

And I also tried to approximate sine/cosine with polynomials (up to 6th factor) http://setiathome.berkeley.edu/forum_thread.php?id=19865&nowrap=true#171295 but icc's math library beats it in speed and also in accuracy (as a matter of course!). But glibc's math library cannot beat it. I think I referred to http://www.weblearn.hs-bremen.de/risse/RST/docs/Parallel/03-041.pdf.

but I cannot afford time to do it again.

regards

-Tetsuji

PS: Now I understand why you moved to devcpp on Window$.
Luckiest in the world. WMD = Weapon of Mass Distraction
ID: 6259 · Report as offensive
BenHer
Volunteer tester

Send message
Joined: 12 Sep 06
Posts: 9
Credit: 0
RAC: 0
United States
Message 6262 - Posted: 13 Sep 2006, 16:40:44 UTC - in response to Message 6259.  

And I also tried to approximate sine/cosine with polynomials (up to 6th factor) http://setiathome.berkeley.edu/forum_thread.php?id=19865&nowrap=true#171295 but icc's math library beats it in speed and also in accuracy (as a matter of course!). But glibc's math library cannot beat it. I think I referred to http://www.weblearn.hs-bremen.de/risse/RST/docs/Parallel/03-041.pdf


Alex Kan's sse3 vectorized sin/cos approximation is the fastest chirp I've seen yet. Faster than the 32MB table. And accurate too...
    Funcname      cpu ticks fast  accuracy
orig_ChirpData--: 286898730 x1.00       0
       TrigArray:  60655122 x4.73 1.4e+006
        unrolled:  58550761 x4.90 1.4e+006
  aks_sse3_chirp:  41761096 x6.87 8.9e-009


ID: 6262 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer tester
Avatar

Send message
Joined: 14 Aug 06
Posts: 22
Credit: 190,000
RAC: 0
Austria
Message 6264 - Posted: 14 Sep 2006, 1:58:36 UTC

Hi,

Eric and Tetsuji, you are of course both welcome to join in at http://lunatics.at.
I had not specifically offered this to you only because I assumed your timetable is too swamped as it is (not wholly wrong there, from your posts).

Kind regards,
Simon.
ID: 6264 · Report as offensive
HTH
Volunteer tester

Send message
Joined: 3 Mar 06
Posts: 261
Credit: 223,125
RAC: 0
Finland
Message 6379 - Posted: 1 Oct 2006, 6:29:58 UTC - in response to Message 6256.  

I wasn't aware that tarballs had stopped being generated. I'll try to fix it today.

What is a tarball?

Mars 2019 Petition <-- Sign, please.
ID: 6379 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 8 Sep 05
Posts: 82
Credit: 545,522
RAC: 0
United States
Message 6382 - Posted: 1 Oct 2006, 13:29:20 UTC

I believe it'a a TapeARchive file (Unix)

see
http://en.wikipedia.org/wiki/Tar.gz


ID: 6382 · Report as offensive

Message boards : SETI@home Enhanced : Status report...


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.