Lower Lamarck (Feb 02 2009)

Message boards : Technical News : Lower Lamarck (Feb 02 2009)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 861270 - Posted: 2 Feb 2009, 21:54:21 UTC

Happy Monday everybody. I guess I should move on from the January thread title theme (odd little towns/places/features in southern Utah which I've been to during many nearly-annual backpacking/hiking adventures in the area - easily one of the best parts of the U.S.).

We did almost run out of data files to split (to generate workunits) over the weekend. This was due to (a) awaiting data drives to be shipped up from Arecibo and (b) HPSS (the offsite archival storage) was down for several days last week for an upgrade - so we couldn't download any unanalysed data from there until the weekend. Jeff got that transfer started once HPSS was back up. We also got the data drives, and I'm reading in some now.

The Astropulse splitters have been deliberately off for several reasons, including to allow SETI@home to catch up. We also may increase the dispersion measure analysis range which will vastly increase the scientific output of Astropulse while having the beneficial side effect of taking longer to process (and thus helping to reduce our bandwidth constraint woes). However, word on the street is that some optimizations have been uncovered which may speed Astropulse back up again. We shall see how this all plays out. I'm all for optimized code, even if that means bandwidth headaches.

Speaking of bandwidth, we seem to be either maxed out or at zero lately. This is mostly due to massive indigestion - a couple weeks ago a bug in the scheduler sent out a ton of excess work, largely to CUDA clients. It took forever for these clients to download the workunits but they eventually did, and now the results are coming back en masse. This means the queries/sec rate on mysql went up about 50% on average for the past several days, which in turn caused the database to start paging to the point where queries backed up for hours, hence the traffic dips (and some web site slowness). We all agreed this morning that this would pass eventually and it'll just be slightly painful until it does. Maybe the worst is behind us.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 861270 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 861284 - Posted: 2 Feb 2009, 22:28:54 UTC

How much unanalyzed data is there on the HPSS?
ID: 861284 · Report as offensive
piper69

Send message
Joined: 25 Sep 08
Posts: 49
Credit: 3,042,244
RAC: 0
Romania
Message 861288 - Posted: 2 Feb 2009, 22:37:53 UTC

thx as usual for the update Matt
ID: 861288 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 861299 - Posted: 2 Feb 2009, 22:53:45 UTC - in response to Message 861284.  

How much unanalyzed data is there on the HPSS?


Unclear - it would take some scripting/scanning to get an accurate answer. My gut says we're talking hundreds of files. At least a hundred.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 861299 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 861304 - Posted: 2 Feb 2009, 22:56:35 UTC

I hope that Josh and Eric don't think that AP 5.01 is ready for release yet. It cannot have been tested yet, because the processing time on a Windows core2 quad has gone from <40 hrs to ~120 hrs.
As it is now less than 6 days since it was released, Josh's announcement 28 Jan 2009 1:52:29 UTC, then only those hosts that run Beta 24/7 and allow Beta at least one cpu core will have returned any results yet.
Also due to the limited posts on the subject at Beta I assume there are, relatively speaking, very few people testing this app, just the normal hardcore testers.

Lets not have another CUDA disaster.
ID: 861304 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 861308 - Posted: 2 Feb 2009, 23:02:41 UTC - in response to Message 861299.  

How much unanalyzed data is there on the HPSS?


Unclear - it would take some scripting/scanning to get an accurate answer. My gut says we're talking hundreds of files. At least a hundred.

- Matt


You mean 100 files, each about 50 GB? Should we consider this a lot? thx.
ID: 861308 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 861311 - Posted: 2 Feb 2009, 23:06:32 UTC - in response to Message 861308.  

You mean 100 files, each about 50 GB? Should we consider this a lot? thx.


Yes (100 50GB files) and depends what you mean (it's more enough data to get us through several dry spells like this past weekend).

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 861311 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861330 - Posted: 2 Feb 2009, 23:38:30 UTC - in response to Message 861304.  

I hope that Josh and Eric don't think that AP 5.01 is ready for release yet. It cannot have been tested yet, because the processing time on a Windows core2 quad has gone from <40 hrs to ~120 hrs.
As it is now less than 6 days since it was released, Josh's announcement 28 Jan 2009 1:52:29 UTC, then only those hosts that run Beta 24/7 and allow Beta at least one cpu core will have returned any results yet.
Also due to the limited posts on the subject at Beta I assume there are, relatively speaking, very few people testing this app, just the normal hardcore testers.

Lets not have another CUDA disaster.

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.
ID: 861330 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 861331 - Posted: 2 Feb 2009, 23:42:05 UTC - in response to Message 861330.  

3d16h?? :) Matt did say he wanted to 'slow' things down a bit. LOL

Is it at all viable to test these 'new' algorithms using a dedicated supercomputer before releasing to beta and then to 'us'? From personal experience, supercomputer time is available for small projects at the national centers.
ID: 861331 · Report as offensive
Profile speedimic
Volunteer tester
Avatar

Send message
Joined: 28 Sep 02
Posts: 362
Credit: 16,590,653
RAC: 0
Germany
Message 861343 - Posted: 3 Feb 2009, 0:05:48 UTC

I'm not quite sure if it's good to make the APs even longer.
Many of those are already aborted because they take much longer than the 'usual' WUs. As a consequence it takes ages to get two valid results together (and 'pay out'), so RAC is falling and people opt out.
Not to mention those half done WUs lingering on the server.

Might be better to split bigger MB chunks...

mic.


ID: 861343 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861367 - Posted: 3 Feb 2009, 0:38:40 UTC - in response to Message 861331.  

3d16h?? :) Matt did say he wanted to 'slow' things down a bit. LOL

Is it at all viable to test these 'new' algorithms using a dedicated supercomputer before releasing to beta and then to 'us'? From personal experience, supercomputer time is available for small projects at the national centers.

It's OK, don't panic.

Debug Beta builds are often compiled without optimisation - makes it easier for the developers to track what's happening (allegedly).

The next Astropulse release, once optimised, will maybe run 50% longer than the current version (whether in stock or Lunatics versions) - and that is because of genuine additional searching ('negative DM', in the jargon). Not a problem.
ID: 861367 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 861389 - Posted: 3 Feb 2009, 1:35:08 UTC - in response to Message 861330.  

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.

That was just the first run after 5.01 install at Beta. They've split more as needed, total 1806 so far.
                                                                   Joe
ID: 861389 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 861393 - Posted: 3 Feb 2009, 1:49:21 UTC - in response to Message 861367.  


The next Astropulse release, once optimised, will maybe run 50% longer than the current version (whether in stock or Lunatics versions) - and that is because of genuine additional searching ('negative DM', in the jargon). Not a problem.


Should I infer that the wu's already processed will need to be reprocessed to apply the 'genuine' additional searching?
ID: 861393 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 861408 - Posted: 3 Feb 2009, 2:14:53 UTC - in response to Message 861389.  

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.

That was just the first run after 5.01 install at Beta. They've split more as needed, total 1806 so far.
                                                                   Joe

They probably were not needed. It was just a perception by BOINC based on DCF, est_flops etc. Mine arrived with estimate to completion of ~32 hrs so it downloaded three tasks. But with Beta's resource share and actual processing time one task was OTT. Had to adjust things so that first task can be returned in reasonable time frame. Set to NNT and I will await news before processing the other two tasks .
ID: 861408 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861558 - Posted: 3 Feb 2009, 9:26:41 UTC - in response to Message 861270.  

However, word on the street is that some optimizations have been uncovered which may speed Astropulse back up again. We shall see how this all plays out. I'm all for optimized code, even if that means bandwidth headaches.

Thanks for the endorsement of optimized code.

But ..... ahem, how to put this delicately? Matt, you really ought to get out more - specifically to Number Crunching.

The optimized code of which you speak has been on public release since 10 October 2008. That was Astropulse v4.35, of course: optimised v5.00 applications were made publicly available on 21 November 2008, just a couple of days after that version was launched by the project.

I think you'll find quite a lot of optimised work in your database already - all fully validated against the stock application, of course, otherwise it wouldn't get into the database.
ID: 861558 · Report as offensive
Profile Karsten Vinding
Volunteer tester

Send message
Joined: 18 May 99
Posts: 239
Credit: 25,201,931
RAC: 11
Denmark
Message 861631 - Posted: 3 Feb 2009, 14:52:32 UTC

Richard:

I think what Matt is talking about is upcoming changes to the AP application, that uses revised radar blanking code, and does FFAs on negative dispersion as well as positive (current only serches on positives AFAIK).

Optimized code for this is well on its way, I think that is what he is talking about.
ID: 861631 · Report as offensive

Message boards : Technical News : Lower Lamarck (Feb 02 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.