Lower Lamarck (Feb 02 2009)


log in

Advanced search

Message boards : Technical News : Lower Lamarck (Feb 02 2009)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 861270 - Posted: 2 Feb 2009, 21:54:21 UTC

Happy Monday everybody. I guess I should move on from the January thread title theme (odd little towns/places/features in southern Utah which I've been to during many nearly-annual backpacking/hiking adventures in the area - easily one of the best parts of the U.S.).

We did almost run out of data files to split (to generate workunits) over the weekend. This was due to (a) awaiting data drives to be shipped up from Arecibo and (b) HPSS (the offsite archival storage) was down for several days last week for an upgrade - so we couldn't download any unanalysed data from there until the weekend. Jeff got that transfer started once HPSS was back up. We also got the data drives, and I'm reading in some now.

The Astropulse splitters have been deliberately off for several reasons, including to allow SETI@home to catch up. We also may increase the dispersion measure analysis range which will vastly increase the scientific output of Astropulse while having the beneficial side effect of taking longer to process (and thus helping to reduce our bandwidth constraint woes). However, word on the street is that some optimizations have been uncovered which may speed Astropulse back up again. We shall see how this all plays out. I'm all for optimized code, even if that means bandwidth headaches.

Speaking of bandwidth, we seem to be either maxed out or at zero lately. This is mostly due to massive indigestion - a couple weeks ago a bug in the scheduler sent out a ton of excess work, largely to CUDA clients. It took forever for these clients to download the workunits but they eventually did, and now the results are coming back en masse. This means the queries/sec rate on mysql went up about 50% on average for the past several days, which in turn caused the database to start paging to the point where queries backed up for hours, hence the traffic dips (and some web site slowness). We all agreed this morning that this would pass eventually and it'll just be slightly painful until it does. Maybe the worst is behind us.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,219,503
RAC: 4,361
United States
Message 861284 - Posted: 2 Feb 2009, 22:28:54 UTC

How much unanalyzed data is there on the HPSS?

piper69
Send message
Joined: 25 Sep 08
Posts: 49
Credit: 3,042,244
RAC: 0
Romania
Message 861288 - Posted: 2 Feb 2009, 22:37:53 UTC

thx as usual for the update Matt

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 861299 - Posted: 2 Feb 2009, 22:53:45 UTC - in response to Message 861284.

How much unanalyzed data is there on the HPSS?


Unclear - it would take some scripting/scanning to get an accurate answer. My gut says we're talking hundreds of files. At least a hundred.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8644
Credit: 24,393,751
RAC: 26,349
United Kingdom
Message 861304 - Posted: 2 Feb 2009, 22:56:35 UTC

I hope that Josh and Eric don't think that AP 5.01 is ready for release yet. It cannot have been tested yet, because the processing time on a Windows core2 quad has gone from <40 hrs to ~120 hrs.
As it is now less than 6 days since it was released, Josh's announcement 28 Jan 2009 1:52:29 UTC, then only those hosts that run Beta 24/7 and allow Beta at least one cpu core will have returned any results yet.
Also due to the limited posts on the subject at Beta I assume there are, relatively speaking, very few people testing this app, just the normal hardcore testers.

Lets not have another CUDA disaster.

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,219,503
RAC: 4,361
United States
Message 861308 - Posted: 2 Feb 2009, 23:02:41 UTC - in response to Message 861299.

How much unanalyzed data is there on the HPSS?


Unclear - it would take some scripting/scanning to get an accurate answer. My gut says we're talking hundreds of files. At least a hundred.

- Matt


You mean 100 files, each about 50 GB? Should we consider this a lot? thx.

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 861311 - Posted: 2 Feb 2009, 23:06:32 UTC - in response to Message 861308.

You mean 100 files, each about 50 GB? Should we consider this a lot? thx.


Yes (100 50GB files) and depends what you mean (it's more enough data to get us through several dry spells like this past weekend).

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8550
Credit: 50,380,753
RAC: 50,632
United Kingdom
Message 861330 - Posted: 2 Feb 2009, 23:38:30 UTC - in response to Message 861304.

I hope that Josh and Eric don't think that AP 5.01 is ready for release yet. It cannot have been tested yet, because the processing time on a Windows core2 quad has gone from <40 hrs to ~120 hrs.
As it is now less than 6 days since it was released, Josh's announcement 28 Jan 2009 1:52:29 UTC, then only those hosts that run Beta 24/7 and allow Beta at least one cpu core will have returned any results yet.
Also due to the limited posts on the subject at Beta I assume there are, relatively speaking, very few people testing this app, just the normal hardcore testers.

Lets not have another CUDA disaster.

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,219,503
RAC: 4,361
United States
Message 861331 - Posted: 2 Feb 2009, 23:42:05 UTC - in response to Message 861330.

3d16h?? :) Matt did say he wanted to 'slow' things down a bit. LOL

Is it at all viable to test these 'new' algorithms using a dedicated supercomputer before releasing to beta and then to 'us'? From personal experience, supercomputer time is available for small projects at the national centers.

Profile speedimic
Volunteer tester
Avatar
Send message
Joined: 28 Sep 02
Posts: 362
Credit: 16,590,653
RAC: 0
Germany
Message 861343 - Posted: 3 Feb 2009, 0:05:48 UTC

I'm not quite sure if it's good to make the APs even longer.
Many of those are already aborted because they take much longer than the 'usual' WUs. As a consequence it takes ages to get two valid results together (and 'pay out'), so RAC is falling and people opt out.
Not to mention those half done WUs lingering on the server.

Might be better to split bigger MB chunks...

____________
mic.


Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8550
Credit: 50,380,753
RAC: 50,632
United Kingdom
Message 861367 - Posted: 3 Feb 2009, 0:38:40 UTC - in response to Message 861331.

3d16h?? :) Matt did say he wanted to 'slow' things down a bit. LOL

Is it at all viable to test these 'new' algorithms using a dedicated supercomputer before releasing to beta and then to 'us'? From personal experience, supercomputer time is available for small projects at the national centers.

It's OK, don't panic.

Debug Beta builds are often compiled without optimisation - makes it easier for the developers to track what's happening (allegedly).

The next Astropulse release, once optimised, will maybe run 50% longer than the current version (whether in stock or Lunatics versions) - and that is because of genuine additional searching ('negative DM', in the jargon). Not a problem.

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4252
Credit: 1,050,380
RAC: 249
United States
Message 861389 - Posted: 3 Feb 2009, 1:35:08 UTC - in response to Message 861330.

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.

That was just the first run after 5.01 install at Beta. They've split more as needed, total 1806 so far.
Joe

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,219,503
RAC: 4,361
United States
Message 861393 - Posted: 3 Feb 2009, 1:49:21 UTC - in response to Message 861367.


The next Astropulse release, once optimised, will maybe run 50% longer than the current version (whether in stock or Lunatics versions) - and that is because of genuine additional searching ('negative DM', in the jargon). Not a problem.


Should I infer that the wu's already processed will need to be reprocessed to apply the 'genuine' additional searching?

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8644
Credit: 24,393,751
RAC: 26,349
United Kingdom
Message 861408 - Posted: 3 Feb 2009, 2:14:53 UTC - in response to Message 861389.

According to Joe Segur posting at Lunatics (and he's good on this sort of thing), they only split 202 WUs at Beta for this test run. I don't think that enough for a valid test. I'm crunching one on my fastest machine, and it's only at 65% after 3 days 16 hours.

That was just the first run after 5.01 install at Beta. They've split more as needed, total 1806 so far.
Joe

They probably were not needed. It was just a perception by BOINC based on DCF, est_flops etc. Mine arrived with estimate to completion of ~32 hrs so it downloaded three tasks. But with Beta's resource share and actual processing time one task was OTT. Had to adjust things so that first task can be returned in reasonable time frame. Set to NNT and I will await news before processing the other two tasks .

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8550
Credit: 50,380,753
RAC: 50,632
United Kingdom
Message 861558 - Posted: 3 Feb 2009, 9:26:41 UTC - in response to Message 861270.

However, word on the street is that some optimizations have been uncovered which may speed Astropulse back up again. We shall see how this all plays out. I'm all for optimized code, even if that means bandwidth headaches.

Thanks for the endorsement of optimized code.

But ..... ahem, how to put this delicately? Matt, you really ought to get out more - specifically to Number Crunching.

The optimized code of which you speak has been on public release since 10 October 2008. That was Astropulse v4.35, of course: optimised v5.00 applications were made publicly available on 21 November 2008, just a couple of days after that version was launched by the project.

I think you'll find quite a lot of optimised work in your database already - all fully validated against the stock application, of course, otherwise it wouldn't get into the database.

Profile Karsten Vinding
Volunteer tester
Send message
Joined: 18 May 99
Posts: 140
Credit: 16,608,240
RAC: 2,905
Denmark
Message 861631 - Posted: 3 Feb 2009, 14:52:32 UTC

Richard:

I think what Matt is talking about is upcoming changes to the AP application, that uses revised radar blanking code, and does FFAs on negative dispersion as well as positive (current only serches on positives AFAIK).

Optimized code for this is well on its way, I think that is what he is talking about.
____________

Message boards : Technical News : Lower Lamarck (Feb 02 2009)

Copyright © 2014 University of California