Rain (Aug 04 2008)

Message boards : Technical News : Rain (Aug 04 2008)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 792874 - Posted: 4 Aug 2008, 21:37:18 UTC

Another wacky weekend for us. Astropulse is still ramping up - we're creating work, sending it out, receiving results back and assimilating them. However the validator stopped granting credit for these workunits - something we'll fix and we can also retroactively give people their credit. The workunit storage server ran low on room again, the bottleneck that's been giving everybody headaches over the weekend as the splitters could only create work as fast as workunits got deleted off disk. Right now things are generally running slow as I'm moving stuff off the workunit server to make room causing lots of excess internal i/o. As an added bonus the mysql database replica server crashed this morning - it ran out of memory. No harm done, but it looks like it'll take a while to catch up again (it's been lagging behind all weekend). I would like to try to split the numbers on the status page between the two different applications (SETI@home/Astropulse) but those extra "where" clauses make the queries run forever.

In better news, looks like we got our new home-grown NAS/RAID box working as we'd like it, so we may start employing that sooner than later (thus freeing up lots of room/power in our server closet). Also all drive issues on our science database server over the past couple of weeks have been completely dealt with at this point. Well.. there's one lingering corrupted index which we'll try to rebuild tomorrow during the outage.

I was actually out of the loop since Thursday as I went up to Seattle to play a gig on the main stage at the Microsoft Techready conference at Bell Harbor. Anybody around here attend that thing? Fun show/event, but the stage tent was completely inadequate and the entire band got soaked by rain and sea mist. I'm amazed none of us were electrocuted.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 792874 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 792877 - Posted: 4 Aug 2008, 21:44:38 UTC - in response to Message 792874.  

I was actually out of the loop since Thursday as I went up to Seattle to play a gig on the main stage at the Microsoft Techready conference at Bell Harbor.



Sshhhhh! Nobody tell ML1 about this. Everybody, stand in front of the letters so ML1 can't read them. He doesn't have to know about this. :)
ID: 792877 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20147
Credit: 7,508,002
RAC: 20
United Kingdom
Message 792885 - Posted: 4 Aug 2008, 22:14:40 UTC - in response to Message 792877.  

I was actually out of the loop since Thursday as I went up to Seattle to play a gig on the main stage at the Microsoft Techready conference at Bell Harbor.

Sshhhhh! Nobody tell ML1 about this. Everybody, stand in front of the letters so ML1 can't read them. He doesn't have to know about this. :)

Someone's got to do it, and they're welcome to fund some hardware. Might even be fun! Just watchout for the EULA...

Shame about the soaking.

Happy crunchin',
Martin

ps: Thanks for the special mention!


See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 792885 · Report as offensive
Profile Mumps [MM]
Volunteer tester
Avatar

Send message
Joined: 11 Feb 08
Posts: 4454
Credit: 100,893,853
RAC: 30
United States
Message 792906 - Posted: 4 Aug 2008, 22:41:29 UTC
Last modified: 4 Aug 2008, 23:04:58 UTC

Well, if the disk is full because of all the WU's "in the wild" wouldn't those "Ghost WU's" for AstroPulse be a large culprit? I have just peeked at one of my hosts that has no work and the server side thinks it has 2 AP WU's. But it's running the AK 8.0 Opti App, and is not configured to run AP. So that's two of those "larger" WU's that will be sitting there for a month before it gets "No Reply"d out and hopefully processed by a more suitable host. :-)

If this is the case for even a small percentage of the hosts out there, we could find a significant amount of disk space tied up that'll wait a month before the system realizes it needs to go to some other host.

And I see at least on one of them, my wingmate is also running an older Opti App. (Which implies to me they don't even read the forums so they wouldn't know to have manually updated the app_info.xml.) So neither party will be returning that one.

EDIT: Then there's the other problem mentoned in the Number Crunching threads of hosts that shouldn't be getting work that suddenly are. Not the most eloquent of threads, but each host out there like this could also reflect a significant amount of WU's "in the wild" that won't come back until they miss deadline...
ID: 792906 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 792918 - Posted: 4 Aug 2008, 23:08:20 UTC - in response to Message 792906.  

Well, if the disk is full because of all the WU's "in the wild" wouldn't those "Ghost WU's" for AstroPulse be a large culprit? I have just peeked at one of my hosts that has no work and the server side thinks it has 2 AP WU's. But it's running the AK 8.0 Opti App, and is not configured to run AP. So that's two of those "larger" WU's that will be sitting there for a month before it gets "No Reply"d out and hopefully processed by a more suitable host. :-)


Don't think so... there's not that many AP WU's out there, yet...


If this is the case for even a small percentage of the hosts out there, we could find a significant amount of disk space tied up that'll wait a month before the system realizes it needs to go to some other host.

And I see at least on one of them, my wingmate is also running an older Opti App. (Which implies to me they don't even read the forums so they wouldn't know to have manually updated the app_info.xml.) So neither party will be returning that one.

EDIT: Then there's the other problem mentoned in the Number Crunching threads of hosts that shouldn't be getting work that suddenly are. Not the most eloquent of threads, but each host out there like this could also reflect a significant amount of WU's "in the wild" that won't come back until they miss deadline...


.

Hello, from Albany, CA!...
ID: 792918 · Report as offensive
Profile Mumps [MM]
Volunteer tester
Avatar

Send message
Joined: 11 Feb 08
Posts: 4454
Credit: 100,893,853
RAC: 30
United States
Message 792926 - Posted: 4 Aug 2008, 23:23:01 UTC - in response to Message 792918.  

Well, if the disk is full because of all the WU's "in the wild" wouldn't those "Ghost WU's" for AstroPulse be a large culprit? I have just peeked at one of my hosts that has no work and the server side thinks it has 2 AP WU's. But it's running the AK 8.0 Opti App, and is not configured to run AP. So that's two of those "larger" WU's that will be sitting there for a month before it gets "No Reply"d out and hopefully processed by a more suitable host. :-)


Don't think so... there's not that many AP WU's out there, yet...

But according to this post by Joe, an AP WU may be roughly 40 times the size of a MB WU. So wouldn't that consume that much more space on the Download Server while it's hanging around waiting to get returned? :-)
ID: 792926 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 793085 - Posted: 5 Aug 2008, 4:35:11 UTC - in response to Message 792926.  

But according to this post by Joe, an AP WU may be roughly 40 times the size of a MB WU. So wouldn't that consume that much more space on the Download Server while it's hanging around waiting to get returned? :-)

AP WUs have exactly 32 times as much data as MB ones. The size is slightly less than 32 times since AP WUs use pure binary for the data section rather than having it broken up into lines, and the header is shorter too.

The 40.4 ratio for number of WUs is because MB WUs start at about 85 second intervals even though they have a duration of 107.37 seconds; that overlap means Gaussians, Pulses, or Triplets can be found reliably and don't get lost at the gap. AP is primarily looking for very fast stuff so isn't using overlap, at least in those I looked at on SETI Beta there wasn't any.
                                                                Joe
ID: 793085 · Report as offensive

Message boards : Technical News : Rain (Aug 04 2008)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.