Maxed (Dec 16 2010)


log in

Advanced search

Message boards : Technical News : Maxed (Dec 16 2010)

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author Message
Profile Andy Lee Robinson
Avatar
Send message
Joined: 8 Dec 05
Posts: 615
Credit: 40,936,133
RAC: 42,793
Hungary
Message 1057655 - Posted: 19 Dec 2010, 0:06:53 UTC - in response to Message 1057462.

Compressing the data is a waste of time. The radio signals are random noise.
If you want to compress the WUs, then it would require another boinc project for it!

Like concrete vs a sponge - there are no holes to squeeze out because there are no patterns to match. This is what we are doing - looking for patterns so tiny that it needs petaflops of work.

The only sensible option we have is to change the transport encoding in the workunit from:

<data length=354991 encoding="x-setiathome">
4!06>[Z"^-=U3F &)+L2$%3?N:MQZ$SS TZQ^'.U9D>EB5$3-NBBHZ<V[$:M]6FX
#AN(80/"%'!3)31Q[&HH$A]9[2*LIH274ITL'/(B%LN7]B[+C$N6J X(TXT\21CQ
+BEJ;]PXG9'$#3@7S]GI83NXJ-$C\35<7@TJG=_,=J)R>/DSEV3C^4%DHT$,%W"R
...

to raw CDATA octets.

Base64 encoding increases the size of the raw data by about 37%, and is unnecessary because XML supports binary octets as CDATA.
Why not use it? The WUs don't have to go through ancient mailservers, and transfer is point to point, so there are no barriers apart from will.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8466
Credit: 48,994,216
RAC: 73,838
United Kingdom
Message 1057660 - Posted: 19 Dec 2010, 0:25:28 UTC - in response to Message 1057655.

Compressing the data is a waste of time. The radio signals are random noise.
If you want to compress the WUs, then it would require another boinc project for it!

Like concrete vs a sponge - there are no holes to squeeze out because there are no patterns to match. This is what we are doing - looking for patterns so tiny that it needs petaflops of work.

The only sensible option we have is to change the transport encoding in the workunit from:
<data length=354991 encoding="x-setiathome">
4!06>[Z"^-=U3F &)+L2$%3?N:MQZ$SS TZQ^'.U9D>EB5$3-NBBHZ<V[$:M]6FX
#AN(80/"%'!3)31Q[&HH$A]9[2*LIH274ITL'/(B%LN7]B[+C$N6J X(TXT\21CQ
+BEJ;]PXG9'$#3@7S]GI83NXJ-$C\35<7@TJG=_,=J)R>/DSEV3C^4%DHT$,%W"R
...

to raw CDATA octets.

Base64 encoding increases the size of the raw data by about 37%, and is unnecessary because XML supports binary octets as CDATA.
Why not use it? The WUs don't have to go through ancient mailservers, and transfer is point to point, so there are no barriers apart from will.

Exactly as has been done with Astropulse workunit files, since the very start of that sub-project.

It's entirely up to the lab staff how they handle it, but either compression or raw CDATA for MB would shave ~30% off that part of the project's download requirements. Whether modifying the splitter code to generate CDATA format, or post-processing the existing format with compression, I'll leave to them. I'll only mention that post-processing would also compress the (non-trivial) XML header, whereas CDATA would not.

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 1057664 - Posted: 19 Dec 2010, 1:02:32 UTC

FYI, limits have been raised...

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12409
Credit: 6,716,966
RAC: 8,573
United States
Message 1057665 - Posted: 19 Dec 2010, 1:05:49 UTC - in response to Message 1057664.

FYI, limits have been raised...

- Matt

Thanks for the update.

____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3406
Credit: 19,633,183
RAC: 18,220
Sweden
Message 1057666 - Posted: 19 Dec 2010, 1:09:16 UTC - in response to Message 1057664.

FYI, limits have been raised...

- Matt


Yeah, I thought so when the bandwidth suddenly became maxed out again. However, we're soon running out of files to split, so unless more are added, the raise is more academic really :-)

I don't care much for myself. My machines have WU's to last them well over the next three day outage.

____________

Profile Andy Lee Robinson
Avatar
Send message
Joined: 8 Dec 05
Posts: 615
Credit: 40,936,133
RAC: 42,793
Hungary
Message 1057680 - Posted: 19 Dec 2010, 2:50:00 UTC - in response to Message 1057660.
Last modified: 19 Dec 2010, 2:54:28 UTC

It's entirely up to the lab staff how they handle it, but either compression or raw CDATA for MB would shave ~30% off that part of the project's download requirements. Whether modifying the splitter code to generate CDATA format, or post-processing the existing format with compression, I'll leave to them. I'll only mention that post-processing would also compress the (non-trivial) XML header, whereas CDATA would not.


Could either compress as is, or do a cdata version and compress that too, the xml header would be compressed in both cases, even if the payload grows a bit.

A quick test - I converted an existing wu, decompressed the payload with ascii85 and made a cdata version:

wu.xml = 375353 wu.xml.gz = 273716 wu-cdata.xml = 295612 wu-cdata.xml.gz = 279310


Interesting in this case that gzipping the current format WUs made a smaller file.
So, all they need to do is make and store the current WUs as gz and add Content-Encoding: gzip to the http header on transfer, and all should be fine.

Profile Blurf
Volunteer tester
Send message
Joined: 2 Sep 06
Posts: 7526
Credit: 6,703,798
RAC: 6,404
United States
Message 1057691 - Posted: 19 Dec 2010, 3:50:06 UTC - in response to Message 1057664.

FYI, limits have been raised...

- Matt


To.....what levels?
____________


Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 1
United States
Message 1057731 - Posted: 19 Dec 2010, 8:16:39 UTC - in response to Message 1057664.

FYI, limits have been raised...

- Matt


that's nice, but there is very little work to be split. please feed the splitters.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8466
Credit: 48,994,216
RAC: 73,838
United Kingdom
Message 1057773 - Posted: 19 Dec 2010, 10:48:04 UTC - in response to Message 1057731.

FYI, limits have been raised...

- Matt

that's nice, but there is very little work to be split. please feed the splitters.

Please enjoy your Saturday night away from the lab.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3406
Credit: 19,633,183
RAC: 18,220
Sweden
Message 1057774 - Posted: 19 Dec 2010, 11:07:58 UTC - in response to Message 1057773.
Last modified: 19 Dec 2010, 11:12:02 UTC

FYI, limits have been raised...

- Matt

that's nice, but there is very little work to be split. please feed the splitters.

Please enjoy your Saturday night away from the lab.


I agree. There is no reason whatsoever that the staff should bother about this small insignificant problem of no work to send out, on a weekend. Anyone who disagree with that should immediately sign up to baby sit the servers at the lab on weekends, for no pay whatsoever.

Nobody is going to die if their computers doesn't get fed with WU's on a weekend. That is really a NON ISSUE.

Edit: PS. It's snowing like mad here. According to weather statistics, this is so far the coldest winter in this part of Sweden for over 150 years. That my friends is much more important than a temporary lack of WU's :-)
____________

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7062
Credit: 60,021,139
RAC: 21,405
Germany
Message 1057776 - Posted: 19 Dec 2010, 11:32:00 UTC
Last modified: 19 Dec 2010, 11:40:16 UTC

FYI, limits have been raised...

- Matt


Thanks for the news!

But it's a pity, that the available WUs were not much.


I agree. There is no reason whatsoever that the staff should bother about this small insignificant problem of no work to send out, on a weekend. Anyone who disagree with that should immediately sign up to baby sit the servers at the lab on weekends, for no pay whatsoever.

Nobody is going to die if their computers doesn't get fed with WU's on a weekend. That is really a NON ISSUE.

Edit: PS. It's snowing like mad here. According to weather statistics, this is so far the coldest winter here in this part of Sweden for over 150 years.


Now we have again two different kind of humans.
Members with enough WUs for at least 6 or 7 days. And members which have maybe only WUs for 3 days - which worry that they can't bridge the next 3 day outage.
I'm the last ones. I guess my PCs will idle during the next 3 day outage, because they will not DL enough WUs from Monday to Tuesday morning (Berkeley time). But nothing for to upset. It's a pity that SETI@home can't use my PCs then.

I would look into the lab at weekends, but what I could do? Clean the lab?
This wouldn't help.
OTOH, Berkeley is too far away from here where I live.

The weather news from my home (Germany/Baden-Württemberg), on the main street no snow, the small street in front of my house with ~ 10 cm snow/ice and melts, the half windshield of my car is closed 50 % with snow (not more snow on the car) and also melts.
From today to mid of the week we will have here + ~ 4 - 6 °C outside at the day. Maybe not white X-Mas.



I wish all a nice 4th Advent..


____________
BR



>Das Deutsche Cafe. The German Cafe.<

Cheopis
Send message
Joined: 17 Sep 00
Posts: 139
Credit: 10,934,816
RAC: 8,950
United States
Message 1057780 - Posted: 19 Dec 2010, 12:42:09 UTC

With the servers running out of work units so fast these days, have there been any thoughts on how to process more work units between each required physical visit to the server room to load tapes or disks of whatever media the raw data is stored on these days?

Is this process already optimized, or would it be possible to perhaps store the data on multiple external hard drives or on it's own separate RAID array?

If it is possible to read and verify the data from the media faster than the machines can initially process said data to create workunits, than it should be possible to build up a store of unprocessed data when humans are available to feed the machine. The "extra" data could then be queued for access whenever one of the splitter CPU's has no available data to make work units from.

Great for weekends, or after outages. Not sure if this would be something that the project would actually want to run full time, as I understand the bottleneck for the actual science is in the post-home-user stages.

I'm imagining a couple really big USB external hard drives for "extra" data storage. These are getting pretty cheap.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38925
Credit: 579,135,130
RAC: 511,836
United States
Message 1057796 - Posted: 19 Dec 2010, 14:11:24 UTC

Thanks for raising the limits, Matt.

However, it appears that Worf is having trouble keeping up with the many kibble bowls to fill.
Might Oscar help him out with the radar blanking work?
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

JohnDKProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 00
Posts: 840
Credit: 43,105,271
RAC: 75,306
Denmark
Message 1057824 - Posted: 19 Dec 2010, 15:28:55 UTC - in response to Message 1057773.

FYI, limits have been raised...

- Matt

that's nice, but there is very little work to be split. please feed the splitters.

Please enjoy your Saturday night away from the lab.

Wasn't there too few tapes to split to begin with when they left on Friday?

Profile Robert Gammon
Send message
Joined: 29 Aug 01
Posts: 19
Credit: 754,043
RAC: 409
United States
Message 1057826 - Posted: 19 Dec 2010, 15:50:52 UTC

I am not sure how this could be done, but some of us ache for a fairness doctrine as applied to downloaded WUs.

While the system was up for the last couple of weeks, I managed to snag 2 Astropulse and 8-9 Seti Enhanced WUs. Now the my last 6 Seti enhanced WUs are Pending Validation as the second computer is one of the machines that downloads 4-10 WUs per day. Granted most of them get returned inside the window of time, but still, I turn them around in 2 days, and wait 3-6 weeks for validation to happen.

Seti is out of WUs now so my Boinc is working on other projects JUST to stay busy
____________

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,059
RAC: 49
United States
Message 1057831 - Posted: 19 Dec 2010, 16:04:51 UTC - in response to Message 1057826.

I am not sure how this could be done, but some of us ache for a fairness doctrine as applied to downloaded WUs.

While the system was up for the last couple of weeks, I managed to snag 2 Astropulse and 8-9 Seti Enhanced WUs. Now the my last 6 Seti enhanced WUs are Pending Validation as the second computer is one of the machines that downloads 4-10 WUs per day. Granted most of them get returned inside the window of time, but still, I turn them around in 2 days, and wait 3-6 weeks for validation to happen.

Seti is out of WUs now so my Boinc is working on other projects JUST to stay busy


I think the "fairness" doctrine is known as "grab and growl". Grab what is there, and growl about it all you want, it will not change a thing.
____________

Janice

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3353
Credit: 2,041,839
RAC: 1,932
Canada
Message 1057869 - Posted: 19 Dec 2010, 18:22:28 UTC - in response to Message 1057831.

I think the "fairness" doctrine is known as "grab and growl". Grab what is there, and growl about it all you want, it will not change a thing.


Grrrrrrrrrrrrrrr.
____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 1
United States
Message 1057897 - Posted: 19 Dec 2010, 19:51:19 UTC
Last modified: 19 Dec 2010, 19:59:56 UTC

how come the people with a low RAC are always ridiculing the people, usually high RAC users, who are almost out of work when they are reporting a problem.

i'm for a more fair work distribution method in which RAC is the basis. machines/users with a higher RAC should be rewarded more work, directly proportional to RAC, during the rationing periods.

then we all could listen to the "computationally challenged" users complain they don't have enough work...
____________

Profile Albireo380
Send message
Joined: 21 Mar 08
Posts: 118
Credit: 1,371,523
RAC: 682
United Kingdom
Message 1057902 - Posted: 19 Dec 2010, 20:08:15 UTC - in response to Message 1057897.

how come the people with a low RAC are always ridiculing the people, usually high RAC users, who are almost out of work when they are reporting a problem.

i'm for a more fair work distribution method in which RAC is the basis. machines/users with a higher RAC should be rewarded more work, directly proportional to RAC, during the rationing periods.

then we all could listen to the "computationally challenged" users complain they don't have enough work...


I have a low RAC, Rottenmutt, and I haven't ridiculed anyone. I guess some people with low RAC are unhappy that high RAC crunchers seem to get lots of work units. Personally, as long as I can contribute a little, my life wont end if I run out of WUs for a while. But some folk get very "antsy" about it - just people I suppose - such is life.

Never mind though, SETI is back up and running - that has got to be good.

Regards & keep crunching : )

Tom

Profile Robert Gammon
Send message
Joined: 29 Aug 01
Posts: 19
Credit: 754,043
RAC: 409
United States
Message 1057908 - Posted: 19 Dec 2010, 20:21:50 UTC - in response to Message 1057897.

Fairness lies in the eye of the beholder.

I see the rationale for having as much seti work as your machine can handle, as the seti folks have had extended downtime over the last several months.

High RAC, low RAC, that all depends on just how many computers you can employ and the relative hotness of each of the machines.

Some of us only have one computer, and that one is several years old. If the High RAC users get all the WUs then we can be shut out.

A slightly better approach would reward users who consistently turn in units that pass validation, turn them in on time, and grade higher for short turn around time. People who fail to turn in units on time get throttled on how many work units they have at one time. People who have compute errors on their work get limited on how many WUs can be outstanding at one time. People who turn in work units well inside the time limit (lets assume that they return them inside 5 days, but sysadmin makes the decision) get rewarded with more WUs.


____________

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Technical News : Maxed (Dec 16 2010)

Copyright © 2014 University of California