Working as Expected (Jul 13 2009)

Message boards : Technical News : Working as Expected (Jul 13 2009)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 11 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 917472 - Posted: 13 Jul 2009, 22:11:48 UTC

The data pipeline over the weekend seemed to be more or less okay, thanks to running out of Astropulse workunits and not having any raw data to split to create new ones. Of course, I shovelled some more raw data to the pile this morning, and our bandwidth shot right back up again. This pretty much proves that our recent headaches have been largely due to the disparity of workunit sizes/compute times between multibeam/Astropulse, but that's all academic at this point as Eric is close to implementing a configuration change which will increase the resolution of chirp rates (thus increasing analysis/sensitivity) and also slowing clients down so they don't contact our servers as often. We should be back to lower levels of traffic soon enough.

We are running fairly low on data from our archives, which is a bit scary. We're burning through it rather quickly. Luckily, Andrew is down at Arecibo now, with one of our new drive bays - he'll plug it in perhaps today and we'll hopefully be collecting data later tonight...?

To be clear, we actually have hundreds of raw data files in our archives, but most of them suffer from (a) lack of embedded hardware radar signals (therefore making it currently impossible to analyse without being blitzed by RFI), or (b) accidental extra coordinate precession, or (c) both of the above. Software is in the works (mostly waiting on me) to solve all the above.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 917472 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 917475 - Posted: 13 Jul 2009, 22:25:02 UTC - in response to Message 917472.  

Thanks for the update Matt,

Claggy
ID: 917475 · Report as offensive
Profile Borgholio
Avatar

Send message
Joined: 2 Aug 99
Posts: 654
Credit: 18,623,738
RAC: 45
United States
Message 917483 - Posted: 13 Jul 2009, 23:19:11 UTC

Good deal. One of the commonly suggested methods to work around the bandwidth and database issues are to increase the amount of science done with the various apps, so that clients will take longer to do them. Will changing the resolution of chirping cause any problems for GPU crunchers, or will it basically be the same thing as it is now, just slower?
You will be assimilated...bunghole!

ID: 917483 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 917506 - Posted: 14 Jul 2009, 2:46:14 UTC

I hope that the deadlines also increased with the increased work.


BOINC WIKI
ID: 917506 · Report as offensive
zpm
Volunteer tester
Avatar

Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,659,024
RAC: 0
United States
Message 917510 - Posted: 14 Jul 2009, 3:25:08 UTC - in response to Message 917506.  
Last modified: 14 Jul 2009, 3:26:02 UTC

I hope that the deadlines also increased with the increased work.



for sure please. i some grab wu's for gpu and have work stacked for cpu and i end up running ggrid until like 5 days left for the deadline on seti work... sometimes.
ID: 917510 · Report as offensive
Profile RandyF
Volunteer tester
Avatar

Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,296,855
RAC: 1
United States
Message 917514 - Posted: 14 Jul 2009, 4:20:18 UTC - in response to Message 917472.  

Send me the largest WU's you got, Matt! :D
ID: 917514 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 917527 - Posted: 14 Jul 2009, 5:24:01 UTC - in response to Message 917514.  

Send me the largest WU's you got, Matt! :D


"you have mail! 857,000 AP wu's!" lol

Thanks for the update Matt!
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 917527 · Report as offensive
CryptokiD
Avatar

Send message
Joined: 2 Dec 00
Posts: 150
Credit: 3,216,632
RAC: 0
United States
Message 917557 - Posted: 14 Jul 2009, 8:16:28 UTC

any ideas how much longer the workunits will take?

with how fast modern computers are coupled with the optamized apps, i would suggest making everything take 10x longer. this would go reasonably well with moores law. for a little while anyways.
ID: 917557 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 917558 - Posted: 14 Jul 2009, 8:41:36 UTC


I have a lot of results ready for UL, but the UL server is offline?

Longer crunching time for MB.. I hope the credits will increase also.. ;-)
The applications don't need to be adjusted?

But.. BTW.. if AP overload the internet traffic to the server.. why not 'disable' AP to the time faster traffic is possible ('bigger cable')?

ID: 917558 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 917564 - Posted: 14 Jul 2009, 9:21:40 UTC - in response to Message 917472.  

...
This pretty much proves that our recent headaches have been largely due to the disparity of workunit sizes/compute times between multibeam/Astropulse, but that's all academic at this point as Eric is close to implementing a configuration change which will increase the resolution of chirp rates (thus increasing analysis/sensitivity) and also slowing clients down so they don't contact our servers as often. We should be back to lower levels of traffic soon enough.
...


Is this really well/better for the science or only to resolve the bandwidth prob?

Isn't MB then like AP? (only difference the narrowband/broader-band)

ID: 917564 · Report as offensive
Dave_In_Oz

Send message
Joined: 26 Jul 99
Posts: 6
Credit: 12,797,191
RAC: 22
Australia
Message 917571 - Posted: 14 Jul 2009, 9:58:41 UTC

I am smiling as I read this thread. Over the weekend I built an i7-950 with a nvidia GTX295. For the first 24hrs I did not run any GPU CUDA work just so I could get a feel of how quick my new toy was. Having 8 threads increased my throughput significantly, but when I upgraded the graphics driver to a release that supported CUDA it all took off. I have had a dramatic increase in completing units. I participate in a number of projects, and went from about 900 to 1000 credits a day to over 5,000 today. And the new toy is stock, no over clocking at all, and temps are all very reasonable. Wow!!!!

But alas I am probably contributing to the infrastructure stress - I have over 30 results waiting to upload, the majority of them were done on the i7 in the past 12 hrs.
ID: 917571 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21209
Credit: 7,508,002
RAC: 20
United Kingdom
Message 917572 - Posted: 14 Jul 2009, 9:59:59 UTC

Indeed so... Working exactly as expected.

For the link limits and congestion... Note:

In a email that was sent to Seti Staff. At a point in time the 100Megabit link was Full Duplex. Meaning Uploads should not interfere with Downloads and vice versa (each is in its own channel).

We forget that TCP is a sliding window protocol. If the 100 megabit line is saturated inbound, part of that inbound traffic are the ACKs for the outbound traffic.

When the ACKs are delayed or lost, at some point the sender stops sending new data, and waits. When the ACKs don't arrive (because they were lost) data is resent.

In either direction, when the load is very high, data in the other direction will suffer too.

That's a very 'subdued' way of describing the situation.

Lose the TCP control packets in either direction and the link is DOSed with an exponentially increasing stack of resend attempts that DOS for further attempts that then DOS for... Until the link disgracefully degrades to being totally blocked. Max link utilisation but no useful information gets through.

The only limiting factors are the TCP timeouts and the rate of new connection attempts.


And I thought the smooth 71Mb/s was due to some cool traffic management. OK, so restricting the available WUs is also a clumsy way to "traffic manage"!


In short, keep the link at never anything more than 89Mb/s MAX and everyone is happy!

Happy smooth crunchin',
Martin



Regards,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 917572 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 917574 - Posted: 14 Jul 2009, 10:24:08 UTC


BTW.

~ 400 results in the BOINC UL overview and every few minutes increasing..

ID: 917574 · Report as offensive
Profile Sebastian M. Bobrecki
Volunteer tester

Send message
Joined: 7 Feb 02
Posts: 23
Credit: 38,375,443
RAC: 0
Poland
Message 917598 - Posted: 14 Jul 2009, 13:50:16 UTC - in response to Message 917558.  

Hello everyone


I have a lot of results ready for UL, but the UL server is offline?

Longer crunching time for MB.. I hope the credits will increase also.. ;-)
The applications don't need to be adjusted?
...




It seams that it only need changes in <analysis_cfg> in result header. Am I right?

But I'm curious, what with already computed workunits?
Will they be resend to be analysed with new more sensitive settings, that sounds good to me.
Or maybe only new one will be treated in that way. Hm it also sounds good to me :)


ID: 917598 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 917610 - Posted: 14 Jul 2009, 14:41:23 UTC - in response to Message 917472.  
Last modified: 14 Jul 2009, 14:43:01 UTC

We are running fairly low on data from our archives, which is a bit scary. We're burning through it rather quickly....


Matt,
We have 3.5M tasks out in the field. Would it be such a tragedy for SETI to take a break, turn in a couple million outstanding tasks, do some software development, and setup a server or two while the project is down? You could even run the splitters for a couple days before bringing the project back online, to build-up a cache of tasks. If there's enough of a reward in terms of project accomplishment, then I see no problem with a week or two worth of PLANNED project downtime to focus on some much needed work. Many BOINC projects have planned downtime in their lifecycle for upgrades, different steps in their research, etc. There's no reason SETI needs to be different and expect 100% continuous work available.
ID: 917610 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 917633 - Posted: 14 Jul 2009, 16:23:47 UTC

Matt,

Please turn uploads back on during maintenance, while we have nice clean bandwidth to use.
ID: 917633 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 917637 - Posted: 14 Jul 2009, 16:36:20 UTC - in response to Message 917633.  
Last modified: 14 Jul 2009, 16:38:28 UTC

I have 10 E5420 CPU's waiting for WU's and about 50 WU's ready to be uploaded, the lists are getting realy long :-(
Please activate the upload server for a couple of hours.
ID: 917637 · Report as offensive
PSY0NIC

Send message
Joined: 17 Apr 02
Posts: 1
Credit: 34,667
RAC: 0
United States
Message 917695 - Posted: 14 Jul 2009, 23:46:54 UTC - in response to Message 917637.  

I have 10 E5420 CPU's waiting for WU's and about 50 WU's ready to be uploaded, the lists are getting realy long :-(
Please activate the upload server for a couple of hours.


Aye... I had to suspend mine as well. 50+- uploads and stacking up fast. It's downloading plenty of replacements though.
ID: 917695 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 917697 - Posted: 15 Jul 2009, 0:01:34 UTC

Reading Matts Tech News the other day and the Cricket Graphs when the Feeder was putting out Just MultiBeam... You could see that Uploads (Uploads and Scheduler Requests) were getting through and eating about 17.5% of the bandwidth. Downloads due to the feeder settings were keeping up and eating ~63%
In that cirucumstance the balance was very visible.

Now as Uploads have been turned Off, I presume that Matt is tunning the "Feeder" to see what settings might be better. In order to accurately set this Uploads would interfere with the calibrations

Regards






Please consider a Donation to the Seti Project.

ID: 917697 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 917704 - Posted: 15 Jul 2009, 0:24:12 UTC - in response to Message 917697.  

Reading Matts Tech News the other day and the Cricket Graphs when the Feeder was putting out Just MultiBeam... You could see that Uploads (Uploads and Scheduler Requests) were getting through and eating about 17.5% of the bandwidth. Downloads due to the feeder settings were keeping up and eating ~63%
In that cirucumstance the balance was very visible.

Now as Uploads have been turned Off, I presume that Matt is tunning the "Feeder" to see what settings might be better. In order to accurately set this Uploads would interfere with the calibrations

Regards


That makes good sense. Let's hope they get it all tuned before there are so many work units out trying to be returned that it just clogs the uploads for days.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 917704 · Report as offensive
1 · 2 · 3 · 4 . . . 11 · Next

Message boards : Technical News : Working as Expected (Jul 13 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.