Working as Expected (Jul 13 2009)

Message boards : Technical News : Working as Expected (Jul 13 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next

AuthorMessage
Profile Jim H

Send message
Joined: 28 Nov 06
Posts: 12
Credit: 2,186,439
RAC: 0
United States
Message 919718 - Posted: 20 Jul 2009, 15:47:53 UTC - in response to Message 919679.  

Thanks for all the hard work.
Makes my head hurt when considering all the factors in play to keep SETI and BOINC running.

I've noted the dificulties over the last several weeks and more importantly, I've noted the efforts the folks are making in order to get it all moving..
THX
Clear Skies to all amateur Astronomers out there...
ID: 919718 · Report as offensive
zpm
Volunteer tester
Avatar

Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,659,024
RAC: 0
United States
Message 919864 - Posted: 20 Jul 2009, 21:45:11 UTC - in response to Message 919718.  

ozzy hit the head of the nail on #5.... old equipment.....
ID: 919864 · Report as offensive
Profile Toeman

Send message
Joined: 31 Mar 01
Posts: 2
Credit: 498,430
RAC: 0
United States
Message 919884 - Posted: 20 Jul 2009, 22:30:41 UTC

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.
ID: 919884 · Report as offensive
Profile Borgholio
Avatar

Send message
Joined: 2 Aug 99
Posts: 654
Credit: 18,623,738
RAC: 45
United States
Message 919897 - Posted: 20 Jul 2009, 23:24:10 UTC - in response to Message 919884.  

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.


I would hardly call searching for a cure for AIDS or cancer filler...but that's just me.
You will be assimilated...bunghole!

ID: 919897 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 919903 - Posted: 20 Jul 2009, 23:48:28 UTC - in response to Message 919897.  

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.


I would hardly call searching for a cure for AIDS or cancer filler...but that's just me.

"filler" is in the eye of the beholder.
ID: 919903 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 919912 - Posted: 21 Jul 2009, 0:04:13 UTC - in response to Message 919884.  

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.


Eric and co. are keeping things going as best they can. Matt is on vacation and will be back soon.
ID: 919912 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 919968 - Posted: 21 Jul 2009, 2:26:46 UTC - in response to Message 919912.  

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.


Eric and co. are keeping things going as best they can. Matt is on vacation and will be back soon.

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.
ID: 919968 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 919995 - Posted: 21 Jul 2009, 4:28:14 UTC - in response to Message 919968.  

Another frustrating Mon. Have been trying to upload/download work units for three weeks. Very sporadic at best. Managing only one connection per week for two in a row. I wish someone would let us know what's up. I started running Boinc for Seti@home when the classic S@H was shut down and am not really interested in running other "filler" projects. Thanks, and any information would very welcome.


Eric and co. are keeping things going as best they can. Matt is on vacation and will be back soon.

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.


That's just Vistro and TCP Jesus/MC Hammer or whatever he decides to call himself this week pressing the retry button too many times.
ID: 919995 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20084
Credit: 7,508,002
RAC: 20
United Kingdom
Message 920042 - Posted: 21 Jul 2009, 10:55:02 UTC - in response to Message 919995.  
Last modified: 21 Jul 2009, 10:55:24 UTC

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.

That's just Vistro and TCP Jesus/MC Hammer or whatever he decides to call himself this week pressing the retry button too many times.

Hey! Shame on you... Cynicism doesn't become you.

Ya just got to admire their interest and dedication to be sat there clicking away at the button. ... They may even get to learn more of how Boinc works, and why, and how, and also find out something of the exploration in how Boinc is put together.

All very good fun!

Meanwhile, I leave Boinc to it's own devices. It usually muddles through.

(I will admit to the occasional prod for the sake of my own experiments in GPU WU selection :-o )

Meanwhile #2, the Cricket graphs form a very good study in TCP effects on a saturated link! It is also a good reminder that for any system, overall 'control' is exerted by the most significant bottleneck (or whatever system resource limit gets hit the hardest).

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 920042 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20084
Credit: 7,508,002
RAC: 20
United Kingdom
Message 920043 - Posted: 21 Jul 2009, 11:01:49 UTC - in response to Message 919968.  
Last modified: 21 Jul 2009, 11:42:01 UTC

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.

That upload spike there shows very nicely how the uploads rate can above double when the downlink is non-saturated.

Note: Green = downlink, blue line = uplink.

There also appears to be a tail-off for a while when the download link becomes saturated oncemore until a short while later the uploads settle back to the saturation average. Is that the exponential backoff coming into play but only for individual upload attempts? The backoffs appear rather too quickly to average out to a high background noise level...

Regards,
Martin


Note the download dip and matching upload peak at Monday 19:00 ->



(Snapshot image from Cricket. Don't do this directly to Cricket itself for obvious reasons!)
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 920043 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 920045 - Posted: 21 Jul 2009, 11:12:38 UTC - in response to Message 920043.  
Last modified: 21 Jul 2009, 11:20:42 UTC

Thought this might be of interest to some.

Using an SSD for an OLTP log disk
"On a "normal" SLES 10 SP2 we achieved 1400 tr/s on a quad core (an anonymous CPU for now ;-). But Anand's article really got us curious and we replaced our mighty Cheetah disk with the Intel x25-M SSD (80 GB). All of a sudden we achieved 1900 tr/s! No less than 35% more transactions, just by replacing the disk that holds the log with the fastest SSD of the moment. That is pretty amazing if you consider that there is no indication whatsoever that we were bottlenecked by our log disk.

....

So our conclusion so far seems to be that in case of MySQL OLTP, sizing for IO/s seems to be less important than the individual write latency. To put it more blunt: in many cases even tens of of spindles will not be able to beat one SSD as each individual disk spindle has a relatively high latency."


EDIT- and from a RAID review,

"However, placing your database data files on an Intel X25-E is an excellent strategy. One X25-E is 66% faster than eight (!) 15000RPM SAS drives. That means if you don't need capacity, you can replace about 13 SAS disks with one SSD to get the same performance. You can keep the SAS disks as your log drives as they are a relatively cheap way to obtain good logging performance."

If only Intel would donate a few dozen X25-Es to the cause. Might help with some of the database, replica performance issues...
Grant
Darwin NT
ID: 920045 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 920046 - Posted: 21 Jul 2009, 11:18:29 UTC - in response to Message 920043.  

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.

That upload spike there shows very nicely how the uploads rate can above double when the downlink is non-saturated.

There also appears to be a tail-off for a while when the download link becomes saturated oncemore until a short while later the uploads settle back to the saturation average. Is that the exponential backoff coming into play but only for individual upload attempts? The backoffs appear rather too quickly average out to a high background noise level...

Regards,
Martin

Because the Cricket graphs record the raw number of bits passing through the router (or packets ditto, if you look at the wrong page :-) ), they won't distinguish between successful uploads and those maddening (and wasteful) ones which get to 100% and then die.

Maybe it's an artefact of the way the link re-saturates after whatever it is that causes the dips (I don't think we've ever got to the bottom of those, have we?). Perhaps there's a phase where a higher number than usual succeed in connecting, and at least partially uploading, before the concrete finally sets again.
ID: 920046 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 920047 - Posted: 21 Jul 2009, 11:22:33 UTC - in response to Message 920045.  


Thought this might be of interest to some.

Using an SSD for an OLTP log disk
"On a "normal" SLES 10 SP2 we achieved 1400 tr/s on a quad core (an anonymous CPU for now ;-). But Anand's article really got us curious and we replaced our mighty Cheetah disk with the Intel x25-M SSD (80 GB). All of a sudden we achieved 1900 tr/s! No less than 35% more transactions, just by replacing the disk that holds the log with the fastest SSD of the moment. That is pretty amazing if you consider that there is no indication whatsoever that we were bottlenecked by our log disk.

....

So our conclusion so far seems to be that in case of MySQL OLTP, sizing for IO/s seems to be less important than the individual write latency. To put it more blunt: in many cases even tens of of spindles will not be able to beat one SSD as each individual disk spindle has a relatively high latency."

I wonder if a manufacturer could be persuaded to "lend" SETI a suitable drive to test that assertion under field conditions. An extended test, to include SSD lifetimes cycle limits, of course.
ID: 920047 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 920048 - Posted: 21 Jul 2009, 11:26:18 UTC - in response to Message 920046.  
Last modified: 21 Jul 2009, 11:28:28 UTC

If you look at the network traffic graphs & match them up with Scarecrow's graphs it's interesting to see that while the upload data rate might move about a bit (5Mb/s or so), that the number of uploads per hour steadily climbs.
I've always attributed that to the gradually reducing number of attempted connections resulting in more successfull connections. End result- more results being returned even though the traffic remains (relatively) unchanged.

When you see the large spikes in upload traffic, that's when you see the huge spkies in results returned per hour. And not long after that you see the database transaction increase & the replica fall behind as the validators start chruning out more work & the assimilators fall behind. Once they catch up the databse transaction rate drops & the replica can catch up again.
Grant
Darwin NT
ID: 920048 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 920050 - Posted: 21 Jul 2009, 11:31:13 UTC - in response to Message 920047.  

I wonder if a manufacturer could be persuaded to "lend" SETI a suitable drive to test that assertion under field conditions. An extended test, to include SSD lifetimes cycle limits, of course.

It would be good if they could.
I think the Seti servers would benefit greatly from them, and it'd give the manufacturers some solid data to work with in their devlopment.
Grant
Darwin NT
ID: 920050 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20084
Credit: 7,508,002
RAC: 20
United Kingdom
Message 920052 - Posted: 21 Jul 2009, 11:39:15 UTC - in response to Message 920045.  
Last modified: 21 Jul 2009, 11:40:11 UTC

Thought this might be of interest to some.

Using an SSD for an OLTP log disk
"On a "normal" SLES 10 SP2 we achieved 1400 tr/s on a quad core (an anonymous CPU for now ;-). But Anand's article really got us curious and we replaced our mighty Cheetah disk with the Intel x25-M SSD (80 GB). All of a sudden we achieved 1900 tr/s! No less than 35% more transactions, just by replacing the disk that holds the log with the fastest SSD of the moment. That is pretty amazing if you consider that there is no indication whatsoever that we were bottlenecked by our log disk. ...

Good note there.

The critical bits are:

In MySQL each user thread can issue a write when the transaction is commited . More importantly is a completely serial, there doesn't seem to be a separate log I/O thread which would allow our user thread to "fire" a disk operation "and forget". As we want to be fully ACID compliant our database is configured with
innodb_flush_log_at_trx_commit = 1

So after each transaction is committed, there is a "pwrite" first, then followed by a flush to the disk. So the actual transactions performance is also influenced by the disk write latency even if the disk is nowhere near it's limits.


And in the comments:

quote: "* typical average I/O latency is 0.23 ms (90%), with about 10% spikes of 7 to 12 ms

That reassured us that our transaction log disk was not a bottleneck"

No, that shows exactly that your disk latency is the limit: If these number are in the right ball park, the average latency is at least (0.9*0.23 + 0.1*7) ~= 0.9 ms, which limits the number of transactions per second to ~1100. Your performance is limited by the 10% of transactions that actually incur a disk related latency.


Very nice when the numbers add up.


Now... It troubles me that we have a potentially hugely parallel system with Boinc, and yet ALL Boinc server state change must go through just the ONE central database that is itself limited by the rate that ONE serial log can be updated!

So... We can only go as fast as that one log file can be updated.

(Note, the present bottleneck is the saturated download link. If that is cleared, we'll likely hit the MySQL log update bottleneck again.)


A very good find there, yay!

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 920052 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 920086 - Posted: 21 Jul 2009, 14:12:28 UTC - in response to Message 920052.  

I agree that the MySQL BOINC database is a serious bottleneck. I wonder how much work it would be to update the code to use a different database system, such as an object-oriented database? Or, perhaps as a first step, could we update the code to use a difference RDBMS such as Oracle or MS SQL?

ID: 920086 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 920087 - Posted: 21 Jul 2009, 14:13:13 UTC
Last modified: 21 Jul 2009, 14:13:39 UTC

Just noticed that even though there was a download dip and upload spike in the graph that ML1 displayed, that when looking at the packets there is virtually no variation on the uploads.
ID: 920087 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 920092 - Posted: 21 Jul 2009, 14:26:05 UTC - in response to Message 920042.  

As of approximately 1:30am UTC, it looks like there is a nice fat spike in uploaded (probably actually reported) results.

That's just Vistro and TCP Jesus/MC Hammer or whatever he decides to call himself this week pressing the retry button too many times.

Hey! Shame on you... Cynicism doesn't become you.


Then you obviously don't know me very well, nor my sense of humor. Its ok, a lot of people don't get my sense of humor until they really get a chance to know me.
ID: 920092 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 920093 - Posted: 21 Jul 2009, 14:27:58 UTC - in response to Message 920087.  

Just noticed that even though there was a download dip and upload spike in the graph that ML1 displayed, that when looking at the packets there is virtually no variation on the uploads.


Yes, good point. This tells me that downloads are limiting upload speeds. This was discussed earlier. As you know, we should expect upload and download speeds to be highly correlated.
ID: 920093 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next

Message boards : Technical News : Working as Expected (Jul 13 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.