Panic Mode On (7) Server Problems! Closed for Renovation

Message boards : Number crunching : Panic Mode On (7) Server Problems! Closed for Renovation
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 14 · Next

AuthorMessage
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 739581 - Posted: 15 Apr 2008, 23:40:14 UTC - in response to Message 739528.  

Look. You can all do what you like, but I'm saying there is no need to be so anally retentive over workunits. I ran out too, but did abc for a few hours and now have new seti WUs, it's no big deal!

If everyone had 10 day caches, there would be (10 x WU per day x number of users) workunits in circulation and they would need to be stored on disk and kept track of in the database, so those with large caches are actually contributing to the problems you are so desperately trying to avoid!

If everyone had 0-2 day caches, then the Tuesday downtime would be shorter and the servers and storage space wouldn't be so stressed, and everyone can have shorter pending lists and go home earlier... There are about 3 million WUs in circulation, each taking up about a third of a megabyte - go figure!

It just makes sense to act responsibly to help smooth things along, that's all.

As for ridiculously long deadlines - there needs to be a method of cancelling orphaned WUs if someone bites of more than they can chew and throws up...

You seemed to miss many peoples point...We are on the Seti board because a lot of us feel loyal to Seti and do not run other projects...When Seti goes away unless some new project comes along I am as passionate about I will probably just turn things off again like I did when Boinc took over. Maybe finding ABC's is important to you but not worth my electric bill....

I see Andy's point -- if the average "major" outage is a couple of days, then carrying ten days work (or more) is overkill.

I agree that "enough" generally is enough.

At the same time, there is no reason to criticize someone for carrying around a huge cache because there is no real harm in doing so. Some work will be returned later as a result, but as long as it is returned, why worry.

There is an issue with carrying a really large cache if you run more than one project -- there is a case where the current BOINC versions can't meet deadlines and honor resource share. In that case, a one-day cache solves many problems.

On deadlines: if the deadlines are too short, slower computers can't crunch them in time -- and there are people out there participating on "dated" systems. I won't argue to exclude slower systems if the owners want to participate.
ID: 739581 · Report as offensive
Blu Dude
Volunteer tester
Avatar

Send message
Joined: 28 Dec 07
Posts: 83
Credit: 34,940
RAC: 0
United States
Message 739582 - Posted: 15 Apr 2008, 23:41:04 UTC - in response to Message 739573.  

You seemed to miss many peoples point...We are on the Seti board because a lot of us feel loyal to Seti and do not run other projects...When Seti goes away unless some new project comes along I am as passionate about I will probably just turn things off again like I did when Boinc took over.


Yep I totally agree. If S@H disappears thats it for me, I wont crunch any of the other prijects as none really catch my interest and are not worth the power bill to me. I have a 4 day cache and that should get me through all but the most severe WU drought.

If people cant wait 4 days for me to crunch, then bad luck for them. About 70% of my WU's seem to go into pending for a day or so anyway, so in real terms a few people will have to wait the extra day for me to crunch, but most wont even be slightly affected at all.


I agree with the pending stuff - who really cares if 1 or 2 out of 20 wu's has to wait an extra day. No one's going to throw a temper tantrum over it. Sure it loads the database, but all the board views and posts do too.
I'm a Prefectionist ;)
ID: 739582 · Report as offensive
Profile AndyW Project Donor
Volunteer tester
Avatar

Send message
Joined: 23 Oct 02
Posts: 5862
Credit: 10,957,677
RAC: 18
United Kingdom
Message 739707 - Posted: 16 Apr 2008, 6:09:58 UTC - in response to Message 739525.  

Look. You can all do what you like, but I'm saying there is no need to be so anally retentive over workunits. I ran out too, but did abc for a few hours and now have new seti WUs,


But I choose not to run any other projects. The machines are 24/7 SETI and that's my choice, as is the 10 day cache. If it were a problem the setting wouldn't be there or would be changed in the next revision. If it is, I'll obviously accept that because if I choose to participate in the project I choose to play by their rules.


it's no big deal!


Glad you think so, so we agree on something then :)
ID: 739707 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 739750 - Posted: 16 Apr 2008, 9:33:26 UTC - in response to Message 739707.  

Look. You can all do what you like, but I'm saying there is no need to be so anally retentive over workunits. I ran out too, but did abc for a few hours and now have new seti WUs,


But I choose not to run any other projects. The machines are 24/7 SETI and that's my choice, as is the 10 day cache. If it were a problem the setting wouldn't be there or would be changed in the next revision. If it is, I'll obviously accept that because if I choose to participate in the project I choose to play by their rules.


it's no big deal!


Glad you think so, so we agree on something then :)


The longest outage I can remember was 9 days.
In those days, I kept a 4 day cache, so I ran out of WU's.
Big deal!
It was my choice to run Einstein as a backup project, so they benefitted.
I would not criticise anyone for turning their PCs off if they run out of work... it's their choice what to do with their equipment. (AND power bill!)

I run a 10 day cache, as I have a pretty fast machine. It hurts no-one, as the results are returned pretty darn fast.
However, my own pending folder is now worth over 2,000 creds. If I was that way inclined, it would concern me, but it doesn't, because the WU's will get returned eventually.

It's not as if I had $2,000 stuck where I couldn't get at them, is it?

Respectfully,

lgm
ID: 739750 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 739757 - Posted: 16 Apr 2008, 10:36:11 UTC - in response to Message 739581.  

Ned, Americans aren't used to English debate - they see a strong point of view as a criticism, when it isn't!

I am merely stating my seasoned opinion as a highly qualified system administrator.

Last year's 9 day outage was an extreme event, with delays caused by negotiations, physical ordering and transportation and reconfiguring a server that is the core of the whole system.
It has now proved itself to be more resilient and as older machines are being updated the systems' reliability are improving. Outages now should be much less than two days.

The real problem with large caches is that all those WUs are tied up and have to be tracked and stored while in circulation and is 'unfriendly' to server resources.

Those with maximum caches probably have low pending credit because everyone else is waiting for them! They also run the risk that if their machine crashes, then all those numbers of WUs disappear and hold up workflow for many weeks and they linger in the database so requiring longer back up times and more storage.

So, I stand by my opinion:

If a person wants to be a good citizen, respect system resources and their fellow crunchers, then they should have a minimum cache necessary to get over normal outages of a day or two max.

If they live on an island and can only row to shore once a week to connect to the net, then a 10 day or more cache is perfectly appropriate. That is what the cache is *really* for!

Andy.
ID: 739757 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 739759 - Posted: 16 Apr 2008, 10:45:45 UTC - in response to Message 739750.  
Last modified: 16 Apr 2008, 10:46:23 UTC

I run a 10 day cache, as I have a pretty fast machine. It hurts no-one, as the results are returned pretty darn fast.


The cache is a First In Last Out queue (under normal conditions).

You may think you are processing them quickly, but all you are doing is processing WUs that have been already waiting for 10 days on your machine for their number to come up.

You still return your WUs 10 days later than you got them, and your machine's speed is totally irrelevant.

Faster machine = more WUs in the cache.
Slower machine = less WUs in the cache.
Still taking 10 days waiting for their turn.

Now, does anyone see the logic of why *unnecessarily* large caches are not good?
ID: 739759 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 739762 - Posted: 16 Apr 2008, 10:56:52 UTC - in response to Message 739759.  

Now, does anyone see the logic of why *unnecessarily* large caches are not good?


Please forgive me when I am on holiday next week. I intend to disconnect from the Internet but leave the machine slowly crunching away. I shall need about 8 days of cache. If I was away for 2 weeks then I should want to have 14 days' worth. I suspect there may be a fair amount of that during the summer days and not just people trying to avoid S@H downtime.

ID: 739762 · Report as offensive
Profile David
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 411
Credit: 1,426,457
RAC: 0
Australia
Message 739768 - Posted: 16 Apr 2008, 11:24:39 UTC - in response to Message 739757.  

If a person wants to be a good citizen, respect system resources and their fellow crunchers, then they should have a minimum cache necessary to get over normal outages of a day or two max.


Yep I agree - get them through a normal outage with room (or WU's) to spare. I had a 3 day cache, and after the last outage of < 2 days I had only a few WU's left (less than 6 hrs) on each PC, so that was good. I bumped the cache up by a day so I have a little more breathing room for future outages, but at the moment I have not resorted to a 10 day cache, but if outages end up long enough then I might have to lol



ID: 739768 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 739771 - Posted: 16 Apr 2008, 11:26:16 UTC

Andy Lee, what exactly is the reason behind the old woman style moaning?

If it doesn't bother you, why bring the subject matter up?
ID: 739771 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 739784 - Posted: 16 Apr 2008, 11:59:10 UTC

Last years Thumper outage lasted about 12-14 days including catchup time.

It started on 1st May 2007 and ended around 12 May 2007.

That was an exceptional outage. The most recent one was ~ 42 hours i.e. less than 2 days.

BOINC is designed to cope with whole project outages. OK some of you want to run SETI only. That is your choice. I have SETI as my highest resourece share, but I also manage to run 8 other projects (including SETI Beta), it should be 9 but I just joined orbit@home and they don't currently have any WU's.

I run a 2.5 to 3 day cache across all projects which usually gives me about 2 - 5 SETI WU's in reserve. Recent SETI WU crunch times have varied between 2 to 6.5 hours. It works for me.

If you are running SETI only, I don't see any good reason to go beyond 5 days cache, 3 days should be enough to cover most normal outages.

Even if you got a whole bunch of -9 overflows, you would be very unlucky to get more than 10 in a row.

I think some people run the big caches because they see that option and just "go for the biggest". If the sofware allowed it they would probably cache 21 or 30 days worth of WU's!

Just my thoughts, btw this thread seems to have gone a bit off topic.
Sir Arthur C Clarke 1917-2008
ID: 739784 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 739791 - Posted: 16 Apr 2008, 12:34:06 UTC

Lets just hope NEZ's timings coincide with a slitter problem, if his machines decide to request 200,000 thousand or so units at once then we could see a lot of issues., His RAC has tripled since I joined the project, so good knows what new machines he has managed to get working on Boinc. The guys are still having splitter problems now, if this spreads then we could see another outage.
ID: 739791 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 739801 - Posted: 16 Apr 2008, 13:13:03 UTC - in response to Message 739759.  

[quote]I run a 10 day cache, as I have a pretty fast machine. It hurts no-one, as the results are returned pretty darn fast.



You may think you are processing them quickly, but all you are doing is processing WUs that have been already waiting for 10 days on your machine for their number to come up.

Since I have a pending file nearly three times as much as my machines can process in a day, I'd say I'm getting stuff done faster than the rest of the quorum, in most cases. The 10 day cache isn't holding anyone up.

You still return your WUs 10 days later than you got them, and your machine's speed is totally irrelevant.

See above.

I hate to sound rude, because this isn't meant to be rude.
I think you're wide of the mark.
However, I shall respect your right to have an opinion.

Respectfully,

lgm
ID: 739801 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19171
Credit: 40,757,560
RAC: 67
United Kingdom
Message 739806 - Posted: 16 Apr 2008, 13:31:22 UTC

With seven day deadlines on VHAR units can you actually get a 10 day cache. I wouldn't have thought so.

The last time we looked at the spread of units across the AR spectrum, VHAR's made up 30% of all units, but only ~10% of crunching time. The percentage of VHAR's may have fallen since then but not by much.
ID: 739806 · Report as offensive
Profile AndyW Project Donor
Volunteer tester
Avatar

Send message
Joined: 23 Oct 02
Posts: 5862
Credit: 10,957,677
RAC: 18
United Kingdom
Message 739816 - Posted: 16 Apr 2008, 14:11:57 UTC - in response to Message 739806.  

With seven day deadlines on VHAR units can you actually get a 10 day cache. I wouldn't have thought so.




Any WUs with a short deadline run as "High Priority" in Boinc, so in effect jump the queue. In theory the cache would stay at 10 days as you are never going to get a run of hundreds of VHAR units.
ID: 739816 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19171
Credit: 40,757,560
RAC: 67
United Kingdom
Message 739819 - Posted: 16 Apr 2008, 14:15:45 UTC - in response to Message 739816.  

With seven day deadlines on VHAR units can you actually get a 10 day cache. I wouldn't have thought so.




Any WUs with a short deadline run as "High Priority" in Boinc, so in effect jump the queue. In theory the cache would stay at 10 days as you are never going to get a run of hundreds of VHAR units.

But as soon as you are in EDF, you are inhibited from downloading more units.
And were you not here at the end of last Dec, until Matt changed the splitter sequence, we had nothing but VHAR's.
ID: 739819 · Report as offensive
Profile AndyW Project Donor
Volunteer tester
Avatar

Send message
Joined: 23 Oct 02
Posts: 5862
Credit: 10,957,677
RAC: 18
United Kingdom
Message 739826 - Posted: 16 Apr 2008, 14:37:47 UTC - in response to Message 739819.  



And were you not here at the end of last Dec, until Matt changed the splitter sequence, we had nothing but VHAR's.



I missed that fun as I had a 2 year break from SETI after moving house and selling all my belongings. Only started crunching again in February this year.

Nothing buy VHAR's? How on Earth did the servers/network stand up to that hammering...or didn't they?!
ID: 739826 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19171
Credit: 40,757,560
RAC: 67
United Kingdom
Message 739843 - Posted: 16 Apr 2008, 15:35:53 UTC - in response to Message 739826.  
Last modified: 16 Apr 2008, 15:44:12 UTC



And were you not here at the end of last Dec, until Matt changed the splitter sequence, we had nothing but VHAR's.



I missed that fun as I had a 2 year break from SETI after moving house and selling all my belongings. Only started crunching again in February this year.

Nothing buy VHAR's? How on Earth did the servers/network stand up to that hammering...or didn't they?!

The servers did quite well, you can see on the Yearly Cricket graphs that the average comms rates were at about the highest between ~20th Dec to ~5th Jan.

Matt's tech News post when he first notices the hig traffic volumes and lots of VHAR's is post 696817, Happy 2454466.5! (Jan 02 2008)
ID: 739843 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 739865 - Posted: 16 Apr 2008, 16:29:40 UTC - in response to Message 739762.  

Now, does anyone see the logic of why *unnecessarily* large caches are not good?


Please forgive me when I am on holiday next week. I intend to disconnect from the Internet but leave the machine slowly crunching away. I shall need about 8 days of cache. If I was away for 2 weeks then I should want to have 14 days' worth. I suspect there may be a fair amount of that during the summer days and not just people trying to avoid S@H downtime.


Which bit of *unnecessarily* didn't you understand? You don't need my forgiveness!
That's a perfectly valid use of the cache.
ID: 739865 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 739874 - Posted: 16 Apr 2008, 16:42:41 UTC - in response to Message 739801.  

I hate to sound rude, because this isn't meant to be rude.
I think you're wide of the mark.


No rudeness inferred! I'd die for your right to say I'm wide of the mark, but I also respectfully don't agree with you.
I am looking at the process from a system and resource allocation perspective, instead of limiting my horizons to the extent of my cache!

If everyone behaved this way at the dinner table the cook would be trying to make room for supplies all the time instead of actually doing the cooking, while those that want to take as many cookies as they can in case they're hungry later cause some to go without! ...or something like that.

Just good manners - take what you need when you need it, and keep enough for downtime, not more unless you only have intermittent net access.

Andy.
ID: 739874 · Report as offensive
Profile Logan
Volunteer tester
Avatar

Send message
Joined: 26 Jan 07
Posts: 743
Credit: 918,353
RAC: 0
Spain
Message 740083 - Posted: 16 Apr 2008, 22:07:19 UTC
Last modified: 16 Apr 2008, 22:15:32 UTC

Upppssss....!

Uploads don't work. (downloads are working fine, by the moment...).


Best regards.
Logan.

BOINC FAQ Service (Ahora, también disponible en Español/Now available in Spanish)
ID: 740083 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 14 · Next

Message boards : Number crunching : Panic Mode On (7) Server Problems! Closed for Renovation


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.