Working as Expected (Jul 13 2009)

Message boards : Technical News : Working as Expected (Jul 13 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

AuthorMessage
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 918079 - Posted: 15 Jul 2009, 16:39:22 UTC - in response to Message 918077.  

Were back online :-), GREAT WORK MATT!
ID: 918079 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 918109 - Posted: 15 Jul 2009, 17:21:16 UTC - in response to Message 917951.  

Poor You!

I have 165 Work Units that are waiting to be uploaded and have almost run out of units to crunch.

I am beginning to wonder if Boinc is "really" looking for "Aliens"? All this talk about slowing things down is starting to irritate me.

How about getting the Boinc end of the data receiving right???!

Not a week goes by when there is no breakdown in one or another of the equipment on the Boinc end and I have Work Units stacking up waiting to be up-loaded.

My question is: Do you want us to crunch the data, OR, don't you want us to crunch the data?

Joe


By enhancing the resolution they will be able to detect smaller signals then currently can be done.

Their situation is like posting a message that you need help moving and then having 50,000 people show up at your door. The saying "Be careful what you ask for because you might just get it." fits really well here I think.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 918109 · Report as offensive
agnawt

Send message
Joined: 25 Jun 07
Posts: 15
Credit: 4,838,223
RAC: 0
Israel
Message 918132 - Posted: 15 Jul 2009, 18:01:28 UTC - in response to Message 918123.  
Last modified: 15 Jul 2009, 18:02:11 UTC

As usual with everything in life, I do not expect it to work at all. In that way if it indeed doesn't work, I do not get disappointed, but if it works I do get very happy :-)


Now there's an idea! Makes a nice QOTD too.
ID: 918132 · Report as offensive
Profile Profi
Volunteer tester

Send message
Joined: 8 Dec 00
Posts: 19
Credit: 20,552,123
RAC: 0
Poland
Message 918133 - Posted: 15 Jul 2009, 18:01:44 UTC - in response to Message 918123.  



As usual with everything in life, I do not expect it to work at all. In that way if it indeed doesn't work, I do not get disappointed, but if it works I do get very happy :-)



OK - you have an ambivalent approach to whole situation. Fair for me. Nevertheless I expect S@H to work otherwise it's a bit a waste of time for me :).

BTW subject may be obsolete because something is going on (server is not just dropping connections with connect() failures of HTTP errors - it's up but probably overwhelmed with requests).

Cheers,

ID: 918133 · Report as offensive
Jim Brynelson

Send message
Joined: 7 Jun 09
Posts: 1
Credit: 8,198
RAC: 0
Canada
Message 918140 - Posted: 15 Jul 2009, 18:13:44 UTC

I'm glad I checked the forum or I woulda never known why the 6 tasks that have been completed were still sitting in there in "upload" mode & have been for the last 4 days. I thought maybe it was my setup, but since I know it isn't that..I guess I'll just have to be patient.
Take care & have a great day
ID: 918140 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 918179 - Posted: 15 Jul 2009, 19:16:48 UTC - in response to Message 918140.  

I'm glad I checked the forum or I woulda never known why the 6 tasks that have been completed were still sitting in there in "upload" mode & have been for the last 4 days. I thought maybe it was my setup, but since I know it isn't that..I guess I'll just have to be patient.
Take care & have a great day


Now THAT post reflects a thoughtful and caring Canadian. ;)

ID: 918179 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918189 - Posted: 15 Jul 2009, 19:48:25 UTC - in response to Message 918049.  

p.s.
Certainly situation is opposite to thread title "Working as expected" - especially when looking from a cruncher's point of view..

I had the same comment as Sten-Arne.

It is working as you'd expect it to work when you have 180,000 BOINC clients trying to connect to one or two upload servers.

It's easy to say "more servers" but SETI has a finite amount of money and equipment, a finite amount of room in the server closet, a finite number of electrical outlets, and a finite amount of A/C.
ID: 918189 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918191 - Posted: 15 Jul 2009, 19:51:13 UTC - in response to Message 918070.  

p.s. I may be wrong. I'm just concerned about the frequency of the downtimes and lack of info. I'm wishing all the best for Matt and other guys @ Berkeley trying to get the best of the limited hardware resources.

Remember that the SETI@Home operations staff is basically three people.

Given the choice of frequent updates, or Matt giving the tasks his undivided attention, I vote for undivided attention.
ID: 918191 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 918230 - Posted: 15 Jul 2009, 21:29:31 UTC - in response to Message 918191.  

Given the choice of frequent updates, or Matt giving the tasks his undivided attention, I vote for undivided attention.


Agreed.

ID: 918230 · Report as offensive
Profile CT-ET@seti
Avatar

Send message
Joined: 23 Jul 02
Posts: 19
Credit: 967,620
RAC: 0
United States
Message 918231 - Posted: 15 Jul 2009, 21:29:59 UTC

Well as all of you know the upload server is offline and my 2 computers are full of work to upload I would offer my new monster puter as a server and i really not sure how it works but oh well have to wait till its fixed i guess and louse all the work that goes out of date
Me in front of the RED BEAR INN in Freiburg, Germany in 2000
ID: 918231 · Report as offensive
Zebra3
Avatar

Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 918238 - Posted: 15 Jul 2009, 21:47:02 UTC - in response to Message 918077.  

I'm wishing all the best for Matt and other guys @ Berkeley trying to get the best of the limited hardware resources.


I see that the donations for July are still increasing.

http://setiathome.berkeley.edu/donation_history.php?month=07&year=2009

Nothing wrong with the small donations. Any cash donation would help increase the physical resources needed to operate the project.

I realize that the majority donate to the project with their computer resources, and that the minority contribute to the project with their financial resources.

For the welfare of the project, I'm concerned enough as to make a small financial donation each month, something that I should have done along time ago.

While I am not eligible to get a tax receipt (non U.S.), it is not important. What is important is the continuation of the project.

Joe



Actually Joe I had a chat with CCRA the other day on this exact topic and I was informed that donations to accredited foreign universities are deductible under Revenue Canada rules. If you look on the reverse of your receipt you will see the revenue Canada tax number. So send as many Loonies and Twonies that you like and you will be able to deduct the donation on your taxes.
http://www.novascotia.com
ID: 918238 · Report as offensive
Val

Send message
Joined: 13 Sep 08
Posts: 1
Credit: 665,382
RAC: 0
Canada
Message 918258 - Posted: 15 Jul 2009, 22:48:24 UTC

thks Z3, just posted $10 and waiting for the receipt b4 tax time. Deductable too, eh :)
ID: 918258 · Report as offensive
Profile Profi
Volunteer tester

Send message
Joined: 8 Dec 00
Posts: 19
Credit: 20,552,123
RAC: 0
Poland
Message 918271 - Posted: 15 Jul 2009, 23:30:20 UTC - in response to Message 918189.  


I had the same comment as Sten-Arne.


Good for you. Agreed. As many peoples, as many approaches. I have different - less "I don't give a s*** about .....". I care if the S@H is working or not - for you it seems like "if it works - it works - cool....if not - ok - who cares - it's just a (stupid) project". It's always a question how you see things - one can see a pot a half full others a half empty.... you know what I mean? (please don't argue who have which approach- it doesn't matter here)


It is working as you'd expect it to work when you have 180,000 BOINC clients trying to connect to one or two upload servers.


NO - it's your opinion. If the project is aiming to be a distributed project their designers have to bare in mind it's grow rate, increase number of users and many other different things. One of them may be distribution of the servers distributing WU's and collecting results - many projects have done so because they've realized that they will not withstand the demand from so many clients. 180000 clients it's not a huge number if you consider OSSI or other services... The number of servers have to be increased and load-balanced one way or another. Imagine that processing power increasing in a geometric (quadratic) rate (doubles every 18 months) an in addition you are enabling technologies such as CUDA.... - no way that single farm of servers (limited to one or two racks) and a single 100 Base-T connection (sigh...) would support such projects. The solution is: more servers load-balanced and much faster connection or distribution of the load to many sites and exchange the information between them (still don't know how - I realize that this issue was already discussed with Matt and "atomization" is a barrier to overcome - it simply has to be done one way or another - suggestions welcomed here. I know Matt is trying to squeeze as much as he can from what he's got, but he's only a human... (give some ideas here to help him)


It's easy to say "more servers" but SETI has a finite amount of money and equipment, a finite amount of room in the server closet, a finite number of electrical outlets, and a finite amount of A/C.


If you trying to maintain such a project as S@H localized in one place you have to have an infrastructure to keep up with a demand - otherwise you'll land with problems currently bugging S@H. I imagine, that among those 180000 active users there are people/institutions able to support the distribution of the workload (WU's distribution, collection of the results etc.) Current demand of the S@H clients exceeds the capability of the SSL S@H hardware and it will only get worse as time will go by. So to have a project continuity some solution have to be found now - otherwise due to downtimes people will migrate to other projects such as Folding@Home, Rosetta@Home or World Community Grid (list of projects can be found at www.distributedcomputing.info). And concerning a limited resources - I know its always money....and I know that Matt at al DO NOT HAVE the money to buy a new equipment - so they have to come up with an idea which will overcome the current problem.

Cheers,
ID: 918271 · Report as offensive
Profile Profi
Volunteer tester

Send message
Joined: 8 Dec 00
Posts: 19
Credit: 20,552,123
RAC: 0
Poland
Message 918281 - Posted: 15 Jul 2009, 23:43:00 UTC - in response to Message 918191.  


Remember that the SETI@Home operations staff is basically three people.


OK - so spread the workload. Many peoples have started their own BOINC project on the basis of BOINC platform developed @ Berkeley (usually with help from BOINC
developers) - I imagine that there is a possibility to launch a site to support the clients from Europe, another site for Asia etc. within the S@H especially with support from Matt and other brilliant guys running SETI now. There must be a way to distribute the workunits to different sites...


Given the choice of frequent updates, or Matt giving the tasks his undivided attention, I vote for undivided attention.


To drop a line once every day is not a big deal. Quick note is enough, i.e "We have a upload server problem and we are working on it" and after solving a problem some more elaboration what was the problem and what solution have been found. I'm against the situation that the people who don't have an experience with S@H or sufficient computer knowledge don't know what is going on. I realize that this note may ignite some discussion, but at least one will know what the situation is.

Cheers,
ID: 918281 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918289 - Posted: 15 Jul 2009, 23:56:54 UTC - in response to Message 918271.  


I had the same comment as Sten-Arne.


Good for you. Agreed. As many peoples, as many approaches. I have different - less "I don't give a s*** about .....". I care if the S@H is working or not - for you it seems like "if it works - it works - cool....if not - ok - who cares - it's just a (stupid) project". It's always a question how you see things - one can see a pot a half full others a half empty.... you know what I mean? (please don't argue who have which approach- it doesn't matter here)


It is working as you'd expect it to work when you have 180,000 BOINC clients trying to connect to one or two upload servers.


NO - it's your opinion. If the project is aiming to be a distributed project their designers have to bare in mind it's grow rate, increase number of users and many other different things. One of them may be distribution of the servers distributing WU's and collecting results - many projects have done so because they've realized that they will not withstand the demand from so many clients.

The first step in solving a problem is defining the problem.

Your statement is "the project must throw hardware (money) at the problem because 180,000 clients are going to cause this problem."

My statement is "The problem is caused by 180,000 clients accessing the servers all at once."

They can find some funding to bring in the fiber (not fast), they can move the servers someplace else on campus (probably faster, but still not fast), or there may be some way to get the BOINC clients to slow down and patiently take turns -- also not fast, but far less expensive than adding facilities.
ID: 918289 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918293 - Posted: 16 Jul 2009, 0:05:53 UTC - in response to Message 918271.  


Good for you. Agreed. As many peoples, as many approaches. I have different - less "I don't give a s*** about .....". I care if the S@H is working or not - for you it seems like "if it works - it works - cool....if not - ok - who cares - it's just a (stupid) project". It's always a question how you see things - one can see a pot a half full others a half empty.... you know what I mean? (please don't argue who have which approach- it doesn't matter here)

Please do not put words in my mouth -- because that is not my position at all.

I want to see it work, perfectly, every day, but there are two very common fallacies that color your thinking:

1) The servers are down right now, and this time-critical work has to get back right now!

2) The project requires smooth flow at all times.

On the first point, we're working on signals that are decades old -- assuming that the source is relatively nearby. It likely isn't.

On the second point, the amazing thing about BOINC that is not true of most other internet based activities is that the project itself can tolerate some bad times.

The problem, which you have illustrated quite well, is that the BOINC servers aren't unhappy. The BOINC clients are retrying on their own, and will get work through when they can.

... and many of the users who watch closely are worried and upset.

I'm not just sitting back and saying "this is how it is" -- when I'm not correcting people when they mis-state my position, I'm reading BOINC source code and thinking about ways to stop the BOINC clients from hammering the BOINC servers.

I think I can see ways to make things better. They're worth trying.

In the meantime, I think we all get to choose: we can be anxious or worried or angry, or we can see if there is some way to make BOINC work better, or we can see if there isn't some way to help get the gigabit upgrade.

ID: 918293 · Report as offensive
Profile Profi
Volunteer tester

Send message
Joined: 8 Dec 00
Posts: 19
Credit: 20,552,123
RAC: 0
Poland
Message 918305 - Posted: 16 Jul 2009, 0:48:58 UTC - in response to Message 918293.  


Please do not put words in my mouth -- because that is not my position at all.


I don't put ayn words in your mouth - I'm not claiming that you've said so - it's just my perception of what you havve said - or maybe more correctly - what I have understood of what you've said.


I want to see it work, perfectly, every day,[..]


Totally agree.



but there are two very common fallacies that color your thinking:

1) The servers are down right now, and this time-critical work has to get back right now!

2) The project requires smooth flow at all times.



1) this happens - what worries me it is the downtime frequency nowadays.
It's not a time-critical work - and will never be such.


On the first point, we're working on signals that are decades old -- assuming that the source is relatively nearby. It likely isn't.



The signals are decades old - so? The very same workunits have been mangled several times each time the processing algorithm changed - no big deal. If not processed with the new app - they certainly will be..with increased resolution, sensitivity, chirp rate, etc. There is so many knobs to turn to improve the quality of the search - but it was not possible to do when the S@H started - now we have our "big shiny powerful" machines...


On the second point, the amazing thing about BOINC that is not true of most other internet based activities is that the project itself can tolerate some bad times.


Not forever nor on frequent basis - if you bare in mind that S@H has a big connection/ disk IO problem outages does not help.It may occur that the project will not recover between outages and than people will be more and more frustrated. BOINC people encouraging participants to devote CPU time to many projects, but there are people (including me) who wish to dedicate their CPU cycles to one project (at least a lion part to one of them)


The problem, which you have illustrated quite well, is that the BOINC servers aren't unhappy. The BOINC clients are retrying on their own, and will get work through when they can.


As mentioned above: Here is the exemplary reason-result chain: Machine is crunching mostly S@H -> a WU's queue has drained -> machine is waiting to upload the results to get new WU's -> Due to outage machine is idle -> switch to another project co occupy idle time or wait for the project to recover.
If the last part is repeated on the frequent basis one may switch to more stable project.


... and many of the users who watch closely are worried and upset.


As for me - more worried then upset - no harsh feelings towards S@H team - they are doing exceptionally good work with what they have. And it seems like S@H skipped beyond developers' imagination. Unfortunately the popularity of S@H may (maybe not) terminate or at least significantly decrease the growth of the project.


I'm not just sitting back and saying "this is how it is" -- when I'm not correcting people when they mis-state my position, I'm reading BOINC source code and thinking about ways to stop the BOINC clients from hammering the BOINC servers.


Uff.... - tho you must have to think quicker.. :) - no - don't get offended.. just kiddig. Keep up a good work - all help is appreciated.


I think I can see ways to make things better. They're worth trying.

Sure - trac'em. Hence S@H staff members number is so limited they may not come up with an idea/don't have a knowledge as great 180 k users.


In the meantime, I think we all get to choose: we can be anxious or worried or angry, or we can see if there is some way to make BOINC work better, or we can see if there isn't some way to help get the gigabit upgrade.


anxious - maybe a bit
worried - sure - I am a bit S@H addicted..:P
angry - not for sure - there is no one to blame - moreover I'm doing this for fun mostly.

Gigabit is one solution, but looking at the progress rate of SETI it is only temporary one - S@H will stand in front of the hardware problem in the future - I think that the better way is to distribute the backend of the project on many sites preferably distributed across the globe.
ID: 918305 · Report as offensive
Nicolas
Avatar

Send message
Joined: 30 Mar 05
Posts: 161
Credit: 12,985
RAC: 0
Argentina
Message 918356 - Posted: 16 Jul 2009, 3:55:05 UTC - in response to Message 918305.  

Gigabit is one solution, but looking at the progress rate of SETI it is only temporary one - S@H will stand in front of the hardware problem in the future - I think that the better way is to distribute the backend of the project on many sites preferably distributed across the globe.

Many people suggested putting multiple servers around the world for clients to download from, so the bandwidth is distributed. But how do you get the data to those servers in the first place? It would have to go from the servers at Berkeley to the distributed download servers through the current 100mbit pipe. I don't see how that lowers bandwidth at all.

Same for upload. Yes, you can put multiple upload servers around the world, but eventually, somehow, the data has to go back to setiathome.berkeley.edu.


Contribute to the Wiki!
ID: 918356 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 918370 - Posted: 16 Jul 2009, 4:30:53 UTC - in response to Message 918356.  

Gigabit is one solution, but looking at the progress rate of SETI it is only temporary one - S@H will stand in front of the hardware problem in the future - I think that the better way is to distribute the backend of the project on many sites preferably distributed across the globe.

Many people suggested putting multiple servers around the world for clients to download from, so the bandwidth is distributed. But how do you get the data to those servers in the first place? It would have to go from the servers at Berkeley to the distributed download servers through the current 100mbit pipe. I don't see how that lowers bandwidth at all.

Same for upload. Yes, you can put multiple upload servers around the world, but eventually, somehow, the data has to go back to setiathome.berkeley.edu.


Also I don't believe the project scientists would like to lose direct control of the project data. It's like a "chain of custody" in an investigation. If you can't account for the security of the data throughout the process then questions will arise if something of some importance is claimed at a later date.
Boinc....Boinc....Boinc....Boinc....
ID: 918370 · Report as offensive
Nicolas
Avatar

Send message
Joined: 30 Mar 05
Posts: 161
Credit: 12,985
RAC: 0
Argentina
Message 918371 - Posted: 16 Jul 2009, 4:38:23 UTC - in response to Message 918370.  

Also I don't believe the project scientists would like to lose direct control of the project data. It's like a "chain of custody" in an investigation. If you can't account for the security of the data throughout the process then questions will arise if something of some importance is claimed at a later date.

You can be sure files won't be modified. There are checksums on all files, and additionally digital signatures for executable files.

For uploads, the client first sends the file to the upload server, then sends checksums when reporting the task (which wouldn't go through any intermediate server).

Contribute to the Wiki!
ID: 918371 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

Message boards : Technical News : Working as Expected (Jul 13 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.