Panic Mode On (21) Server problems

Message boards : Number crunching : Panic Mode On (21) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 12 · Next

AuthorMessage
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 919153 - Posted: 18 Jul 2009, 22:55:58 UTC - in response to Message 919100.  

I had recently shut down all of my crunchers due to a spell in hospital. When I try to restart them all I get is continued failures to upload or download. I have been with SETI@Home since 1999 and would really love to continue to crunch, but you cannot have several machines sitting around waiting to upload and you can't switch them off as that will loose the work and upset your wingman.

I will set no new tasks and crunch something that might actually be useful .

Bernie

So, pick another project that interests you and set the resource shares.

BOINC is designed to meet resource shares even when there are outages. You can crunch something else and when SETI is working well you'll do all SETI for a while.

SETI has always promised that there will be times when you can't get work, and has always recommended crunching for more than one project.

You should not punish them for doing what they've always promised.
ID: 919153 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 919154 - Posted: 18 Jul 2009, 23:02:23 UTC
Last modified: 18 Jul 2009, 23:53:59 UTC

All 4 of my boxes have crunched the work they have on hand and a short time ago the 50 work units they were trying to upload went through. I only had to press "retry" a couple of times. So now my machines are on an extended holiday.

[edit]I hope Berkeley can get these problems worked out soon.
Boinc....Boinc....Boinc....Boinc....
ID: 919154 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 919206 - Posted: 19 Jul 2009, 1:57:42 UTC

Ran out of work on both machines yesterday. Gave the P4 some cpu rest and let the Mac crunch Milkyway full bore. saw that lunatics had the new version of unfied installer out so put that on the P4 last night.
Oh i forgot i installed the 6.6.36 version on the Mac. P4 already had that put on a couple of outages ago.
There is always something to do when there is an outage. Next one ill blow out the dust bunnys.
[/quote]

Old James
ID: 919206 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 919207 - Posted: 19 Jul 2009, 2:03:29 UTC - in response to Message 919154.  

All 4 of my boxes have crunched the work they have on hand and a short time ago the 50 work units they were trying to upload went through. I only had to press "retry" a couple of times. So now my machines are on an extended holiday.

[edit]I hope Berkeley can get these problems worked out soon.


Geek@Play

Glad to hear it that. Have a Wonderful Vacation

Regards

Please consider a Donation to the Seti Project.

ID: 919207 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 919289 - Posted: 19 Jul 2009, 11:07:33 UTC
Last modified: 19 Jul 2009, 11:09:56 UTC

And it start's again - uploadserver "bruno" ist signed down (Disabled) on the statuspage.

__W___
ID: 919289 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 919299 - Posted: 19 Jul 2009, 11:49:00 UTC
Last modified: 19 Jul 2009, 12:00:49 UTC

I have read Douglas Adams and so i "don't panik" - just disabled network connection, having a real cup of coffee and wait some hours 'til it comes back up again.
And because i am not a heavy cruncher running for credits (only running a notebook with a T7200 at 85% cpu-time) i smoothly lean back, looking at the stats and wait what 's going on.
Meanwhile my machine can crunch its 3-day-cache. The time to upload will come back - sooner or later ;-)
__W__
ID: 919299 · Report as offensive
Zebra3
Avatar

Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 919305 - Posted: 19 Jul 2009, 12:11:05 UTC - in response to Message 919299.  

I have read Douglas Adams and so i "don't panic" - just disabled network connection, having a real cup of coffee and wait some hours 'til it comes back up again.
And because i am not a heavy cruncher running for credits (only running a notebook with a T7200 at 85% CPU-time) i smoothly lean back, looking at the stats and wait what 's going on.
Meanwhile my machine can crunch its 3-day-cache. The time to upload will come back - sooner or later ;-)
__W__



Exactly!! Why get in a tizzy about something that is out of the users control...it will get fixed when it get fixed...until then crunch on other projects if you run out of Seti work and come back when the WU's are there to be gotten.
http://www.novascotia.com
ID: 919305 · Report as offensive
Zebra3
Avatar

Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 919324 - Posted: 19 Jul 2009, 13:04:50 UTC - in response to Message 919313.  
Last modified: 19 Jul 2009, 13:10:48 UTC

I thought Seti was a science project to find life other than ours and the credits were just a bi product of that project...
http://www.novascotia.com
ID: 919324 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 919327 - Posted: 19 Jul 2009, 13:15:54 UTC - in response to Message 919324.  

I thought Seti was a science project to find life other than ours and the credits were just a bi product of that project...


Science project or not, it is just crazy that you ask people to help with this project and invest power and time and not being able to keep servers online!

I guess that this project costs allot off money, witch is mostly subsidized and bartered.
Please take my advice and barter bandwidth and knowledge from a professional data-centre.
ID: 919327 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 919330 - Posted: 19 Jul 2009, 13:24:27 UTC - in response to Message 919329.  

I thought Seti was a science project to find life other than ours and the credits were just a bi product of that project...


Science project or not, it is just crazy that you ask people to help with this project and invest power and time and not being able to keep servers online!

I guess that this project costs allot off money, witch is mostly subsidized and bartered.
Please take my advice and barter bandwidth and knowledge from a professional data-centre.


They ask for spare CPU cycles, not dedicated super cruncher farms.

Don't complain if you provide more than asked for.



Don't ask for help if you can not handle help or notify users that their is a limit on the number of machines you use for crunching.

Why the discussion; Just talk with the right people about possibilities of hosting the WU servers at a location that is able to handle the demand!

Nothing more..
ID: 919330 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 919333 - Posted: 19 Jul 2009, 13:27:51 UTC - in response to Message 919327.  

I thought Seti was a science project to find life other than ours and the credits were just a bi product of that project...


Science project or not, it is just crazy that you ask people to help with this project and invest power and time and not being able to keep servers online!


They ask for spare CPU cycles, but they never promised they'd have work available 100% of the time. This project has had server issues since day 1. Its the nature of the beast with hand-me-down or beta hardware and little money to finance everything else.

I guess that this project costs allot off money, witch is mostly subsidized and bartered.
Please take my advice and barter bandwidth and knowledge from a professional data-centre.


The situation is far more complex then simply needing to barter with companies. There are certain rules and guidelines of the University of California @ Berkeley that need to be followed. There's a lot of red tape and bureaucracy that needs to happen before they can proceed with their plans.

They are already working on solutions to these problems, but due to the aforementioned red tape, it can sometimes take months before they can continue. In the meantime, people simply need to be patient.

It may be like a video game for you, but it is their life's work for them. They take it very seriously and they are doing the best they can.
ID: 919333 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 919335 - Posted: 19 Jul 2009, 13:32:02 UTC - in response to Message 919330.  
Last modified: 19 Jul 2009, 13:33:24 UTC

Don't ask for help if you can not handle help or notify users that their is a limit on the number of machines you use for crunching.


The problem is they have far more work than they have crunchers, but the infrastructure can only handle so much. How much precisely wasn't previously known.

They do not want to limit the number of machines you can use for crunching because this would be quite totalitarian in nature, and they prefer to leave it as user choice. Besides, where do you draw the line? Not all machines are built alike? I have a bunch of AMD K6's, Pentium III's and Pentium 4's that the servers can handle just fine. Its the faster machines pounding for so much work that the servers just can't handle.

All they ever asked for was spare cycles from your typical machine. People who take it upon themselves to build larger farms are not discouraged because its ultimately their choice - but in that choice they also need to understand that they never promised work 100% of the time.

Why the discussion; Just talk with the right people about possibilities of hosting the WU servers at a location that is able to handle the demand!

Nothing more..


Again, its not that simple. The entire server closet has to be moved or none of it can be moved. Matt already stated that its an all-or-nothing type of thing. They also need access to the servers, and they need money to run those servers, or lease them from someone else as you suggest.
ID: 919335 · Report as offensive
Zebra3
Avatar

Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 919336 - Posted: 19 Jul 2009, 13:34:47 UTC - in response to Message 919327.  

I thought Seti was a science project to find life other than ours and the credits were just a bi product of that project...


Science project or not, it is just crazy that you ask people to help with this project and invest power and time and not being able to keep servers online!

I guess that this project costs allot off money, witch is mostly subsidized and bartered.
Please take my advice and barter bandwidth and knowledge from a professional data-center.


I see you have 46 processors running...that's a lot of kitties to feed and right now I'm sure you are having problems like most users are filling their caches but as Sten-Arne says the project is not asking for supercomputers to crunch 24/7.

It all comes down to the funding that is available to the project by donations from volunteers and in this current economy I am sure those donations are down thus affecting purchases of equipment,upgrades,etc.

Another point is that Matt is on vacation so they are shorthanded this week.

Things will improve just give it time to happen.
http://www.novascotia.com
ID: 919336 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 919341 - Posted: 19 Jul 2009, 13:43:44 UTC - in response to Message 919289.  

And it start's again - uploadserver "bruno" ist signed down (Disabled) on the statuspage.

And because no-one can upload, eventually no-one will be able to download, therefore the loading problem on the download servers and bandwidth troubles will be solved. It's working already check the Cricket graphs.

Then we will all have to wait till Matt gets back from holidays and finds the ethernet cable he accidently kicked out while working in the back of the rack just before he left :-)

Fiendishly cunning these aliens
ID: 919341 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919342 - Posted: 19 Jul 2009, 13:46:18 UTC
Last modified: 19 Jul 2009, 13:52:48 UTC


@ john deneer

Thanks, that you feel will me.. ;-)

Uhh.. well done.. your Core i7 940 from 2.93 to 4.0 GHz !

BTW. You let run opt. apps?
I saw you have the old 182.50 nVIDIA driver.. with CUDA_V2.2 and opt. CUDA_V12_app and 186.18_nVIDIA_driver it should run little faster.. ;-)
Look here and take the new installer: http://lunatics.kwsn.net/index.php?module=Downloads;catd=9
nVIDIA driver.. from their homepage..

I would let run per GPU one CPU-Core idle. Because of the BOINC client [boinc.exe] which have sometimes big CPU peaks. (depend of your WU cache size)
Because of this all other under 'normal' are involved/disturbed - CPU and GPU tasks. It could be that the CPU support of the GPUs would be stop and then the GPU calculation make a break.

That's why I don't crunch on the CPU of my GPU cruncher.
1 CPU-Core for boinc.exe and one CPU-Core for 'System' (what ever this is in TaskManager). And if two CUDA tasks stop similar, the new CUDA tasks can be prepared simultaneously on the CPU without disturbing the other two GPUs.
If 4 CUDA WUs finish simultaneously.. two GPUs idle to the time the other two CUDA WUs are prepared on the CPU. The CPU preparation on my GPU cruncher is ~ 12 sec./CUDA WU.
AMD Phenom II X4 940 BE @ 4 x 3.0* GHz with 4 x OCed GTX260-216
[* for now stock speed, without OC]


@ -= Vyper =-

Yes.. 'remember the time'.. before 1/2 year.. everything was well.. ;-)


@ all

Please be patient.. we all need to be..
AFAIK, the kind man (Matt) which normally manage the server at SETI@home is in vacation.
Maybe only for one week? If - then tomorrow he will be in the lab and will 'kick' all server he can see.. ;-)

BTW. Yes, my GPU cruncher will idle again in ~ 216 WUs or ~ 6 hours (if all normal AR WUs)
Because the UL server is offline and BOINC can't UL ~ 300 results, because of this no new work request.

ID: 919342 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 919350 - Posted: 19 Jul 2009, 14:00:13 UTC - in response to Message 919333.  

They ask for spare CPU cycles, but they never promised they'd have work available 100% of the time. This project has had server issues since day 1. Its the nature of the beast with hand-me-down or beta hardware and little money to finance everything else.


Knowing that Seti is now running for 10 Years and that this project shows that within this period the stability of their servers and network did not increased stability makes me ask allot off questions.

It can not be true that a well known project as Seti isn't able to attract help from larger companies that can help with hardware and bandwidth. (Marketing)

The situation is far more complex then simply needing to barter with companies. There are certain rules and guidelines of the University of California @ Berkeley that need to be followed. There's a lot of red tape and bureaucracy that needs to happen before they can proceed with their plans.

They are already working on solutions to these problems, but due to the aforementioned red tape, it can sometimes take months before they can continue. In the meantime, people simply need to be patient.

It may be like a video game for you, but it is their life's work for them. They take it very seriously and they are doing the best they can.


I can understand that this project is really complex in terms off rules and bureaucracy. Just make sure that the fundament of this huge project is stable
and able to handle the responses that it gets.

Anyway; i guess my vision on this matter is over reacted and i would like to rest my case but making me think to stop helping Seti if these stability issues stay.
ID: 919350 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 919354 - Posted: 19 Jul 2009, 14:09:25 UTC - in response to Message 919350.  
Last modified: 19 Jul 2009, 14:20:35 UTC

They ask for spare CPU cycles, but they never promised they'd have work available 100% of the time. This project has had server issues since day 1. Its the nature of the beast with hand-me-down or beta hardware and little money to finance everything else.


Knowing that Seti is now running for 10 Years and that this project shows that within this period the stability of their servers and network did not increased stability makes me ask allot off questions.


Do you also take into account that they have hand-me-down servers, and beta hardware donated that doesn't always work right? Do you also take into account the added processing power over the last 10 years?

If you want stability to increase at the same rate, you need money for better servers, UPSs, more staff. SETI@Home is not a business - they do not have a steady source of income. They must make do with what they have. Their servers have not increased in power at the same rate as the average cruncher's machine. This was bound to happen without a constant influx of money. Its no wonder they have the stability issues they have.

It can not be true that a well known project as Seti isn't able to attract help from larger companies that can help with hardware and bandwidth. (Marketing)


Why can't it be true? Because looking for aliens is still as popular now as it was back when it started? Because people are more interested in the economy and global warming than they are in finding "little green men"?

But you're right. Intel has donated several severs to help out. They're all beta hardware of course, and the hardware is flaky because of it. They were able to negotiate a 1Gbit connection to the internet with a company for cheaper than their previous 100Mbit connection, but due to several University policies, they don't have 1Gbit access directly to the lab.

Most companies that donate want something in return. For donating beta hardware, SETI gets to be a real life guinea pig to find the performance/stress point of said hardware before it gets fixed in later revisions and released to other companies for profit.

The situation is far more complex then simply needing to barter with companies. There are certain rules and guidelines of the University of California @ Berkeley that need to be followed. There's a lot of red tape and bureaucracy that needs to happen before they can proceed with their plans.

They are already working on solutions to these problems, but due to the aforementioned red tape, it can sometimes take months before they can continue. In the meantime, people simply need to be patient.

It may be like a video game for you, but it is their life's work for them. They take it very seriously and they are doing the best they can.


I can understand that this project is really complex in terms off rules and bureaucracy. Just make sure that the fundament of this huge project is stable
and able to handle the responses that it gets.

Anyway; i guess my vision on this matter is over reacted and i would like to rest my case but making me think to stop helping Seti if these stability issues stay.


I'm sure Eric would love to alleviate the server issues and make all the crunchers happy. I'm sure Eric would also love to be a millionaire.

Realistically, all the project can do is what they are already doing and hope that the volunteers will understand. Some will not, and will leave out of frustration, and the damage to their reputation will be almost irreparable. I'm sure this will also frustrate Admins who are doing the best they can, but ultimately, they can't do any more than they already are and have to simply accept it for what it is: user choice.
ID: 919354 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 919362 - Posted: 19 Jul 2009, 14:20:53 UTC

Note that in the last hour - at about 06:30 on a Sunday morning, Pacific time - somebody has disabled the Astropulse splitters, put an extra 200 GB of data online (4 'tapes'), and restarted the splitters.

Any more volunteers for the Sunday morning staff shift?
ID: 919362 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919365 - Posted: 19 Jul 2009, 14:25:40 UTC - in response to Message 919362.  
Last modified: 19 Jul 2009, 14:32:17 UTC

Note that in the last hour - at about 06:30 on a Sunday morning, Pacific time - somebody has disabled the Astropulse splitters, put an extra 200 GB of data online (4 'tapes'), and restarted the splitters.

Any more volunteers for the Sunday morning staff shift?


I guess someone is on the way home from a party or Disco and make a break/look in the lab.. ;-)

Or they have rent ASIMO from HONDA to manage the server? ;-D


[ EDIT:
]

ID: 919365 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 919369 - Posted: 19 Jul 2009, 14:37:19 UTC - in response to Message 919354.  
Last modified: 19 Jul 2009, 15:25:00 UTC

I would like to turn this discussion around towards a sollution.

I will transfer some money for this project and would like to ask the others also to do this!.

I know that allot off users just want to donate their CPU's and nothing extra.
Just keep in mind if all active users transfer only 1 dollar or euro to Seti, server issues incl. bandwidth are gone!.

For 150.000,- + Dollar all 53 servers can be replaced by kickass servers! and still leaves money left for data and hosting.

Is their a way Seti can help with the marketing on the site for a special "upgrade" donation?

I will transfer 10 euro's
Who's next :-)

Uhhhh, the drinks must be pretty expensive over their:

Donations to SETI@home Since 19 Jul 2008:
Number of Donations: 1403
Total Donations: $ 117786.24
Smallest Donation: $ 10.00
Largest Donation: $ 20000.00
Average Donation: $ 83.95
Median Donation: $ 30.00
Donation Rate (per day): $ 322.48
% of Volunteers that have Donated: 0.00%
ID: 919369 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (21) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.