Panic Mode On (20) Server problems


log in

Advanced search

Message boards : Number crunching : Panic Mode On (20) Server problems

1 · 2 · 3 · 4 . . . 16 · Next
Author Message
Profile [AF>HFR>RR] Les Rochelais setiseurs
Volunteer tester
Send message
Joined: 11 Oct 00
Posts: 6
Credit: 2,407,052
RAC: 0
France
Message 914643 - Posted: 6 Jul 2009, 8:12:43 UTC

My team in unable to send any work from 3 days.
What's wrong ?
We can download work, but not send the results.

here are the messages :
06/07/2009 10:06:37 SETI@home Temporarily failed upload of 06dc08ad.24531.8252.7.8.97_1_0: connect() failed
06/07/2009 10:06:37 SETI@home Backing off 1 hr 25 min 2 sec on upload of 06dc08ad.24531.8252.7.8.97_1_0
06/07/2009 10:06:38 Internet access OK - project servers may be temporarily down.
06/07/2009 10:07:19 Project communication failed: attempting access to reference site
06/07/2009 10:07:19 SETI@home Temporarily failed upload of 06dc08ad.24531.8252.7.8.89_0_0: connect() failed
06/07/2009 10:07:19 SETI@home Backing off 39 min 27 sec on upload of 06dc08ad.24531.8252.7.8.89_0_0
06/07/2009 10:07:21 Internet access OK - project servers may be temporarily down.

In the Server status list, all seems ok !!!


Thank you
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5685
Credit: 56,118,635
RAC: 49,773
Australia
Message 914658 - Posted: 6 Jul 2009, 9:01:57 UTC - in response to Message 914643.

My team in unable to send any work from 3 days.
What's wrong ?

There has been heavy network traffic for the last few days, and it looks like it will continue for a while yet. Some of your results must have got through otherwise you wouldn't be able to down load more work.

____________
Grant
Darwin NT.

MuylaerB
Avatar
Send message
Joined: 27 May 99
Posts: 3
Credit: 1,225,826
RAC: 1,108
Belgium
Message 914820 - Posted: 6 Jul 2009, 18:02:45 UTC - in response to Message 914658.

It's already going on for a number of weeks now. A lot of times I cannot load up the calculated data for a number of consecutive days. If it is a network overload, why isn't anything done about it?
____________

Fred W
Volunteer tester
Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 914832 - Posted: 6 Jul 2009, 18:10:41 UTC - in response to Message 914820.

It's already going on for a number of weeks now. A lot of times I cannot load up the calculated data for a number of consecutive days. If it is a network overload, why isn't anything done about it?

Because that would cost lots of $$$ that the project just doesn't have (last estimate I saw was in the region of $100000).

F.
____________

Profile [AF>HFR>RR] Les Rochelais setiseurs
Volunteer tester
Send message
Joined: 11 Oct 00
Posts: 6
Credit: 2,407,052
RAC: 0
France
Message 916186 - Posted: 9 Jul 2009, 15:57:43 UTC

Really I don't understand why SETI servers still have problems !!!

what are they doing there ???

We have plenty of crunches to upload since last friday !!!

And, IT IS possible to download new ones.

The problem comes from the upoload not from the download.

Please Correct this fast ! A lot of friends are leaving SETI because of that .
____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916189 - Posted: 9 Jul 2009, 16:15:41 UTC - in response to Message 916186.

Really I don't understand why SETI servers still have problems !!!

what are they doing there ???

Because there are 180,000 crunchers trying to reach one upload server.

One upload server, run by a tiny staff, on 100 megabit connection.

They need more help, better hardware, and a faster connection.

The fix is money: they don't have money.

If you can be patient, BOINC will take care of this. If you can't, maybe you can help with the funding issue.

____________

Cameron S Moore
Send message
Joined: 5 Jun 09
Posts: 2
Credit: 128,495
RAC: 0
United States
Message 916213 - Posted: 9 Jul 2009, 17:40:07 UTC

Well it's been redoing the upload process for 18mins now and hasn't uploaded this one thing -.- It says project servers may be down. WILL it upload or no? D:

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13541
Credit: 29,291,681
RAC: 15,062
United States
Message 916224 - Posted: 9 Jul 2009, 18:23:10 UTC

Yes, it will upload eventually. SETI's internet bandwidth is just a bit overloaded at the moment.
____________

Profile [AF>HFR>RR] Les Rochelais setiseurs
Volunteer tester
Send message
Joined: 11 Oct 00
Posts: 6
Credit: 2,407,052
RAC: 0
France
Message 916449 - Posted: 10 Jul 2009, 7:08:13 UTC

Hi everybody,

How much it costs ?

What kind of hardware they need ?

Is it possible to externalize another server in Europe ? in France ?

If someone could ask, and tell me ? I think we could do something to help...
____________

Profile tullio
Send message
Joined: 9 Apr 04
Posts: 3566
Credit: 361,443
RAC: 202
Italy
Message 916463 - Posted: 10 Jul 2009, 7:35:47 UTC - in response to Message 916449.

Hi everybody,

How much it costs ?

What kind of hardware they need ?

Is it possible to externalize another server in Europe ? in France ?

If someone could ask, and tell me ? I think we could do something to help...

I suggest you go to the Technical News Forum where all this is hotly debated.
Tullio
____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916543 - Posted: 10 Jul 2009, 16:36:51 UTC - in response to Message 916449.

Hi everybody,

How much it costs ?

What kind of hardware they need ?

Is it possible to externalize another server in Europe ? in France ?

If someone could ask, and tell me ? I think we could do something to help...

As Tullio has said, discussed at length.

SETI@Home has a 100 megabit connection to the 'net. They need to get the University to upgrade their connection, or perhaps move the servers to a building that already has fast bandwidth.

The servers are mostly hand-me-downs.

It's probably possible to put servers "out there" in the world, but it's unlikely because it is a management headache -- all the data must get out of Berkeley initially, and back to Berkeley, and adding off-site servers makes that a two-step process.
____________

Profile tullio
Send message
Joined: 9 Apr 04
Posts: 3566
Credit: 361,443
RAC: 202
Italy
Message 916553 - Posted: 10 Jul 2009, 17:00:48 UTC

I am getting MB units, crunching them on my CPU, uploading the results and getting new units. Cannot get Astropulse. I have the optimized 5.03 version and am waiting for the optimized 5.05 installer for my Linux box.
Tullio
____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916566 - Posted: 10 Jul 2009, 17:47:59 UTC - in response to Message 916553.

I am getting MB units, crunching them on my CPU, uploading the results and getting new units. Cannot get Astropulse. I have the optimized 5.03 version and am waiting for the optimized 5.05 installer for my Linux box.
Tullio

If you have an app_info.xml file (you have optimized apps) and you do not have an entry for "astropulse_v505" then you will not get any 5.05 work units: you have to tell BOINC you have an application for it.
____________

Profile tullio
Send message
Joined: 9 Apr 04
Posts: 3566
Credit: 361,443
RAC: 202
Italy
Message 916567 - Posted: 10 Jul 2009, 17:57:33 UTC - in response to Message 916566.


If you have an app_info.xml file (you have optimized apps) and you do not have an entry for "astropulse_v505" then you will not get any 5.05 work units: you have to tell BOINC you have an application for it.

Yes, I know. But the combined installation package for both MB and astropulse should be available soon.

____________

samuel7
Volunteer tester
Send message
Joined: 2 Jan 00
Posts: 47
Credit: 2,194,240
RAC: 0
Finland
Message 917375 - Posted: 13 Jul 2009, 16:18:48 UTC

I have one completed task which won't upload. First attempt:

13.7.2009 17:38:50 SETI@home [error] Error reported by file upload server: EOF on socket read : asked for 7488, got 5309

All subsequent attempts:
13.7.2009 18:04:23 SETI@home [error] Error reported by file upload server: EOF on socket read : asked for 2179, got 0

<file_xfer_debug> output:
13.7.2009 18:45:21 SETI@home Started upload of 17oc08ac.4855.21749.16.8.163_1_0
13.7.2009 18:45:21 SETI@home [file_xfer_debug] URL: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
13.7.2009 18:45:22 SETI@home [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
13.7.2009 18:45:22 SETI@home [file_xfer_debug] parsing upload response: <data_server_reply> <status>0</status> <file_size>21691</file_size></data_server_reply>
13.7.2009 18:45:22 SETI@home [file_xfer_debug] parsing status: 0
13.7.2009 18:45:24 SETI@home [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
13.7.2009 18:45:24 SETI@home [error] Error reported by file upload server: EOF on socket read : asked for 2179, got 0
13.7.2009 18:45:24 SETI@home [file_xfer_debug] parsing upload response: <data_server_reply> <status>1</status> <message>EOF on socket read : asked for 2179, got 0</message></data_server_reply>
13.7.2009 18:45:24 SETI@home [file_xfer_debug] parsing status: -127
13.7.2009 18:45:24 SETI@home [file_xfer_debug] file transfer status -127
13.7.2009 18:45:24 SETI@home Temporarily failed upload of 17oc08ac.4855.21749.16.8.163_1_0: transient upload error
13.7.2009 18:45:24 SETI@home Backing off 8 min 12 sec on upload of 17oc08ac.4855.21749.16.8.163_1_0


I have restarted this BOINC 6.6.36 (32bit) on Vista64 to no avail.

The task was apparently completed during a short period when BOINC was not started as administrator. I did two CUDA/CPU reschedules with Marius' tool (v1.9) and the log wasn't saved to stdoutdae.txt between the two. BOINC can't create/save files properly when not running as administrator?

The task was completed successfully by the way (exit status 0).

All other uploads have gone through fine.

Any ideas?
____________

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12258
Credit: 2,544,727
RAC: 264
Netherlands
Message 917379 - Posted: 13 Jul 2009, 16:22:39 UTC - in response to Message 917375.

13.7.2009 18:45:24 SETI@home [file_xfer_debug] file transfer status -127

That's a transient upload error. It as much means that a file_upload_handler (FUH) handling a file has put a lock on the directory on the server, so no other FUHs can interfere and write to that same space. This can only be solved by the project.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

samuel7
Volunteer tester
Send message
Joined: 2 Jan 00
Posts: 47
Credit: 2,194,240
RAC: 0
Finland
Message 917382 - Posted: 13 Jul 2009, 16:33:38 UTC - in response to Message 917379.

Thanks for a quick reply!

Well it's no biggie to me, just one task. As long as it won't cause problems on the project level it can retry for the full two weeks for all I care.
____________

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3590
Credit: 47,338,019
RAC: 427
United States
Message 917501 - Posted: 14 Jul 2009, 2:01:07 UTC

New version....
____________

Chelski
Avatar
Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,790,458
RAC: 693
Malaysia
Message 917508 - Posted: 14 Jul 2009, 2:58:23 UTC

Richard Haselgrove wrote:
Only when the number of AP results 'in the field' has risen back up to the pre-recorder-failure figure of around ~440,000 (vastly more than the current 137,853) will we start to see steady-state behaviour.
The steady state behaviour is rather far away and the last climb lost all steam at 220k

Does that actually means that we'll never see normality (e.g. a high but steady usage on the Cricket that doesn't bork uploads whenever AP is available) for at least, say, 1-2 months?
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2234
Credit: 8,427,327
RAC: 4,129
United States
Message 917516 - Posted: 14 Jul 2009, 4:44:33 UTC

0445utc, upload server is disabled. Obviously, uploads fail.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

1 · 2 · 3 · 4 . . . 16 · Next

Message boards : Number crunching : Panic Mode On (20) Server problems

Copyright © 2014 University of California