CLOSED CLOSED CLOSED

Message boards : Number crunching : CLOSED CLOSED CLOSED
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 14 · Next

AuthorMessage
Profile tekwyzrd
Volunteer tester
Avatar

Send message
Joined: 21 Nov 01
Posts: 767
Credit: 30,009
RAC: 0
United States
Message 205497 - Posted: 7 Dec 2005, 8:19:10 UTC - in response to Message 205398.  
Last modified: 7 Dec 2005, 8:20:48 UTC

I have suggested to them to decrease the keep_alive time and the close wait interval down to two minutes each. I also suggested that they increase the listen queue to 2048 or 4096 instead of the default 128. Even told them how..

I am still waiting on their answer.


Been there.
I did my best to explain why problems like the current communications problems and the recurring buildup of orphaned units will keep recurring.
I was shouted down.
I pointed out a change that occurred in the BOINC source code, when it occurred, and the fix necessary to return the linux benchmarks to their previous levels.
It was ignored.

There are certain individuals who are heeded and others who are not.
Welcome to the nots.
Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)
ID: 205497 · Report as offensive
Profile Rudolfensis

Send message
Joined: 20 Nov 99
Posts: 60
Credit: 427,273
RAC: 0
Papua New Guinea
Message 205546 - Posted: 7 Dec 2005, 10:41:30 UTC

I'm not too sure why this thread was started in the first place since we already knew at that time that there was huge technical problems on the SETI side, thus no problems regarding any one's settings either hardware wise or software wise.

So, if there's any message board god out there, it would surely be nice if the thread could be locked since further postings here are futile. And know, we don't need to know that there's a cogent related slow, we already know also that there is a feud between Level 3 and Cogent enterprises, both will not carry the bandwith load of the other, dollars being the main dissention between the two parties.

This means that, we don't need anymore copy and past of Traces and we also don't need error codes nor do we need hardware/software settings copied. The problem is still ongoing at SETI and when solved, we'll be able to upload results again.

http://setiathome.berkeley.edu/tech_news.php
(I know, link already pasted quite a few times here).
ID: 205546 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 205548 - Posted: 7 Dec 2005, 10:46:52 UTC - in response to Message 205546.  

I'm not too sure why this thread was started in the first place since we already knew at that time that there was huge technical problems on the SETI side, thus no problems regarding any one's settings either hardware wise or software wise.

So, if there's any message board god out there, it would surely be nice if the thread could be locked since further postings here are futile. And know, we don't need to know that there's a cogent related slow, we already know also that there is a feud between Level 3 and Cogent enterprises, both will not carry the bandwith load of the other, dollars being the main dissention between the two parties.

This means that, we don't need anymore copy and past of Traces and we also don't need error codes nor do we need hardware/software settings copied. The problem is still ongoing at SETI and when solved, we'll be able to upload results again.

http://setiathome.berkeley.edu/tech_news.php
(I know, link already pasted quite a few times here).

THis thread was originally an offshoot of the "port/configure firewall" thread I started to try to help several users at the "help desk" who seemed to be experiencing the same issue. It seems to be VERY FEW users are/were affected by this. I didn't want to lose them. It ended up that by lowering the MTU size a little that these users could attach. Now the question is why would a user have to do such a thing. This thread was started to find that answer. It's unfortunate its' timing ended up being at a time when the outage occurred and everyone is getting them, when only a few (less than 12 users) really are affected.

tony
ID: 205548 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 205550 - Posted: 7 Dec 2005, 10:56:35 UTC - in response to Message 205546.  
Last modified: 7 Dec 2005, 10:57:26 UTC

I'm not too sure why this thread was started in the first place since we already knew at that time that there was huge technical problems on the SETI side, thus no problems regarding any one's settings either hardware wise or software wise.

So, if there's any message board god out there, it would surely be nice if the thread could be locked since further postings here are futile. And know, we don't need to know that there's a cogent related slow, we already know also that there is a feud between Level 3 and Cogent enterprises, both will not carry the bandwith load of the other, dollars being the main dissention between the two parties.

This means that, we don't need anymore copy and past of Traces and we also don't need error codes nor do we need hardware/software settings copied. The problem is still ongoing at SETI and when solved, we'll be able to upload results again.

http://setiathome.berkeley.edu/tech_news.php
(I know, link already pasted quite a few times here).


Yes you are right you really do not know why it was started.

Well let me tell you. A number, albeit small, of users are suffering from failed connections. Often they get a sevrer error 500 back from seti. This has been ongoing for many months (probably 12-24 months) and although some software updates in September 05 have helped there are still people who cannot connect. They appear to be people that can connect on straight PC to modem connections but not if a router is involved. Some are associated with poor quality routes. Many factors involved and we collect info where and when we can. A few of us have been working with these people to try and pin down the problems. Personally I have 3 users I am helping. UCB staff have engaged fully and the devs have helped with creating diagnostic scripts and the sys admins have recently made changes to help network connections from these people through config changes. The latest was last night!

We will crack it for sure but it takes time and patience. Personally I would like to thank Ned Ludd, mmciastro and Celtic Wolf for their efforts in helping chase down these problems. Also the project director Dr Anderson and Matt and Jeff @ UCB in their work and Bruce Allen for the scritping he has done this week. There are others of course and thanks to them too. We will prevail here.

This server issue the last few days is unrelated in fact so that message god you summoned should just stay away while we continue the hunt and bring a solution.

I do hope you feel better informed now. Please do not discourage our information collection as you prejudice other peoples' opportunities to have their probelms sorted out.

Thank you






ID: 205550 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 205565 - Posted: 7 Dec 2005, 11:31:06 UTC - in response to Message 205550.  

[quote]I'm not too sure why this thread was started in the first place since we already knew at that time that there was huge technical problems on the SETI side, thus no problems regarding any one's settings either hardware wise or software wise.



From my perspective I have found this thread has been interesting for the errors - I/O, 500 and -106 - for failed uploads. Like the rest of us, I can see them on the BOINC Manager Messages tab.

The other interesting thing for me is that about 40% to 50% of my finished results seem to be getting through. My back log of results "waiting to report" is about 50% to 60% of what I would expect to see. Also, my new WU download is only slightly lower than the numbers I expect when my normal window box/small garden requirements are met, on a daily basis.

So, despite the major problems Matt, et al, at Berkeley are experiencing, and we are seemingly getting frustrated over things are leaking through both ways. However, it looks like being a close thing on new WU downloads lasting long enough.

I am coninuing to "go with the flow" ;-)) After all "it is only problem".


It's good to be back amongst friends and colleagues



ID: 205565 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 205570 - Posted: 7 Dec 2005, 11:41:21 UTC - in response to Message 205546.  
Last modified: 7 Dec 2005, 12:14:58 UTC

Yvan, considering you're new to boinc, and from your comments here, and in other threads, i would strongly advise that for now, you observe what goes on, and how things work, just because the reasons for something aren't apparent, doesn't mean it's wrong

ask questions rather than make accusations/assumptions

read the Wiki, there you will find the answers to most of your questions, gain a greater understanding of the system as a whole, and the reasons for things being the way they are
ID: 205570 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 205576 - Posted: 7 Dec 2005, 11:49:35 UTC - in response to Message 205565.  


I am coninuing to "go with the flow" ;-)) After all "it is only problem".


Bugger my proof reading, before sending, is abysmal :-((

Should have finished reading "After all it is only a problem"
It's good to be back amongst friends and colleagues



ID: 205576 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 205578 - Posted: 7 Dec 2005, 11:55:40 UTC

John, you can edit your posts as long as you do so within 60 minutes of posting.

look for "edit this post" link

tony
ID: 205578 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 205600 - Posted: 7 Dec 2005, 12:18:07 UTC - in response to Message 205578.  

John, you can edit your posts as long as you do so within 60 minutes of posting.

look for "edit this post" link

tony



Thanks, this will be useful. I've noticed several miss spellings even after (what I thought was) proofing, which means my proofreading is shocking.
It's good to be back amongst friends and colleagues



ID: 205600 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 205606 - Posted: 7 Dec 2005, 12:25:28 UTC

try working the help desk, there are so many to help that I type as fast as I can, then look back at what I sent. Oy, I use that link alot.

tony
ID: 205606 · Report as offensive
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 205608 - Posted: 7 Dec 2005, 12:30:10 UTC
Last modified: 7 Dec 2005, 12:32:33 UTC

Hi,
I've tried to attach a new host (with my account) and:
1) installing BOINC I've seen a message about "SETI project suspended" or something like that.
2) in one hour I've downloaded 3 RU's and one is now running.
3) in all others PC I have more problems in UPload than in DownLoad, even if DL is bigger - in size - then UL.

So I think it's not a communication problem (the bottleneck is not in fiber, copper, communication devices) but in software/DB.
I see that the upload/download server (kryten) is the same machine.
The problem can be in the search of the result in internal DB or in the folder structure.
Can be an architettural problem, I think.
I suggest that
a) less results in the bottleneck is better.
b) short timeout in finding internal date, too.

bye,
Francesco

PS: sorry for my poor english.
It's not my mother tongue.
And neither my father's too :-)
ID: 205608 · Report as offensive
Profile SidWing

Send message
Joined: 4 Apr 00
Posts: 1
Credit: 5,428,130
RAC: 2
United States
Message 205617 - Posted: 7 Dec 2005, 12:42:36 UTC

1. CABLE
2. MOTOROLA
3. NO
4. Linksys (No Firewall, BOINC Box is in the DMZ)
5. YES, Provided below

tracert setiboincdata.ssl.berkeley.edu

Tracing route to setiboincdata.ssl.berkeley.edu [66.28.250.125]
over a maximum of 30 hops:

1 <1 ms <1 ms <1 ms 216.180.84.1
2 3 ms 2 ms 2 ms hsvrouter.hiwaay.net [216.180.47.234]
3 7 ms 7 ms 7 ms 12.124.64.9
4 24 ms 24 ms 23 ms tbr1-p013701.attga.ip.att.net [12.123.21.98]
5 22 ms 22 ms 22 ms 12.122.10.69
6 22 ms 21 ms 21 ms 12.122.82.225
7 27 ms 27 ms 27 ms att-gw.dc.cogent.net [192.205.32.86]
8 28 ms 29 ms 28 ms p11-0.core01.dca01.atlas.cogentco.com[154.54.2.201]
9 31 ms 32 ms 31 ms p4-0.core01.phl01.atlas.cogentco.com [66.28.4.18]
10 32 ms 31 ms 32 ms p5-0.core02.jfk02.atlas.cogentco.com [66.28.4.1]
11 55 ms 54 ms 54 ms p12-0.core01.mci01.atlas.cogentco.com [154.54.3.202]
12 91 ms 91 ms 91 ms p10-0.core02.sfo01.atlas.cogentco.com [66.28.4.209]
13 92 ms 91 ms 91 ms p15-0.core01.sfo01.atlas.cogentco.com [66.28.4.69]
14 93 ms 93 ms 93 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
15 98 ms 98 ms 94 ms 66.28.250.125

Trace complete.

ID: 205617 · Report as offensive
LennStar

Send message
Joined: 3 Jul 04
Posts: 4
Credit: 287,628
RAC: 0
Germany
Message 205637 - Posted: 7 Dec 2005, 13:07:35 UTC
Last modified: 7 Dec 2005, 13:09:43 UTC

(not readed all)

1. (Germany) DSL 2,2 Mbit

2+4. Router - WLAN Laptop; WLAN with USBstick

3. 2 surfbars, email, Avant browser

5. no ;)


deleted cookies yesterday after problems on one site (solved), no other changes
upload+download I/O and 500



07.12.2005 12:35:32|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
07.12.2005 12:35:32|SETI@home|Reason: To fetch work
07.12.2005 12:35:32|SETI@home|Requesting 118 seconds of new work
07.12.2005 12:35:38|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
07.12.2005 12:35:39|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:38:49|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 12:38:49|SETI@home|Backing off 1 minutes and 0 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:39:50|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:39:53|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 12:39:53|SETI@home|Backing off 1 minutes and 0 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:40:54|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:41:53||request_reschedule_cpus: process exited
07.12.2005 12:41:53|SETI@home|Computation for result 20mr05aa.3278.9986.311076.239_0 finished
07.12.2005 12:41:53|climateprediction.net|Restarting result 1sdb_400104276_0 using hadsm3 version 414
07.12.2005 12:41:55|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:41:57|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
07.12.2005 12:41:57|climateprediction.net|Reason: To send trickle-up message
07.12.2005 12:41:57|climateprediction.net|Note: not requesting new work or reporting results
07.12.2005 12:42:02|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
07.12.2005 12:44:03|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 12:44:03|SETI@home|Backing off 1 minutes and 0 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:45:04|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:45:06|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 12:45:06|SETI@home|Backing off 1 minutes and 0 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:45:26||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu]
07.12.2005 12:45:26|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: system I/O
07.12.2005 12:45:26|SETI@home|Backing off 1 minutes and 0 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:46:06|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:46:27|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:49:17|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 12:49:17|SETI@home|Backing off 1 minutes and 0 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:49:38|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 12:49:38|SETI@home|Backing off 1 minutes and 59 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:50:18|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:51:17|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 12:51:17|SETI@home|Backing off 1 minutes and 0 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:51:38|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:52:17|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:53:15|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 12:53:15|SETI@home|Backing off 3 minutes and 51 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 12:55:29|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 12:55:29|SETI@home|Backing off 1 minutes and 0 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:56:31|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 12:57:08|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 12:59:41|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 12:59:41|SETI@home|Backing off 1 minutes and 53 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:00:20|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 13:00:20|SETI@home|Backing off 13 minutes and 41 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 13:01:35|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:01:58||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu]
07.12.2005 13:01:58|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: system I/O
07.12.2005 13:01:58|SETI@home|Backing off 5 minutes and 17 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:07:16|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:10:26|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 13:10:26|SETI@home|Backing off 8 minutes and 54 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:14:02|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 13:17:12|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: error 500
07.12.2005 13:17:12|SETI@home|Backing off 7 minutes and 24 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 13:19:21|SETI@home|Started upload of 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:22:32|SETI@home|Temporarily failed upload of 20mr05aa.3278.9986.311076.239_0_0: error 500
07.12.2005 13:22:32|SETI@home|Backing off 48 minutes and 43 seconds on upload of file 20mr05aa.3278.9986.311076.239_0_0
07.12.2005 13:24:38|SETI@home|Started download of 14ap04ab.18162.7521.698572.61
07.12.2005 13:25:00||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu]
07.12.2005 13:25:00|SETI@home|Temporarily failed download of 14ap04ab.18162.7521.698572.61: system I/O
07.12.2005 13:25:00|SETI@home|Backing off 1 hours, 34 minutes, and 20 seconds on download of file 14ap04ab.18162.7521.698572.61
07.12.2005 13:45:53||request_reschedule_cpus: project op
07.12.2005 13:45:56|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
07.12.2005 13:45:56|SETI@home|Reason: Requested by user
07.12.2005 13:45:56|SETI@home|Note: not requesting new work or reporting results
07.12.2005 13:46:01|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

ID: 205637 · Report as offensive
juergen Kozok 2

Send message
Joined: 1 Sep 99
Posts: 1
Credit: 651,202
RAC: 0
Germany
Message 205666 - Posted: 7 Dec 2005, 13:43:18 UTC

some addition to these 500 and System I/O errors
everything has working for me fine about a week or two ago, both download and upload. Now I get the errors. Download is working most of the time ok, no upload since the weekend.
1. Germany - DSL 2,2Mbit
2. Router
so far never have seen these 500 errors before

When an upload starts, it get connected and up to 0,28Kb get uploaded, then everything got stuck. Some minute later without any progress the error message and the backing off of the download. Seams that connect is ok and the first chunck of data get send, but then the handshaking get lost.

Version change to 5.2.13 did not change anything.

07.12.2005 13:42:24||Starting BOINC client version 5.2.13 for windows_intelx86
07.12.2005 13:42:24||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
07.12.2005 13:42:24||Data directory: D:\\Programme\\BOINC
07.12.2005 13:42:24||Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3200+
07.12.2005 13:42:24||Memory: 1023.48 MB physical, 3.90 GB virtual
07.12.2005 13:42:24||Disk: 30.27 GB total, 24.69 GB free
07.12.2005 13:42:24||Version change detected (5.2.7 -> 5.2.13); running CPU benchmarks
07.12.2005 13:42:24|SETI@home|Computer ID: 1680656; location: home; project prefs: default
07.12.2005 13:42:24||General prefs: from SETI@home (last modified 2004-07-21 18:46:36)
07.12.2005 13:42:24||General prefs: no separate prefs for home; using your defaults
07.12.2005 13:42:25||Remote control not allowed; using loopback address
07.12.2005 13:42:27||Running CPU benchmarks
07.12.2005 13:43:26||Benchmark results:
07.12.2005 13:43:26|| Number of CPUs: 1
07.12.2005 13:43:26|| 1899 double precision MIPS (Whetstone) per CPU
07.12.2005 13:43:26|| 3563 integer MIPS (Dhrystone) per CPU
07.12.2005 13:43:26||Finished CPU benchmarks
07.12.2005 13:43:27|SETI@home|Resuming computation for result 14ap04ab.18162.7569.367328.100_3 using setiathome version 418
07.12.2005 13:43:27||Resuming computation and network activity
07.12.2005 13:43:27||request_reschedule_cpus: Resuming activities
07.12.2005 14:10:30|SETI@home|Started upload of 30oc03ab.23657.4529.523582.122_1_0
07.12.2005 14:13:41|SETI@home|Temporarily failed upload of 30oc03ab.23657.4529.523582.122_1_0: error 500
07.12.2005 14:13:41|SETI@home|Backing off 1 hours, 11 minutes, and 56 seconds on upload of file 30oc03ab.23657.4529.523582.122_1_0
07.12.2005 14:29:28|SETI@home|Started upload of 15ap04aa.22158.4305.1034646.95_3_0
07.12.2005 14:29:43|SETI@home|Started upload of 21mr05ab.11361.32624.147152.78_0_0
07.12.2005 14:29:51|SETI@home|Temporarily failed upload of 15ap04aa.22158.4305.1034646.95_3_0: error 500
07.12.2005 14:29:51|SETI@home|Backing off 2 hours, 46 minutes, and 47 seconds on upload of file 15ap04aa.22158.4305.1034646.95_3_0
07.12.2005 14:32:54|SETI@home|Temporarily failed upload of 21mr05ab.11361.32624.147152.78_0_0: error 500
07.12.2005 14:32:54|SETI@home|Backing off 2 hours, 43 minutes, and 57 seconds on upload of file 21mr05ab.11361.32624.147152.78_0_0

ID: 205666 · Report as offensive
Nic
Volunteer tester

Send message
Joined: 26 Jan 05
Posts: 15
Credit: 187,008
RAC: 0
United States
Message 205686 - Posted: 7 Dec 2005, 14:09:22 UTC - in response to Message 205565.  


The other interesting thing for me is that about 40% to 50% of my finished results seem to be getting through.


I wish I could say the same. Not a single WU of mine has uploaded in 3 days now. As long as they get this fixed in 10 days my WU's won't start to expire.

And if they're having a problem with the uploads, you would think they would stop giving out new units until they caught up. But no...
ID: 205686 · Report as offensive
nicol

Send message
Joined: 4 Jan 01
Posts: 2
Credit: 23,445
RAC: 0
United Kingdom
Message 205715 - Posted: 7 Dec 2005, 14:57:25 UTC - in response to Message 205686.  

I agree they should stop pumping more WU's out then maybe they will catch up before my three + days worth expire. I have every simpathy with the problem s**t happens and don't mind helping the project but I'm not keen on wasting time/electricity for nothing.
ID: 205715 · Report as offensive
Profile Celtic Wolf
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3278
Credit: 595,676
RAC: 0
United States
Message 205723 - Posted: 7 Dec 2005, 15:12:16 UTC

OK I think I have enough data.

Please do not post your entire BOINC message log. The network traces I was looking for were sniffer trace, though the trace routes you provided lead me away from an ISP issue..

We all know the folks at Berkeley have a problem.. They are working to correct that problem.

PLEASE BE PATIENT!!!

ID: 205723 · Report as offensive
Profile John Youles

Send message
Joined: 10 May 00
Posts: 8
Credit: 1,365
RAC: 0
United Kingdom
Message 205732 - Posted: 7 Dec 2005, 15:23:56 UTC

I take it that work units uploaded late due to the servers not being able to cope will be accepted and credited ? I don't like the idea of contributing computing resources if the results are chucked in the bin.

Did it not occur to the management that getting hundreds of thousands of SETI Classic users to switch over to BOINC at the same time would cause server problems ?

Any extra-terrestrial intelligences must be having a good laugh at this fiasco.


ID: 205732 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 205786 - Posted: 7 Dec 2005, 16:08:04 UTC - in response to Message 205732.  

I take it that work units uploaded late due to the servers not being able to cope will be accepted and credited ? I don't like the idea of contributing computing resources if the results are chucked in the bin.

Did it not occur to the management that getting hundreds of thousands of SETI Classic users to switch over to BOINC at the same time would cause server problems ?



If you visit the server status (http://setiathome.berkeley.edu/sah_status.html) you will have seen a few days ago, and now, that the reserve of work units ready for issue has dwindled from over 2 million to the present 519K.

The latter should last another day or more, before crunchers (for the science) need to live off their reserve caches. This means the system is "topping up these caches" while this "sort of" outrage is occurring. Despite not returning all the completed resuls.

I think others will confirm (Celic Wolf?) that in the past when an extended outrage threatened to take completed work units over the 14 day deadline the server checking the date valdity was turned off. This meant they accepted the all results, which were validated against the other 3 units issued at the same time. After this credit was then given.

So, I think you are wrong to assume completed WU results will be binned this time if they exceed the "reply by date".

As regards the latter server load as Classic users swapped to BOINC ... I am sure the people at SSL, Berkeley, were aware of the problem and were keeping an eye on the servers. However, an earlier post by Matt L indicated database size problems and application memory leaks were also issues. This occurred at the time new users were hittingthe servers.

How do you anticipate, let alone plan, for this combination?

As Celtic Wolf recommends "PLEASE BE PATIENT", nothing is wasted and continued crunching efforts are appreciatd. But the SSL people have the priority to identify and sort the current issues. As soon as these are sorted then things will get smoothly boring again.
It's good to be back amongst friends and colleagues



ID: 205786 · Report as offensive
Profile Celtic Wolf
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3278
Credit: 595,676
RAC: 0
United States
Message 205794 - Posted: 7 Dec 2005, 16:15:06 UTC - in response to Message 205786.  


I think others will confirm (Celic Wolf?) that in the past when an extended outrage threatened to take completed work units over the 14 day deadline the server checking the date valdity was turned off. This meant they accepted the all results, which were validated against the other 3 units issued at the same time. After this credit was then given.

So, I think you are wrong to assume completed WU results will be binned this time if they exceed the "reply by date".

As regards the latter server load as Classic users swapped to BOINC ... I am sure the people at SSL, Berkeley, were aware of the problem and were keeping an eye on the servers. However, an earlier post by Matt L indicated database size problems and application memory leaks were also issues. This occurred at the time new users were hittingthe servers.

How do you anticipate, let alone plan, for this combination?

As Celtic Wolf recommends "PLEASE BE PATIENT", nothing is wasted and continued crunching efforts are appreciatd. But the SSL people have the priority to identify and sort the current issues. As soon as these are sorted then things will get smoothly boring again.


It is usually Dr. Anderson call, but I have not known SETI to throw away work units because there was a problem with the server..

2 million is an awful lot to waste. Especially if one of them picked up the WOW signal.


ID: 205794 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 14 · Next

Message boards : Number crunching : CLOSED CLOSED CLOSED


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.