Can't report or get new tasks

Message boards : Number crunching : Can't report or get new tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

AuthorMessage
Profile dahls

Send message
Joined: 24 Oct 04
Posts: 135
Credit: 178,942,502
RAC: 217
Norway
Message 1119444 - Posted: 20 Jun 2011, 18:24:37 UTC - in response to Message 1119353.  

I know this machine runs Linux, have you tried a reboot? Works many times for Windows machines.

A project reset is also an option but you would lose the work already on board and completed. Berkeley should resend the work after the reset but you would have to crunch it again. Last but best option in my view. After 6 days of non communication you have to do something!


I have considered it. But since this is a database server, I have to find time to do so.

About the other last replies:

The problem machine, and all the other machines are located on different internal LANs connected to a firewall which again is connected to Internet through a ADSL router. Rebooting the router will not help, since all other traffic seem to work quite well (both BOINC and non-BOINC communication).

If there has been anyhting wrong with the machines network connection, the database traffic would also suffer from this error. Downloading new WUs did not seem to be any problem a few days ago. Uploading them seem to work from time to time, but reporting them is not possible.

Also, BOINC on this machine is currently reporting "Retrieved header from server: 500 Internal server error". This tell me (based on several years experience with web-design and -programming using PHP, MySQL and Apache web server) that the server (in this case NOT my machine, but the server in the other end) has a problem.

But since no other seem to report this error, it can be caused by invalid data sent from my machine.

And again... thanks for all replies :)
ID: 1119444 · Report as offensive
Profile Skywalker66 @ Berlin

Send message
Joined: 31 Jan 01
Posts: 78
Credit: 27,692,349
RAC: 0
Germany
Message 1119450 - Posted: 20 Jun 2011, 18:39:46 UTC - in response to Message 1119445.  

I know this machine runs Linux, have you tried a reboot? Works many times for Windows machines.

A project reset is also an option but you would lose the work already on board and completed. Berkeley should resend the work after the reset but you would have to crunch it again. Last but best option in my view. After 6 days of non communication you have to do something!


I have considered it. But since this is a database server, I have to find time to do so.

About the other last replies:

The problem machine, and all the other machines are located on different internal LANs connected to a firewall which again is connected to Internet through a ADSL router. Rebooting the router will not help, since all other traffic seem to work quite well (both BOINC and non-BOINC communication).

If there has been anyhting wrong with the machines network connection, the database traffic would also suffer from this error. Downloading new WUs did not seem to be any problem a few days ago. Uploading them seem to work from time to time, but reporting them is not possible.

Also, BOINC on this machine is currently reporting "Retrieved header from server: 500 Internal server error". This tell me (based on several years experience with web-design and -programming using PHP, MySQL and Apache web server) that the server (in this case NOT my machine, but the server in the other end) has a problem.

But since no other seem to report this error, it can be caused by invalid data sent from my machine.

And again... thanks for all replies :)



Have you tried flushing the DNS cache on the machine in question?

Edit: you may have corrupt, or outdated DNS entries for the project.



self with flushing no connecting to the projekt, i go again over a proxy, that works :-(

ID: 1119450 · Report as offensive
Profile dahls

Send message
Joined: 24 Oct 04
Posts: 135
Credit: 178,942,502
RAC: 217
Norway
Message 1119470 - Posted: 20 Jun 2011, 19:38:45 UTC - in response to Message 1119450.  

Also forgot to tell that this machine is the internal DSN server. So if this machine has an invalid DNS entry, all the other machines should have suffer from the same problem - but they are not.

Anyway, I can try to restart the DNS server on the machine.
ID: 1119470 · Report as offensive
Profile dahls

Send message
Joined: 24 Oct 04
Posts: 135
Credit: 178,942,502
RAC: 217
Norway
Message 1124210 - Posted: 3 Jul 2011, 7:53:56 UTC

Another machine is showing sign of communication problem. The number of completed WUs are increasing and the time for the last contact is days back.

Both machine are running Fedora Core 12. Can this problem be related to the OS?

The first computer that had this problem (and still have) is http://setiathome.berkeley.edu/show_host_detail.php?hostid=5256434.
The new computer that seem to have this problem is http://setiathome.berkeley.edu/show_host_detail.php?hostid=5306094.

ID: 1124210 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1124211 - Posted: 3 Jul 2011, 7:57:28 UTC

Take a look at this thread:
http://setiathome.berkeley.edu/forum_thread.php?id=64652

- it may be a problem within the Hurricane Electric network that's stopping your communications getting through.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1124211 · Report as offensive
Profile S@NL - XP_Freak

Send message
Joined: 10 Jul 99
Posts: 99
Credit: 6,248,265
RAC: 0
Netherlands
Message 1124212 - Posted: 3 Jul 2011, 8:00:34 UTC - in response to Message 1124210.  
Last modified: 3 Jul 2011, 8:01:13 UTC

ID: 1124212 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13755
Credit: 208,696,464
RAC: 304
Australia
Message 1124214 - Posted: 3 Jul 2011, 8:28:13 UTC - in response to Message 1124212.  
Last modified: 3 Jul 2011, 8:30:47 UTC

Downloads have been slow for most of the day. Then several hours ago uploads started taking several attempts to go through. It' been getting progressively worse.
And with the uploads having difficulties doing through, that means problems uploading data to the Scheduler.
At the moment probably less than one in 30 of my attemtpts to contact the Scheduler are successfull.
Most result in
SETI@home Scheduler request failed: HTTP internal server error


EDIT- and more & more of my downloads are resulting in errors.
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>29ap11ad.7260.18881.3.10.9</file_name>
<error_code>-119</error_code>
<error_message>MD5 check failed</error_message>
</file_xfer_error>
Grant
Darwin NT
ID: 1124214 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1124239 - Posted: 3 Jul 2011, 11:56:04 UTC - in response to Message 1124214.  
Last modified: 3 Jul 2011, 12:11:22 UTC

UPLoads are OK, but DownLoads are still difficult, but not impossible, from my
LT (WLAN) :

3-7-2011 4:44:01 | SETI@home | Started download of 29ap11aj.27207.15200.10.10.80
3-7-2011 4:44:37 | SETI@home | Finished download of 29ap11aj.27207.15200.10.10.80
3-7-2011 4:44:37 | SETI@home | Started download of 31mr11ai.10601.20108.9.10.38
3-7-2011 4:45:06 | SETI@home | Finished download of 31mr11ai.10601.20108.9.10.38
3-7-2011 4:45:06 | SETI@home | Started download of 29ap11aj.27207.15200.10.10.89
3-7-2011 4:45:10 | SETI@home | Started download of 28ap11ag.15926.22153.15.10.90
3-7-2011 4:45:11 | SETI@home | Temporarily failed download of 29ap11aj.27207.15200.10.10.89: HTTP error
3-7-2011 4:45:11 | SETI@home | Backing off 13 min 49 sec on download of 29ap11aj.27207.15200.10.10.89
3-7-2011 4:45:11 | SETI@home | Started download of 31mr11ai.10601.20108.9.10.46
3-7-2011 4:45:15 | SETI@home | Temporarily failed download of 28ap11ag.15926.22153.15.10.90: HTTP error
3-7-2011 4:45:15 | SETI@home | Backing off 12 min 8 sec on download of 28ap11ag.15926.22153.15.10.90
3-7-2011 4:45:49 | SETI@home | Finished download of 31mr11ai.10601.20108.9.10.46
3-7-2011 4:45:49 | SETI@home | Started download of 31mr11ai.10601.20108.9.10.48
3-7-2011 4:45:49 | SETI@home | Started download of 29ap11aa.10350.22562.4.10.169
3-7-2011 4:45:51 | SETI@home | Temporarily failed download of 29ap11aa.10350.22562.4.10.169: HTTP error [b]
3-7-2011 4:45:51 | SETI@home | Backing off 11 min 11 sec on download of 29ap11aa.10350.22562.4.10.169[/b]

3-7-2011 5:58:22 | SETI@home | Started download of 29ap11aa.10350.22562.4.10.146
3-7-2011 5:58:50 | SETI@home | Sending scheduler request: To fetch work.
3-7-2011 5:58:50 | SETI@home | Reporting 4 completed tasks, requesting new tasks for CPU
3-7-2011 5:58:59 | SETI@home | Scheduler request completed: got 1 new tasks
3-7-2011 5:59:14 | SETI@home | Temporarily failed download of 29ap11aa.10350.22562.4.10.146: HTTP error
3-7-2011 5:59:14 | SETI@home | Backing off 56 min 30 sec on download of 29ap11aa.10350.22562.4.10.146
3-7-2011 5:59:30 |  | Project communication failed: attempting access to reference site
3-7-2011 5:59:32 |  | Internet access OK - project servers may be temporarily down.
3-7-2011 5:59:42 | SETI@home | Temporarily failed download of 28ap11ag.15926.22153.15.10.75: HTTP error
3-7-2011 5:59:42 | SETI@home | Backing off 29 min 9 sec on download of 28ap11ag.15926.22153.15.10.75
3-7-2011 5:59:44 |  | Project communication failed: attempting access to reference site
3-7-2011 5:59:46 |  | Internet access OK - project servers may be temporarily down.
3-7-2011 8:34:46 | SETI@home | Started upload of 26fe11ad.2360.2521.8.10.233.vlar_1_0
3-7-2011 8:34:55 | SETI@home | Finished upload of 26fe11ad.2360.2521.8.10.233.vlar_1_0
3-7-2011 9:01:19 | SETI@home | Started download of 28ap11ag.15926.22153.15.10.75
3-7-2011 9:01:19 | SETI@home | Started download of 29ap11aa.10350.22562.4.10.146
3-7-2011 9:01:24 | SETI@home | Temporarily failed download of 28ap11ag.15926.22153.15.10.75: HTTP error
3-7-2011 9:01:24 | SETI@home | Backing off 1 hr 5 min 17 sec on download of 28ap11ag.15926.22153.15.10.75
3-7-2011 9:05:01 | SETI@home | Temporarily failed download of 29ap11aa.10350.22562.4.10.146: HTTP error
3-7-2011 9:05:01 | SETI@home | Backing off 1 hr 35 min 11 sec on download of 29ap11aa.10350.22562.4.10.146
3-7-2011 9:05:14 |  | Project communication failed: attempting access to reference site 
3-7-2011 9:05:16 |  | Internet access OK - project servers may be temporarily down.
3-7-2011 10:08:38 | SETI@home | Computation for task 26fe11ad.2360.2521.8.10.227.vlar_1 finished
3-7-2011 10:08:38 | SETI@home | Starting 26fe11ad.923.2521.16.10.234.vlar_1
3-7-2011 10:08:38 | SETI@home | Starting task 26fe11ad.923.2521.16.10.234.vlar_1 using setiathome_enhanced version 603
3-7-2011 10:08:40 | SETI@home | Started upload of 26fe11ad.2360.2521.8.10.227.vlar_1_0
3-7-2011 10:08:46 | SETI@home | Finished upload of 26fe11ad.2360.2521.8.10.227.vlar_1_0 .


Eventually DownLoads are gettin through.
But still waiting on: 29ap11aa.10350.22562.4.10.169 and a few others
, BOINC (6.12.26.|x86) will try in 5 hours, according to the DownLoad Qeue
ID: 1124239 · Report as offensive
Profile dahls

Send message
Joined: 24 Oct 04
Posts: 135
Credit: 178,942,502
RAC: 217
Norway
Message 1124260 - Posted: 3 Jul 2011, 14:15:38 UTC - in response to Message 1124239.  

I have rebooted the second machine that seem to be failing (this is a web-server and can be rebooted without too much issues).

After 15 minutes uptime, boinc has tried to upload/report WUs, but get "500 Internal server error". And still, the last contact date is June 25th for this machine.

The other machines, that seem to be working OK, have todays date as the last time for contact.
ID: 1124260 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1124265 - Posted: 3 Jul 2011, 14:28:53 UTC

I'm one of the lucky ones that have a clear path to SETI but my pendings are climbing and I've got a lot of wingmen timed out and work resent. It should be really interesting to see what happens when they fix this problem and all these back work units flood in.

I wonder if there is any way the guys at the lab can rig it so all those that timed out because of this still get credit for the work they couldn't send in? I guess those WUs that get resent and returned will go to cleared but that should still leave a lot of work that could get validated. If they could rig it so that those returned results don't clear but wait until all results are returned that would be great.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1124265 · Report as offensive
Profile S@NL - XP_Freak

Send message
Joined: 10 Jul 99
Posts: 99
Credit: 6,248,265
RAC: 0
Netherlands
Message 1124455 - Posted: 3 Jul 2011, 23:05:53 UTC - in response to Message 1124260.  

And still, the last contact date is June 25th for this machine.

I don't understand this at all.
I am certain that this morning it showed a wu that was uploaded at about 7:00 UTC on 3 July. I even checked the link I did put down in my previous post.
But now the link shows another wu. ???

Goodbye Seti Classic
ID: 1124455 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1124472 - Posted: 4 Jul 2011, 0:25:36 UTC - in response to Message 1124260.  


Do all of your Computers:
http://setiathome.berkeley.edu/hosts_user.php?userid=7992989

(which you say "use the same ADSL line and hide behind the same IP-address")

show the same "External IP address"?
(after you click "Details" and "Show IP address")


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1124472 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1124591 - Posted: 4 Jul 2011, 14:38:21 UTC
Last modified: 4 Jul 2011, 14:38:44 UTC

Looks like we have another weekend crash.
Server status last updated at 4 July 2011 0230 UTC.
Upload server on bruno is disabled.
Don't know if they have tried a remote power cycle/reboot yet, but if that doesn't work, it will probably be Tuesday afternoon before they get it sorted. (Today - 4 July - is a Holiday here in the USA).
Donald
Infernal Optimist / Submariner, retired
ID: 1124591 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1124615 - Posted: 4 Jul 2011, 16:19:40 UTC

Yawn - its Monday, and Bruno has crashed over the weekend. Again.


That's not news, that's an event, a fact of life, albeit a rather boring fact of life until those in the lab can get to the bottom of the problem, thoughts there are running along the lines of a CPU going marginal, or a mother board, or memory, or, or or.

The trouble is each test takes a week or so to complete, start with the cheapest and work the way up to the most expensive. It all takes time, and effort.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1124615 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1124628 - Posted: 4 Jul 2011, 16:49:58 UTC - in response to Message 1124615.  
Last modified: 4 Jul 2011, 17:01:39 UTC

Must have been about 06:30 according to BOINCs LOG,
4-7-2011 3:38:46 | SETI@home | Started upload of 27fe11ah.25769.8656.3.10.230_0_0
4-7-2011 3:38:53 | SETI@home | Finished upload of 27fe11ah.25769.8656.3.10.230_0_0
4-7-2011 4:36:15 | SETI@home | Started download of 28ap11ag.15926.22153.15.10.75
4-7-2011 4:36:15 | SETI@home | Started download of 29ap11aa.10350.22562.4.10.146
4-7-2011 4:36:17 | SETI@home | Temporarily failed download of 28ap11ag.15926.22153.15.10.75: HTTP error
4-7-2011 4:36:17 | SETI@home | Backing off 4 hr 18 min 38 sec on download of 28ap11ag.15926.22153.15.10.75
4-7-2011 4:36:17 | SETI@home | Temporarily failed download of 29ap11aa.10350.22562.4.10.146: HTTP error
4-7-2011 4:36:17 | SETI@home | Backing off 7 hr 31 min 23 sec on download of 29ap11aa.10350.22562.4.10.146
4-7-2011 6:20:40 | SETI@home | Temporarily failed upload of 27fe11ah.28116.3748.4.10.109_0_0: connect() failed
4-7-2011 6:20:40 | SETI@home | Backing off 11 min 52 sec on upload of 27fe11ah.28116.3748.4.10.109_0_0
4-7-2011 6:20:53 |  | Project communication failed: attempting access to reference site
4-7-2011 6:20:54 |  | Internet access OK - project servers may be temporarily down.
4-7-2011 6:32:34 | SETI@home | Started upload of 27fe11ah.28116.3748.4.10.109_0_0
4-7-2011 6:32:56 | SETI@home | Temporarily failed upload of 27fe11ah.28116.3748.4.10.109_0_0: connect() failed
4-7-2011 6:32:56 | SETI@home | Backing off 25 min 0 sec on upload of 27fe11ah.28116.3748.4.10.109_0_0
4-7-2011 6:33:11 |  | Project communication failed: attempting access to reference site
4-7-2011 6:33:12 |  | Internet access OK - project servers may be temporarily down.


And a new week, just has started. . .:)
Over here, summer holidays have started, for the middle of the
Netherlands.
Hope the guys at the LAB will soon find the 'trouble' and fix it.
ID: 1124628 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1124639 - Posted: 4 Jul 2011, 17:28:41 UTC - in response to Message 1124628.  
Last modified: 4 Jul 2011, 17:29:48 UTC

Must have been about 06:30 according to BOINC's LOG...

If that's your (our:-) local time, that would be 04:30 UTC, about two hours after the Status-page hang-up. ;-)

Gruß,
Gundolf
ID: 1124639 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 1124642 - Posted: 4 Jul 2011, 17:39:58 UTC - in response to Message 1124635.  

No doubt they'll fix it in the outrage tomorrow.


Faux pas. :^)
ID: 1124642 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1124662 - Posted: 4 Jul 2011, 18:33:33 UTC - in response to Message 1124642.  
Last modified: 4 Jul 2011, 18:39:11 UTC

Isn't it a holiday, independence day? Forth of july.........
Still have work, so it'll be tomorrow............

And, completely off topic.......

This morning I had to reBOOT my i7-2600, it's INTEL DP67BG Mobo,
now 3 month old and it started and stopped, inmediatly after that.
And starts again........................ long story short, battery dead!

Back on topic.
ID: 1124662 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 1124667 - Posted: 4 Jul 2011, 18:48:10 UTC

Yes back on topic...

Just shut down + restarted BOINC & now it's saying unable-to-connect to local client.
ID: 1124667 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1124669 - Posted: 4 Jul 2011, 18:54:10 UTC - in response to Message 1124667.  

Is the local client running? ;-)

Gruß,
Gundolf
ID: 1124669 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

Message boards : Number crunching : Can't report or get new tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.