Panic Mode On (48) Server problems?

Message boards : Number crunching : Panic Mode On (48) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

AuthorMessage
Profile S@NL - XP_Freak

Send message
Joined: 10 Jul 99
Posts: 99
Credit: 6,248,265
RAC: 0
Netherlands
Message 1118984 - Posted: 19 Jun 2011, 12:45:14 UTC

Who can we blame for jinxing, this time?

Goodbye Seti Classic
ID: 1118984 · Report as offensive
Profile S@NL - XP_Freak

Send message
Joined: 10 Jul 99
Posts: 99
Credit: 6,248,265
RAC: 0
Netherlands
Message 1118986 - Posted: 19 Jun 2011, 12:51:01 UTC - in response to Message 1118977.  

Is there a problem with the validate servers? The last 2 work units that I have turned in have been waiting for at least an hour. The server stats page shows they are working.


In the past 13,5 hours, I've gained a total of 1 credit.
No, not credit for 1 wu, just 1 credit.
So you are not the only one. :)


Goodbye Seti Classic
ID: 1118986 · Report as offensive
Profile Theramansi
Avatar

Send message
Joined: 25 Jun 04
Posts: 97
Credit: 39,577,723
RAC: 63
United States
Message 1118987 - Posted: 19 Jun 2011, 13:02:26 UTC

Looks like validation is down as well.

http://setiathome.berkeley.edu/workunit.php?wuid=764361847
ID: 1118987 · Report as offensive
Profile Dave C
Avatar

Send message
Joined: 22 Jan 02
Posts: 364
Credit: 1,025,962
RAC: 0
United States
Message 1118988 - Posted: 19 Jun 2011, 13:09:53 UTC - in response to Message 1118986.  
Last modified: 19 Jun 2011, 13:10:28 UTC

Is there a problem with the validate servers? The last 2 work units that I have turned in have been waiting for at least an hour. The server stats page shows they are working.


In the past 13,5 hours, I've gained a total of 1 credit.
No, not credit for 1 wu, just 1 credit.
So you are not the only one. :)


Oh well, it is what it is. I guess they will get it fixed some day :-)
Avians and Myrmicats, the Octospiders, and the Humans all living in one huge cylinder in space called RAMA.
ID: 1118988 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1118994 - Posted: 19 Jun 2011, 13:37:10 UTC

Someone turned on uploads - got mine up so you can try now. :)

Still no server status or validation. No GPU tasks for d/l, not sure of cpu tasks since blasted BOINC won't ask for them when I'm almost out). Probably not since cricket shows little d/l activity.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1118994 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1119001 - Posted: 19 Jun 2011, 14:08:01 UTC - in response to Message 1118995.  

Okay, maybe it was just me - uploads were okay most of night here, then struggled for an hour or two, then went to nothing for another hour or so-
when they started going up quickly, I thought it was a fix or something. ange. Sorry if I mislead anyone ( or jinxed it). .
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1119001 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34354
Credit: 79,922,639
RAC: 80
Germany
Message 1119004 - Posted: 19 Jun 2011, 14:29:40 UTC


No uploads here since last night 3 AM 1 UTC.



With each crime and every kindness we birth our future.
ID: 1119004 · Report as offensive
W5DMG - Dave

Send message
Joined: 19 May 99
Posts: 155
Credit: 33,162,251
RAC: 0
United States
Message 1119005 - Posted: 19 Jun 2011, 14:30:54 UTC - in response to Message 1119001.  

Uploads were not working here either for a good while, but working now.
Also just turned in a lot of work, but got Zero credit and no pending.
ID: 1119005 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1119044 - Posted: 19 Jun 2011, 15:56:20 UTC - in response to Message 1118916.  

I've noticed that my validated APs haven't been purged in.. almost a week for some. I believe this is part of that table issue with the database that Matt mentioned. They get validated, they just don't assimilate and then get purged. I believe until it moves to the purge stage, each one takes up 8mb on disk plus the returned results. Empty storage has to run out eventually.

File deletion normally happens immediately after assimilation. Database entries are held for awhile to let users see what happened then purged. Those database entries are no larger for AP than MB and even though I have one still showing which validated 10 days ago, 10 days of AP records is still much less than 1 day of MB records.

They did have 45,218 AP WUs waiting for assimilation last time server status updated, those do reflect extra 8 MiB files so about 380 GB of storage.
                                                                Joe
ID: 1119044 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14674
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1119066 - Posted: 19 Jun 2011, 16:13:03 UTC - in response to Message 1119044.  

I've noticed that my validated APs haven't been purged in.. almost a week for some. I believe this is part of that table issue with the database that Matt mentioned. They get validated, they just don't assimilate and then get purged. I believe until it moves to the purge stage, each one takes up 8mb on disk plus the returned results. Empty storage has to run out eventually.

File deletion normally happens immediately after assimilation. Database entries are held for awhile to let users see what happened then purged. Those database entries are no larger for AP than MB and even though I have one still showing which validated 10 days ago, 10 days of AP records is still much less than 1 day of MB records.

They did have 45,218 AP WUs waiting for assimilation last time server status updated, those do reflect extra 8 MiB files so about 380 GB of storage.
                                                                Joe

Joe, why does that code keep the 8MB input file hanging around so long? Surely, there's no need for that file after validation and a canonical result choice?

As I understand it, the only file which is required for assimilation is the much, much smaller output (result) file - and in theory, only one of the multiple copies of that. Is it simply that the file deleter daemon was written, simplistically, not to distingish input and output files - perhaps not anticipating such a wide disparity in sizes as we have with AP?
ID: 1119066 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1119067 - Posted: 19 Jun 2011, 16:17:04 UTC - in response to Message 1119044.  

I've noticed that my validated APs haven't been purged in.. almost a week for some. I believe this is part of that table issue with the database that Matt mentioned. They get validated, they just don't assimilate and then get purged. I believe until it moves to the purge stage, each one takes up 8mb on disk plus the returned results. Empty storage has to run out eventually.

File deletion normally happens immediately after assimilation. Database entries are held for awhile to let users see what happened then purged. Those database entries are no larger for AP than MB and even though I have one still showing which validated 10 days ago, 10 days of AP records is still much less than 1 day of MB records.

Matt explained in the Technical News why the AP WUs can't be purged ATM.
ID: 1119067 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1119080 - Posted: 19 Jun 2011, 16:51:15 UTC - in response to Message 1119066.  


File deletion normally happens immediately after assimilation. Database entries are held for awhile to let users see what happened then purged. Those database entries are no larger for AP than MB and even though I have one still showing which validated 10 days ago, 10 days of AP records is still much less than 1 day of MB records.

They did have 45,218 AP WUs waiting for assimilation last time server status updated, those do reflect extra 8 MiB files so about 380 GB of storage.
                                                                Joe

Joe, why does that code keep the 8MB input file hanging around so long? Surely, there's no need for that file after validation and a canonical result choice?

As I understand it, the only file which is required for assimilation is the much, much smaller output (result) file - and in theory, only one of the multiple copies of that. Is it simply that the file deleter daemon was written, simplistically, not to distingish input and output files - perhaps not anticipating such a wide disparity in sizes as we have with AP?

Assimilation is a project function, so I think BOINC is just preserving all the data to give the project as much flexibility as possible. It might make sense for some kind of project to assimilate the WU rather than, or in addition to, the result.

The S@H validators compare the signals between results, but do no checks whether those "signals" are possible, it's left to the assimilator code to do sanity checks. Here if those sanity checks failed the WU would be assumed to be garbage and discarded, but perhaps for some other project it would make sense to try the same work again, maybe with slightly adjusted processing.

I agree that for this project it would make sense to delete WU files as soon as a canonical result is chosen, but that would require changes in BOINC rather than just project code.
                                                                Joe
ID: 1119080 · Report as offensive
Profile Jim_S
Avatar

Send message
Joined: 23 Feb 00
Posts: 4705
Credit: 64,560,357
RAC: 31
United States
Message 1119100 - Posted: 19 Jun 2011, 17:33:11 UTC - in response to Message 1118919.  

There should not be any problem uploading at this moment.
I suspect the problem is at your side, for even your AMD Athlon(tm) II X4 630 has not contacted Berkeley for more than 3 days. Last contact was 15 jun 02:24 UTC

I've checked everything on my end S@H still no joy...My other projects are working fine.
Confused!

I Desire Peace and Justice, Jim Scott (Mod-Ret.)
ID: 1119100 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1119101 - Posted: 19 Jun 2011, 17:40:17 UTC - in response to Message 1119100.  

There should not be any problem uploading at this moment.
I suspect the problem is at your side, for even your AMD Athlon(tm) II X4 630 has not contacted Berkeley for more than 3 days. Last contact was 15 jun 02:24 UTC

I've checked everything on my end S@H still no joy...My other projects are working fine.
Confused!

Might try posting some of the Boinc messages shown when it tries to connect.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1119101 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1119114 - Posted: 19 Jun 2011, 18:06:41 UTC - in response to Message 1119100.  



I've checked everything on my end S@H still no joy...My other projects are working fine.

Confused!




Don't worry too much. I had one computer do exactly the same thing. It didn't show a connection for four days while four others did connect. There didn't seem to be anything I could do to force it to even check-in, much less up and download. That made me worry.

It was still crunching work, so I walked away feeling defeated.

For absolutely no reason that I know, the day after I decided I had done all I could do except "start over", it just connected on its own and has connected reliably since.

It is almost as though Berkeley made it go sit in a corner, put it in a penalty box, gave it a time-out, ejected it for the remainder of the game, left its heart in San Francisco, detained it for later questioning...

Starting over would have meant I lost a lot of work units, both completed and not yet crunched, so that would have been bad.

Everything may not be fine at your end, but it might be.
ID: 1119114 · Report as offensive
justsomeguy

Send message
Joined: 27 May 99
Posts: 84
Credit: 6,084,595
RAC: 11
United States
Message 1119130 - Posted: 19 Jun 2011, 19:17:50 UTC

I get the following:

6/19/2011 12:15:06 PM SETI@home Starting task 05mr11af.8242.140253.11.10.120_2 using setiathome_enhanced version 603
6/19/2011 12:28:00 PM SETI@home Sending scheduler request: To report completed tasks.
6/19/2011 12:28:00 PM SETI@home Reporting 1 completed tasks, not requesting new tasks
6/19/2011 12:28:22 PM Project communication failed: attempting access to reference site
6/19/2011 12:28:22 PM SETI@home Scheduler request failed: Couldn't connect to server
6/19/2011 12:28:23 PM Internet access OK - project servers may be temporarily down.
6/19/2011 12:29:35 PM SETI@home Started upload of 04mr11ab.29587.9883.5.10.159_0_0
6/19/2011 12:29:35 PM SETI@home Started upload of 02ja11ac.20540.2930.9.10.137.vlar_1_0
6/19/2011 12:29:57 PM Project communication failed: attempting access to reference site
6/19/2011 12:29:57 PM SETI@home Temporarily failed upload of 04mr11ab.29587.9883.5.10.159_0_0: connect() failed
6/19/2011 12:29:57 PM SETI@home Backing off 1 hr 21 min 17 sec on upload of 04mr11ab.29587.9883.5.10.159_0_0
6/19/2011 12:29:57 PM SETI@home Temporarily failed upload of 02ja11ac.20540.2930.9.10.137.vlar_1_0: connect() failed
6/19/2011 12:29:57 PM SETI@home Backing off 3 hr 29 min 46 sec on upload of 02ja11ac.20540.2930.9.10.137.vlar_1_0
6/19/2011 12:29:58 PM Internet access OK - project servers may be temporarily down.
6/19/2011 1:57:11 PM SETI@home update requested by user
6/19/2011 1:57:16 PM SETI@home Sending scheduler request: Requested by user.
6/19/2011 1:57:16 PM SETI@home Reporting 1 completed tasks, not requesting new tasks
6/19/2011 1:57:38 PM Project communication failed: attempting access to reference site
6/19/2011 1:57:38 PM SETI@home Scheduler request failed: Couldn't connect to server
6/19/2011 1:57:39 PM Internet access OK - project servers may be temporarily down.


No uploads for 5 days now...

"Two things are infinite: The universe and human stupidity; and I'm not sure about the universe." - Albert Einstein

ID: 1119130 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1119131 - Posted: 19 Jun 2011, 19:23:28 UTC

Hmmmmmm...
No problems with comms here. Just checked 4 of my better rigs.

Uploads and reporting working just fine. No work being issued, of course. And no validation of tasks going on right now, so everything is going into pending.

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1119131 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36396
Credit: 261,360,520
RAC: 489
Australia
Message 1119170 - Posted: 19 Jun 2011, 21:40:29 UTC - in response to Message 1119131.  

No problem here either this morning as all mine have uploaded and reported fine.

Cheers.
ID: 1119170 · Report as offensive
Profile Jim_S
Avatar

Send message
Joined: 23 Feb 00
Posts: 4705
Credit: 64,560,357
RAC: 31
United States
Message 1119185 - Posted: 19 Jun 2011, 22:33:39 UTC - in response to Message 1119101.  

There should not be any problem uploading at this moment.
I suspect the problem is at your side, for even your AMD Athlon(tm) II X4 630 has not contacted Berkeley for more than 3 days. Last contact was 15 jun 02:24 UTC

I've checked everything on my end S@H still no joy...My other projects are working fine.
Confused!

Might try posting some of the Boinc messages shown when it tries to connect.

Here is some of what I got;

6/19/2011 12:49:02 PM SETI@home Started upload of 13mr11ah.3053.10292.10.10.80_0_0
6/19/2011 12:49:09 PM Project communication failed: attempting access to reference site
6/19/2011 12:49:10 PM Internet access OK - project servers may be temporarily down.
6/19/2011 12:49:24 PM Project communication failed: attempting access to reference site
6/19/2011 12:49:24 PM SETI@home Temporarily failed upload of 08ap11ab.18487.8247.5.10.123_0_0: connect() failed
6/19/2011 12:49:24 PM SETI@home Backing off 2 hr 36 min 22 sec on upload of 08ap11ab.18487.8247.5.10.123_0_0
6/19/2011 12:49:24 PM SETI@home Temporarily failed upload of 13mr11ah.3053.10292.10.10.80_0_0: connect() failed
6/19/2011 12:49:24 PM SETI@home Backing off 2 hr 33 min 59 sec on upload of 13mr11ah.3053.10292.10.10.80_0_0
6/19/2011 12:49:26 PM Internet access OK - project servers may be temporarily down.
6/19/2011 12:50:25 PM SETI@home Started upload of 05mr11af.8242.137390.11.10.10_2_0
6/19/2011 12:50:25 PM SETI@home Started upload of 18ap11aa.18789.4157.6.10.49_1_0
6/19/2011 12:50:47 PM Project communication failed: attempting access to reference site
6/19/2011 12:50:47 PM SETI@home Temporarily failed upload of 05mr11af.8242.137390.11.10.10_2_0: connect() failed
6/19/2011 12:50:47 PM SETI@home Backing off 7 min 21 sec on upload of 05mr11af.8242.137390.11.10.10_2_0
6/19/2011 12:50:47 PM SETI@home Temporarily failed upload of 18ap11aa.18789.4157.6.10.49_1_0: connect() failed
6/19/2011 12:50:47 PM SETI@home Backing off 3 hr 19 min 3 sec on upload of 18ap11aa.18789.4157.6.10.49_1_0
6/19/2011 12:50:48 PM Internet access OK - project servers may be temporarily down.
6/19/2011 12:54:25 PM SETI@home Sending scheduler request: To report completed tasks.
6/19/2011 12:54:25 PM SETI@home Reporting 32 completed tasks, not requesting new tasks
6/19/2011 12:54:47 PM Project communication failed: attempting access to reference site
6/19/2011 12:54:47 PM SETI@home Scheduler request failed: Couldn't connect to server
6/19/2011 12:54:48 PM Internet access OK - project servers may be temporarily down.
6/19/2011 12:55:01 PM Re-reading cc_config.xml
6/19/2011 12:55:01 PM Re-read config file
6/19/2011 12:55:01 PM log flags: file_xfer, sched_ops, task
6/19/2011 12:55:47 PM SETI@home Fetching scheduler list
6/19/2011 12:55:48 PM SETI@home Master file download succeeded
6/19/2011 12:55:54 PM SETI@home Sending scheduler request: To report completed tasks.
6/19/2011 12:55:54 PM SETI@home Reporting 32 completed tasks, not requesting new tasks
6/19/2011 12:56:15 PM Project communication failed: attempting access to reference site
6/19/2011 12:56:15 PM SETI@home Scheduler request failed: Couldn't connect to server
6/19/2011 12:56:16 PM Internet access OK - project servers may be temporarily down.


I Desire Peace and Justice, Jim Scott (Mod-Ret.)
ID: 1119185 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1119187 - Posted: 19 Jun 2011, 22:42:45 UTC
Last modified: 19 Jun 2011, 22:48:09 UTC

There's a spike of downloads happening, my RAC has dropped by 2k, and as i refresh my Account page my RAC is going back up, validators must be online again,

Claggy
ID: 1119187 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (48) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.