Tangled Web (Jun 05 2008)

Message boards : Technical News : Tangled Web (Jun 05 2008)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 763325 - Posted: 5 Jun 2008, 21:24:59 UTC

Another mild day in server land. Lots of minor apache issues. There was an annoying web scrape yesterday afternoon that gummed up the works for a moment. This morning I found a bug in the web log rotation script that prevented our public web server from restarting - so it's been running for weeks non-stop during which the httpd processes bloated in size (apparently there are small/tolerable memory leaks in php/apache/boinc code somewhere). Then later our scheduling server was suddenly unable to run the scheduler cgi. We were dropping connections so I got alerts right away about this. I had to stop/restart apache twice, though, to get it working again. Not sure why the first restart didn't take.

Jeff's adding more star catalog data to our database. Bob worked on another alert script to better check our current database storage allocations (and prevent another minor mishap like earlier this week). Eric and I swapped drives between his hydrogen server "ewen" and ptolemy (for when the latter becomes a storage server) - ewen freaked out a little bit unexpectedly - we umounted the filesystems before pulling the drives, but an xfs daemon woke up and thought that particular partition should still be around, etc. No big deal - just a lot of alert e-mails that were scary at first.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 763325 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 763342 - Posted: 5 Jun 2008, 22:30:04 UTC


. . . putting iT mildly - YOU guys ROCK! - keep up the superb work there @ Berkeley - iT is Sincerely Appreciated


BOINC Wiki . . .

Science Status Page . . .
ID: 763342 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 763362 - Posted: 5 Jun 2008, 23:20:19 UTC

Let's hope all is settled down for the week end, and the servers keep to themselves and leave the time for your own use.

Thanks Matt.
It's good to be back amongst friends and colleagues



ID: 763362 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 763437 - Posted: 6 Jun 2008, 4:36:17 UTC - in response to Message 763362.  

Let's hope all is settled down for the week end, and the servers keep to themselves and leave the time for your own use.

Thanks Matt.



AMEN to that!
.

Hello, from Albany, CA!...
ID: 763437 · Report as offensive
Black Beard

Send message
Joined: 8 Dec 02
Posts: 3
Credit: 1,770,889
RAC: 0
United States
Message 764178 - Posted: 7 Jun 2008, 15:24:04 UTC

It looks like the servers have decided to take a nap!
I'm getting 'bad gateway' and 'http service unavailable' errors.

Anyone else seeing this?
ID: 764178 · Report as offensive
Profile Steve Dodd

Send message
Joined: 29 May 99
Posts: 23
Credit: 8,695,373
RAC: 1
United States
Message 764181 - Posted: 7 Jun 2008, 15:27:34 UTC

Yep, same here. No communication for reporting tasks. I can upload completed work, though. Cricket graphs indicate minimal traffic.
ID: 764181 · Report as offensive
Hanford WA4LZC
Avatar

Send message
Joined: 15 May 99
Posts: 38
Credit: 10,129,207
RAC: 0
United States
Message 764182 - Posted: 7 Jun 2008, 15:28:07 UTC

As of 11:13 EDT I am getting this from the Bonic client...

6/7/2008 11:11:50|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 8 completed tasks
6/7/2008 11:13:27||Project communication failed: attempting access to reference site
6/7/2008 11:13:29||Access to reference site succeeded - project servers may be temporarily down.
6/7/2008 11:13:31|SETI@home|Scheduler request failed: Server returned nothing (no headers, no data)

ID: 764182 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 764193 - Posted: 7 Jun 2008, 15:41:36 UTC - in response to Message 764182.  

As of 11:13 EDT I am getting this from the Bonic client...

6/7/2008 11:11:50|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 8 completed tasks
6/7/2008 11:13:27||Project communication failed: attempting access to reference site
6/7/2008 11:13:29||Access to reference site succeeded - project servers may be temporarily down.
6/7/2008 11:13:31|SETI@home|Scheduler request failed: Server returned nothing (no headers, no data)

.... and since the BOINC client is designed to handle this (and retry later) it isn't a reason for alarm.

Someone from the project will wander by a computer at home, check SETI, see a problem, and try to fix it. If they can't fix it remotely, then someone will go into the lab.

... and in the meantime, we're all crunching along, and we'll report work (and download more) once it is fixed.
ID: 764193 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 764236 - Posted: 7 Jun 2008, 16:57:09 UTC - in response to Message 763362.  

Let's hope all is settled down for the week end, and the servers keep to themselves and leave the time for your own use.

Thanks Matt.


you jinxed the servers and the admins:P
ID: 764236 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 764253 - Posted: 7 Jun 2008, 17:15:52 UTC - in response to Message 764193.  

Someone from the project will wander by a computer at home, check SETI, see a problem, and try to fix it. If they can't fix it remotely, then someone will go into the lab.


Which is exactly what happened. Same scheduler problem that happened earlier in the week. I gave it a kick remotely - should be catching up on requests pretty soon. Not sure the *exact* problem (it's suddenly sensitive to log file sizes whereas nothing changed on the server in months) but I think the fix/workaround will be employed sooner than later.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 764253 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 764260 - Posted: 7 Jun 2008, 17:21:14 UTC


. . . Thanks Matt (as usual - doin' a great job @ keepin' up to par ;)


BOINC Wiki . . .

Science Status Page . . .
ID: 764260 · Report as offensive

Message boards : Technical News : Tangled Web (Jun 05 2008)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.