Can't talk.. Debugging.. (May 15 2007)

Message boards : Technical News : Can't talk.. Debugging.. (May 15 2007)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 12 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 568133 - Posted: 15 May 2007, 23:05:33 UTC

We had the usual outage today which was mostly a success. The database compressed and was backed up in just over an hour. Normally this takes almost twice as long but the result table has significantly shrunk over the past two weeks (wonder why?). After that we put the new thumper in the closet (we being me, Eric, Jeff, and Kevin - it's a heavy machine). We also rebooted bruno to cleanly pick up a new disk (replacing a failed disk from yesterday). And I rebooted penguin to attach koloth's old tape drive to it (so it could read the classic data tapes for splitting).

That all went well. We also updated all the BOINC-side code to bring the SETI@home project in line with the current BOINC source tree and a few things broke, namely our validators and assimilators. These aren't project critical for the time being, so we're postponing dealing with these until we deal with the real problem at hand: getting people to connect to our data servers.

I think this is the longest outage we've ever had (even though it wasn't a "complete" outage - just no work was available) and we're in a whole new network configuration since the last major outage (new OS, new servers, new ISP, new switches, new router). In short, we're being clobbered by the returning flood of work requests. The major bottleneck is somewhere in the direction of our Hurricane router or bruno. Or at least that's the way it seems right now and there's no guarantee that when we break that dam a new bottleneck won't arise. I don't have the time to spell out what is broken and what we tried and what failed and what yielded unexpected results. Just know we're working on it and we understand most connections are being dropped.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 568133 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 568134 - Posted: 15 May 2007, 23:08:50 UTC

Cheers guys, we know your doing the best you can.
ID: 568134 · Report as offensive
Profile JerWA

Send message
Joined: 3 Apr 99
Posts: 13
Credit: 4,262,442
RAC: 0
United States
Message 568139 - Posted: 15 May 2007, 23:13:03 UTC

Glad to hear it's on the radar at least. Have a ton of stuck WUs, but none due for the next 13 days or so, so no rush.

ID: 568139 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 568144 - Posted: 15 May 2007, 23:26:47 UTC

Thanks for the update, Matt; good luck with everything. 8-D


TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 568144 · Report as offensive
Profile Dingo
Volunteer tester
Avatar

Send message
Joined: 28 Jun 99
Posts: 104
Credit: 16,364,896
RAC: 1
Australia
Message 568148 - Posted: 15 May 2007, 23:35:43 UTC - in response to Message 568133.  


Thanks for the update, I think everyone realised that there would be a bottleneck somewhere.

Proud Founder and member of



Have a look at my WebCam
ID: 568148 · Report as offensive
Profile Daniel Michel
Volunteer tester
Avatar

Send message
Joined: 2 Feb 04
Posts: 14925
Credit: 1,378,607
RAC: 6
United States
Message 568188 - Posted: 16 May 2007, 0:30:23 UTC

I remember back in the early days of SETI/BOINC there was no weekly scheduled outage...With all the new equipment coming on line...Does that mean that back up day may become a thing of the past?

PROUD TO BE TFFE!
ID: 568188 · Report as offensive
Jose Montesinos

Send message
Joined: 15 Apr 07
Posts: 1
Credit: 422,046
RAC: 0
Chile
Message 568191 - Posted: 16 May 2007, 0:45:06 UTC

Is there a way to send the results? I don't care about the credits, but some of the results will expire tomorrow.
ID: 568191 · Report as offensive
KZ3AB

Send message
Joined: 1 Mar 00
Posts: 6
Credit: 4,084,338
RAC: 0
United States
Message 568193 - Posted: 16 May 2007, 0:53:08 UTC


Z-Z-Z-Z

Waiting.


ID: 568193 · Report as offensive
Profile Gavin Shaw
Avatar

Send message
Joined: 8 Aug 00
Posts: 1116
Credit: 1,304,337
RAC: 0
Australia
Message 568198 - Posted: 16 May 2007, 0:56:37 UTC - in response to Message 568191.  

Is there a way to send the results? I don't care about the credits, but some of the results will expire tomorrow.


Same here. I've got results that haven't uploaded since this all started. Except I got some that expire today.

Never surrender and never give up. In the darkest hour there is always hope.

ID: 568198 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 568207 - Posted: 16 May 2007, 1:19:19 UTC

Totally understandable Matt. We know it will take a week or two for things to settle down again. We also know you guys will be doing all you can to ease the situation in the meantime, but you must be fighting a very uphill battle!

I bet no one ever envisaged these levels of network traffic when they dreamed up SETI ;)


*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 568207 · Report as offensive
Profile paul
Volunteer tester
Avatar

Send message
Joined: 29 Jul 01
Posts: 42
Credit: 23,126,185
RAC: 0
United States
Message 568211 - Posted: 16 May 2007, 1:33:40 UTC

Boincers massing at the southern wall, sir. ;-)

I've suspended Seti since the outage occurred, the fleet picked up on backup projects, at least until the logjam breaks. Our Team certainly ensured that backups projects were added, and gave everyone a quick lesson on how to ensure that they don't run out of work. I suspect many other BOINC projects benefited from increased resources the past week or two.

Kudos for your team getting the project back online, the efforts are appreciated.


Team Starfire World BOINC
IRC- irc//irc.teamstarfire.net:6667/team_starfire

ID: 568211 · Report as offensive
Profile PUCE II

Send message
Joined: 12 Oct 02
Posts: 3
Credit: 175,156
RAC: 0
United States
Message 568240 - Posted: 16 May 2007, 2:18:01 UTC

Don't worry about rushing, guys. She'll be up when she's up, and we'll be here then.
ID: 568240 · Report as offensive
Profile [SETI.USA]Tank_Master
Volunteer tester
Avatar

Send message
Joined: 1 Jan 01
Posts: 24
Credit: 2,194,285
RAC: 0
United States
Message 568243 - Posted: 16 May 2007, 2:22:30 UTC

does this meen the 64bit clients will now be supported?
ID: 568243 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 568258 - Posted: 16 May 2007, 2:46:27 UTC

I already have WUs that are behind deadline, and they look like they were downloaded after deadline.

Of course, the outage is also affecting Beta. *sigh*

Keep up the good work lads, we appreciate it!
ID: 568258 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 568286 - Posted: 16 May 2007, 3:28:26 UTC - in response to Message 568258.  

I already have WUs that are behind deadline, and they look like they were downloaded after deadline.

Of course, the outage is also affecting Beta. *sigh*

Keep up the good work lads, we appreciate it!

As Odysseus pointed out, the validators are off.....
ID: 568286 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 568289 - Posted: 16 May 2007, 3:36:29 UTC - in response to Message 568286.  

I already have WUs that are behind deadline, and they look like they were downloaded after deadline.

Of course, the outage is also affecting Beta. *sigh*

Keep up the good work lads, we appreciate it!

As Odysseus pointed out, the validators are off.....


Is my understanding correct though that units that exceed deadline will still be reissued, thus creating more download traffic (reissue) and more upload attempts once completed? If so, this is a snowball.
ID: 568289 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 568322 - Posted: 16 May 2007, 5:43:32 UTC - in response to Message 568289.  

Is my understanding correct though that units that exceed deadline will still be reissued, thus creating more download traffic (reissue) and more upload attempts once completed?

Nope.
It won't add or subtract from the network traffic. It will just be one more Work Unit available to crucnch amongst all the others that haven't yet been downloaded at all.

Grant
Darwin NT
ID: 568322 · Report as offensive
Aragon Speed
Volunteer tester

Send message
Joined: 1 Apr 07
Posts: 3
Credit: 140,717
RAC: 0
United Kingdom
Message 568327 - Posted: 16 May 2007, 6:10:22 UTC

It's a shame there isn't a smilie for pulling your hair out in frustration. ;)
Aragon Speed XTM Team Member

X-Tended Mod Website
ID: 568327 · Report as offensive
HachPi
Avatar

Send message
Joined: 2 Aug 99
Posts: 481
Credit: 21,807,425
RAC: 21
Belgium
Message 568332 - Posted: 16 May 2007, 6:39:11 UTC

Keep on smiling...
We will overcome some day.

Grtz HP
ID: 568332 · Report as offensive
Profile Mephist0
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 12
Credit: 1,401,540
RAC: 0
Sweden
Message 568335 - Posted: 16 May 2007, 6:48:43 UTC

Isn't it possible to turn of the "due date" of the results until the connection problems is resolved. that way one result dont have to be sent out to more computers than neccesary.. Its just a waste of computing power in my eyes...
ID: 568335 · Report as offensive
1 · 2 · 3 · 4 . . . 12 · Next

Message boards : Technical News : Can't talk.. Debugging.. (May 15 2007)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.