Not Perfect but Better (Jun 22 2009)


log in

Advanced search

Message boards : Technical News : Not Perfect but Better (Jun 22 2009)

1 · 2 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1390
Credit: 74,079
RAC: 0
United States
Message 910238 - Posted: 22 Jun 2009, 20:53:56 UTC

It's fairly clear that the recent updates we made to the general mysql/state counts/splitter fold has vastly improved our recent weekend woes. There were still a couple dips here and there, but no wild swings like before.

Except this morning one particular query - from the scheduler - was clogging the works. We figured we'll just let it push through, i.e. let nature take its course. We assumed it was an expensive lookup, but after a couple hours of waiting I ran the same query on the replica and found there was only one (!) row in question. So what the heck is mysql doing? We killed the query and eventually the logjam cleared.

I'm finally scraping up enough space to pull a lot more work up from our archives, so Astropulse will be kicking in again, at least at some low level. This should also help reduce the deman on our limited resources since those workunits take longer to process, which means a lighter load on our database/download/upload/scheduling servers.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Jack Zhang
Volunteer tester
Avatar
Send message
Joined: 2 Jul 06
Posts: 206
Credit: 6,141,531
RAC: 893
Canada
Message 910261 - Posted: 22 Jun 2009, 21:56:54 UTC - in response to Message 910238.
Last modified: 22 Jun 2009, 22:08:52 UTC

I can certain say that the title holds true for right now.

But, I'm running out of work again... check the queries if there's anything stuck again...

Edit: Never mind, the work just took freakishly really long to get via scheduler requests...
____________
What if Fiction was Fact and Fact was Fiction and vice versa?

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 13004
Credit: 7,666,318
RAC: 6,097
United States
Message 910265 - Posted: 22 Jun 2009, 22:34:47 UTC - in response to Message 910238.

Ah, CUDA maxed the pipe out for MB.

Thanks for the updates.

____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8767
Credit: 52,716,920
RAC: 16,500
United Kingdom
Message 910266 - Posted: 22 Jun 2009, 22:42:29 UTC - in response to Message 910238.

... so Astropulse will be kicking in again, at least at some low level. ...

Thanks for the info.

Could you clarify, please, whether the data fetched back from storage (like 05mr09ad currently splitting) will be split and scheduled for the general _v5 (503) application we've been using for a while, or the new _v505 application installed for Windows only on 10 June? It would help the optimisers plan their testing and releases.

Profile Rick
Send message
Joined: 27 Mar 01
Posts: 8
Credit: 21,615,319
RAC: 0
United States
Message 910419 - Posted: 23 Jun 2009, 12:44:55 UTC

Excellent news Matt. I know the community in general is probably glad to hear that AP work is coming, and Im sure your pipeline/databases are going to like you more as well.

Profile Dr. C.E.T.I.
Avatar
Send message
Joined: 29 Feb 00
Posts: 15993
Credit: 690,597
RAC: 0
United States
Message 910468 - Posted: 23 Jun 2009, 16:19:08 UTC



. . . Thanks for the Updates Matt

< To All @ Berkeley - the Best to Each of You Today - May All go well . . .




____________
BOINC Wiki . . .

Science Status Page . . .

Profile S@NL - Eesger - www.knoop.nl
Avatar
Send message
Joined: 7 Oct 01
Posts: 384
Credit: 37,574,583
RAC: 6,330
Netherlands
Message 910688 - Posted: 24 Jun 2009, 11:42:22 UTC

FYI, the thread of 23 June is unreadable:

Database Error
Unable to handle request

No thread with id 54313. Please check the link and try again.

____________
The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS

RoosStar
Send message
Joined: 16 Oct 99
Posts: 48
Credit: 6,248,380
RAC: 3,725
Netherlands
Message 910703 - Posted: 24 Jun 2009, 12:38:25 UTC

Not only this thread. :-(
I have seen this error in the NC forum also.
And I have noticed that some threads says that the last post was made a few hours ago but in the thread itself the last post was made yesterday or even months ago.
See as exsample these threads:

http://setiathome.berkeley.edu/forum_thread.php?id=54092
http://setiathome.berkeley.edu/forum_thread.php?id=53702
____________

Profile Robi
Send message
Joined: 24 Oct 00
Posts: 33
Credit: 295,387
RAC: 310
United States
Message 910705 - Posted: 24 Jun 2009, 12:40:29 UTC

BTW, whoever kicked the server this morning, thanks a lot :)
when I looked up server status after not being able to see my tasks (results) I saw red
muchly appreciated, muchísimas gracias, obrigado, vielen herzlichen Dank, merci beaucoup, domo arigato gosaimashita, efcharistó, mille grazie, hvala lijepo, spasibo bolshoe
____________
Robi

C
Send message
Joined: 3 Apr 99
Posts: 240
Credit: 6,719,089
RAC: 859
United States
Message 910710 - Posted: 24 Jun 2009, 12:51:17 UTC

The server may be mostly up, but there's almost no I/O going on. See http://fragment1.berkeley.edu:80/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets

C
____________

Join Team MacNN

JPP
Send message
Joined: 31 May 99
Posts: 15
Credit: 17,313,375
RAC: 6,614
France
Message 910746 - Posted: 24 Jun 2009, 15:50:22 UTC

well
something else
currently i can t download wu; (from server to my pc)
they all are pending and all copy fail with wrong size; as below
24/06/2009 17:46:58 SETI@home Started download of 06ap09ac.31280.4162.13.8.0
24/06/2009 17:46:59 Internet access OK - project servers may be temporarily down.
24/06/2009 17:47:20 Project communication failed: attempting access to reference site
24/06/2009 17:47:20 SETI@homeTemporarily failed download of 06ap09ac.31280.4162.13.8.0: connect() failed
24/06/2009 17:47:20 SETI@home Backing off 15 min 54 sec on download of 06ap09ac.31280.4162.13.8.0
24/06/2009 17:47:20 SETI@home [error] File 06ap09ac.31280.3753.13.8.31 has wrong size: expected 375333, got 0
24/06/2009 17:47:20 SETI@home Started download of 06ap09ac.31280.3753.13.8.31
24/06/2009 17:47:21 Internet access OK - project servers may be temporarily down.
24/06/2009 17:47:38 Project communication failed: attempting access to reference site
24/06/2009 17:47:38 SETI@home Temporarily failed download of 01mr09af.22260.17659.13.8.97: HTTP error
24/06/2009 17:47:38 SETI@home Backing off 2 min 3 sec on download of 01mr09af.22260.17659.13.8.97
24/06/2009 17:47:38 SETI@home [error] File 06ap09ac.31280.3753.13.8.144 has wrong size: expected 375331, got 0
24/06/2009 17:47:38 SETI@home Started download of 06ap09ac.31280.3753.13.8.144
24/06/2009 17:47:39 Internet access OK - project servers may be temporarily down.

...
cheers
jeanpierre@jpp

____________

Radford Bunker
Send message
Joined: 12 Mar 09
Posts: 8
Credit: 3,263,354
RAC: 1,769
United States
Message 910754 - Posted: 24 Jun 2009, 16:12:17 UTC

I'm getting much the same as Jeanpierre:

Wed Jun 24 12:04:32 2009|SETI@home|Started download of 06ap09ab.914.4571.15.8.189
Wed Jun 24 12:06:42 2009|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks
Wed Jun 24 12:06:48 2009|SETI@home|Scheduler request succeeded: got 0 new tasks
Wed Jun 24 12:07:00 2009|SETI@home|Started download of 01mr09af.22260.22976.13.8.195
Wed Jun 24 12:07:44 2009||Project communication failed: attempting access to reference site
Wed Jun 24 12:07:44 2009|SETI@home|Temporarily failed download of 06ap09ab.914.4571.15.8.189: HTTP error
Wed Jun 24 12:07:44 2009|SETI@home|Backing off 1 hr 9 min 40 sec on download of 06ap09ab.914.4571.15.8.189
Wed Jun 24 12:07:44 2009|SETI@home|Started download of 06ap09aa.31651.10706.15.8.205
Wed Jun 24 12:07:45 2009||Internet access OK - project servers may be temporarily down.
Wed Jun 24 12:08:15 2009||Project communication failed: attempting access to reference site
Wed Jun 24 12:08:15 2009|SETI@home|Temporarily failed download of 01mr09af.22260.22976.13.8.195: connect() failed
Wed Jun 24 12:08:15 2009|SETI@home|Backing off 1 hr 29 min 42 sec on download of 01mr09af.22260.22976.13.8.195
Wed Jun 24 12:08:15 2009|SETI@home|Started download of 06ap09ab.20174.23794.14.8.5
Wed Jun 24 12:08:16 2009||Internet access OK - project servers may be temporarily down.
Wed Jun 24 12:09:30 2009||Project communication failed: attempting access to reference site
Wed Jun 24 12:09:30 2009|SETI@home|Temporarily failed download of 06ap09ab.20174.23794.14.8.5: connect() failed
Wed Jun 24 12:09:30 2009|SETI@home|Backing off 1 hr 6 min 4 sec on download of 06ap09ab.20174.23794.14.8.5
Wed Jun 24 12:09:30 2009|SETI@home|Started download of 01mr09ad.18953.68833.16.8.126
Wed Jun 24 12:09:31 2009||Internet access OK - project servers may be temporarily down.

Radford Bunker
Send message
Joined: 12 Mar 09
Posts: 8
Credit: 3,263,354
RAC: 1,769
United States
Message 910759 - Posted: 24 Jun 2009, 16:24:26 UTC

Oh, and all my tasks are "downloading": seems like one big mess.




____________

dydek
Send message
Joined: 3 Jul 01
Posts: 3
Credit: 1,421,791
RAC: 0
United States
Message 910794 - Posted: 24 Jun 2009, 18:00:07 UTC

Ditto in Chicago. Download is starting but receiving zero bytes for each file, in two different location on Mac and Win.

Darek
____________

dydek
Send message
Joined: 3 Jul 01
Posts: 3
Credit: 1,421,791
RAC: 0
United States
Message 910795 - Posted: 24 Jun 2009, 18:01:11 UTC

BTW, the top sticky thread (Long outgae or something like that) seems to be broken as well. I get an error when clicked.
____________

Danny Sosebee
Send message
Joined: 10 Jun 02
Posts: 53
Credit: 1,022,592
RAC: 0
United States
Message 910800 - Posted: 24 Jun 2009, 18:19:20 UTC - in response to Message 910795.

BTW, the top sticky thread (Long outgae or something like that) seems to be broken as well. I get an error when clicked.


Yep, same thing happens when I try it. Also having problems downloading work so I suspect it's a database error. I'm sure they'll fix it as soon as they can.
____________

Profile Space Cowboy
Volunteer tester
Avatar
Send message
Joined: 24 Apr 00
Posts: 43
Credit: 1,656,957
RAC: 28
United Kingdom
Message 910802 - Posted: 24 Jun 2009, 18:30:25 UTC

Anyone noticed that the works units they managed to send back today still appear as in progress?
____________

Radford Bunker
Send message
Joined: 12 Mar 09
Posts: 8
Credit: 3,263,354
RAC: 1,769
United States
Message 910819 - Posted: 24 Jun 2009, 19:22:07 UTC - in response to Message 910800.


Dan Sosebee,

From yours:

Yep, same thing happens when I try it. Also having problems downloading work so I suspect it's a database error. I'm sure they'll fix it as soon as they can.


I'm sure they will, I just wish they'd post some information about what is going on.

Rad

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 910874 - Posted: 24 Jun 2009, 21:24:05 UTC - in response to Message 910819.

I just wish they'd post some information about what is going on.

Do you want them to stop working on the problem(s) and post, or do you want them to put all of their efforts into fixing it and post later?

____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13664
Credit: 31,500,985
RAC: 7,981
United States
Message 910886 - Posted: 24 Jun 2009, 22:17:38 UTC - in response to Message 910705.

BTW, whoever kicked the server this morning, thanks a lot :)
when I looked up server status after not being able to see my tasks (results) I saw red
muchly appreciated, muchísimas gracias, obrigado, vielen herzlichen Dank, merci beaucoup, domo arigato gosaimashita, efcharistó, mille grazie, hvala lijepo, spasibo bolshoe


I've never quite understood why people get so emotional (angry, seeing red) over anything at this project when 99% of the time things can be explained in a rational way when given a chance and a bit of patience.
____________

1 · 2 · Next

Message boards : Technical News : Not Perfect but Better (Jun 22 2009)

Copyright © 2014 University of California