Panic Mode On (44) Server problems

Message boards : Number crunching : Panic Mode On (44) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51583
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1078859 - Posted: 18 Feb 2011, 19:29:58 UTC

Well, the Frozen 920 has been getting a little more than enough work landed to keep going and build a little cache.

Frankly, I am a little surprised there has not been much of a chink in the Cricket graph yet.....
But I did notice whilst monitoring downloads that some of the work it has been crunching over the last 24 hours seems to be of rather short duration.

So we might be fighting a shorty storm at the same time we are trying to overcome the effects of the outage.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1078859 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51583
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1078861 - Posted: 18 Feb 2011, 19:31:44 UTC

Well, the Frozen 920 has been getting a little more than enough work landed to keep going and build a little cache.

Frankly, I am a little surprised there has not been much of a chink in the Cricket graph yet.....
But I did notice whilst monitoring downloads that some of the work it has been crunching over the last 24 hours seems to be of rather short duration.

So we might be fighting a shorty storm at the same time we are trying to overcome the effects of the outage.

Also have noticed a fair number of 'lost task' resends.
I honestly don't know where the kitties hid them.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1078861 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1078867 - Posted: 18 Feb 2011, 19:56:57 UTC - in response to Message 1078859.  

Yes, I'm seeing some shorties in the mix, mostly from the 26au10 and 28au10 recordings, but some from 31jl10aa have crept in recently.

I wouldn't call it a shorty 'storm', because there's a good mix of recordings going through the splitters at the same time. But yes - they'll certainly add to the delay before the Crickets pause for breath.

You'll be seeing 'resent lost result', where a few months ago you would have seen a Ghost WU - and we all know how much chaos they caused. Be grateful for the resends - not least because they are only possible thanks to the extra capabilities of Oscar and Carolyn.
ID: 1078867 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51583
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1078869 - Posted: 18 Feb 2011, 20:06:51 UTC - in response to Message 1078867.  
Last modified: 18 Feb 2011, 20:07:45 UTC

Yes, I'm seeing some shorties in the mix, mostly from the 26au10 and 28au10 recordings, but some from 31jl10aa have crept in recently.

I wouldn't call it a shorty 'storm', because there's a good mix of recordings going through the splitters at the same time. But yes - they'll certainly add to the delay before the Crickets pause for breath.

You'll be seeing 'resent lost result', where a few months ago you would have seen a Ghost WU - and we all know how much chaos they caused. Be grateful for the resends - not least because they are only possible thanks to the extra capabilities of Oscar and Carolyn.

Oh, don't get me wrong....

I think being able to resend lost tasks is a wonderful thing.
I would rather get them back and crunch them up rather than having them spend the extra time in the database waiting to expire before being sent out again!

I just can't get the kitties to tell me what they did with them. LOL.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1078869 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1078874 - Posted: 18 Feb 2011, 20:19:35 UTC

During my luck of the draw for getting APs the other day, I saw for the first time, "resent lost tasks." Three APs. That's the first time ever that I've had what would have been ghosts. Love that new feature, because now my consecutive valid tasks won't reset. :D
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1078874 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14013
Credit: 208,696,464
RAC: 304
Australia
Message 1078884 - Posted: 18 Feb 2011, 20:38:42 UTC - in response to Message 1078874.  


Although the network traffic is still pegged, things are certainly improving.

For the first day, No Work available was the usual response once a Scheduler request went through- and most times they didn't. When downloading it often took multiple attempts before a file would start to download & it would often take several re-tries before it would finish downloading. And then it was lucky if it was much more than a couple of kB/s.

Now the Scheduler requests rarely don't go through, work is allocated on most of them, and downloads are almost up to 10kB/s & only a couple of re-tries before they start downloading.
Grant
Darwin NT
ID: 1078884 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 1078885 - Posted: 18 Feb 2011, 20:42:50 UTC

Not me, AP tasks still getting multiple restarts, and currently I have 0.32KBps download speed and a total elapsed time of almost an hour on one task, which has been accumulating off and on for a total Progress of about 35%.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 1078885 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1078898 - Posted: 18 Feb 2011, 21:15:02 UTC - in response to Message 1078867.  
Last modified: 18 Feb 2011, 21:16:49 UTC

@Richard:

Any possibility that other download errors could be resent, too?

I had one download that resulted in a "MD5 check failed" about two days ago. I suppose the cause was a transmission error and not a hd error here.

First time I've ever seen this one, and I don't recall any discussion in the forums, so I don't imagine resending MD5 errors would put much additional load on the servers.

Martin
ID: 1078898 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1078900 - Posted: 18 Feb 2011, 21:18:48 UTC

I've filled my cache, MBs & APs, on both PCs with only a few clicks on retry.
ID: 1078900 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34694
Credit: 79,922,639
RAC: 80
Germany
Message 1078918 - Posted: 18 Feb 2011, 22:00:20 UTC
Last modified: 18 Feb 2011, 22:00:54 UTC

Lucky you i´m fighting for 2 days to download my 116 APs in queue.
CPU will run dry tonight.
With each crime and every kindness we birth our future.
ID: 1078918 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1078925 - Posted: 18 Feb 2011, 22:19:45 UTC - in response to Message 1078898.  

@Richard:

Any possibility that other download errors could be resent, too?

I had one download that resulted in a "MD5 check failed" about two days ago. I suppose the cause was a transmission error and not a hd error here.

First time I've ever seen this one, and I don't recall any discussion in the forums, so I don't imagine resending MD5 errors would put much additional load on the servers.

Martin

Highly unlikely, I would think.

"Ghost WU", and "resent lost result" are two sides of the same coin. They both describe the same thing: the server thinks your computer has the WU (we put it in the post, honest), but your computer has never heard of it (must have got lost in the post, then).

In your case, the parcel arrived, but got damaged in transit - the precious vase is in smithereens. That needs an insurance claim, rather than sending the post-boy round the mailroom to find the original package slipped down behind the filing cabinet. Or something like that.
ID: 1078925 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 38649
Credit: 261,360,520
RAC: 489
Australia
Message 1078931 - Posted: 18 Feb 2011, 22:47:20 UTC - in response to Message 1078925.  

No problems here as all 3 of my PC's now have full caches. :D

Cheers.
ID: 1078931 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1078944 - Posted: 18 Feb 2011, 23:33:41 UTC - in response to Message 1078925.  

@Richard:

Any possibility that other download errors could be resent, too?

I had one download that resulted in a "MD5 check failed" about two days ago. I suppose the cause was a transmission error and not a hd error here.

First time I've ever seen this one, and I don't recall any discussion in the forums, so I don't imagine resending MD5 errors would put much additional load on the servers.

Martin

Highly unlikely, I would think.


Yep, probably so.

"Ghost WU", and "resent lost result" are two sides of the same coin. They both describe the same thing: the server thinks your computer has the WU (we put it in the post, honest), but your computer has never heard of it (must have got lost in the post, then).


is there that much difference between getting lost in the post or being trashed by the post?

It's not a problem.

Thanks.

Martin

ID: 1078944 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1078948 - Posted: 18 Feb 2011, 23:56:34 UTC

18-Feb-2011 04:33:48 [SETI@home] MD5 check failed for 24ap10aa.32197.12910.13.10.62
18-Feb-2011 04:33:48 [SETI@home] expected xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, got xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
18-Feb-2011 04:33:48 [SETI@home] Checksum or signature error for 24ap10aa.32197.12910.13.10.62


[UTC]

http://setiathome.berkeley.edu/result.php?resultid=1805283483

IIRC, never saw this before.

ID: 1078948 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14013
Credit: 208,696,464
RAC: 304
Australia
Message 1078949 - Posted: 18 Feb 2011, 23:59:29 UTC - in response to Message 1078944.  

is there that much difference between getting lost in the post or being trashed by the post?

If it got lost, there's no record of it.
If it got trashed, the evidence of it's existence is there.

Grant
Darwin NT
ID: 1078949 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6498
Credit: 34,134,168
RAC: 0
United States
Message 1079487 - Posted: 19 Feb 2011, 18:57:27 UTC

Well.. it will clear on its own eventually,

But from what I have seen the network throughput could be effectively increased
a great deal by "throttling back" the number and/or speed of the downloads.

Whether it is done by increasing backoff or limiting the number of virtual connections, does not really matter. But the saturated link does seem to be counter productive.
Janice
ID: 1079487 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1079603 - Posted: 20 Feb 2011, 2:33:43 UTC

Looks like the AP splitters have juuuust about chewed through the tapes. Maybe we'll see the first not-maxed interval in cricket soon.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1079603 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14013
Credit: 208,696,464
RAC: 304
Australia
Message 1079642 - Posted: 20 Feb 2011, 7:56:58 UTC - in response to Message 1077417.  


The network traffic must be close to subsiding, i'm now getting downloads at more than 10kB/s. A couple have even made it down at 30kB/s.
Grant
Darwin NT
ID: 1079642 · Report as offensive
Profile Dr Grey

Send message
Joined: 27 May 99
Posts: 154
Credit: 104,147,344
RAC: 21
United Kingdom
Message 1079667 - Posted: 20 Feb 2011, 10:34:56 UTC

Yes, they're coming through now. Last night my cache size doubled and I'm watching them come in at up to 35 kBps. The cricket graph is still showing a sea of green but I don't think it will be long before gaps start to show.
Maybe tonight it will be time to increase my cache size up from three days to ten.
ID: 1079667 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1079678 - Posted: 20 Feb 2011, 12:27:52 UTC

Yup seeing the same thing this morning, coming down at 30-35kbps slowing some to 25kbps. Now if I can get the caches stabilized on both my machines I would be happy. For some reason one machine will not pull the amount of work it used too. Irritating but hopefully works itself out.
Traveling through space at ~67,000mph!
ID: 1079678 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (44) Server problems


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.