Panic Mode On (103) Server Problems?

Message boards : Number crunching : Panic Mode On (103) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 34 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1805555 - Posted: 29 Jul 2016, 9:29:34 UTC - in response to Message 1805551.  

And when the server processes are stopped and restarted - say for Tuesday maintenance - the splitters appear to start working on the most recently loaded batch, rather than resuming older batches.

not at all, they're working on the smallest number of blc*

Within each batch, yes. But you need to be aware of when each collection of 'tapes' started to appear on the page: the partially-split blc7 tapes have been there the longest, and the partially-split blc5 tapes have been there less long. The active blc3 tapes are the most recent arrivals. A one-time glance at the SSP isn't enough to give you the full picture.

(I must get out more...)
ID: 1805555 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1805708 - Posted: 29 Jul 2016, 23:42:41 UTC

Interesting arrival of a bunch of small MESSIER031 tapes on the SSP. I imagine I won't have to worry about those at least until the morning.
ID: 1805708 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1805742 - Posted: 30 Jul 2016, 1:52:54 UTC - in response to Message 1805708.  

Interesting arrival of a bunch of small MESSIER031 tapes on the SSP. I imagine I won't have to worry about those at least until the morning.

Seems to be from observations a day or two earlier than all those MESSIER031 files we got back in April. This group is all from the 57396 date, while all the ones in that first "messy" group were either 57397 or 57398. I wonder if these will be just as noisy or if they've tried cleaning them up at all before splitting them.
ID: 1805742 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1805750 - Posted: 30 Jul 2016, 2:18:20 UTC - in response to Message 1805742.  

Interesting arrival of a bunch of small MESSIER031 tapes on the SSP. I imagine I won't have to worry about those at least until the morning.

Seems to be from observations a day or two earlier than all those MESSIER031 files we got back in April. This group is all from the 57396 date, while all the ones in that first "messy" group were either 57397 or 57398. I wonder if these will be just as noisy or if they've tried cleaning them up at all before splitting them.

If they are noisy we will burn through them reasonably quickly. I am assuming by noisy you mean tasks that run fast?
ID: 1805750 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1805751 - Posted: 30 Jul 2016, 2:35:30 UTC - in response to Message 1805750.  

If they are noisy we will burn through them reasonably quickly. I am assuming by noisy you mean tasks that run fast?

Yeah, I think about 70% of them resulted in -9 overflows. I remember posting the following at the time:
Was just checking my machines before I went to bed and watched one blow through 7 consecutive messier031 tasks in 41 seconds.

They weren't VLARs, either, so even if they weren't overflows, the NVIDIA GPUs handled them just fine, IIRC.
ID: 1805751 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1805796 - Posted: 30 Jul 2016, 9:35:55 UTC - in response to Message 1805742.  

I wonder if these will be just as noisy or if they've tried cleaning them up at all before splitting them.

This combining with different AR than BLC ones could mean not correctly chosen thresholds for such GBT data...
If new bunch will have excessive noise too worth to attract Eric attention to this issue.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1805796 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1808545 - Posted: 11 Aug 2016, 18:06:00 UTC

Where did we go?
ID: 1808545 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22189
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1808548 - Posted: 11 Aug 2016, 18:16:14 UTC

Third door on the left, the invisible one without a handle or name plate....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1808548 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1808867 - Posted: 13 Aug 2016, 13:04:19 UTC

So it looks like seti is reporting negative credits to free DC and BoincStats.

Anyone else seeing this? Is this related to outage the other day?
ID: 1808867 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1808870 - Posted: 13 Aug 2016, 13:39:39 UTC - in response to Message 1808867.  

So it looks like seti is reporting negative credits to free DC and BoincStats.

Anyone else seeing this? Is this related to outage the other day?

BOINC projects don't export stats 'to' anywhere at all: they just make the data available at a public place, and say "come and get them" - so if two different sites show the same problem, it's a sure guess that there's something wrong with the exported data, rather than the stats sites' interpretation of them.

In our case, the public place is

http://setiathome.berkeley.edu/stats/

Looking at the 12 August version of those files, there's something wrong with tables.xml - there should be summary lines for users, hosts, teams and project total credit at the top, under the 'update time'.

The server status page was also squiffy for a while after the server outage, and I'm guessing you're right - they are related. Also note that the replica database is currently about 25K seconds behind the master, which can affect values too.

I'd suggest we wait and see if the files return to normal today or tomorrow, and drop a note to the staff on Monday if they're still out of kilter.
ID: 1808870 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1808873 - Posted: 13 Aug 2016, 14:03:20 UTC - in response to Message 1808870.  
Last modified: 13 Aug 2016, 14:03:33 UTC

Cool, thanks Richard
ID: 1808873 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1808887 - Posted: 13 Aug 2016, 14:51:23 UTC - in response to Message 1808870.  

I'd suggest we wait and see if the files return to normal today or tomorrow, and drop a note to the staff on Monday if they're still out of kilter.


Hopefully files do get fixed before SETI@Home Wow Event 2016 starts Monday 16.00 UTC...
ID: 1808887 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1808889 - Posted: 13 Aug 2016, 15:00:39 UTC - in response to Message 1808887.  

I'd suggest we wait and see if the files return to normal today or tomorrow, and drop a note to the staff on Monday if they're still out of kilter.

Hopefully files do get fixed before SETI@Home Wow Event 2016 starts Monday 16.00 UTC...

The Wow stats are updated hourly, so they must use a different source from the public dump, which is only updated once every 24 hours.
ID: 1808889 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1808941 - Posted: 13 Aug 2016, 18:59:13 UTC

I'm wondering how long those 71 MBv7 results are going to be waiting for DB purging. Normally, they stick around for ~24 hours after being assimilated, but it's been a few days now.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1808941 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1808975 - Posted: 13 Aug 2016, 21:41:56 UTC

Looks like BOINCstats is finding some new credits in today's dump, but we won't see the full picture until tomorrow's 'daily update' is complete.
ID: 1808975 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1809120 - Posted: 14 Aug 2016, 18:55:05 UTC - in response to Message 1808889.  

I'd suggest we wait and see if the files return to normal today or tomorrow, and drop a note to the staff on Monday if they're still out of kilter.

Hopefully files do get fixed before SETI@Home Wow Event 2016 starts Monday 16.00 UTC...

The Wow stats are updated hourly, so they must use a different source from the public dump, which is only updated once every 24 hours.


I think they are scanning the actual account pages for the participants.

ID: 1809120 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1809131 - Posted: 14 Aug 2016, 20:21:10 UTC

I did notice the negative credit in BOINCStats and I wonder if it's related to a situation I've been experiencing since at least about 21:00 UTC on 13th August. From my home network I have had no access to S@h - or anywhere else on berkeley.edu for that matter. But checking at my parents' place that same day (about 02:00 UTC 14th August) and also at my work office just now (20:00 UTC 14th August), I see no issues with berkeley.edu so I'm wondering if there's some routing problem somewhere between my home network and Berkeley.

Just to be clear, I have no other Internet issues on my home network and even gave the router a restart to get a fresh connection with my ISP. Am I really the only one with this issue?
Soli Deo Gloria
ID: 1809131 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1809133 - Posted: 14 Aug 2016, 20:25:38 UTC - in response to Message 1809131.  

No problems here at my different locations. Might want to your antivirus program and make sure it isn't blocking your access
ID: 1809133 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809136 - Posted: 14 Aug 2016, 20:36:14 UTC - in response to Message 1809131.  

You might want to do a traceroute to the Seti servers.

http://boinc2.ssl.berkeley.edu
http://georgem.ssl.berkeley.edu
http://vader.ssl.berkeley.edu

You might also add them to your allowed URL "whitelist" in your AV program. I was getting download errors in mine till I whitelisted the servers for Seti and Einstein.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809136 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1809141 - Posted: 14 Aug 2016, 20:54:32 UTC

The issue is at the network level, so I really doubt anti-virus is involved. Access is blocked to all services on berkeley.edu across all devices on my home network - Windows and Linux.

I did try a traceroute last night (at home) but I was using an on-line service (which was successful) and didn't think to try it from one of my computers. I'll try that when I get home.
Soli Deo Gloria
ID: 1809141 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 34 · Next

Message boards : Number crunching : Panic Mode On (103) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.