The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 108 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 12990
Credit: 208,696,464
RAC: 690
Australia
Message 2043937 - Posted: 10 Apr 2020, 1:12:53 UTC - in response to Message 2043930.  

I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down.
The slow uptake of most things was due to cost. It cost a lot for broadband plans when they first came out (hell, it cost a lot for dialup when it came out). I can't remember the actual pricing- but i think it was something like $75/month for a 1.5mb/s connection with a data cap in the 100s of MB (not GB, MB). Eventually we got 8Mb/s, then up to 24Mb/s ADSL2+. Prices didn't change much but the data caps did increase.

Then the government decided that it wasn't good enough (a lot of people still couldn't even get ADSL) so they came up with the idea of the NBN (National Broadband Network) the idea was fibre to the home for everyone (or at least 95%+ of the population. Some people live a long, long way form anywhere) with a 100Mb/s connection.
Partway through that the government changed & the other side got in & they didn't like the cost and decided that fibre to the node was good enough for some, others could have satellite (which is crap at the best of times) and fixed wireless and that 100Mb/s was way more than anyone really need anyway (tossers).
I was one of the lucky ones to get fibre to the home before that got canned. And many others are stuck with the hodgepodge of other technologies- and guess what? It would seem that just like 5kB/s wasn't really enough, with more & more entertainment (Netflix etc), and now with everyone now isolating at home and trying to work & do school and still watch Netfilx has shown 100Mb/s isn't even close to being good enough if there's more than a couple of people in a house.
Short sighted thinking screws things up, yet again.
Grant
Darwin NT
ID: 2043937 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5384
Credit: 192,787,363
RAC: 1,426
Australia
Message 2043933 - Posted: 10 Apr 2020, 0:27:04 UTC - in response to Message 2043767.  

Could be. The default idle interval for project connection checkin is 60 minutes in the client. So that would reduce the checkins down to twice an hour.
A big reduction from every 5min and a few seconds, then 10min and a few more seconds.


I got one! i got one!


. . Throw an online party (since no-one can meet physically anymore). You lucky, lucky bas^&*d! (Life of Brian). I wonder when the next exciting thing will happen. :)

Stephen

< shrug >
ID: 2043933 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 603
United States
Message 2043931 - Posted: 10 Apr 2020, 0:09:13 UTC

I do remember that living in Bahrain from 2006 to 2009, my internet was considered "hi-speed" at 100Kbps. It was frustrating, but workable. Now I get upset when I can't get 500Mbps down and 25M up on my gig connection. :D
ID: 2043931 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 603
United States
Message 2043930 - Posted: 10 Apr 2020, 0:02:48 UTC - in response to Message 2043927.  
Last modified: 10 Apr 2020, 0:05:59 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.
Maybe for you, but not for the rest of the world.
Broadband here has only really taken off in the last 15 years or so. Almost 50% of internet users were still on dial up in 2006.
Some ISPs still have dialup customers, other ISPs only stopped offering dialup services in 2018.
And we still have data caps for the lower priced plans.

Mobile/fixed wireless Internet really only took off here about 8 years ago when the price of data started to drop significantly.

So hard to believe it took this long in Australia. First time I went to Thailand with my wife in 2006, internet was only to be found in shops mostly in the larger cities of each province, and Bangkok of course.. By the time I moved there permanently in 2011, 3G was everywhere, including our house in the jungle (literally in one of the least populated areas of Chanthaburi Province near Cambodia), and was updated to 4G by 2014. I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down.
ID: 2043930 · Report as offensive     Reply Quote
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 482
United States
Message 2043928 - Posted: 9 Apr 2020, 23:48:46 UTC - in response to Message 2043906.  

Yup, it's Mark.
Mine worked OK for a quite a while too, until Seti server comms got messed up for long enough that Boinc went into that 'fetching scheduler list' routine, and that's where it got stuck.

Meow.


ah-ha! yes, that's the difference

congratulations on making One Billion
Member of the 20 Year Club



ID: 2043928 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 12990
Credit: 208,696,464
RAC: 690
Australia
Message 2043927 - Posted: 9 Apr 2020, 23:37:23 UTC - in response to Message 2043867.  
Last modified: 9 Apr 2020, 23:39:05 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.
Maybe for you, but not for the rest of the world.
Broadband here has only really taken off in the last 15 years or so. Almost 50% of internet users were still on dial up in 2006.
Some ISPs still have dialup customers, other ISPs only stopped offering dialup services in 2018.
And we still have data caps for the lower priced plans.

Mobile/fixed wireless Internet really only took off here about 8 years ago when the price of data started to drop significantly.
Grant
Darwin NT
ID: 2043927 · Report as offensive     Reply Quote
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1200
Credit: 451,243,443
RAC: 2,557
Denmark
Message 2043917 - Posted: 9 Apr 2020, 22:21:10 UTC

Just suspend internet and make a BOINC backup, then you won't lose anything if something goes wrong. That's what I did but all went OK.
ID: 2043917 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9764
Credit: 572,710,851
RAC: 8,616
Panama
Message 2043913 - Posted: 9 Apr 2020, 21:37:31 UTC
Last modified: 9 Apr 2020, 21:59:10 UTC

S@H will be down in a couple of weeks, why not wait until your cache will be zero to make the client version change?
IMHO Risk to crash a S@H cache is not an option in this lasts weeks.
my 0.02
ID: 2043913 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 50494
Credit: 1,018,363,574
RAC: 2,276
United States
Message 2043906 - Posted: 9 Apr 2020, 20:27:38 UTC - in response to Message 2043905.  

Yup, it's Mark.
Mine worked OK for a quite a while too, until Seti server comms got messed up for long enough that Boinc went into that 'fetching scheduler list' routine, and that's where it got stuck.

Meow.
"Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein
"With cats." kittyman

ID: 2043906 · Report as offensive     Reply Quote
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 482
United States
Message 2043905 - Posted: 9 Apr 2020, 20:20:27 UTC - in response to Message 2043848.  

I had this problem and had to upgrade the version of Boinc I was running.
This does pose the risk of possibly losing your cache and completed work however.

Meow.


Thanks - Mark isn't it?
Odd that it worked okay until 3 April...
I will backup the data directory and have at tomorrow, worst case I can move it to a working machine for upload so I don't bugger the wingpersons
Member of the 20 Year Club



ID: 2043905 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1119
Credit: 48,373,696
RAC: 74,889
Finland
Message 2043867 - Posted: 9 Apr 2020, 16:35:16 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone. Boinc wus weren't the same size as classic wus as they took quite different time when I ran them on the same machine that had been running classic. Also during the Boinc Seti the wu size has been increased at least once. Just not enough.

Seti wu data is a signal covering certain period of time. Increasing that time length won't change the results produced. Double size wu would just produce twice the number of detected results on average. And I believe those results exist as individual results in the science database without any wu boundaries.
ID: 2043867 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14114
Credit: 200,643,578
RAC: 1,983
United Kingdom
Message 2043864 - Posted: 9 Apr 2020, 16:17:30 UTC - in response to Message 2043859.  

Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller.
At the time the SETI data format was established (before 1999), most people would have been on dial-up modems, and maybe not even up to 56 Kbit/sec. The typical processing time per task was several hours. The 'size' (both data volume and processing time) of the workunit was probably deliberately chosen to be the largest which could maintain the volunteers' interest, given the hardware of the time.

To make any fundamental changes to the data structure would have made the 20-year longitudinal study even more difficult than it already is. The other alternative, creating bundles of tasks as a single super-WU, probably never crossed the threshhold of 'it'll be hard work, but worth it in the end'.
ID: 2043864 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1119
Credit: 48,373,696
RAC: 74,889
Finland
Message 2043862 - Posted: 9 Apr 2020, 16:10:17 UTC - in response to Message 2043861.  

I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo??
I guess it means the user detached from Seti without returning the task.
ID: 2043862 · Report as offensive     Reply Quote
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 780
Credit: 2,361,516
RAC: 49
United States
Message 2043861 - Posted: 9 Apr 2020, 16:05:22 UTC

I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo??
ID: 2043861 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1119
Credit: 48,373,696
RAC: 74,889
Finland
Message 2043859 - Posted: 9 Apr 2020, 15:53:08 UTC - in response to Message 2043850.  
Last modified: 9 Apr 2020, 15:54:49 UTC

The kind of huge DB S@H uses needs to be constantly monitored by a DB specialist or weird things could happening. Does that sound common in S@H?
Database of tens of millions of rows isn't really a huge database. What makes Seti Boinc database challenging is the very high rate of database operations it has to support.

Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller.
ID: 2043859 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9764
Credit: 572,710,851
RAC: 8,616
Panama
Message 2043850 - Posted: 9 Apr 2020, 15:03:00 UTC - in response to Message 2043808.  

Also, remember that the SETI@Home staff team has been without a specialist database wrangler since the departure of Bob Bankay (bobb2). He left for a commercial post some time after his last contribution to these message boards in 2008 - I thought I remembered a valedictory, but I haven't been able to find it. Check his posting history for an idea of what we lost.

That could explain some of the constant DB problems. The kind of huge DB S@H uses needs to be constantly monitored by a DB specialist or weird things could happening. Does that sound common in S@H?
ID: 2043850 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 50494
Credit: 1,018,363,574
RAC: 2,276
United States
Message 2043848 - Posted: 9 Apr 2020, 14:51:24 UTC - in response to Message 2043846.  

I had this problem and had to upgrade the version of Boinc I was running.
This does pose the risk of possibly losing your cache and completed work however.

Meow.
"Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein
"With cats." kittyman

ID: 2043848 · Report as offensive     Reply Quote
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 482
United States
Message 2043846 - Posted: 9 Apr 2020, 14:46:05 UTC

computer 7596636has 33 tasks ready to report and continues to state that:

09-Apr-20 10:42:46 AM SETI@home update requested by user
09-Apr-20 10:42:50 AM SETI@home [sched_op_debug] Fetching master file
09-Apr-20 10:42:50 AM SETI@home Fetching scheduler list
09-Apr-20 10:42:52 AM Project communication failed: attempting access to reference site
09-Apr-20 10:42:52 AM SETI@home [sched_op_debug] Deferring communication for 1 days 0 hr 0 min 0 sec
09-Apr-20 10:42:52 AM SETI@home [sched_op_debug] Reason: 20 consecutive failures fetching scheduler list
09-Apr-20 10:42:55 AM Internet access OK - project servers may be temporarily down.

I have restarted boimc and restarted the computer with and without no new tasks
any suggestions?
Member of the 20 Year Club



ID: 2043846 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14114
Credit: 200,643,578
RAC: 1,983
United Kingdom
Message 2043808 - Posted: 9 Apr 2020, 9:40:39 UTC - in response to Message 2043805.  
Last modified: 9 Apr 2020, 9:43:19 UTC

Also, remember that the SETI@Home staff team has been without a specialist database wrangler since the departure of Bob Bankay (bobb2). He left for a commercial post some time after his last contribution to these message boards in 2008 - I thought I remembered a valedictory, but I haven't been able to find it. Check his posting history for an idea of what we lost.
ID: 2043808 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1119
Credit: 48,373,696
RAC: 74,889
Finland
Message 2043805 - Posted: 9 Apr 2020, 9:07:05 UTC - in response to Message 2043803.  

Remember, that for a database, deleting a row doesn't free up any space: it simply marks that row as no longer active, effectively creating a hole in the storage area.
That's just disk storage of which I'm not aware of there being any shortage of. The deleted rows don't need to be cached in ram so ram pressure will be reduced and that's where the problems were.

I'm wondering why didn't S@H ever make real use of the fact that the database has a full replica. If database compaction was the reason for the weekly downtimes, then those downtimes could have been completely avoided by compacting the replica when the master is still running, then syncing it up again and swapping the roles of the databases. The downtime from this swap would likely be less than the period between two scheduler requests, so few users would experience any downtime. Then the replica would be running the project as the new master and the old master would now be the replica that can be taken down and compacted.
ID: 2043805 · Report as offensive     Reply Quote
Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 108 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.