Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation
Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 108 · Next
| Author | Message |
|---|---|
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12990 Credit: 208,696,464 RAC: 690
|
I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down.The slow uptake of most things was due to cost. It cost a lot for broadband plans when they first came out (hell, it cost a lot for dialup when it came out). I can't remember the actual pricing- but i think it was something like $75/month for a 1.5mb/s connection with a data cap in the 100s of MB (not GB, MB). Eventually we got 8Mb/s, then up to 24Mb/s ADSL2+. Prices didn't change much but the data caps did increase. Then the government decided that it wasn't good enough (a lot of people still couldn't even get ADSL) so they came up with the idea of the NBN (National Broadband Network) the idea was fibre to the home for everyone (or at least 95%+ of the population. Some people live a long, long way form anywhere) with a 100Mb/s connection. Partway through that the government changed & the other side got in & they didn't like the cost and decided that fibre to the node was good enough for some, others could have satellite (which is crap at the best of times) and fixed wireless and that 100Mb/s was way more than anyone really need anyway (tossers). I was one of the lucky ones to get fibre to the home before that got canned. And many others are stuck with the hodgepodge of other technologies- and guess what? It would seem that just like 5kB/s wasn't really enough, with more & more entertainment (Netflix etc), and now with everyone now isolating at home and trying to work & do school and still watch Netfilx has shown 100Mb/s isn't even close to being good enough if there's more than a couple of people in a house. Short sighted thinking screws things up, yet again. Grant Darwin NT |
Stephen "Heretic" ![]() Send message Joined: 20 Sep 12 Posts: 5384 Credit: 192,787,363 RAC: 1,426
|
Could be. The default idle interval for project connection checkin is 60 minutes in the client. So that would reduce the checkins down to twice an hour.A big reduction from every 5min and a few seconds, then 10min and a few more seconds. . . Throw an online party (since no-one can meet physically anymore). You lucky, lucky bas^&*d! (Life of Brian). I wonder when the next exciting thing will happen. :) Stephen < shrug > |
|
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 603
|
I do remember that living in Bahrain from 2006 to 2009, my internet was considered "hi-speed" at 100Kbps. It was frustrating, but workable. Now I get upset when I can't get 500Mbps down and 25M up on my gig connection. :D |
|
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 603
|
That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.Maybe for you, but not for the rest of the world. So hard to believe it took this long in Australia. First time I went to Thailand with my wife in 2006, internet was only to be found in shops mostly in the larger cities of each province, and Bangkok of course.. By the time I moved there permanently in 2011, 3G was everywhere, including our house in the jungle (literally in one of the least populated areas of Chanthaburi Province near Cambodia), and was updated to 4G by 2014. I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down. |
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 482
|
Yup, it's Mark. ah-ha! yes, that's the difference congratulations on making One Billion Member of the 20 Year Club
|
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12990 Credit: 208,696,464 RAC: 690
|
That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.Maybe for you, but not for the rest of the world. Broadband here has only really taken off in the last 15 years or so. Almost 50% of internet users were still on dial up in 2006. Some ISPs still have dialup customers, other ISPs only stopped offering dialup services in 2018. And we still have data caps for the lower priced plans. Mobile/fixed wireless Internet really only took off here about 8 years ago when the price of data started to drop significantly. Grant Darwin NT |
JohnDK ![]() Send message Joined: 28 May 00 Posts: 1200 Credit: 451,243,443 RAC: 2,557
|
Just suspend internet and make a BOINC backup, then you won't lose anything if something goes wrong. That's what I did but all went OK. |
juan BFP ![]() Send message Joined: 16 Mar 07 Posts: 9764 Credit: 572,710,851 RAC: 8,616
|
S@H will be down in a couple of weeks, why not wait until your cache will be zero to make the client version change? IMHO Risk to crash a S@H cache is not an option in this lasts weeks. my 0.02
|
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
Yup, it's Mark. Mine worked OK for a quite a while too, until Seti server comms got messed up for long enough that Boinc went into that 'fetching scheduler list' routine, and that's where it got stuck. Meow. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 482
|
I had this problem and had to upgrade the version of Boinc I was running. Thanks - Mark isn't it? Odd that it worked okay until 3 April... I will backup the data directory and have at tomorrow, worst case I can move it to a working machine for upload so I don't bugger the wingpersons Member of the 20 Year Club
|
|
Ville Saari Send message Joined: 30 Nov 00 Posts: 1119 Credit: 48,373,696 RAC: 74,889
|
That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone. Boinc wus weren't the same size as classic wus as they took quite different time when I ran them on the same machine that had been running classic. Also during the Boinc Seti the wu size has been increased at least once. Just not enough. Seti wu data is a signal covering certain period of time. Increasing that time length won't change the results produced. Double size wu would just produce twice the number of detected results on average. And I believe those results exist as individual results in the science database without any wu boundaries. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14114 Credit: 200,643,578 RAC: 1,983
|
Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller.At the time the SETI data format was established (before 1999), most people would have been on dial-up modems, and maybe not even up to 56 Kbit/sec. The typical processing time per task was several hours. The 'size' (both data volume and processing time) of the workunit was probably deliberately chosen to be the largest which could maintain the volunteers' interest, given the hardware of the time. To make any fundamental changes to the data structure would have made the 20-year longitudinal study even more difficult than it already is. The other alternative, creating bundles of tasks as a single super-WU, probably never crossed the threshhold of 'it'll be hard work, but worth it in the end'. |
|
Ville Saari Send message Joined: 30 Nov 00 Posts: 1119 Credit: 48,373,696 RAC: 74,889
|
I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo??I guess it means the user detached from Seti without returning the task. |
Unixchick ![]() Send message Joined: 5 Mar 12 Posts: 780 Credit: 2,361,516 RAC: 49
|
I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo?? |
|
Ville Saari Send message Joined: 30 Nov 00 Posts: 1119 Credit: 48,373,696 RAC: 74,889
|
The kind of huge DB S@H uses needs to be constantly monitored by a DB specialist or weird things could happening. Does that sound common in S@H?Database of tens of millions of rows isn't really a huge database. What makes Seti Boinc database challenging is the very high rate of database operations it has to support. Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller. |
juan BFP ![]() Send message Joined: 16 Mar 07 Posts: 9764 Credit: 572,710,851 RAC: 8,616
|
Also, remember that the SETI@Home staff team has been without a specialist database wrangler since the departure of Bob Bankay (bobb2). He left for a commercial post some time after his last contribution to these message boards in 2008 - I thought I remembered a valedictory, but I haven't been able to find it. Check his posting history for an idea of what we lost. That could explain some of the constant DB problems. The kind of huge DB S@H uses needs to be constantly monitored by a DB specialist or weird things could happening. Does that sound common in S@H?
|
kittyman ![]() Send message Joined: 9 Jul 00 Posts: 50494 Credit: 1,018,363,574 RAC: 2,276
|
I had this problem and had to upgrade the version of Boinc I was running. This does pose the risk of possibly losing your cache and completed work however. Meow. "Learn from yesterday. Live for today. Hope for tomorrow." Albert Einstein "With cats." kittyman
|
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 482
|
computer 7596636has 33 tasks ready to report and continues to state that: 09-Apr-20 10:42:46 AM SETI@home update requested by user 09-Apr-20 10:42:50 AM SETI@home [sched_op_debug] Fetching master file 09-Apr-20 10:42:50 AM SETI@home Fetching scheduler list 09-Apr-20 10:42:52 AM Project communication failed: attempting access to reference site 09-Apr-20 10:42:52 AM SETI@home [sched_op_debug] Deferring communication for 1 days 0 hr 0 min 0 sec 09-Apr-20 10:42:52 AM SETI@home [sched_op_debug] Reason: 20 consecutive failures fetching scheduler list 09-Apr-20 10:42:55 AM Internet access OK - project servers may be temporarily down. I have restarted boimc and restarted the computer with and without no new tasks any suggestions? Member of the 20 Year Club
|
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14114 Credit: 200,643,578 RAC: 1,983
|
Also, remember that the SETI@Home staff team has been without a specialist database wrangler since the departure of Bob Bankay (bobb2). He left for a commercial post some time after his last contribution to these message boards in 2008 - I thought I remembered a valedictory, but I haven't been able to find it. Check his posting history for an idea of what we lost. |
|
Ville Saari Send message Joined: 30 Nov 00 Posts: 1119 Credit: 48,373,696 RAC: 74,889
|
Remember, that for a database, deleting a row doesn't free up any space: it simply marks that row as no longer active, effectively creating a hole in the storage area.That's just disk storage of which I'm not aware of there being any shortage of. The deleted rows don't need to be cached in ram so ram pressure will be reduced and that's where the problems were. I'm wondering why didn't S@H ever make real use of the fact that the database has a full replica. If database compaction was the reason for the weekly downtimes, then those downtimes could have been completely avoided by compacting the replica when the master is still running, then syncing it up again and swapping the roles of the databases. The downtime from this swap would likely be less than the period between two scheduler requests, so few users would experience any downtime. Then the replica would be running the project as the new master and the old master would now be the replica that can be taken down and compacted. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.