The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 66 · 67 · 68 · 69 · 70 · 71 · 72 . . . 107 · Next

AuthorMessage
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2043859 - Posted: 9 Apr 2020, 15:53:08 UTC - in response to Message 2043850.  
Last modified: 9 Apr 2020, 15:54:49 UTC

The kind of huge DB S@H uses needs to be constantly monitored by a DB specialist or weird things could happening. Does that sound common in S@H?
Database of tens of millions of rows isn't really a huge database. What makes Seti Boinc database challenging is the very high rate of database operations it has to support.

Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller.
ID: 2043859 · Report as offensive     Reply Quote
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 2043861 - Posted: 9 Apr 2020, 16:05:22 UTC

I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo??
ID: 2043861 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2043862 - Posted: 9 Apr 2020, 16:10:17 UTC - in response to Message 2043861.  

I got some resends marked abandoned. I havent seen this status before. is it something the user did or is eric doing something to get some of these WUs out of limbo??
I guess it means the user detached from Seti without returning the task.
ID: 2043862 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14656
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2043864 - Posted: 9 Apr 2020, 16:17:30 UTC - in response to Message 2043859.  

Seti should have used much bigger workunits to reduce the number of database operations needed. That would also have made the database smaller.
At the time the SETI data format was established (before 1999), most people would have been on dial-up modems, and maybe not even up to 56 Kbit/sec. The typical processing time per task was several hours. The 'size' (both data volume and processing time) of the workunit was probably deliberately chosen to be the largest which could maintain the volunteers' interest, given the hardware of the time.

To make any fundamental changes to the data structure would have made the 20-year longitudinal study even more difficult than it already is. The other alternative, creating bundles of tasks as a single super-WU, probably never crossed the threshhold of 'it'll be hard work, but worth it in the end'.
ID: 2043864 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2043867 - Posted: 9 Apr 2020, 16:35:16 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone. Boinc wus weren't the same size as classic wus as they took quite different time when I ran them on the same machine that had been running classic. Also during the Boinc Seti the wu size has been increased at least once. Just not enough.

Seti wu data is a signal covering certain period of time. Increasing that time length won't change the results produced. Double size wu would just produce twice the number of detected results on average. And I believe those results exist as individual results in the science database without any wu boundaries.
ID: 2043867 · Report as offensive     Reply Quote
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 2043905 - Posted: 9 Apr 2020, 20:20:27 UTC - in response to Message 2043848.  

I had this problem and had to upgrade the version of Boinc I was running.
This does pose the risk of possibly losing your cache and completed work however.

Meow.


Thanks - Mark isn't it?
Odd that it worked okay until 3 April...
I will backup the data directory and have at tomorrow, worst case I can move it to a working machine for upload so I don't bugger the wingpersons
Member of the 20 Year Club



ID: 2043905 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51469
Credit: 1,018,363,574
RAC: 1,004
United States
Message 2043906 - Posted: 9 Apr 2020, 20:27:38 UTC - in response to Message 2043905.  

Yup, it's Mark.
Mine worked OK for a quite a while too, until Seti server comms got messed up for long enough that Boinc went into that 'fetching scheduler list' routine, and that's where it got stuck.

Meow.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 2043906 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2043913 - Posted: 9 Apr 2020, 21:37:31 UTC
Last modified: 9 Apr 2020, 21:59:10 UTC

S@H will be down in a couple of weeks, why not wait until your cache will be zero to make the client version change?
IMHO Risk to crash a S@H cache is not an option in this lasts weeks.
my 0.02
ID: 2043913 · Report as offensive     Reply Quote
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 2043917 - Posted: 9 Apr 2020, 22:21:10 UTC

Just suspend internet and make a BOINC backup, then you won't lose anything if something goes wrong. That's what I did but all went OK.
ID: 2043917 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13769
Credit: 208,696,464
RAC: 304
Australia
Message 2043927 - Posted: 9 Apr 2020, 23:37:23 UTC - in response to Message 2043867.  
Last modified: 9 Apr 2020, 23:39:05 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.
Maybe for you, but not for the rest of the world.
Broadband here has only really taken off in the last 15 years or so. Almost 50% of internet users were still on dial up in 2006.
Some ISPs still have dialup customers, other ISPs only stopped offering dialup services in 2018.
And we still have data caps for the lower priced plans.

Mobile/fixed wireless Internet really only took off here about 8 years ago when the price of data started to drop significantly.
Grant
Darwin NT
ID: 2043927 · Report as offensive     Reply Quote
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 2043928 - Posted: 9 Apr 2020, 23:48:46 UTC - in response to Message 2043906.  

Yup, it's Mark.
Mine worked OK for a quite a while too, until Seti server comms got messed up for long enough that Boinc went into that 'fetching scheduler list' routine, and that's where it got stuck.

Meow.


ah-ha! yes, that's the difference

congratulations on making One Billion
Member of the 20 Year Club



ID: 2043928 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2043930 - Posted: 10 Apr 2020, 0:02:48 UTC - in response to Message 2043927.  
Last modified: 10 Apr 2020, 0:05:59 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.
Maybe for you, but not for the rest of the world.
Broadband here has only really taken off in the last 15 years or so. Almost 50% of internet users were still on dial up in 2006.
Some ISPs still have dialup customers, other ISPs only stopped offering dialup services in 2018.
And we still have data caps for the lower priced plans.

Mobile/fixed wireless Internet really only took off here about 8 years ago when the price of data started to drop significantly.

So hard to believe it took this long in Australia. First time I went to Thailand with my wife in 2006, internet was only to be found in shops mostly in the larger cities of each province, and Bangkok of course.. By the time I moved there permanently in 2011, 3G was everywhere, including our house in the jungle (literally in one of the least populated areas of Chanthaburi Province near Cambodia), and was updated to 4G by 2014. I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down.
ID: 2043930 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2043931 - Posted: 10 Apr 2020, 0:09:13 UTC

I do remember that living in Bahrain from 2006 to 2009, my internet was considered "hi-speed" at 100Kbps. It was frustrating, but workable. Now I get upset when I can't get 500Mbps down and 25M up on my gig connection. :D
ID: 2043931 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2043933 - Posted: 10 Apr 2020, 0:27:04 UTC - in response to Message 2043767.  

Could be. The default idle interval for project connection checkin is 60 minutes in the client. So that would reduce the checkins down to twice an hour.
A big reduction from every 5min and a few seconds, then 10min and a few more seconds.


I got one! i got one!


. . Throw an online party (since no-one can meet physically anymore). You lucky, lucky bas^&*d! (Life of Brian). I wonder when the next exciting thing will happen. :)

Stephen

< shrug >
ID: 2043933 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13769
Credit: 208,696,464
RAC: 304
Australia
Message 2043937 - Posted: 10 Apr 2020, 1:12:53 UTC - in response to Message 2043930.  

I know your telecom cables were installed many decades ago. Seems more like a governmental control issue bogged you down.
The slow uptake of most things was due to cost. It cost a lot for broadband plans when they first came out (hell, it cost a lot for dialup when it came out). I can't remember the actual pricing- but i think it was something like $75/month for a 1.5mb/s connection with a data cap in the 100s of MB (not GB, MB). Eventually we got 8Mb/s, then up to 24Mb/s ADSL2+. Prices didn't change much but the data caps did increase.

Then the government decided that it wasn't good enough (a lot of people still couldn't even get ADSL) so they came up with the idea of the NBN (National Broadband Network) the idea was fibre to the home for everyone (or at least 95%+ of the population. Some people live a long, long way form anywhere) with a 100Mb/s connection.
Partway through that the government changed & the other side got in & they didn't like the cost and decided that fibre to the node was good enough for some, others could have satellite (which is crap at the best of times) and fixed wireless and that 100Mb/s was way more than anyone really need anyway (tossers).
I was one of the lucky ones to get fibre to the home before that got canned. And many others are stuck with the hodgepodge of other technologies- and guess what? It would seem that just like 5kB/s wasn't really enough, with more & more entertainment (Netflix etc), and now with everyone now isolating at home and trying to work & do school and still watch Netfilx has shown 100Mb/s isn't even close to being good enough if there's more than a couple of people in a house.
Short sighted thinking screws things up, yet again.
Grant
Darwin NT
ID: 2043937 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2043938 - Posted: 10 Apr 2020, 1:14:20 UTC - in response to Message 2043867.  
Last modified: 10 Apr 2020, 1:22:54 UTC

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone. Boinc wus weren't the same size as classic wus as they took quite different time when I ran them on the same machine that had been running classic. Also during the Boinc Seti the wu size has been increased at least once. Just not enough.

Seti wu data is a signal covering certain period of time. Increasing that time length won't change the results produced. Double size wu would just produce twice the number of detected results on average. And I believe those results exist as individual results in the science database without any wu boundaries.


. . The only change in WU size I am aware of was the change from 2bit to 4bit samples which doubled the size of the tasks sent out but only to increase the resolution of the process. The sample size from the raw data is exactly the same, NO CHANGE!

. . The reason for the increase in processing time was the change of versions, v7 taking longer than v6, v8 taking longer than v7, etc. As more computing power became the norm the later versions took advantage to improve (and slow down) the analysis itself. Again NO CHANGE in the sample size being analysed. Changing the basic sample size from the splitters would be like working with a jigsaw puzzle with pieces of two different scales and trying to make a coherent picture out of them. As Richard said, the only reasonable path would have been to create a new WU structure comprising multiples of the original sample units so that the results would be units on the same scale as all the previous results but in manageable blocks rather like zipped packages to improve transmission efficiency. The biggest problem there would be the validation process which would have to be done at the smaller component size or whole batches would become invalid because of one glitch in one component piece, plus the possibility which increases with the number of fundamental units that comprise the 'new WU' that different components could have errors on the different wingmen making getting a valid task much more unlikely and increasing the numbers of resends. So as Richard said, some hard work involved.

. . And if you didn't notice, that would result in the same number of results at the validator/assimilator stage.

Stephen

:(
ID: 2043938 · Report as offensive     Reply Quote
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 2043950 - Posted: 10 Apr 2020, 3:49:08 UTC - in response to Message 2043867.  

That was the case when Classic started but at the time the Boinc Seti started, the dialups were long gone.


I'll have you know I used dial up til 2016... My phone line couldn't support ADSL, since I was too far for it and they thought it was uneconomical to put another exchange in for me.

Of course that didn't stop me from ordering ADSL, and I would get glorious 96kbit/s ADSL, and it would be barely functional I would drop out at least 100 times per day, I couldn't keep a stable connection for more than 10 mins.

:D
ID: 2043950 · Report as offensive     Reply Quote
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 2043983 - Posted: 10 Apr 2020, 8:30:04 UTC

Yay I can finally say that some of my valid results have been deleted. 73 result have been deleted I now have 2306. I know the number is quite small however it is progress
ID: 2043983 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13769
Credit: 208,696,464
RAC: 304
Australia
Message 2043996 - Posted: 10 Apr 2020, 9:54:06 UTC
Last modified: 10 Apr 2020, 9:55:31 UTC

A few clumps of deadlines being hit & resends made a couple of nice dips in the In progress numbers there, along with the backlogged numbers.


Grant
Darwin NT
ID: 2043996 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13769
Credit: 208,696,464
RAC: 304
Australia
Message 2043997 - Posted: 10 Apr 2020, 9:57:37 UTC - in response to Message 2043983.  
Last modified: 10 Apr 2020, 10:13:33 UTC

Yay I can finally say that some of my valid results have been deleted. 73 result have been deleted I now have 2306. I know the number is quite small however it is progress
A huge dent in mine, probably around 700 or so removed.


Yep- and most of them were all older Valids that had been waiting on other Tasks to be returned.
Looks like i should get rid of a few more of those really old ones in the next couple of days- resends due over 10, 11, 12th of April.
But still a couple there waiting on resends not due to till late May.

Now most of my oldest Valids are from mid Feb.
Grant
Darwin NT
ID: 2043997 · Report as offensive     Reply Quote
Previous · 1 . . . 66 · 67 · 68 · 69 · 70 · 71 · 72 . . . 107 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.