Panic Mode On (92) Server Problems?

Message boards : Number crunching : Panic Mode On (92) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 23 · Next

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1607457 - Posted: 30 Nov 2014, 23:40:04 UTC - in response to Message 1606852.  

...
The question of what happens to a retired host after its last task is validated and purged is an interesting one. Most of mine have RAC below 0.1, but some have 0.00 - so some other decay mechanism must be in place. I don't know what that is, but I might have a look around.

Filling in some of the blanks...

RAC is recalculated each time credit is granted. Additionally, projects are expected to run a periodic task to decay RAC when there are no credit grants. The sched/update_stats.cpp source for that indicates it does not do an update if the RAC has been changed within the last day, nor if the RAC is below 0.1.

The updates for host, user, and team records are all done sequentially. The fact that they are separate calculations means that even under normal circumstances the user RAC is not exactly equal to the sum of the RACs of the user's hosts.

Normal circumstances for this project had the update_stats run once a week on Tuesday. That appears not to have happened for the last two weeks, perhaps the executable was on bruno so the cron job fails or perhaps they are using a stripped down config.xml missing that crontab entry. So host RAC for a system which hasn't had a recent validation has not decayed, but for those with more than one host the user RAC may have been adjusted by a validation on a different host.

The current BOINC logic doesn't account for defunct hosts with RAC below 0.05. Perhaps an older version of update_stats didn't skip cases with RAC below 0.1 or perhaps some other cleanup script was run sometime.
                                                                  Joe
ID: 1607457 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1607480 - Posted: 1 Dec 2014, 0:42:39 UTC - in response to Message 1607457.  

...
The question of what happens to a retired host after its last task is validated and purged is an interesting one. Most of mine have RAC below 0.1, but some have 0.00 - so some other decay mechanism must be in place. I don't know what that is, but I might have a look around.

Filling in some of the blanks...

RAC is recalculated each time credit is granted. Additionally, projects are expected to run a periodic task to decay RAC when there are no credit grants. The sched/update_stats.cpp source for that indicates it does not do an update if the RAC has been changed within the last day, nor if the RAC is below 0.1.

The updates for host, user, and team records are all done sequentially. The fact that they are separate calculations means that even under normal circumstances the user RAC is not exactly equal to the sum of the RACs of the user's hosts.

Normal circumstances for this project had the update_stats run once a week on Tuesday. That appears not to have happened for the last two weeks, perhaps the executable was on bruno so the cron job fails or perhaps they are using a stripped down config.xml missing that crontab entry. So host RAC for a system which hasn't had a recent validation has not decayed, but for those with more than one host the user RAC may have been adjusted by a validation on a different host.

The current BOINC logic doesn't account for defunct hosts with RAC below 0.05. Perhaps an older version of update_stats didn't skip cases with RAC below 0.1 or perhaps some other cleanup script was run sometime.
                                                                  Joe

Thanks for the insight to some of the BOINC inner workings. As my AP only systems have had an unchanged RAC since Nov. 7th it makes sense. As that looks to be the same time Bruno went rather sideways.
A cleanup process such as that is certainly in the non-critical column.
Looking through my older inactive team mates systems. I see hosts with contact as recent as 2012 that have a RAC of 0.00 & hosts with last contact in 2007 with a RAC of 0.10. Perhaps there is another process as you theorized, but it only processes hosts that have communicated with the server in the past N days.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1607480 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1607720 - Posted: 1 Dec 2014, 16:59:18 UTC

For the first time in over 15 years I am not crunching Seti, this is causing a real panic.
ID: 1607720 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1607765 - Posted: 1 Dec 2014, 18:58:35 UTC - in response to Message 1607720.  

For the first time in over 15 years I am not crunching Seti, this is causing a real panic.


relax, most of us have been out of work for at least 2 weeks now... Surely you must remember Nov-Dec 2012 when Seti was down for weeks too. Patience is all we can depend on ;-)
ID: 1607765 · Report as offensive
Profile AlienDancer
Volunteer tester
Avatar

Send message
Joined: 8 Sep 99
Posts: 68
Credit: 12,473,416
RAC: 0
Message 1607808 - Posted: 1 Dec 2014, 20:15:22 UTC
Last modified: 1 Dec 2014, 20:16:30 UTC

I still have a few stragglers on a couple of machines and even picked up a few new ones yesterday. My Einstein rac is higher than its ever been and seti is falling. I've even added a third project Bitcoin and I think I'm posting more in the forums than ever before cause I keep reading them trying to determine when SETI is coming back. I had all old and slow machines back in 2012 so I never actually ran out of work except on one machine. Crunching since 1999.
ID: 1607808 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 1607831 - Posted: 1 Dec 2014, 21:25:33 UTC

I almost panicked. My GPU Bata cache was down to 4 WU and the Status page was all red. Than BOOM the downloads started and my cache filled. Status page is green again. Crisis avoided. Milkyway and Einstein can stay on the back burner.

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 1607831 · Report as offensive
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 1607842 - Posted: 1 Dec 2014, 21:51:49 UTC - in response to Message 1607831.  

I almost panicked. My GPU Bata cache was down to 4 WU and the Status page was all red. Than BOOM the downloads started and my cache filled. Status page is green again. Crisis avoided. Milkyway and Einstein can stay on the back burner.


Some folk have all the luck, haven't haqd anything for several days now ;-(

P.
ID: 1607842 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1607848 - Posted: 1 Dec 2014, 21:55:54 UTC - in response to Message 1607831.  

I almost panicked. My GPU Bata cache was down to 4 WU and the Status page was all red. Than BOOM the downloads started and my cache filled. Status page is green again. Crisis avoided. Milkyway and Einstein can stay on the back burner.

Yes, DA has been busy, he's fixed Seti Beta's Server Status page, only took a minimum of four years, might have been longer.

Claggy
ID: 1607848 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1608184 - Posted: 2 Dec 2014, 16:27:24 UTC
Last modified: 2 Dec 2014, 16:27:53 UTC

At that rate we're looking at another 86 hours (+24 for index building), which means Saturday morning (US Pacific time).

On the plus side, Arecibo has four full drives of data that they can send us, so there might be SETI@home work before then if we're really lucky.

Sounds like it's time to install a couple of video cards I purchased on Black Friday!!
ID: 1608184 · Report as offensive
Admiral Gloval
Avatar

Send message
Joined: 31 Mar 13
Posts: 20232
Credit: 5,308,449
RAC: 0
United States
Message 1608390 - Posted: 3 Dec 2014, 1:39:39 UTC

Lucky you. I still have to get buy on a onboard APU with only half a gig of memory. Trying to save up for a real GPU sometime in the near future.
Waiting for the feeding frenzy in the shark pool to start.

ID: 1608390 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1608432 - Posted: 3 Dec 2014, 3:23:19 UTC - in response to Message 1607848.  

I almost panicked. My GPU Bata cache was down to 4 WU and the Status page was all red. Than BOOM the downloads started and my cache filled. Status page is green again. Crisis avoided. Milkyway and Einstein can stay on the back burner.

Yes, DA has been busy, he's fixed Seti Beta's Server Status page, only took a minimum of four years, might have been longer.

Claggy

You call that gobbledygook fixed? The program names may make sense to him and even to some other people, but most (even those savvy enough to be running Beta) will be looking at long, multiline strings of characters and saying "huh?!"
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1608432 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1608436 - Posted: 3 Dec 2014, 3:25:40 UTC - in response to Message 1607848.  

Hi Claggy,

Never mind, its all grist to the mill..

A few gadzillion electrons being booted up the tail pipe to move through a silicon spider with copper legs:-)

Alles sal reg com.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1608436 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30636
Credit: 53,134,872
RAC: 32
United States
Message 1608471 - Posted: 3 Dec 2014, 4:44:12 UTC - in response to Message 1607848.  

I almost panicked. My GPU Bata cache was down to 4 WU and the Status page was all red. Than BOOM the downloads started and my cache filled. Status page is green again. Crisis avoided. Milkyway and Einstein can stay on the back burner.

Yes, DA has been busy, he's fixed Seti Beta's Server Status page, only took a minimum of four years, might have been longer.

Claggy

Well, http://setiweb.ssl.berkeley.edu/beta/status.php sure isn't fixed ....
and http://setiweb.ssl.berkeley.edu/beta/server_status.php isn't updating the daemon jobs as that report is from 12/1 and the task data is from 12/3!

I'd say he is working towards a repair.

Oh, as to those long names, he took the lazy man's route and printed the entire command line that started the program including option switches [which maybe should not be public for security reasons], not just the program name and instance number.
ID: 1608471 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1608541 - Posted: 3 Dec 2014, 6:59:01 UTC

Panic modus is over, now it's just acquiescence.. Only Lisa's computer has some tasks left, 2 tasks on Yoko's and my main rig is dry:(
rOZZ
Music
Pictures
ID: 1608541 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1608679 - Posted: 3 Dec 2014, 13:29:45 UTC - in response to Message 1608541.  

Hi Julie,

We might want to consider the possibility that the S@H servers & db have decided that once again its holliday season and that they are entitled to some time off:-)

This kinda reminds me of 2012 around this time of year there was a significant outage as well.

Perhaps we should be thankful that S@H only decides to holiday every 2 years or so:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1608679 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1608682 - Posted: 3 Dec 2014, 14:00:54 UTC - in response to Message 1608679.  

Hi Julie,

We might want to consider the possibility that the S@H servers & db have decided that once again its holliday season and that they are entitled to some time off:-)

This kinda reminds me of 2012 around this time of year there was a significant outage as well.

Perhaps we should be thankful that S@H only decides to holiday every 2 years or so:-)

Regards,



Truer words were never spoken Cliff:)
rOZZ
Music
Pictures
ID: 1608682 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 1608795 - Posted: 3 Dec 2014, 20:42:05 UTC - in response to Message 1608679.  

18:00 GMT 23/12 will be my guess :-)
ID: 1608795 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1608816 - Posted: 3 Dec 2014, 21:41:32 UTC - in response to Message 1608795.  

18:00 GMT 23/12 will be my guess :-)



Your guess for what?
rOZZ
Music
Pictures
ID: 1608816 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1608833 - Posted: 3 Dec 2014, 22:23:33 UTC - in response to Message 1608679.  
Last modified: 3 Dec 2014, 22:23:45 UTC

Hi Julie,

We might want to consider the possibility that the S@H servers & db have decided that once again its holliday season and that they are entitled to some time off:-)

This kinda reminds me of 2012 around this time of year there was a significant outage as well.

Perhaps we should be thankful that S@H only decides to holiday every 2 years or so:-)

Regards,

I think the db might be working overtime with all the work going on in the background to move the data into a new table. At least I imagine it would be more than us poking it x number of times a second.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1608833 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1608847 - Posted: 3 Dec 2014, 23:07:49 UTC - in response to Message 1608833.  

Hi Julie,

We might want to consider the possibility that the S@H servers & db have decided that once again its holliday season and that they are entitled to some time off:-)

This kinda reminds me of 2012 around this time of year there was a significant outage as well.

Perhaps we should be thankful that S@H only decides to holiday every 2 years or so:-)

Regards,

I think the db might be working overtime with all the work going on in the background to move the data into a new table. At least I imagine it would be more than us poking it x number of times a second.



Just reported my main rig, nuttin:( *no running tasks*
rOZZ
Music
Pictures
ID: 1608847 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 23 · Next

Message boards : Number crunching : Panic Mode On (92) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.