The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 46 · 47 · 48 · 49 · 50 · 51 · 52 . . . 94 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028991 - Posted: 24 Jan 2020, 15:41:25 UTC - in response to Message 2028990.  

Putting the current value of 8.0269/sec into a spreadsheet, and constraining myself to integer seconds, the most plausible values in the first 10 minutes are

3283 in 409 seconds
1790 in 223 seconds

It's one way of filling in time while waiting for new work.
ID: 2028991 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028992 - Posted: 24 Jan 2020, 15:48:28 UTC - in response to Message 2028960.  

Can someone show me when this 250 task limit was implemented on the feeder as AFAIK the last increase that I know of took it from 100 to 200.

That was just a guess on my part since when the servers were working well I frequently received MORE than 200 tasks at a time for a request when hosts were empty and the general populace was already full. I have seen values as high as the 236 range, so I just guessed that the limit was closer to 250. Or maybe the magic 256 number that is computer friendly.

Last night before bed I was receiving occasional dribs and drabs of work at every 10th request or so. This morning all hosts are completely empty. So either I have very bad luck in the filling queue or there is still a problem with the RTS buffer and splitting.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028992 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2028994 - Posted: 24 Jan 2020, 15:58:46 UTC - in response to Message 2028992.  

This morning all hosts are completely empty. So either I have very bad luck in the filling queue or there is still a problem with the RTS buffer and splitting.
The splitters have been stopped for about 4 hours now. Only Astropulse splitters are running but they are so slow they can satisfy about 2% of the demand. The only work we get now are tie-breakers and some lucky Astropulses.
ID: 2028994 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3869
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2028995 - Posted: 24 Jan 2020, 15:58:59 UTC - in response to Message 2028992.  
Last modified: 24 Jan 2020, 16:17:36 UTC

This morning all hosts are completely empty. So either I have very bad luck in the filling queue or there is still a problem with the RTS buffer and splitting.


Same here... only my slower hosts have enough work. Every time that I have checked the result creation rate since the outrage ended it has been 8/sec. or so (and the page has certainly updated since then.) Now all the GBT splitters are showing disabled. so I think that result creation is still being throttled.

There are 5M results in the field even with this, so again it seems to fall back on the three-quorum validation being in place due to the bad AMD hosts.
ID: 2028995 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2028996 - Posted: 24 Jan 2020, 16:06:58 UTC - in response to Message 2028992.  

I’m also mostly empty. So it’s not just you.

I was always under the impression that the RTS buffer was 255, 8-bit [0-255]
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2028996 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028997 - Posted: 24 Jan 2020, 16:23:24 UTC - in response to Message 2028996.  

https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Jobscheduling
'The size of the job cache. Default is 100 jobs.'
'The size of the feeder's enumeration query. Default is 200.'
ID: 2028997 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2028998 - Posted: 24 Jan 2020, 16:28:26 UTC - in response to Message 2028997.  

https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Jobscheduling
'The size of the job cache. Default is 100 jobs.'
'The size of the feeder's enumeration query. Default is 200.'
Still, as Keith has also observed, sometimes you get more than 200 tasks in one scheduler call.
ID: 2028998 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2029000 - Posted: 24 Jan 2020, 16:50:38 UTC - in response to Message 2028998.  

As the page says, it's configurable in the project's config.xml file - it won't be stored or visible where anyone except a project admin can see it. We'll just have to go on speculating.
ID: 2029000 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2029001 - Posted: 24 Jan 2020, 17:01:33 UTC - in response to Message 2028995.  

Every time that I have checked the result creation rate since the outrage ended it has been 8/sec. or so
It has been that for several hours now but when the splitters were still running, the number was normal. Those 8/sec don't come from splitters but from tie-breaker tasks added by the validator.

Now all the GBT splitters are showing disabled. so I think that result creation is still being throttled.
It's not throttling - the splitters have been completely stopped for a long time now, not 'pulse width modulated' like when they throttled them to help the database shrink.

There are 5M results in the field even with this, so again it seems to fall back on the three-quorum validation being in place due to the bad AMD hosts.
The average replication is only 2.2, so doesn't look that high. I guess the results out in the field is high because people have increased their caches due to recent very long outages.
ID: 2029001 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2029003 - Posted: 24 Jan 2020, 17:08:00 UTC - in response to Message 2028998.  

Still, as Keith has also observed, sometimes you get more than 200 tasks in one scheduler call.
24-Jan-2020 02:31:58 [SETI@home] Scheduler request completed: got 306 new tasks

ID: 2029003 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 2029004 - Posted: 24 Jan 2020, 17:26:07 UTC - in response to Message 2028975.  

Greetings,

The long unscheduled Sunday outages last September and October motivated me to download boinc source and hack it to report fake GPUs. I made it read the number of fake GPUs from a file each time it contacts the scheduler so that I can adjust the queue size on the fly without restarting boinc. The slower computer was running the stock boinc until last Monday when I installed my spoofed client on it too to prepare for Tuesday downtime that Eric announced beforehand to be very long.

So, those of use not getting any work are going to continue not getting any work. Great!

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2029004 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 2029005 - Posted: 24 Jan 2020, 17:42:50 UTC

Since the project came back up:

I have 2 machines
1 finally recovered and is getting a trickle of work.
The other

1/24/2020 9:37:26 AM | SETI@home | Fetching scheduler list
1/24/2020 9:37:39 AM | | Project communication failed: attempting access to reference site
1/24/2020 9:37:40 AM | | Internet access OK - project servers may be temporarily down.

Have rebooted, etc.

Cannot even get it to report.

Any ideas ?

Dave
Dave

ID: 2029005 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2029006 - Posted: 24 Jan 2020, 17:48:31 UTC - in response to Message 2029005.  

Since the project came back up:

I have 2 machines
1 finally recovered and is getting a trickle of work.
The other

1/24/2020 9:37:26 AM | SETI@home | Fetching scheduler list
1/24/2020 9:37:39 AM | | Project communication failed: attempting access to reference site
1/24/2020 9:37:40 AM | | Internet access OK - project servers may be temporarily down.

Have rebooted, etc.

Cannot even get it to report.

Any ideas ?

Dave


Try setting No new tasks (NNT) on the Projects tab. I was able to report all of my completed tasks this way. then reset it to Allow new tasks after all reports are done. Then just wait to get lucky and get some new work.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2029006 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 2029007 - Posted: 24 Jan 2020, 17:53:05 UTC - in response to Message 2029006.  

Since the project came back up:

I have 2 machines
1 finally recovered and is getting a trickle of work.
The other

1/24/2020 9:37:26 AM | SETI@home | Fetching scheduler list
1/24/2020 9:37:39 AM | | Project communication failed: attempting access to reference site
1/24/2020 9:37:40 AM | | Internet access OK - project servers may be temporarily down.

Have rebooted, etc.

Cannot even get it to report.

Any ideas ?

Dave


Try setting No new tasks (NNT) on the Projects tab. I was able to report all of my completed tasks this way. then reset it to Allow new tasks after all reports are done. Then just wait to get lucky and get some new work.


Have tried NNT, no JOY.
Dave

ID: 2029007 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2029008 - Posted: 24 Jan 2020, 17:54:03 UTC - in response to Message 2028992.  

Can someone show me when this 250 task limit was implemented on the feeder as AFAIK the last increase that I know of took it from 100 to 200.

That was just a guess on my part since when the servers were working well I frequently received MORE than 200 tasks at a time for a request when hosts were empty and the general populace was already full. I have seen values as high as the 236 range, so I just guessed that the limit was closer to 250. Or maybe the magic 256 number that is computer friendly.

Last night before bed I was receiving occasional dribs and drabs of work at every 10th request or so. This morning all hosts are completely empty. So either I have very bad luck in the filling queue or there is still a problem with the RTS buffer and splitting.


. . This machine has been empty for nearly 4 hours ... :(

Stephen

:(
ID: 2029008 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2029009 - Posted: 24 Jan 2020, 17:56:46 UTC - in response to Message 2029006.  

Then just wait to get lucky and get some new work.
If you are that lucky, then better use that luck to buy a lottery ticket or something. I have received 5 tasks in the last 6 hours.
ID: 2029009 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2029011 - Posted: 24 Jan 2020, 18:50:29 UTC

Now my gpu ran out of tasks and the cpu will follow soon...

And looks like the same is happening to everyone else too. The results out in the field dropped rapidly from millions to thousands!
ID: 2029011 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2029012 - Posted: 24 Jan 2020, 19:10:37 UTC

I guess the numbers on the server status page are wrong. 'Results waiting for db purging' is zero but I have almost 8000 task in 'Valid' state where they should be included in that zero count. Also 'results out in the field' dropped suspiciously fast and Astropulse results out is only 71 although AP splitters have been running all day. I have 20 AP tasks here and it's hard to believe that I would have 28% of all the AP tasks in the world!
ID: 2029012 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2029013 - Posted: 24 Jan 2020, 19:19:02 UTC

This will be a cold weekend if they don't kick the servers into action.
ID: 2029013 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2029015 - Posted: 24 Jan 2020, 19:38:33 UTC - in response to Message 2029012.  

I guess the numbers on the server status page are wrong. 'Results waiting for db purging' is zero but I have almost 8000 task in 'Valid' state where they should be included in that zero count. Also 'results out in the field' dropped suspiciously fast and Astropulse results out is only 71 although AP splitters have been running all day. I have 20 AP tasks here and it's hard to believe that I would have 28% of all the AP tasks in the world!
I don't think the actual numbers have changed in unexpected ways, they've just moved around on the status page, so there is little or no connect between the (current) labels and the numbers.
ID: 2029015 · Report as offensive
Previous · 1 . . . 46 · 47 · 48 · 49 · 50 · 51 · 52 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.