The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 41 · 42 · 43 · 44 · 45 · 46 · 47 . . . 107 · Next

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22220
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2040767 - Posted: 27 Mar 2020, 7:21:23 UTC - in response to Message 2040698.  

If you look at the date axis you will see that only three dates are displayed, 25th March, the initial data point, 26th March, the second data point, and the 27th March, "today". On "today" you will not see any significant change in data as the day hasn't finished and there's insufficient data so the total stays at the previous day's value.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2040767 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22220
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2040768 - Posted: 27 Mar 2020, 7:29:59 UTC - in response to Message 2040764.  

A suggestion as to why this might be "true" - as the servers are struggling just now for whatever reason it is taking longer to respond to any request for work, and the time to respond is related to the amount of work requested and the process is getting so slow that for large requests an they time-out and we get "server has no work" type messages in reply to our work requests.
Those that are making infrequent, large requests may be aiding the servers by not continually requesting work, but at the same time are not helping by asking for large amounts of work :-(
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2040768 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2040769 - Posted: 27 Mar 2020, 7:33:28 UTC - in response to Message 2040765.  

I recall someone (Richard, perhaps) mentioning that the issue was that the "fast-burners" are naturally requesting more replacement work, and if the number of seconds of work requested exceeds the available, you get nothing rather than a lower proportion.
If this was the case, then no one would get any more work ever after his cache has dropped low enough for the remaining space to exceed the size of the scheduler work queue.

The only situation where a big request matters is when the scheduler actually had lot of work and tried to send it to you but this made it run out of available computing resources causing the scheduler request to end in an error.

I have observed that in the post downtime rushes where we got lot of http errors when trying to refill our caches, the size of the request did matter. Setting NNT was the best way to make the request work so that you could at least report your completed tasks. And asking for smaller amonut of work had better chance to not end in error than asking a lot. But this required tedious micromanaging of your cache settings when your cache filled.
ID: 2040769 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 2040771 - Posted: 27 Mar 2020, 7:57:38 UTC

After about an hour and a half i've managed to pick up some more work on 2 consecutive requests (although not enough to replace what's been returned), then it was back to "Project has no tasks available" again.
Grant
Darwin NT
ID: 2040771 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2040775 - Posted: 27 Mar 2020, 8:24:15 UTC

My faster host got a bunch of 61 tasks. And then the next five requests didn't even ask anything due to stuck downloads. The latest request - the first one after those five, got 2 more tasks.

During this same time my slower host has got absolutely nothing.
ID: 2040775 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040788 - Posted: 27 Mar 2020, 10:13:50 UTC - in response to Message 2040769.  

I recall someone (Richard, perhaps) mentioning that the issue was that the "fast-burners" are naturally requesting more replacement work, and if the number of seconds of work requested exceeds the available, you get nothing rather than a lower proportion.
I've certainly made speculations along those lines, but not exactly that. My theory is that the very big requests take longer to process - both parsing the request file, and examining candidate tasks for suitability. Remember that there are rules like 'You can't be your own wingmate'. For every task, the scheduler has to check that the other copy hasn't been sent to another of your machines - lots of queries. That's probably another of BOINC's design inefficiencies.

But asking for less seems to help, when work is in limited supply.
ID: 2040788 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2040789 - Posted: 27 Mar 2020, 10:33:35 UTC - in response to Message 2040788.  

I've been maintaining hit and miss all night. Not big downloads, but a few here, a few there. It looks like almost everything appears to be running off the primary database, but very slowly, as you'd expect if that's the case, it just might not be able to handle massive numbers right now. My cache has since dwindled back to the mid 300's again, but I'm slowing that down to take care of a lot of AP files thrown my way, so I'm not processing anything strictly SETI right now. Maybe that'll help build my cache up over the next little while.
ID: 2040789 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040792 - Posted: 27 Mar 2020, 11:03:54 UTC

Modest amounts of newly split work are starting to emerge from the scheduler, as they often do at this time of day. One of my fast crunchers (top 100) was completely dry, so I allowed a little Einstein. Time to prime the pump....

27/03/2020 10:59:59 | SETI@home | Reporting 1 completed tasks
27/03/2020 10:59:59 | SETI@home | [sched_op] NVIDIA GPU work request: 3456.00 seconds; 0.00 devices
27/03/2020 11:00:04 | SETI@home | Scheduler request completed: got 2 new tasks
27/03/2020 11:00:04 | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 112 seconds
ID: 2040792 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2040793 - Posted: 27 Mar 2020, 11:10:33 UTC

Server status
SETI@home server status information is also available in XML.

[As of 27 Mar 2020, 11:00:03 UTC]

>1 hr ago.

Are you sure is new work not just resends?
ID: 2040793 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040794 - Posted: 27 Mar 2020, 11:17:57 UTC - in response to Message 2040793.  

Are you sure is new work not just resends?
I checked the other machines. These were all new work, received on slower machines with low, but functional, caches.

27/03/2020 10:47:37 | SETI@home | Scheduler request completed: got 6 new tasks
27/03/2020 10:48:41 | SETI@home | Scheduler request completed: got 20 new tasks
27/03/2020 10:49:40 | SETI@home | Scheduler request completed: got 5 new tasks
27/03/2020 10:54:51 | SETI@home | Scheduler request completed: got 11 new tasks
ID: 2040794 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2040796 - Posted: 27 Mar 2020, 11:22:08 UTC

27-Mar-2020 06:20:10 [SETI@home] Reporting 58 completed tasks
27-Mar-2020 06:20:10 [SETI@home] Requesting new tasks for CPU and NVIDIA GPU
27-Mar-2020 06:20:22 [SETI@home] Scheduler request completed: got 0 new tasks
27-Mar-2020 06:20:22 [SETI@home] Project has no tasks available
27-Mar-2020 06:20:22 [SETI@home] Project requested delay of 303 seconds


No joy here.
ID: 2040796 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040797 - Posted: 27 Mar 2020, 11:23:40 UTC - in response to Message 2040796.  

Set <sched_op_debug>. Show us how many seconds you requested.
ID: 2040797 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2040811 - Posted: 27 Mar 2020, 12:03:59 UTC - in response to Message 2040797.  
Last modified: 27 Mar 2020, 12:07:03 UTC

Set <sched_op_debug>. Show us how many seconds you requested.

I manually remove this lines to avoid misunderstandings.
You know my client works a little different.
It ask just 1 sec of work per logical device.
Instead of a huge amounts of seconds the regular client ask.
But no problem at all, i could wait the servers back to the regular operation.
ID: 2040811 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040814 - Posted: 27 Mar 2020, 12:31:17 UTC - in response to Message 2040811.  

Understood. Probably not the best machine to pick for testing the 'big requests are less likely to succeed' hypothesis.

I'm continuing to get sporadic allocations, but rarely:

27/03/2020 11:40:23 | SETI@home | Scheduler request completed: got 10 new tasks
Maybe one lonely splitter is still struggling along? Certainly not a full flood of splitting yet.
ID: 2040814 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2040827 - Posted: 27 Mar 2020, 13:15:46 UTC - in response to Message 2040814.  

Rarely getting any updates myself. No communication blocks, just nothing being delivered. Everything still on the primary database, and still 2 or more hours behind. 20mr20ac and ad on the splitters, but nothing in the store. Tasks are the SETI@Home equivalent to butt wipe.
ID: 2040827 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2040828 - Posted: 27 Mar 2020, 13:21:27 UTC - in response to Message 2040814.  
Last modified: 27 Mar 2020, 13:23:02 UTC

Maybe one lonely splitter is still struggling along? Certainly not a full flood of splitting yet.

By the last SSP update there are only 2 Arecibo tapes with few MB splitters working on it, a 1 tape for AP.
Sure not enough production to feed all the hungry hosts.
IMHO The covid-19 crises add a lot of troubles just to keep the servers running, in this last days before the hibernation.
I know most of the operations could be done remotely, but why stress with something that will shut down in 3 days, when you have a lot of more things important happening on the world at this time. They could work from home yes, but they are humans, has families and relatives to care about. Better stay with them. Not sure about how CA is handling the quarantine, but here even go to a groceries store is a hard task.
ID: 2040828 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2040832 - Posted: 27 Mar 2020, 13:29:54 UTC

The key task counts on the SSP are also abnormally high, and we would expect them to be well into the 'inhibit splitters' red zone. I've had a lot of noise bombs in the last few days, both from the grotbag of old tapes and from brand new recordings. Under the present validation regime, all of those will need re-checking.
ID: 2040832 · Report as offensive     Reply Quote
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19075
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2040835 - Posted: 27 Mar 2020, 13:43:59 UTC

I also see looking through my tasks that the only tasks that have been validated since about 18:00 GMT yesterday are the _2 or higher that I have returned, even though in most cases both initial wingmen have reported in.
ID: 2040835 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2040837 - Posted: 27 Mar 2020, 13:45:41 UTC - in response to Message 2040828.  

Not sure about how CA is handling the quarantine, but here even go to a groceries store is a hard task.

I doubt this is in any way impacted by the virus. CA is a shelter in place status right now, which pretty much means stay at home as much as you can. Drive throughs are still open, most convenience stores, etc. You can't sit in a restaurant, have to stay 6 ft from people in. line at any stores which are open, and so forth. You can even work, as long as you can stay 6 ft from everyone else. As far as quarantines are concerned, like all things American, we have a pretty good luxurious life and complain because it isn't what we want. How are things in Panama? I haven't been there since...1993ish. Went through the canal 4 times. Beautiful country. I think Richard has hit the nail on the head, and given my earlier comments about how everything is running on the primary DB, and all else which is going on, things are just really bogged down right now, which it may remain this way until this gets shut down.

sudo shutdown now.
ID: 2040837 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2040842 - Posted: 27 Mar 2020, 14:01:43 UTC - in response to Message 2040837.  
Last modified: 27 Mar 2020, 14:25:14 UTC

How are things in Panama? I haven't been there since...1993ish. Went through the canal 4 times. Beautiful country.
We are on a complete lockdown. Only supermarkets, groceries stores, pets food stores, banks and fuel stations are opened but with prohibition to sell anything with alcohol (liquors). Restaurants or Fast foods only to go. All the rest is totally closed. including airports and frontiers. The canal is still open to marine traffic but the people of the ships could not disembark. You could only go outside to buy food or medicine and even that is only allow 2 hr per day, and that time of the day is controlled by your ID number, actually 1 to do the shop itself and the rest for go/return from your home. At least we have very few deaths (9) and cases (674) but as you know we are a very small country with just 4 MM inhabitants. Ok it's out of topic. Moderator please forgive me.
ID: 2040842 · Report as offensive     Reply Quote
Previous · 1 . . . 41 · 42 · 43 · 44 · 45 · 46 · 47 . . . 107 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.