The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 . . . 94 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2027765 - Posted: 15 Jan 2020, 22:16:18 UTC - in response to Message 2027759.  

So this explains why i have a heap of GPU work units queued up for processing, but no CPU only WU's???


. . You have GPU WUs? I'll toss you for them ...

Stephen

:)
ID: 2027765 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2027766 - Posted: 15 Jan 2020, 22:18:19 UTC

At present, if you have work you have been unable to report, try setting the project to "no new tasks" and try reporting again.
At least for me, this enabled me to report ~1000 tasks without getting the http error from a machine that hadn't been able to report.
ID: 2027766 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2027767 - Posted: 15 Jan 2020, 22:18:46 UTC - in response to Message 2027764.  

Greetings,
I never thought I would ever see the day. My main is now running Einstein since it cannot get any WUs from SETI. My laptop has, my Pis have and I believe my other Linux PC did several hours ago. But! Not my main. So, in the interim, my Main will be doing Einstein.
Have a great day! :)
Siran

. . Since the outage ended I believe I have only received 2 WUs at all, on one machine only and that was hours ago :(

Stephen

:(
ID: 2027767 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2027770 - Posted: 15 Jan 2020, 22:21:19 UTC

SSP would seem to indicate that everything is back up now.
No tasks available, of course ...
ID: 2027770 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 2027771 - Posted: 15 Jan 2020, 22:25:33 UTC - in response to Message 2027768.  

Greetings,

... Pis have and I believe my other Linux PC did several hours ago. But! Not my main. So, in the interim, my Main will be doing Einstein.


This reminds me, I also have a spare RPi 3B+ sitting around and collecting dust. Is there an ISO/IMAGE one can save to a microSD card, sign in and start processing? I'm a complete dummy when it comes to Linux and stuff like that. I'd like something SIMPLE to setup, on the order of KODI or RASPIAN in NOOBS. Actually I think a lot of people have Raspberry Pi's hanging around that could be put to good use if a SIMPLE and bare bones way to install it could be devised.

Hi Omega,

Go through this thread. There's a lot of information in it.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2027771 · Report as offensive
Niteryder
Volunteer tester

Send message
Joined: 1 Mar 99
Posts: 64
Credit: 22,663,988
RAC: 18
United States
Message 2027772 - Posted: 15 Jan 2020, 22:26:51 UTC - in response to Message 2027744.  

Keith Myers, if everyone would quit fooling the system into thinking they had 44 GPU's on only 4 desktops it would also take a load off the system.
ID: 2027772 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22816
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2027773 - Posted: 15 Jan 2020, 22:29:53 UTC

Good News - Beta is giving out work
Bad News - Main is somewhat constipated with the RTS at >1,000,000 and nothing getting out (for me)

As others have said - something has been done to the servers.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2027773 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 2027774 - Posted: 15 Jan 2020, 22:32:42 UTC - in response to Message 2027772.  

Keith Myers, if everyone would quit fooling the system into thinking they had 44 GPU's on only 4 desktops it would also take a load off the system.

Hi Niteryder,

Keith isn't the only one "spoofing", many others do the same. I'm kinda doing it by having 2 cards but only one doing work. The other drives my monitor. At least I have the cards to show the number of WUs I get though. ;)

I don't really think that "spoofing" puts that much more stress on the servers.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2027774 · Report as offensive
Niteryder
Volunteer tester

Send message
Joined: 1 Mar 99
Posts: 64
Credit: 22,663,988
RAC: 18
United States
Message 2027775 - Posted: 15 Jan 2020, 22:34:56 UTC - in response to Message 2027774.  

Yea but Keith is complaining about everyone who is not "spoofing" getting more tasks, because of the new limits.
ID: 2027775 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 2027777 - Posted: 15 Jan 2020, 22:57:54 UTC - in response to Message 2027775.  

Yea but Keith is complaining about everyone who is not "spoofing" getting more tasks, because of the new limits.

Hi Niteryder,

I just re-read his post and don't see him saying anything about non-"spoofers" getting more WUs. He was talking about the "number of WUs per device" that was just increased a few weeks ago. It was 100 per device, now it's 200 CPU and I believe 300 GPU WUs per device. He was saying that the number should go back to what it was.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2027777 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2027779 - Posted: 15 Jan 2020, 22:58:27 UTC - in response to Message 2027772.  

Keith Myers, if everyone would quit fooling the system into thinking they had 44 GPU's on only 4 desktops it would also take a load off the system.
Spoofing won't impact the system at all as long as the spoofers are powerful systems processing their big caches before their wingmen. Spoofed 'in progress' tasks would be 'waiting for validation' without spoofing but the number of database rows stays the same.

But increasing the standard limits of all users does have a direct impact on database size.
ID: 2027779 · Report as offensive
Niteryder
Volunteer tester

Send message
Joined: 1 Mar 99
Posts: 64
Credit: 22,663,988
RAC: 18
United States
Message 2027780 - Posted: 15 Jan 2020, 23:02:03 UTC - in response to Message 2027777.  

He is complaining about someone getting 200 per device when he may be getting as much as 600 per device.
ID: 2027780 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2027781 - Posted: 15 Jan 2020, 23:03:05 UTC

If all the limits were the issue, or some of the other "load" type things folks have raised as possibilities, wouldn't the SSP reflect that in terms of excessive queries on the db (Master database queries/second)? In watching it, I don't see that number get very high. Seems there must be something else going on, and while there's no way to know my wild guess is that ultimately it's a splitter throttle issue. Seems to me that it when that was reflected on the SSP as a separate process that all this goofiness began, especially all the odd gyrations on scheduling server availability and response time.
All uninformed speculation, of course, but in the absence of any meaningful communication from the project folks, that's all there is. Dead horse, beaten.
ID: 2027781 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2027782 - Posted: 15 Jan 2020, 23:04:21 UTC
Last modified: 15 Jan 2020, 23:05:21 UTC

He is complaining about someone getting 200 per device when he may be getting as much as 600 per device.


There are a big difference from one who spoofing the GPU and well handle the host, like Keith or others (maybe 20 or less users) who do the same (me included) than change the limits of all the community.

In the case of the big crunchers, who has GPU who could crunch 1 or more Wu per minute a 100 WU per GPU buffer is simply not realistic. This crunchers need some way to increase the WU cache or they will constantly run out of work. But they are very few.

For the fast majority of users (maybe more than 50 K), who run on a "set & forget" and produces maybe 20 WU per day or less a large WU cache is not needed and unnecessary increases the size of the DB. That change is what we are talking about.
ID: 2027782 · Report as offensive
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2027784 - Posted: 15 Jan 2020, 23:08:40 UTC
Last modified: 15 Jan 2020, 23:14:12 UTC

The server status page seems to update only once every few hours but the last time it updated it said there was over a million results ready to send. However even when that that information was fresh, my both computers got just 'Project has no tasks available' over and over. Are the anonymous systems being discriminated against again like during the christmas?

Edit: right after I typed that my bigger box received over a hundred tasks!
ID: 2027784 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2027785 - Posted: 15 Jan 2020, 23:09:34 UTC - in response to Message 2027782.  

For the fast majority of users (maybe more than 50 K), who run on a "set & forget" and produces maybe 20 WU per day or less a large WU cache is not needed and unnecessary increases the size of the DB. That change is what we are talking about.
And the proper way to deal with that, for users of any production volume, is to set realistic cache size limits so that the process can self-regulate, rather than flailing about trying to find a sweet spot in externally imposed limits. If everyone set their caches for a max of 1-2 days, and stuck to that, actual device limits would be unneeded. Assuming, of course, that the client calculated the requirement accurately.
ID: 2027785 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22816
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2027787 - Posted: 15 Jan 2020, 23:12:05 UTC - in response to Message 2027780.  

The 200/call (if that's what is being talked about) is a hard limit. The actual "ready for dispatch" queue is 200 work units long, so anyone needing more than tha is going to need more than 1 call to fill their cache.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2027787 · Report as offensive
Niteryder
Volunteer tester

Send message
Joined: 1 Mar 99
Posts: 64
Credit: 22,663,988
RAC: 18
United States
Message 2027788 - Posted: 15 Jan 2020, 23:12:18 UTC

Jimbocous, I agree 100%.
ID: 2027788 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2027789 - Posted: 15 Jan 2020, 23:14:43 UTC - in response to Message 2027785.  
Last modified: 15 Jan 2020, 23:15:13 UTC

And the proper way to deal with that, for users of any production volume, is to set realistic cache size limits so that the process can self-regulate....

I agree and there are a simple way to do that.

Each host could has a WU cache up to the valid returning crunched WU per day, Your host produces 10, you could have a 10 WU cache, your host produce 5000, then your cache could be 5000.

Obviously there a need for some adjust for the slower devices who crunch less than 1 Wu per day, but that could be easy be solved by limit a minimum cache size of up to 4 WU for example.
ID: 2027789 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22816
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2027790 - Posted: 15 Jan 2020, 23:15:01 UTC - in response to Message 2027785.  

At one time there were no cache limits, and some people ended up with many thousands of tasks that thy had no hope of completing in time, but they saw it as a "badge of honour" to have so many. That of course was in the days when it could take a fair chunk of a day to process a task (a bit like my RPis do today)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2027790 · Report as offensive
Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.