Panic Mode On (104) Server Problems?

Message boards : Number crunching : Panic Mode On (104) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 42 · Next

AuthorMessage
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1841315 - Posted: 11 Jan 2017, 4:36:46 UTC - in response to Message 1841284.  

The kitties are wondering if the GPUs are gonna run outta work if today's outrage is another 8 hour saga.
What with all the Aerecibo shorties in the cruncher's caches.

Meow?

At a guess I would say it is just the tapes that they loaded to process not expecting to not be splitting GBT data. I like the idea of shorter runtimes because to me it feels as more work is getting done. I am reasonably confident that people will have different opinions on that sentence. In no way am I wanting to start a debate
Did your GPUs manage to hold their own?
ID: 1841315 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1841316 - Posted: 11 Jan 2017, 4:37:17 UTC - in response to Message 1841284.  

The kitties are wondering if the GPUs are gonna run outta work if today's outrage is another 8 hour saga.
What with all the Aerecibo shorties in the cruncher's caches.

Meow?


. . 8 hours? I wish! It was at least 10 hours by my reckoning. But worse still, while I was expecting a flood of guppies, even worse has occured ... a flood of "Project has no tasks!" ....

. . AAARRRGGG!

. . The Baby ran out of work nearly 6 hours before the system came back up, and now, still no work :(

Stephen

:(
ID: 1841316 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1841318 - Posted: 11 Jan 2017, 4:57:47 UTC - in response to Message 1841316.  
Last modified: 11 Jan 2017, 5:00:12 UTC

Out of GPU work, and still unable to get any.

And I notice Centurion is still down.


EDIT- should have posted this sooner. 3 seconds after posting, I got some work. Hopefully it will be enough to not run out before the next batch becomes available.

EDIT- not looking good, the first 3 WUs were noisy. And probably half the WUs downloaded are all shorties.
Grant
Darwin NT
ID: 1841318 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1841324 - Posted: 11 Jan 2017, 5:18:56 UTC - in response to Message 1841318.  
Last modified: 11 Jan 2017, 5:22:28 UTC

Out of GPU work, and still unable to get any.

And I notice Centurion is still down.


EDIT- should have posted this sooner. 3 seconds after posting, I got some work. Hopefully it will be enough to not run out before the next batch becomes available.

EDIT- not looking good, the first 3 WUs were noisy. And probably half the WUs downloaded are all shorties.


. . Hi Grant,

. . Like you, within a short time of posting that there were no new tasks I got about 20 or so on The Baby, but still a long way short of a full cache of 200. Hopefully there will be more before that little bit runs out.

. . Also like you I was expecting a flood of Guppies with Centurion restored during the outage but it must have a serious problem :( Still only Arecibo tasks when you can get them.

Stephen

.
ID: 1841324 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1841329 - Posted: 11 Jan 2017, 5:26:35 UTC - in response to Message 1841324.  


. . Like you, within a short time of posting that there were no new tasks I got about 20 or so on The Baby, but still a long way short of a full cache of 200. Hopefully there will be more before that little bit runs out.
.

My fault! My caches are now full (lol).
Must have been a long outage today, my "heavy hitter" actually ran out of CPU work today, and that almost never happens ...
ID: 1841329 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1841342 - Posted: 11 Jan 2017, 6:32:33 UTC - in response to Message 1841329.  


. . Like you, within a short time of posting that there were no new tasks I got about 20 or so on The Baby, but still a long way short of a full cache of 200. Hopefully there will be more before that little bit runs out.
.

My fault! My caches are now full (lol).
Must have been a long outage today, my "heavy hitter" actually ran out of CPU work today, and that almost never happens ...

Quit stealing all my work !! ;-^ } I have been only to get a single task per machine every hour or so since the project came back. I'm still running backup projects because of a lack of SETI work. These latest outages take a full day to build up my normal 200 task cache per machine.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1841342 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1841347 - Posted: 11 Jan 2017, 6:49:17 UTC - in response to Message 1841342.  

I've just about got a full cache again.
Luckily I got hit with a whole slab of Guppie resends, so that will help keep the system busy between eventual successful requests for work.
Grant
Darwin NT
ID: 1841347 · Report as offensive
Profile marsinph
Volunteer tester

Send message
Joined: 7 Apr 01
Posts: 172
Credit: 23,823,824
RAC: 0
Belgium
Message 1841354 - Posted: 11 Jan 2017, 7:17:54 UTC
Last modified: 11 Jan 2017, 7:22:54 UTC

Only one task received since Seti came back !!! It will say about 6 hours ago.
No work available !?
But on server status the creation rate is about 37WU/sec !?!? Who can explain ?
ID: 1841354 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1841355 - Posted: 11 Jan 2017, 7:18:49 UTC - in response to Message 1841342.  

Quit stealing all my work !! ;-^ } I have been only to get a single task per machine every hour or so since the project came back. I'm still running backup projects because of a lack of SETI work. These latest outages take a full day to build up my normal 200 task cache per machine.

SETI loves me
This I know
Cause my caches tell me so

lol

Interesting twist here is that since it's all Aricebo stuff coming out, and mostly not VLARs right now, every time GR runs it slams all the jobs over to the GPU and the CPUs get empty and have to ask for more work.
Terrible thing!
ID: 1841355 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1841356 - Posted: 11 Jan 2017, 7:21:10 UTC - in response to Message 1841354.  
Last modified: 11 Jan 2017, 7:24:42 UTC

Only one task received since Seti came back !!!
No work available !?
But on server status the creation rate is about 37WU/sec !?!? Who can explain ?

Market economics
aka
demand vs. supply

We're all slamming the scheduler asking for work, and it's getting handed out as fast as it's created.
Won't see a surplus until all the caches get filled up on folks machines.

edit

For a bit more background, the BL splitter server died, so no BLC stuff from Greenbank until they get it fixed. Basically, 50-75% reduction in capacity to generate work right now coupled with demand from weekly maintenance equals very slow to get caught up.
ID: 1841356 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1841360 - Posted: 11 Jan 2017, 7:32:50 UTC - in response to Message 1841315.  

The kitties are wondering if the GPUs are gonna run outta work if today's outrage is another 8 hour saga.
What with all the Aerecibo shorties in the cruncher's caches.

Meow?

At a guess I would say it is just the tapes that they loaded to process not expecting to not be splitting GBT data. I like the idea of shorter runtimes because to me it feels as more work is getting done. I am reasonably confident that people will have different opinions on that sentence. In no way am I wanting to start a debate
Did your GPUs manage to hold their own?

Dunno if they ran out or not, as I was gone to work.
I am assuming they did at some point, but my RAC still went up, so it may not have been too bad.
Caches not full at present, but all rigs appear to have work.
With the extended outage and the splitter output all short Aerecibo tasks, it will take some time for things to catch up, for sure.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1841360 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1841386 - Posted: 11 Jan 2017, 11:07:38 UTC
Last modified: 11 Jan 2017, 11:46:07 UTC

Don't see this often, 4 inconclusives http://setiathome.berkeley.edu/result.php?resultid=5423396047

Lets see if I can make it 5 :)

EDIT: One more http://setiathome.berkeley.edu/workunit.php?wuid=2331080925
ID: 1841386 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1841396 - Posted: 11 Jan 2017, 12:09:54 UTC - in response to Message 1841347.  

I've just about got a full cache again.
Luckily I got hit with a whole slab of Guppie resends, so that will help keep the system busy between eventual successful requests for work.


. . I never thought I would hear these words leave my fingers under such circumstances but ... lucky you! :)

Stephen

.
ID: 1841396 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1841567 - Posted: 12 Jan 2017, 4:15:12 UTC
Last modified: 12 Jan 2017, 4:22:25 UTC

Is it my imagination or are pages loading very slowly? I am in New Zealand
ID: 1841567 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1841571 - Posted: 12 Jan 2017, 4:33:11 UTC

Well, this has been interesting. My Windows 10 machine has a full cache of work. My Windows 7 machines have zero GPU work. Looks like the outrage is going to cause at least a 3-4 day lack of GPU work for my fast machines.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1841571 · Report as offensive
Profile Dr Grey

Send message
Joined: 27 May 99
Posts: 154
Credit: 104,147,344
RAC: 21
United Kingdom
Message 1841599 - Posted: 12 Jan 2017, 8:06:23 UTC - in response to Message 1841571.  

Well, this has been interesting. My Windows 10 machine has a full cache of work. My Windows 7 machines have zero GPU work. Looks like the outrage is going to cause at least a 3-4 day lack of GPU work for my fast machines.


Current result creation rate: 30.4548/sec
Results received in last hour: 110,406. 110,406 / 3600 = 30.668 / sec

So it looks like the splitters can't keep up.
ID: 1841599 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1841600 - Posted: 12 Jan 2017, 8:11:30 UTC - in response to Message 1841599.  

So it looks like the splitters can't keep up.

It's been a long standing issue with the PFB splitters; several of them end up on the one file, and that slows the splitting down significantly.
Grant
Darwin NT
ID: 1841600 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22536
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1841605 - Posted: 12 Jan 2017, 8:41:08 UTC - in response to Message 1841599.  

The splitting rate shown is an instantaneous figure, compared to the long (hour?) sample used to generate the return rate. As I type the splitting rate is over 35/sec, so should be "more than capable" of keeping up with the demand (based on the results returned), which is just over 31/sec.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1841605 · Report as offensive
Profile Dr Grey

Send message
Joined: 27 May 99
Posts: 154
Credit: 104,147,344
RAC: 21
United Kingdom
Message 1841606 - Posted: 12 Jan 2017, 9:08:56 UTC - in response to Message 1841605.  

The splitting rate shown is an instantaneous figure, compared to the long (hour?) sample used to generate the return rate. As I type the splitting rate is over 35/sec, so should be "more than capable" of keeping up with the demand (based on the results returned), which is just over 31/sec.


So generating a surplus of 4/sec which is enough to fill a 100 wu cache every 25 seconds. That's fast. But with 162,000 active hosts, it will take a while to get ahead of the pack.
ID: 1841606 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1841607 - Posted: 12 Jan 2017, 9:13:15 UTC - in response to Message 1841606.  

So generating a surplus of 4/sec which is enough to fill a 100 wu cache every 25 seconds. That's fast. But with 162,000 active hosts, it will take a while to get ahead of the pack.

As long as it continues to produce work at that rate. Sometimes it's faster, but at other times (like the most recent update) it's slower.
Only 26/sec. Nowhere near faster enough.
Grant
Darwin NT
ID: 1841607 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 42 · Next

Message boards : Number crunching : Panic Mode On (104) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.