Panic Mode On (100) Server Problems?

Message boards : Number crunching : Panic Mode On (100) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 32 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1718598 - Posted: 26 Aug 2015, 15:29:11 UTC - in response to Message 1718597.  

Well, something is wrong. Even though the creation rate shows high output, I see that there are no tasks available. I have been getting the "Project has no tasks available" message now for the last half hour on both systems.

Yeah, well.............
Results returned is also at 104k, indicating a shorty storm.
So, unless the splitters can stay at the high water mark for some time, there is going to be a shortage of work going out for a bit.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1718598 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1718625 - Posted: 26 Aug 2015, 16:32:27 UTC

RTS for both MB + AP = 0. This is cause for PANIC
ID: 1718625 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1718644 - Posted: 26 Aug 2015, 17:25:24 UTC

Re: RTS=0 - Lots of stuff hitting the science database, thus causing general indigestion. One of these is the weekly backup, which should end any minute now. Hopefully that will be enough to push things through without much additional intervention.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1718644 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1718648 - Posted: 26 Aug 2015, 17:32:03 UTC - in response to Message 1718644.  

Thanks for the info Matt!!
ID: 1718648 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1718649 - Posted: 26 Aug 2015, 17:33:54 UTC

Thanks, Matt.
We knew you were keeping on top of it....LOL.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1718649 · Report as offensive
djmotiska

Send message
Joined: 26 Jul 01
Posts: 20
Credit: 29,378,647
RAC: 105
Finland
Message 1718766 - Posted: 26 Aug 2015, 21:16:24 UTC

I wish Matt & co could enable lost tasks resend. Got 65 ghosts on monday, oops. Current tasks will last probably until tomorrow.
ID: 1718766 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1718773 - Posted: 26 Aug 2015, 21:23:56 UTC - in response to Message 1718766.  

I wish Matt & co could enable lost tasks resend. Got 65 ghosts on monday, oops. Current tasks will last probably until tomorrow.

I know, but it apparently takes a toll on the database server.
And it has been coping amazingly well with the WOW onslaught.
Maybe after the event is done?
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1718773 · Report as offensive
djmotiska

Send message
Joined: 26 Jul 01
Posts: 20
Credit: 29,378,647
RAC: 105
Finland
Message 1718785 - Posted: 26 Aug 2015, 21:47:11 UTC - in response to Message 1718773.  

I wish Matt & co could enable lost tasks resend. Got 65 ghosts on monday, oops. Current tasks will last probably until tomorrow.

I know, but it apparently takes a toll on the database server.


I know that. Don't remember if current level of ~1500 is normal level of queries/sec, tho I remember seeing higher figures.

And it has been coping amazingly well with the WOW onslaught.
Maybe after the event is done?


Let's hope the best and fear the worst. Luckily all those ghost wu's were vlar's so they are not timing out anywhere soon.

I got those ghosts when I changed CPU app from sse2 to sse3, forgot to change reference of cmdline.txt in app_info.xml. At about same time I updated the OpenCL app, new r2929 is 15-20 % faster than old r2033. I also noticed I don't have to have a free core to feed the GPU. Goob job, Raistmer!
ID: 1718785 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1718787 - Posted: 26 Aug 2015, 21:49:21 UTC - in response to Message 1718785.  

I wish Matt & co could enable lost tasks resend. Got 65 ghosts on monday, oops. Current tasks will last probably until tomorrow.

I know, but it apparently takes a toll on the database server.


I know that. Don't remember if current level of ~1500 is normal level of queries/sec, tho I remember seeing higher figures.

And it has been coping amazingly well with the WOW onslaught.
Maybe after the event is done?


Let's hope the best and fear the worst. Luckily all those ghost wu's were vlar's so they are not timing out anywhere soon.

I got those ghosts when I changed CPU app from sse2 to sse3, forgot to change reference of cmdline.txt in app_info.xml. At about same time I updated the OpenCL app, new r2929 is 15-20 % faster than old r2033. I also noticed I don't have to have a free core to feed the GPU. Goob job, Raistmer!

If they timeout, they shall simply be resent. Hopefully the next cruncher in line has better luck with them.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1718787 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1718891 - Posted: 27 Aug 2015, 1:15:21 UTC - in response to Message 1718870.  

Yes it seems to do that all too often.
ID: 1718891 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1718963 - Posted: 27 Aug 2015, 6:20:54 UTC - in response to Message 1718870.  
Last modified: 27 Aug 2015, 6:21:55 UTC

And again, the replica DB went poof.
Yes it seems to do that all too often.

This may be too reduce strain on the database so it can dedicate more resources to the splitters
ID: 1718963 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1718977 - Posted: 27 Aug 2015, 6:34:33 UTC - in response to Message 1718963.  

I see the splitters are still not up to the task.
Grant
Darwin NT
ID: 1718977 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1718979 - Posted: 27 Aug 2015, 6:36:49 UTC

Kitties are still OK, caches are keeping up.
No worries here.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1718979 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1718990 - Posted: 27 Aug 2015, 6:52:59 UTC - in response to Message 1718979.  

Kitties are still OK, caches are keeping up.

As are my caches.
However the fact that the splitters aren't keeping up indicates an issue that should be resolved. As systems become more powerful, as they have since the beginning of Seti@home, more & work will be done.
It would be a shame if in the future people are unable to get work, even though there is plenty available; it's just that it can't be produced fast enough.
Why wait till that occurs? Why not sort it out now?
Makes more sense to me to avoid issues, not wait for them to occur & then try to resolve them.
Grant
Darwin NT
ID: 1718990 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1719025 - Posted: 27 Aug 2015, 9:18:01 UTC - in response to Message 1718990.  

I agree with Grant, splitters are just too slow at moment. Results received in last hour is above 100.000 all time, so Current result creation rate should be at least 28/sec to keep it even.

I just hope that they have time and manpower to sort it out...
ID: 1719025 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1719027 - Posted: 27 Aug 2015, 9:34:13 UTC - in response to Message 1719025.  

I agree with Grant, splitters are just too slow at moment. Results received in last hour is above 100.000 all time, so Current result creation rate should be at least 28/sec to keep it even.

I just hope that they have time and manpower to sort it out...

If you read Matt's technical posts, all three of time, manpower, and Arecibo recording time are in short supply.

What time and manpower are available are being concentrated on working up the alternative tools, so we can search the data from other telescopes and hence other parts of the sky. Keeping every volunteer supplied with every WU they ask for isn't the highest of their priorities, and to be honest I don't think it should be.
ID: 1719027 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1719035 - Posted: 27 Aug 2015, 10:13:15 UTC - in response to Message 1719027.  

Keeping every volunteer supplied with every WU they ask for isn't the highest of their priorities, and to be honest I don't think it should be.

Every cruncher supplied with every WU they ask for shouldn't be a major priority (let alone one of the highest), I agree.
But the goal of the project is to process data, and if it's not being split, then it can't be processed. And as we've seen with the move away from MB to AP on the part on many of the more vocal crunchers, Credit is an important factor in people choosing to crunch for Seti; and a lack of work has a big effect on the amount of credit people get.

So while we are aware there won't always be data available to crunch, and the project won't be up 24/7/365, it would be nice if while it was up, and there is data to crunch, it's possible to do so.
Hence among the hundreds of other things that need doing, some attention to the ongoing splitting issues would be appreciated.
Grant
Darwin NT
ID: 1719035 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1719036 - Posted: 27 Aug 2015, 10:19:57 UTC
Last modified: 27 Aug 2015, 10:20:34 UTC

We simply don`t get enough tapes from Arecibo but there is work from Green Banks.


With each crime and every kindness we birth our future.
ID: 1719036 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1719037 - Posted: 27 Aug 2015, 10:34:21 UTC - in response to Message 1719036.  

We simply don`t get enough tapes from Arecibo but there is work from Green Banks.

Having raw data is a whole other issue IMHO.
If we don't have any, we don't have any. Simple.
But if we do have it, it's not much good if we can't process it, and it has to be split in order to process it.
Splitter output has been an issue ever since the PFB splitters came on line, and as processing power increases it has become, and will become, even more of an issue.
Sure, it doesn't matter how good Seti's hardware & software is if there is data to process. But if we have the data, it would be nice to be able to process it.
Grant
Darwin NT
ID: 1719037 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1719039 - Posted: 27 Aug 2015, 10:46:11 UTC

For those who don't bother to read Matt's post in Tech News on 21 August:

Message 1716662 - Posted: 21 Aug 2015, 22:47:12 UTC

Those panicking about a coming storm due to lack of data... The well is pretty dry but Jeff and I just uncovered a stash of tapes from 2011 that require some re-analysis, so that's why you'll see a bunch showing up in splitter queue over the weekend (hopefully before the the results-to-send queue drops to zero).

In the meantime, we are still recording data at AO (not fast enough to keep our crunchers supplied), but.... this situation has really pushed us to devote more resources to finally finishing the GBT splitter, which will avail to us another reserve supply of data in case we hit another dry spell.

The network switch on Tuesday seems to have gone fairly well. We are now sending all our bits over the campus net just like the very old days .

- Matt


So, the tapes being split just now have already been split once, but are being culled for more usable data. More effort is being put into the splitter for the GBT data (coo, what a surprise - its "tapes" are not the same format as those from AO...) Having two data sources should really improve the situation somewhat.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1719039 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 32 · Next

Message boards : Number crunching : Panic Mode On (100) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.