Panic Mode On (78) Server Problems? |
![]() |
| log in |
Message boards : Number crunching : Panic Mode On (78) Server Problems?
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 22 · Next
| Author | Message |
|---|---|
Anybody got any idea whether we're being haunted by Astropulse ghosts, to the same degree? Have not crunched any AP tasks in days, but I see 2 among the ghosts. | |
| ID: 1302176 · | |
Anybody got any idea whether we're being haunted by Astropulse ghosts, to the same degree? I would just stop the AP splitters. MB was chugging along just fine until the AP work entered the picture again. ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 1302178 · | |
The question is - is stopping the MB splitters enough, or should I ask them to stop AP as well? Numbers for AP are not as large, but I have 8 lost AP that were not sent when scheduler gave me new work instead. "Results" out in field are pretty big for MB - 10,685,000 and 137k for AP. Suggest they try to shut off both. ____________ Another Fred Support SETI@home when you search the Web or shop online with GoodSearch and GoodShop | |
| ID: 1302179 · | |
|
One rig couldn't connect to the SETI servers and displayed the message: don't need a network connection ?!After reinstalling BOINC several times, my account setting in SETI were altered?! And in Malaria and Rosetta, too. My BOOT-drive (C:) was also shared. In too many cases, it's not a good idea to run multiple projects, on 1 host. Apollogies to my wingmen for a few hundred MB WUs, which could not be uploaded, due to this network issue and were timed-out. ____________ Knight Who Says Ni N!, OUT numbered................. | |
| ID: 1302180 · | |
|
As Marcel said..."Just shoot up here amongst us, one of us has got to have some relief." | |
| ID: 1302181 · | |
I would just stop the AP splitters. I sympathise with that assessment of the possible trigger, but I think we've got beyond that point now. The database is horrendously bloated - well over 10.5 million tasks supposedly 'out in the field', which is 50% more than usual. The biggest problem right now isn't communications - when you get resends, they come down quite smoothly - but the slow "thinking time" response of the scheduler. I don't think simply stopping AP production on its own will free up enough scheduler and database resources to stop the timeouts and the creation of new ghosts. On the other hand, I agree stopping MB production is a drastic step, and it will impact loads of volunteers with hosts like my little one-core server - which has been plinking along exactly as designed, getting new tasks as needed (and rotating through three different projects). No surplus fat to live off there - one task in progress it says, and I can see it running now. But my gut feeling is saying, quite strongly, that recovery from this problem is going to take an outage of some sort - and the sooner we start it, the shorter it will be. | |
| ID: 1302186 · | |
I would just stop the AP splitters. Well, I would not be the one to second guess your intuition, Richard. You seldom are off base. ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 1302188 · | |
|
| |
| ID: 1302198 · | |
|
According to AP Ghosts: on my host http://setiathome.berkeley.edu/show_host_detail.php?hostid=5553346 which only do AP, there are till now no ghost units at all. My other host which is setup as MB only have some, around 100. And only for this host i do manual suspend of network communication (once a day open for 2-3 hours) cause i dont wanna put extra load on the schedular because of NNT. | |
| ID: 1302206 · | |
|
What I'm seeing now. | |
| ID: 1302238 · | |
|
I've just had a note back from Eric: I've stopped the splitters and doubled the httpd timeout... The splitters are already showing red/orange on the server status page, and 'ready to send' is as near zero as makes no difference (there'll always be a few errors and timeouts to resend). So I'm going to turn off NNT and see what happens - let's see if we can help get this beast back under control. | |
| ID: 1302257 · | |
I've just had a note back from Eric: I have NNT turned off on three machines with empty caches and lots of ghost tasks. So far all I get is Project has no tasks available. | |
| ID: 1302262 · | |
|
I decided to try to get some of my lost tasks since the splitters are disabled and there is no new work available. However, I didn't get any of my 467 lost tasks. I got the "no tasks available" message on two attempts. Will have to see how this plays out. I'm back to NNT for now. | |
| ID: 1302265 · | |
|
Lost tasks come back "automagically"..... | |
| ID: 1302266 · | |
|
"In too many cases, it's not a good idea to run multiple projects, | |
| ID: 1302273 · | |
Lost tasks come back "automagically"..... Not yet, they haven't. I'm with Fred and fscheel so far - really quick turnround on requests, but always "Project has no tasks available". I'll let them keep asking, and see what happens over the next few hours. | |
| ID: 1302275 · | |
|
I had set NNT so to run down my cache to switch off my machine. When I looked after it had finished crunching I had 20 or so ghosts, so I unset NNT for a while to try and see if I could get them. I now appear to have 329 tasks I haven't' got. | |
| ID: 1302276 · | |
I had set NNT so to run down my cache to switch off my machine. When I looked after it had finished crunching I had 20 or so ghosts, so I unset NNT for a while to try and see if I could get them. I now appear to have 329 tasks I haven't' got. You should get them as you request work over the next few days, no more than 20 tasks per request. Your computer won't be suddenly overwhelmed with work, and if you don't get through them all in time, don't worry - they'll get sent to somebody else instead. PS I have received none so far as well. One of my requests has got a resend now, but other machines are still dry. Never mind, the journey of a thousand WUs starts with a single crunch... | |
| ID: 1302279 · | |
had set NNT so to run down my cache to switch off my machine. When I looked after it had finished crunching I had 20 or so ghosts, so I unset NNT for a while to try and see if I could get them. I now appear to have 329 tasks I haven't' got. When they start flowing again, you should get some until BOINC stops asking for work due to the cache setting. No request = no work. You may have to increase that setting, or just be patient and get them over a period of days. Edit: Richard beat me to it. ____________ Another Fred Support SETI@home when you search the Web or shop online with GoodSearch and GoodShop | |
| ID: 1302280 · | |
|
Well there was a dip in the cricket graphs and about then I got 20 members of my Ghost Army downloaded. However now that the cricket graphs are back up, I'm getting scheduler timeouts again as it's trying to get my ghosts and report newly done units. :( | |
| ID: 1302289 · | |
Message boards : Number crunching : Panic Mode On (78) Server Problems?
| Copyright © 2013 University of California |