Message boards :
Number crunching :
Panic Mode On (69) Server problems?
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next
Author | Message |
---|---|
Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572 |
I may have a problem. Consecutive valid tasks on the application details page for my GPU's keeps dropping to single figures. I may be trashing WU's, not seeing anything obvious (short run times etc) in boinc manager. I have over the weekend striped and cleaned this machine and upgraded video drivers, first to 285.62 and now to 290.53. ATM I cannot do much more than keep an eye on it when I can, if it starts looking too bad I will stop processing on GPU's. It would be nice if the tasks pages could be turned on again. Kevin |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Pshaw.....it seems that they still don't think the servers can handle the DB inquiries to give us back our tasks info. Meowphhhhhht. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
And at about 03:00hrs d/l u/l & sched servers showing disabled again. 03:24 hrs only 1 server now disabled.. These darn things are switching on and off for some reason. Cant see a human hand being responsible in the very early hours of the morning. [times are pst mot utc] Cheers Cliff, Been there, Done that, Still no damm T shirt! |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
And at about 03:00hrs d/l u/l & sched servers showing disabled again. When I looked an hour or so ago, only the scheduling server was showing disabled. Now it and the upload server are. Disabled means someone manually turned it off, right? But the guys usually come into the lab at 0800 PST and the last update of the server status page was at 0700. Also, what the heck is going on with the crickets? They seem to have dropped by about 20Mbps at around 1900 last night. Update: before posting, I checked the SSP again and the upload server is back up; the blue crickets aren't showing any significant problems. Ready to send has dropped quite a bit, to ~200K, which means work is continuing to be scheduled and sent. "I'm so confused..." David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
Since reporting is working it's obviously some giltch in the status display. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Matt just turned the tasks page back on!!! Meowza! (Now, don't everybody go crashing the DB now. LOL.) "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572 |
I may have a problem. Found the problem. The driver update reduced the vcore voltage on my GPU's from 1.00v to 0.95v, It looks as if my cards don't like being run at too low a voltage. Thanks for turning on the tasks pages. Kevin |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Matt just turned the tasks page back on!!! Meowza indeed! That's even better than a silly green star. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0 |
Those "silly green stars" keep this place alive. |
SciManStev Send message Joined: 20 Jun 99 Posts: 6651 Credit: 121,090,076 RAC: 0 |
It is so nice having the tasks pages back. I now know from work, that I am filled to my limits. With it off, I had to use the rescheduler tool to see what I had on board. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65709 Credit: 55,293,173 RAC: 49 |
Matt just turned the tasks page back on!!! I think blue is quite an improvement... The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
Matt just turned the tasks page back on!!! Woo! Member of the People Encouraging Niceness In Society club. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Well, I am glad the tasks pages are back on.. but at the same time.. my spreadsheet for all the APs I've done is just completely ruined. 80% of the WUs that I had made entries for and were waiting to be crunched and returned to get the rest of the data are all now "unable to collect data." Scrapping that project after nearly two years. With that bombshell.. I present: Getting totally credit-screwed by an ATI wingmate. (since it will purge soon.. 3.31 is all it got). Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
With that bombshell.. I present: Getting totally credit-screwed by an ATI wingmate. (since it will purge soon.. 3.31 is all it got). Isn't it strange that when one of those gets pointed out it's suddenly no longer available to see (that's happened to me before as well). Cheers. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
With that bombshell.. I present: Getting totally credit-screwed by an ATI wingmate. (since it will purge soon.. 3.31 is all it got). Not trying to promote/start any witch-hunts.. but it was hostid=6029917. I looked through the AP tasks that machine has listed and there is a huge variation in run time for the GPU, was at least one with an error (is purged now), and several that had a pretty normal granted credit. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
Hah, I have an outstanding AP task, been outstanding for quite some time.. So I had a look at the wingman.. WoW, 45 day turnround, ~380 tasks on the rig.. When I wonder will the last of those tasks be completed? 2020? or later? Still he did contact the server today:-) I wonder why.. Oh.. silly me it must have been day 45.. Wonder if he actually returned a compled task? Not as irritating as loosing 2 years worth of work, but still irritating. Regards, Cliff, Been there, Done that, Still no damm T shirt! |
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0 |
You didn't "lose 2 years of work". Think of it as an ongoing research project that reached a natural end. Everything we do is a learning exercise & we come away from it better for it. |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
Hi Dave, I'm betting thats not what Cosmic Ocean is thinking or feeling right now though.. Cheers Cliff, Been there, Done that, Still no damm T shirt! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
Not today, but next week. From the front page: Monday Morning Outage |
Graham Middleton Send message Joined: 1 Sep 00 Posts: 1517 Credit: 86,815,638 RAC: 0 |
Not today, but next week. From the front page: Indeed good to know, but also something to beware of. Every time there are power tests in the Data Centers of my employers, There seem to be 2 constants:- 1. The testing causes further power issues that continue for some time after the scheduled duration, and have a wider effect than planned. And 2) A number of the systems (notably those that the business can least do without) fail to reboot after the power work, caused by failed disks (the one remaining of a mirror pair fails to restart), power supplies (N out of N + 1 power supplies may be enough to keep the server running, but won't be enough or will be wrongly configured to enable the system to boot) or a database server has been wrongly set to auto-boot on power-up, to it tries to open a database when some, but not all the required disks are available, resulting in corruption, and the need for a full restore of the database from backups [and they are always usable and correct, aren't they???!!!???]. Or even some combination of these issues. In other words, Murphy's Law always applies. Of course this isn't from personal experience! :-D Happy Crunching, Graham |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.