Panic Mode On (93) Server Problems?

Message boards : Number crunching : Panic Mode On (93) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 24 · Next

AuthorMessage
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1618450 - Posted: 24 Dec 2014, 23:03:08 UTC - in response to Message 1618446.  

Not during Chrrrrrristmas.
ID: 1618450 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1618480 - Posted: 24 Dec 2014, 23:31:22 UTC

I am a true believer in Rem Oil for anything that needs oil and STOS for something that needs a light grease. I use both products on things from door locks to my shot gun have seen very little wear. Rem Oil is easy to find but STOS needs to be mail ordered unless you are lucky enough to find a dealer.
ID: 1618480 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1618517 - Posted: 25 Dec 2014, 0:34:24 UTC

No panic.

Santana Jingo.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1618517 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1618526 - Posted: 25 Dec 2014, 0:39:09 UTC - in response to Message 1618408.  

Well the AP creation rate is the highest I've seen it in a long time.. 1.14/sec. Sure beats 0.014/sec. Hopefully it stays above 1/sec for a while.

Yeah, I have managed to snag 29 GPU AP's now,

Om nom nom nom... My 10-day cache has been filled up (I was already sitting on ~7 days anyway, but it has been topped-off now). The rest of you should be able to snag enough to start building a cache (looking through the messages log, nearly every request resulted in just one AP each time).

Also, it looks like the MB ready-to-send is starting to build a little bit, as well.

Whatever the problems were.. they seem to have passed.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1618526 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1618530 - Posted: 25 Dec 2014, 0:43:32 UTC - in response to Message 1618526.  

Whatever the problems were.. they seem to have passed.

The problems are still there, but the large number of VLARs has helped offset the low splitter output by having significantly less WUs required to build up caches.
Grant
Darwin NT
ID: 1618530 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1618544 - Posted: 25 Dec 2014, 1:01:04 UTC - in response to Message 1618526.  
Last modified: 25 Dec 2014, 1:02:13 UTC

Well the AP creation rate is the highest I've seen it in a long time.. 1.14/sec. Sure beats 0.014/sec. Hopefully it stays above 1/sec for a while.

Yeah, I have managed to snag 29 GPU AP's now,

Om nom nom nom... My 10-day cache has been filled up (I was already sitting on ~7 days anyway, but it has been topped-off now). The rest of you should be able to snag enough to start building a cache (looking through the messages log, nearly every request resulted in just one AP each time).

Also, it looks like the MB ready-to-send is starting to build a little bit, as well.

Whatever the problems were.. they seem to have passed.

Those of us running GPU APs are having a different experience. I'll just post a few numbers to illustrate the problem. These are three different Hosts;
AstroPulse v7 (anonymous platform, CPU): Number of tasks today = 79
AstroPulse v7 (anonymous platform, ATI GPU): Number of tasks today = 10

AstroPulse v7 (anonymous platform, CPU): Number of tasks today = 76
AstroPulse v7 (anonymous platform, ATI GPU): Number of tasks today = 8

AstroPulse v7 (anonymous platform, CPU): Number of tasks today = 92
AstroPulse v7 (anonymous platform, ATI GPU): Number of tasks today = 6

As I've stated before, if I disable "Use CPU" I do not get that many more GPU tasks. Same with the lowered cache setting, I just get fewer APs overall. At some point I do expect the numbers to switch...I hope.

Oh, Merry Christmas!
ID: 1618544 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1618717 - Posted: 25 Dec 2014, 9:13:11 UTC - in response to Message 1618425.  
Last modified: 25 Dec 2014, 9:13:29 UTC

As mentioned WD40 will get it going again, but light machine oil is best if you want to keep them running.

Waking up I turned on the TV server and it made its racket again. So looked around the house what other options I had, now went for the petroleum lubrication that my beard trimmer has.

Fun fact: the default Intel CPU fan for socket 1156 CPUs has no rubber plug underneath the sticker.
ID: 1618717 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1618719 - Posted: 25 Dec 2014, 10:08:26 UTC - in response to Message 1618717.  
Last modified: 25 Dec 2014, 10:31:57 UTC

As mentioned WD40 will get it going again, but light machine oil is best if you want to keep them running.

Waking up I turned on the TV server and it made its racket again. So looked around the house what other options I had, now went for the petroleum lubrication that my beard trimmer has

I use Cycle Oil, the sort in a plastic bottle, I did use GT85 on my T8100's fan, but needed to disassemble the laptop again to put something heavier on it later.

Claggy
ID: 1618719 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1618720 - Posted: 25 Dec 2014, 10:09:54 UTC - in response to Message 1618717.  

Fun fact: the default Intel CPU fan for socket 1156 CPUs has no rubber plug underneath the sticker.

Probably the case for a lot of smaller computer fans these days.
The fans on my GTX 560Ti were the same- it was a case of carefully (and quite forcefully) levering the fan blade assembly out of the motor assembly.
Grant
Darwin NT
ID: 1618720 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1618732 - Posted: 25 Dec 2014, 11:48:17 UTC

The SSP shows Martin is running:)
rOZZ
Music
Pictures
ID: 1618732 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1618769 - Posted: 25 Dec 2014, 14:30:20 UTC - in response to Message 1618732.  

The SSP shows Martin is running:)

Martin the Marvian?

<runs from Julie's rolling pin>
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1618769 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1618818 - Posted: 25 Dec 2014, 19:29:04 UTC - in response to Message 1615937.  


So, why is my out of work machine having to wait 20 minutes between useless requests when my other 2 machines aren't having to wait?
I just checked it again, now it's up to a 40 minute interval, while the GPUs are quiet.

BOINC has an incremental delay built in for failed response. This is to avoid DOSing the servers. It the same no matter which BOINC version or OS used.

Pre BOINC v7 doesn't have the same high delay. It's just silly.



Using Linux I can have cron run a script every six minutes that will tail the last line of stdoutdae.txt and if there has been no change in the last line since the last run, issue the update command which forces a contact that results in new work or at least a statement that there is none available. Perhaps Windows users can have the Task Scheduler run a batch file every so often to accomplish the same thing.

Works fine using the Task Scheduler. Just 2 minor issues, it of course doesn't check last update time so it just updates every 10 mins, no problem really. But every 10 mins I see a flash when the command window runs, selecting hidden doesn't seem to work, I can live with it but would prefer it gone :)
ID: 1618818 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1618819 - Posted: 25 Dec 2014, 19:31:48 UTC - in response to Message 1618769.  
Last modified: 25 Dec 2014, 19:33:06 UTC

The SSP shows Martin is running:)

Martin the Marvian?

<runs from Julie's rolling pin>


;)

... Marvin....:))))
rOZZ
Music
Pictures
ID: 1618819 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1618831 - Posted: 25 Dec 2014, 19:54:47 UTC - in response to Message 1618818.  


So, why is my out of work machine having to wait 20 minutes between useless requests when my other 2 machines aren't having to wait?
I just checked it again, now it's up to a 40 minute interval, while the GPUs are quiet.

BOINC has an incremental delay built in for failed response. This is to avoid DOSing the servers. It the same no matter which BOINC version or OS used.

Pre BOINC v7 doesn't have the same high delay. It's just silly.



Using Linux I can have cron run a script every six minutes that will tail the last line of stdoutdae.txt and if there has been no change in the last line since the last run, issue the update command which forces a contact that results in new work or at least a statement that there is none available. Perhaps Windows users can have the Task Scheduler run a batch file every so often to accomplish the same thing.

Works fine using the Task Scheduler. Just 2 minor issues, it of course doesn't check last update time so it just updates every 10 mins, no problem really. But every 10 mins I see a flash when the command window runs, selecting hidden doesn't seem to work, I can live with it but would prefer it gone :)

I'm afraid I'm going to have to refile my old discrimination case. Everything was pretty equal until after the first 100 APs. Then suddenly my Mac stopped receiving APs while my 2 Windows machines continued to magically find tasks. Now my Mac is just about OUT while my Windows machines have a couple Hundred a piece. Clearly a case of platform discrimination. This isn't new, it's been the norm. I suspect that after every Windows machine at SETI has been topped off the server will stoop to sending my Mac tasks again, seems to be standard operating procedure. Disgusting Discrimination ;-)

Just to make sure all things were equal, I let it download a few CPU MBs to mirror the Windows machines. Doesn't make a difference, the server still can't seem to find any APs when it gets to my Mac. It doesn't have that problem when it gets to the Windows machines though.
ID: 1618831 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1618852 - Posted: 25 Dec 2014, 22:17:21 UTC - in response to Message 1618720.  

Fun fact: the default Intel CPU fan for socket 1156 CPUs has no rubber plug underneath the sticker.

Probably the case for a lot of smaller computer fans these days.
The fans on my GTX 560Ti were the same- it was a case of carefully (and quite forcefully) levering the fan blade assembly out of the motor assembly.

+1
ID: 1618852 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1618918 - Posted: 26 Dec 2014, 3:49:06 UTC

Can anyone enlighten me as to the meaning of "Message from task: 0"? In the boinc manager event log.

I get it infrequently and cannot pin it to any logged entry. I.E. :


12/25/2014 7:35:15 PM | SETI@home | Message from task: 0
12/25/2014 7:35:15 PM | SETI@home | Computation for task ap_26se14aq_B2_P0_00073_20141225_32598.wu_0 finished
12/25/2014 7:35:17 PM | SETI@home | Started upload of ap_26se14aq_B2_P0_00073_20141225_32598.wu_0_0
12/25/2014 7:35:21 PM | SETI@home | Finished upload of ap_26se14aq_B2_P0_00073_20141225_32598.wu_0_0
12/25/2014 7:35:52 PM | SETI@home | Sending scheduler request: To fetch work.
12/25/2014 7:35:52 PM | SETI@home | Reporting 1 completed tasks
12/25/2014 7:35:52 PM | SETI@home | Requesting new tasks for NVIDIA GPU
12/25/2014 7:35:53 PM | SETI@home | Message from task: 0

"Sour Grapes make a bitter Whine." <(0)>
ID: 1618918 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1618919 - Posted: 26 Dec 2014, 3:51:52 UTC - in response to Message 1618918.  
Last modified: 26 Dec 2014, 3:55:50 UTC

Never seen that message myself.


MB splitter output has choked again. Not as bad as the last time, but enough to cause the ready-to-send buffer (that had only, finally, refilled) to start draining again.


EDIT- and it's gone down the same way as last time; Server Status page shows 7 PFB splitters running, but only 5 splitters are actually working on files.
Grant
Darwin NT
ID: 1618919 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1618933 - Posted: 26 Dec 2014, 4:52:30 UTC - in response to Message 1618798.  
Last modified: 26 Dec 2014, 5:07:11 UTC

OK, AP cache still growing. As of now I have 60 GPU AP's, and 31 CPU AP's.
Looking good so far...

Things do look good on the AP front. I set my secondary GPU projects to NNT & set them to get CPU & GPU work before I left for my sisters place this morning. Looks like they finished the secondary project tasks & have been pulling down AP tasks all day. The two systems have ~170 total tasks each. I imagine they will be at their limit of 200 by Saturday.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1618933 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1618942 - Posted: 26 Dec 2014, 5:06:49 UTC - in response to Message 1618918.  
Last modified: 26 Dec 2014, 5:11:34 UTC

Can anyone enlighten me as to the meaning of "Message from task: 0"? In the boinc manager event log.

Maybe something similar to "task finished with an exit status of (0x0)." ? *shrug* But it looks like it was after an upload, so maybe it was just mentioning that 0 means "no problems with uploading"?


TBar, and others trying to fill GPU caches.. what if instead of reducing your cache size which results in less tasks overall for the GPU anyway, you just go in and reduce the number of CPUs/cores that are allowed to be used after you get a pile of CPU tasks. Then when you do a manual update, CPU shouldn't ask for work at all.

*shrug* Just an idea, I suppose. I'm sure you already tried that though.

[edit: I've kind of done something similar in the past. I run on 50% of the cpu cores, and got the 10-day cache filled with APs, and then--because I was greedy--I changed it to 100% and got another ~40 APs to get up to the 100 limit, giving myself about 18 days of APs once I dropped it back down to 50%. That's why I mention that might be an option...but I don't know how it would work in reverse.]
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1618942 · Report as offensive
Profile Dimly Lit Lightbulb 😀
Volunteer tester
Avatar

Send message
Joined: 30 Aug 08
Posts: 15399
Credit: 7,423,413
RAC: 1
United Kingdom
Message 1618947 - Posted: 26 Dec 2014, 5:27:12 UTC

When you add someone to your ignore list, it works, but you get this:
Notice: Undefined index: forum_images_as_links in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 110 Notice: Undefined index: forum_hide_avatars in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 112 Notice: Undefined index: forum_hide_signatures in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 113 Notice: Undefined index: forum_highlight_special in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 114 Notice: Undefined index: forum_ignore_sticky_posts in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 116 Notice: Undefined index: remove in


With each of the ignored users listed as:
/disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 150 Notice: Undefined index: remove******* in


Ending with:
/disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 150 Warning: Cannot modify header information - headers already sent by (output started at /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php:150) in /disks/carolyn/b/home/boincadm/projects/sah/html/user/edit_forum_preferences_action.php on line 165


Member of the People Encouraging Niceness In Society club.

ID: 1618947 · Report as offensive
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (93) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.