Panic Mode On (89) Server Problems?

Message boards : Number crunching : Panic Mode On (89) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 24 · Next

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1565273 - Posted: 31 Aug 2014, 5:26:06 UTC - in response to Message 1565267.  

Well at least the MB splitters havn't been throwing up errors on those files like the AP 1's did.

Over 50 MultiBeam channels done & still no errors.
Looks like it was the AP splitter(s) having issues, not the data.

A large volume of the AP work from those data sets I am seeing end in 30/30 early exits.

So far the AP's that I've done from that bunch so far have had little to no blanking at all, but my CPU's still have a lot of them to do yet.

Cheers.
ID: 1565273 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1565998 - Posted: 2 Sep 2014, 2:37:12 UTC
Last modified: 2 Sep 2014, 2:43:46 UTC

Hm. Odd. My single core machine won't get work, for whatever reason. It finished its last WUs and reported them on the 29th, and hasn't gotten anything since. The past 48 hours of the message log shows this:

2014-09-01 22:25:14 SETI@home [sched_op_debug] Starting scheduler request
2014-09-01 22:25:14 SETI@home Sending scheduler request: To fetch work. Requesting 302400 seconds of work, reporting 0 completed tasks
2014-09-01 22:25:20 SETI@home Scheduler request succeeded: got 0 new tasks
2014-09-01 22:25:20 SETI@home [sched_ops_debug] Server version 705
2014-09-01 22:25:20 SETI@home Message from server: No tasks sent
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for SETI@home Enhanced
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for SETI@home v7
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for AstroPulse v6
2014-09-01 22:25:20 SETI@home Message from server: Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
2014-09-01 22:25:20 SETI@home Message from server: Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
2014-09-01 22:25:20 SETI@home Project requested delay of 303.000000 seconds
2014-09-01 22:25:20 SETI@home [sched_op_debug] Deferring communication for 2 min 45 sec
2014-09-01 22:25:20 SETI@home [sched_op_debug] Reason: no work from project
2014-09-01 22:25:20 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
2014-09-01 22:25:20 SETI@home [sched_op_debug] Reason: requested by project


Nothing changed on the system..it just quit getting work. I'm confused.

[edit: although I did just notice that the one valid task that is still in the list is showing 0 seconds for both run time and CPU time. I reboot it probably once a month or so, and I just did a few days ago, and again just now, and that didn't fix it. Hm. Anyone have ideas?]
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1565998 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1566019 - Posted: 2 Sep 2014, 3:38:32 UTC - in response to Message 1565998.  

Hm. Odd. My single core machine won't get work, for whatever reason. It finished its last WUs and reported them on the 29th, and hasn't gotten anything since. The past 48 hours of the message log shows this:

2014-09-01 22:25:14 SETI@home [sched_op_debug] Starting scheduler request
2014-09-01 22:25:14 SETI@home Sending scheduler request: To fetch work. Requesting 302400 seconds of work, reporting 0 completed tasks
2014-09-01 22:25:20 SETI@home Scheduler request succeeded: got 0 new tasks
2014-09-01 22:25:20 SETI@home [sched_ops_debug] Server version 705
2014-09-01 22:25:20 SETI@home Message from server: No tasks sent
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for SETI@home Enhanced
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for SETI@home v7
2014-09-01 22:25:20 SETI@home Message from server: No tasks are available for AstroPulse v6
2014-09-01 22:25:20 SETI@home Message from server: Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
2014-09-01 22:25:20 SETI@home Message from server: Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
2014-09-01 22:25:20 SETI@home Project requested delay of 303.000000 seconds
2014-09-01 22:25:20 SETI@home [sched_op_debug] Deferring communication for 2 min 45 sec
2014-09-01 22:25:20 SETI@home [sched_op_debug] Reason: no work from project
2014-09-01 22:25:20 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
2014-09-01 22:25:20 SETI@home [sched_op_debug] Reason: requested by project


Nothing changed on the system..it just quit getting work. I'm confused.

[edit: although I did just notice that the one valid task that is still in the list is showing 0 seconds for both run time and CPU time. I reboot it probably once a month or so, and I just did a few days ago, and again just now, and that didn't fix it. Hm. Anyone have ideas?]

Looks like bad luck hitting an empty feeder on each request. How often does it ask for work?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1566019 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566029 - Posted: 2 Sep 2014, 4:39:05 UTC - in response to Message 1566019.  
Last modified: 2 Sep 2014, 4:40:59 UTC

Looks like bad luck hitting an empty feeder on each request. How often does it ask for work?

It makes it up to a 3:59:59 delay once or twice a day, but generally, every 5:03. it has made.. 131 requests in the past 48 hours. Comes back empty every time.


I even just shut boinc down and went into the projects folder and moved the app_info and EXE/dll out of the folder and got rid of everything else, and then moved the needed stuff back. Checked the slots folder and it was empty. Checked the preferences both locally and online, and no change.

I just don't know. I guess I'll let it tend to itself for a few more days and see what happens.

[edit: and actually, I just thought about it, too.. I hold a ~3.5-day cache, and the last one was reported on the 29th, so that means I haven't gotten new tasks since the 26th.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566029 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1566081 - Posted: 2 Sep 2014, 8:20:27 UTC - in response to Message 1565270.  

Eric, Matt, and Jeff were all contacted about the situation. And at least Eric responded, so he knows about it. If the problem continues with the next AP run, I'll send the flares up again.

Will be interesting to see how the next batch of AP goes- still no sign of any errors in splitting the same tapes for MB work.
Grant
Darwin NT
ID: 1566081 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1566151 - Posted: 2 Sep 2014, 13:28:08 UTC - in response to Message 1566029.  

Looks like bad luck hitting an empty feeder on each request. How often does it ask for work?

It makes it up to a 3:59:59 delay once or twice a day, but generally, every 5:03. it has made.. 131 requests in the past 48 hours. Comes back empty every time.


I even just shut boinc down and went into the projects folder and moved the app_info and EXE/dll out of the folder and got rid of everything else, and then moved the needed stuff back. Checked the slots folder and it was empty. Checked the preferences both locally and online, and no change.

I just don't know. I guess I'll let it tend to itself for a few more days and see what happens.

[edit: and actually, I just thought about it, too.. I hold a ~3.5-day cache, and the last one was reported on the 29th, so that means I haven't gotten new tasks since the 26th.

Did you try a project reset yet? I seem to recall I had issues with some clients in the past not downloading work and I had to kick them really hard to get them to resume.
Also I think that might be around the same time they were messing around with the GPU limit handling section again. Perhaps they broke something for older clients?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1566151 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566179 - Posted: 2 Sep 2014, 19:00:58 UTC - in response to Message 1566151.  

Did you try a project reset yet? I seem to recall I had issues with some clients in the past not downloading work and I had to kick them really hard to get them to resume.

Tried a reset, and added some more debugging flags to cc_config:

2014-09-02 14:55:26 [work_fetch_debug] Request work fetch: Project backoff ended
2014-09-02 14:55:26 [work_fetch_debug] compute_work_requests(): start
2014-09-02 14:55:26 [work_fetch_debug] compute_work_requests(): cpu_shortfall 302400.000000, overall urgency Need immediately
2014-09-02 14:55:26 SETI@home [work_fetch_debug] best project so far
2014-09-02 14:55:26 SETI@home [work_fetch_debug] compute_work_requests(): work req 302400.000000, shortfall 301471.421109, urgency Need immediately
2014-09-02 14:55:27 SETI@home [sched_op_debug] Starting scheduler request
2014-09-02 14:55:27 [work_fetch_debug] time_until_work_done(): est 0.000000 ssr 100.000000 apr 0.996929 prs 100.000000
2014-09-02 14:55:27 SETI@home Sending scheduler request: To fetch work. Requesting 302400 seconds of work, reporting 0 completed tasks
2014-09-02 14:55:33 SETI@home Scheduler request succeeded: got 0 new tasks
2014-09-02 14:55:33 SETI@home [sched_ops_debug] Server version 705
2014-09-02 14:55:33 SETI@home Message from server: No tasks sent
2014-09-02 14:55:33 SETI@home Message from server: No tasks are available for SETI@home v7
2014-09-02 14:55:33 SETI@home Message from server: No tasks are available for AstroPulse v6
2014-09-02 14:55:33 SETI@home Message from server: Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
2014-09-02 14:55:33 SETI@home Message from server: Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
2014-09-02 14:55:33 SETI@home Message from server: Tasks for Intel GPU are available, but your preferences are set to not accept them
2014-09-02 14:55:33 SETI@home Project requested delay of 303.000000 seconds
2014-09-02 14:55:33 SETI@home [sched_op_debug] Deferring communication for 1 min 29 sec
2014-09-02 14:55:33 SETI@home [sched_op_debug] Reason: no work from project
2014-09-02 14:55:33 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
2014-09-02 14:55:33 SETI@home [sched_op_debug] Reason: requested by project
2014-09-02 14:55:33 [work_fetch_debug] Request work fetch: RPC complete
2014-09-02 14:55:33 [work_fetch_debug] compute_work_requests(): start
2014-09-02 14:55:33 [work_fetch_debug] compute_work_requests(): cpu_shortfall 302400.000000, overall urgency Need immediately
2014-09-02 14:55:33 SETI@home [work_fetch_debug] work fetch: project not contactable; skipping


I'm asking for work, I don't see anything wrong on my end.. it looks like the server just isn't sending me anything. I do notice that server version is 705 now. It's been 703 for a while up until recently, it seems. So.. I guess the change broke something..? I don't know.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566179 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1566211 - Posted: 2 Sep 2014, 20:17:53 UTC - in response to Message 1566179.  
Last modified: 2 Sep 2014, 20:18:14 UTC

Did you try a project reset yet? I seem to recall I had issues with some clients in the past not downloading work and I had to kick them really hard to get them to resume.

Tried a reset, and added some more debugging flags to cc_config:

I'm asking for work, I don't see anything wrong on my end.. it looks like the server just isn't sending me anything. I do notice that server version is 705 now. It's been 703 for a while up until recently, it seems. So.. I guess the change broke something..? I don't know.


I don't see anything related on the BOINC forum Questions and problems section. Perhaps you could ask and see if anything pops up.
Also trying to connect to another project and seeing if it will send you work might be a good test. If you hit a project with 703 and get work that does add more evidence that they broke something, or unceremoniously pulled the carpet out on old versions of BOINC. The latter I wouldn't suspect.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1566211 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1566227 - Posted: 2 Sep 2014, 20:47:51 UTC

I'm not really here, but on a hunch, try putting the offending host on a different venue and play around with preferences - e.g. have you really set 'no GPU work' via the preferences?

'a server log, a server log, a kingdom for a server log'
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1566227 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1566260 - Posted: 2 Sep 2014, 22:23:36 UTC - in response to Message 1566227.  

I'm not really here, but on a hunch, try putting the offending host on a different venue and play around with preferences - e.g. have you really set 'no GPU work' via the preferences?

'a server log, a server log, a kingdom for a server log'

And for http://setiathome.berkeley.edu/client_types.php to be updated more often than once every 18 months.
ID: 1566260 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566317 - Posted: 3 Sep 2014, 2:16:48 UTC

In all of my venues, I have the three GPU types unchecked. Neither of my rigs have usable GPUs anyway.

I didn't change anything on my end, and the only difference I see now is server version changed from 703 to 705. My other host has gotten work since after the change to 705.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566317 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1566321 - Posted: 3 Sep 2014, 2:23:32 UTC - in response to Message 1566317.  

Last option when all else fails ( I hesitate to suggest but) try uninstalling BOINC and reboot, then reinstall BOINC and reattach to the project. I found a time or two where I couldn't figure out why it wouldn't connect and did that. I think something got corrupted somewhere, especially after a hard reboot. Reinstalling took care of it. Just my 2 cents.
ID: 1566321 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1566331 - Posted: 3 Sep 2014, 3:29:00 UTC - in response to Message 1566317.  
Last modified: 3 Sep 2014, 3:31:05 UTC

In all of my venues, I have the three GPU types unchecked. Neither of my rigs have usable GPUs anyway.

I didn't change anything on my end, and the only difference I see now is server version changed from 703 to 705. My other host has gotten work since after the change to 705.

I did some testing on one of my machines with two old versions.
6.2.19 - Requested work & was not given any. Just as you are seeing.
6.6.33 - Requested work & was given work.

So you are not crazy, well not completely at least... The server code changes have borked you. Defiantly time to bring it to the attention of the BOINC devs, or switch to a version < 6 years old.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1566331 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566342 - Posted: 3 Sep 2014, 4:28:54 UTC - in response to Message 1566331.  

So you are not crazy, well not completely at least... The server code changes have borked you. Defiantly time to bring it to the attention of the BOINC devs, or switch to a version < 6 years old.

Thanks for the help. I posted over there and will see what happens. Like I said there, "I know I could upgrade to a newer version to get more work, but then that wouldn't be fixing a possible problem, just applying a work-around. So.. is this occurrence the intended result, or is it a sign of a problem?" We'll see what happens.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566342 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1566474 - Posted: 3 Sep 2014, 14:16:02 UTC - in response to Message 1566342.  

So you are not crazy, well not completely at least... The server code changes have borked you. Defiantly time to bring it to the attention of the BOINC devs, or switch to a version < 6 years old.

Thanks for the help. I posted over there and will see what happens. Like I said there, "I know I could upgrade to a newer version to get more work, but then that wouldn't be fixing a possible problem, just applying a work-around. So.. is this occurrence the intended result, or is it a sign of a problem?" We'll see what happens.

I've been contacted privately by another user who has supplied some substantial evidence that some server change took place during maintenance last week, with significant detrimental effects on hosts running older clients - more than the points which have been raised in this thread so far. I'll start drafting an email for Eric and David now, passing on what I know.
ID: 1566474 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1566500 - Posted: 3 Sep 2014, 15:44:51 UTC - in response to Message 1566474.  
Last modified: 3 Sep 2014, 15:45:24 UTC


I've been contacted privately by another user who has supplied some substantial evidence that some server change took place during maintenance last week, with significant detrimental effects on hosts running older clients - more than the points which have been raised in this thread so far. I'll start drafting an email for Eric and David now, passing on what I know.

This must be putting a significant number (although perhaps not a significant percentage) of old set and forget hosts out to pasture at the moment, unbeknownst to their users.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1566500 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1566522 - Posted: 3 Sep 2014, 16:45:40 UTC

I see a thread over in Q&A, Is SETI@HOME down?, which appears to relate. The host is also still using BOINC 6.2.19.
ID: 1566522 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1566524 - Posted: 3 Sep 2014, 16:47:26 UTC

Under the category of for what it is worth, I have noticed, based on my pending list, that between August 18th and August 26th that my WU results have stopped including the "CPU Time (sec)". I crunch CPU only using 6.2.14. A little out of date but I subscribe to the "if it ain't broke, don't fix it" philosophy. I have not done anything that I can think of at my end to account for the change so perhaps this is related to the server change as well.
ID: 1566524 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566548 - Posted: 3 Sep 2014, 17:55:18 UTC

Now for a change of pace.. it looks like the candlelight vigil for the 30 and 31 tapes might come to an end soon. I wonder if that is deliberate. At least one or two of those tapes will get split up, maybe not all of them, but.. there is hope for them after all!
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566548 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1566549 - Posted: 3 Sep 2014, 17:59:50 UTC - in response to Message 1566548.  

Now for a change of pace.. it looks like the candlelight vigil for the 30 and 31 tapes might come to an end soon. I wonder if that is deliberate. At least one or two of those tapes will get split up, maybe not all of them, but.. there is hope for them after all!

I've been watching that all morning....wondering what will start splitting next or if AP will jump in again before the splitters get there.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1566549 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (89) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.