Panic Mode On (89) Server Problems?

Message boards : Number crunching : Panic Mode On (89) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 24 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1566786 - Posted: 4 Sep 2014, 3:10:54 UTC - in response to Message 1566785.  
Last modified: 4 Sep 2014, 3:49:13 UTC

Well, that's it. One of my GPUs is now Idle even though the server has sent the Host over 30 new CPU tasks that it didn't need. The Server Refuses to send my ATI cards work....almost just like on SETI Beta, http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=2182&postid=52223#52223

Not Good...Time for the Drop Kick.

!!!!!!!!!!!!!!!!!!!!!!!!

Now look at my Windows 8.1 Host, http://setiathome.berkeley.edu/results.php?hostid=6796475
BTW, it Does have an nVidia card running MBs. It has 100 MBs. Up until a few minutes ago it too was receiving only CPU APs even though the ATI card was down to just 8 GPU tasks. Now the ATI card has just 13 APs yet this is what it's reporting;
9/3/2014 23:32:07 | SETI@home | Sending scheduler request: To report completed tasks.
9/3/2014 23:32:07 | SETI@home | Reporting 1 completed tasks
9/3/2014 23:32:07 | SETI@home | Requesting new tasks for CPU and NVIDIA and ATI
9/3/2014 23:32:09 | SETI@home | Scheduler request completed: got 0 new tasks
9/3/2014 23:32:09 | SETI@home | No tasks sent
9/3/2014 23:32:09 | SETI@home | No tasks are available for AstroPulse v6
9/3/2014 23:32:09 | SETI@home | This computer has reached a limit on tasks in progress

Then 5 minutes later it says;
9/3/2014 23:37:15 | SETI@home | Sending scheduler request: To fetch work.
9/3/2014 23:37:15 | SETI@home | Requesting new tasks for CPU and NVIDIA and ATI
9/3/2014 23:37:17 | SETI@home | Scheduler request completed: got 5 new tasks...

Now it looks like it's going to download ATI GPU tasks...maybe
BTW, it's Not a Mac.
ID: 1566786 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1566819 - Posted: 4 Sep 2014, 4:35:33 UTC

Well there's a whole pile of low-numbered tapes now. So the candlelight vigil continues...
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1566819 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1566888 - Posted: 4 Sep 2014, 7:16:34 UTC

seems the server changes implied 2 weeks ago result in feeding everyone's CPU needs first and even though you're out of GPU tasks it will feed them later. 2 hours after reaching 100 CPU AP tasks I'm rising steadily on GPU AP's so no need to worry here... The WU's will come in sooner or later.
ID: 1566888 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1566914 - Posted: 4 Sep 2014, 8:48:32 UTC - in response to Message 1566739.  

And AP work is being split once again.

And channels are ending in error like there's a sale on errored channels.
Grant
Darwin NT
ID: 1566914 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1566952 - Posted: 4 Sep 2014, 12:27:40 UTC
Last modified: 4 Sep 2014, 12:42:47 UTC

I still not understand how when you split a WU it could end on error for AP and splits normaly for MB if theoricaly both have the same data.

Something must be wrong/changed in the AP splitting programs since the problem starts in the last week AP fest.
ID: 1566952 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1567007 - Posted: 4 Sep 2014, 16:30:07 UTC - in response to Message 1566952.  

I still not understand how when you split a WU it could end on error for AP and splits normaly for MB if theoricaly both have the same data.

Something must be wrong/changed in the AP splitting programs since the problem starts in the last week AP fest.

Yes...
The problem is with the AP splitting program.
They are aware of it and are working on it. It's not the datasets.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1567007 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1567052 - Posted: 4 Sep 2014, 17:54:24 UTC - in response to Message 1566888.  

seems the server changes implied 2 weeks ago result in feeding everyone's CPU needs first and even though you're out of GPU tasks it will feed them later. 2 hours after reaching 100 CPU AP tasks I'm rising steadily on GPU AP's so no need to worry here... The WU's will come in sooner or later.

So that's what happened, I thought it was odd.
ID: 1567052 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1567179 - Posted: 4 Sep 2014, 21:03:20 UTC - in response to Message 1566755.  
Last modified: 4 Sep 2014, 21:03:31 UTC

We might have a solution:
There was a bug in the new server code that prevented it from sending jobs to pre-6.7 clients. I checked in a fix (4 lines in sched/sched_types.h). Jeff, can you deploy this in beta and public?
Thanks -- David
Could those affected keep an eye on this, and let us know what transpires - pro or anti? Thanks.

Will do. So far there's no change, but I don't know when it will be deployed.

Eric built the new scheduler and it's been deployed on here and at beta. So try again, with 6.2.19 or whichever old 5.xx/6.xx version you want.
ID: 1567179 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1567190 - Posted: 4 Sep 2014, 21:09:52 UTC - in response to Message 1567179.  

We might have a solution:
There was a bug in the new server code that prevented it from sending jobs to pre-6.7 clients. I checked in a fix (4 lines in sched/sched_types.h). Jeff, can you deploy this in beta and public?
Thanks -- David
Could those affected keep an eye on this, and let us know what transpires - pro or anti? Thanks.

Will do. So far there's no change, but I don't know when it will be deployed.

Eric built the new scheduler and it's been deployed on here and at beta. So try again, with 6.2.19 or whichever old 5.xx/6.xx version you want.

I do find it odd that the issue they found should effect versions before 6.7.x but 6.6.33 worked. The exact version isn't really the issue.
6.2.19 does indeed get work upon request now.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1567190 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1567191 - Posted: 4 Sep 2014, 21:15:20 UTC - in response to Message 1567190.  

We might have a solution:
There was a bug in the new server code that prevented it from sending jobs to pre-6.7 clients. I checked in a fix (4 lines in sched/sched_types.h). Jeff, can you deploy this in beta and public?
Thanks -- David
Could those affected keep an eye on this, and let us know what transpires - pro or anti? Thanks.

Will do. So far there's no change, but I don't know when it will be deployed.

Eric built the new scheduler and it's been deployed on here and at beta. So try again, with 6.2.19 or whichever old 5.xx/6.xx version you want.

I do find it odd that the issue they found should effect versions before 6.7.x but 6.6.33 worked. The exact version isn't really the issue.
6.2.19 does indeed get work upon request now.

A host I'm monitoring with v5.10.45 has received new work too, and it's reporting CPU time for finished work again.
ID: 1567191 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1567203 - Posted: 4 Sep 2014, 21:42:37 UTC - in response to Message 1566819.  

Well there's a whole pile of low-numbered tapes now. So the candlelight vigil continues...

Alas poor 31mr08ag. I imagine it will be there for quite a long time as well.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1567203 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1567251 - Posted: 4 Sep 2014, 22:14:59 UTC

No APs are being split, that run didn't last very long.
ID: 1567251 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1567268 - Posted: 4 Sep 2014, 22:35:50 UTC - in response to Message 1567251.  
Last modified: 4 Sep 2014, 22:54:32 UTC

No APs are being split, that run didn't last very long.

I imagine they loaded less data to see how the splitting would go. Possibly trying to work out the issue with the errors while splitting.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1567268 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1567273 - Posted: 4 Sep 2014, 22:51:58 UTC - in response to Message 1567179.  

The good news is not only the "CPU Time" back after a couple of weeks absence but the "Run Time" that I have not seen in months, possibly years, is also back. The values of both are the same which is a bit strange but I am not going quibble over that.


The bad news is that I am still not receiving any AP work for my CPU for version 6.2.14 despite many attempts since Ageless reported a new scheduler was being deployed. It might be that it is just the luck of the draw since everyone is trying to obtain fresh AP units, but normally when there is a good number of WUs ready to send, I am able to snag at least a few. Time will tell.
ID: 1567273 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1567367 - Posted: 5 Sep 2014, 2:56:21 UTC - in response to Message 1566767.  

TBar,

When I see that happening, I just use a profile that doesn't accept CPU WU's until I'm full up there and then switch back to my normal GPU+CPU profile... But yeah, I've noticed it happening too lately...

Chris
ID: 1567367 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1567371 - Posted: 5 Sep 2014, 3:04:17 UTC - in response to Message 1567268.  

We are out of APs now we will have to live off of resends,
ID: 1567371 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1567439 - Posted: 5 Sep 2014, 7:40:52 UTC - in response to Message 1567179.  

Eric built the new scheduler and it's been deployed on here and at beta. So try again, with 6.2.19 or whichever old 5.xx/6.xx version you want.

Success! My 6.2.19 host has a full ~3.5-day cache again.

Glad this issue got resolved. Thanks to all who went through the channels to report it as an issue.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1567439 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1567441 - Posted: 5 Sep 2014, 7:44:15 UTC - in response to Message 1567439.  

Eric built the new scheduler and it's been deployed on here and at beta. So try again, with 6.2.19 or whichever old 5.xx/6.xx version you want.

Success! My 6.2.19 host has a full ~3.5-day cache again.

Glad this issue got resolved. Thanks to all who went through the channels to report it as an issue.

Yes, it's a good thing we have some in these forums who recognize and deal with trouble reports on behalf of the many who don't even visit the forums.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1567441 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1567446 - Posted: 5 Sep 2014, 8:00:39 UTC - in response to Message 1567439.  


Success! My 6.2.19 host has a full ~3.5-day cache again.

I am pleased you have sorted out the issue. I must say your OS choice for that machine surprised me. I am guessing you run a server of some description
ID: 1567446 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1567565 - Posted: 5 Sep 2014, 13:42:37 UTC - in response to Message 1567273.  

The good news is not only the "CPU Time" back after a couple of weeks absence but the "Run Time" that I have not seen in months, possibly years, is also back. The values of both are the same which is a bit strange but I am not going quibble over that.


The bad news is that I am still not receiving any AP work for my CPU for version 6.2.14 despite many attempts since Ageless reported a new scheduler was being deployed. It might be that it is just the luck of the draw since everyone is trying to obtain fresh AP units, but normally when there is a good number of WUs ready to send, I am able to snag at least a few. Time will tell.


I never received a single WU before they ran out. Anyone else running 6.2.14? I am curious if something has changed at my end or if it is time to try a version that is known to work with the recent changes.
ID: 1567565 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (89) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.