Panic Mode On (82) Server Problems?

Message boards : Number crunching : Panic Mode On (82) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 24 · Next

AuthorMessage
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1346121 - Posted: 13 Mar 2013, 11:55:47 UTC - in response to Message 1346117.  
Last modified: 13 Mar 2013, 11:56:20 UTC


I've been having similar problems getting GPU work since the maintenance window was completed.


Did you know what is happening?

13/03/2013 06:16:24 | SETI@home | This computer has reached a limit on tasks in progress

But it not have a single GPU WU allready DL in my cache! So how it reaches the limit?


Makes no sense[/b]
ID: 1346121 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1346151 - Posted: 13 Mar 2013, 12:59:08 UTC

same problem
ID: 1346151 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1346152 - Posted: 13 Mar 2013, 13:12:24 UTC - in response to Message 1346121.  


I've been having similar problems getting GPU work since the maintenance window was completed.


Both your mahcines show 200 on board?

Did you know what is happening?

13/03/2013 06:16:24 | SETI@home | This computer has reached a limit on tasks in progress

But it not have a single GPU WU allready DL in my cache! So how it reaches the limit?

Makes no sense


6269362 has been doing some GPU work in the past hours.
That timestamp is local, so without knowing your timezone it's a bit difficult to mathc to the UTC website times.
Your hosts are of pretty similar makeup so I struggle to think of something that only affects one of them. I may have spotted another host, but I need confirmation that it's not had its GPU disabled.

Anybody else has a host that had an allocation of 100+100 tasks before maintenance but now only gets 100, please post host ID.

And no, 26153 posts of 'works for me' and 'I'm getting 200' and 'no problems here' are pretty unhelpful, so please, refrain from stating that. It's nice when it works smoothly but it's a slap in the face of those having problems. The notable exception being when you want to report that a previous problem has subsided.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1346152 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1346158 - Posted: 13 Mar 2013, 13:28:29 UTC - in response to Message 1346152.  
Last modified: 13 Mar 2013, 13:30:05 UTC


I've been having similar problems getting GPU work since the maintenance window was completed.


Both your mahcines show 200 on board?

Did you know what is happening?

13/03/2013 06:16:24 | SETI@home | This computer has reached a limit on tasks in progress

But it not have a single GPU WU allready DL in my cache! So how it reaches the limit?

Makes no sense


6269362 has been doing some GPU work in the past hours.
That timestamp is local, so without knowing your timezone it's a bit difficult to mathc to the UTC website times.
Your hosts are of pretty similar makeup so I struggle to think of something that only affects one of them. I may have spotted another host, but I need confirmation that it's not had its GPU disabled.

Anybody else has a host that had an allocation of 100+100 tasks before maintenance but now only gets 100, please post host ID.

And no, 26153 posts of 'works for me' and 'I'm getting 200' and 'no problems here' are pretty unhelpful, so please, refrain from stating that. It's nice when it works smoothly but it's a slap in the face of those having problems. The notable exception being when you want to report that a previous problem has subsided.

Yes, it looks like both machines are doing OK now. I posted that before I went to work. I can see the stats from work, but can't see the Boinc console. Maybe I'll get a remote console going so I can monitor both of my Seti machines from work.

Must have been some gremlin that got into the machines. Hopefully it's gone for good. I've seen this odd behavior before. Sometimes it goes away in a couple of hours, sometimes it can last for many hours.
ID: 1346158 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1346163 - Posted: 13 Mar 2013, 13:49:44 UTC
Last modified: 13 Mar 2013, 13:49:57 UTC

I'm not so sure. At 09:35 this morning (-4hrs UTC), one of my machines requested work and was told it had reached its limits. Checking with BOINCTASKS it had 94 CPU and 92 GPU tasks on board. At 09:35, a request was again made reporting 8 tasks and got back 8 (7 GPU & 1 CPU). Something is fishy. What I really need for this machine are AP WUs. When will we start spitting out AP tasks again. In the words of WWE's Ryback -- FEED ME MORE!!


I don't buy computers, I build them!!
ID: 1346163 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1346164 - Posted: 13 Mar 2013, 13:53:54 UTC - in response to Message 1346163.  

I'm not so sure. At 09:35 this morning (-4hrs UTC), one of my machines requested work and was told it had reached its limits. Checking with BOINCTASKS it had 94 CPU and 92 GPU tasks on board. At 09:35, a request was again made reporting 8 tasks and got back 8 (7 GPU & 1 CPU). Something is fishy. What I really need for this machine are AP WUs. When will we start spitting out AP tasks again. In the words of WWE's Ryback -- FEED ME MORE!!

Well, we're going to need some more 'tapes' soon - we're down to three on the server status page, and only two MB splitters. I imagine it's on their 'to do' list for when they get to the lab in a couple of hours' time.
ID: 1346164 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1346177 - Posted: 13 Mar 2013, 14:38:07 UTC

I'm not so sure. At 09:35 this morning (-4hrs UTC), one of my machines requested work and was told it had reached its limits. Checking with BOINCTASKS it had 94 CPU and 92 GPU tasks on board.

This may sound silly but you did include the ones "in progress" and "ready to report"?
ID: 1346177 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1346179 - Posted: 13 Mar 2013, 14:41:14 UTC - in response to Message 1346177.  

I'm not so sure. At 09:35 this morning (-4hrs UTC), one of my machines requested work and was told it had reached its limits. Checking with BOINCTASKS it had 94 CPU and 92 GPU tasks on board.

This may sound silly but you did include the ones "in progress" and "ready to report"?


I only reported 1 task and the totals included those in progress.


I don't buy computers, I build them!!
ID: 1346179 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1346196 - Posted: 13 Mar 2013, 15:36:32 UTC

Well, man your TCPs, folks.
Looks like we have a few fresh datasets and AP is splitting again.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1346196 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1346204 - Posted: 13 Mar 2013, 15:59:28 UTC

And, yup.
AP starts back up and all of a sudden my rigs start having problems connecting to the servers to do scheduling requests.

Meowsigh.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1346204 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1346206 - Posted: 13 Mar 2013, 16:20:06 UTC

Something like this?

22595 SETI@home 13/03/2013 15:53:38 Sending scheduler request: To fetch work.
22596 SETI@home 13/03/2013 15:53:38 Requesting new tasks for NVIDIA
22597 SETI@home 13/03/2013 15:54:08 Scheduler request failed: HTTP gateway timeout
22598 SETI@home 13/03/2013 15:55:45 Sending scheduler request: To fetch work.
22599 SETI@home 13/03/2013 15:55:45 Requesting new tasks for NVIDIA
22600 13/03/2013 15:56:18 Project communication failed: attempting access to reference site
22601 SETI@home 13/03/2013 15:56:18 Scheduler request failed: Server returned nothing (no headers, no data)
22602 13/03/2013 15:56:19 Internet access OK - project servers may be temporarily down.
22603 SETI@home 13/03/2013 15:59:26 Sending scheduler request: To fetch work.
22604 SETI@home 13/03/2013 15:59:26 Requesting new tasks for NVIDIA
22605 SETI@home 13/03/2013 15:59:56 Scheduler request failed: HTTP gateway timeout
22606 SETI@home 13/03/2013 16:04:45 Sending scheduler request: To fetch work.
22607 SETI@home 13/03/2013 16:04:45 Requesting new tasks for NVIDIA
22608 SETI@home 13/03/2013 16:05:17 Scheduler request failed: HTTP gateway timeout

ID: 1346206 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1346208 - Posted: 13 Mar 2013, 16:25:50 UTC - in response to Message 1346206.  

Something like this?

22595 SETI@home 13/03/2013 15:53:38 Sending scheduler request: To fetch work.
22596 SETI@home 13/03/2013 15:53:38 Requesting new tasks for NVIDIA
22597 SETI@home 13/03/2013 15:54:08 Scheduler request failed: HTTP gateway timeout
22598 SETI@home 13/03/2013 15:55:45 Sending scheduler request: To fetch work.
22599 SETI@home 13/03/2013 15:55:45 Requesting new tasks for NVIDIA
22600 13/03/2013 15:56:18 Project communication failed: attempting access to reference site
22601 SETI@home 13/03/2013 15:56:18 Scheduler request failed: Server returned nothing (no headers, no data)
22602 13/03/2013 15:56:19 Internet access OK - project servers may be temporarily down.
22603 SETI@home 13/03/2013 15:59:26 Sending scheduler request: To fetch work.
22604 SETI@home 13/03/2013 15:59:26 Requesting new tasks for NVIDIA
22605 SETI@home 13/03/2013 15:59:56 Scheduler request failed: HTTP gateway timeout
22606 SETI@home 13/03/2013 16:04:45 Sending scheduler request: To fetch work.
22607 SETI@home 13/03/2013 16:04:45 Requesting new tasks for NVIDIA
22608 SETI@home 13/03/2013 16:05:17 Scheduler request failed: HTTP gateway timeout

Similar, but never saw a 'gateway timeout'.
Usually, cannot connect to server or internal HTTP error.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1346208 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1346209 - Posted: 13 Mar 2013, 16:30:16 UTC

And I really don't understand why maxxing out the bandwidth with AP downloads VS having it maxxed out with MB downloads results in chaos.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1346209 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1346225 - Posted: 13 Mar 2013, 17:36:43 UTC
Last modified: 13 Mar 2013, 17:41:43 UTC

13/03/2013 17:34:29 | SETI@home | Sending scheduler request: To fetch work.
13/03/2013 17:34:29 | SETI@home | Reporting 42 completed tasks
13/03/2013 17:34:55 | SETI@home | Scheduler request completed: got 8 new tasks
13/03/2013 17:34:55 | SETI@home | Resent lost task 31oc12ab.1477.12337.206158430219.10.171_1

etc...

And here come the rest of them...

13/03/2013 17:40:00 | SETI@home | Requesting new tasks for NVIDIA
13/03/2013 17:40:08 | SETI@home | Scheduler request completed: got 34 new tasks
ID: 1346225 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1346227 - Posted: 13 Mar 2013, 17:49:24 UTC

I have some 40 plus timeouts again on the I7 3770.
Is it possible to code so that the the sever asks whether the CPU or the GPU needs work, Before it just blindly sends work?

I wont lose any sleep over having time outs, But it seems like a waste of bandwith. Why send work twice when once would have worked.
[/quote]

Old James
ID: 1346227 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1346231 - Posted: 13 Mar 2013, 17:59:23 UTC - in response to Message 1346227.  


I'm getting random "Couldn't connect to server" errors on my Scheduler requests.
Add to that there are only 2 MB splitters running, so when the next batch of shorties come through, we'll run out of work rather quickly.
Grant
Darwin NT
ID: 1346231 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1346236 - Posted: 13 Mar 2013, 18:00:41 UTC - in response to Message 1346227.  

Is it possible to code so that the the sever asks whether the CPU or the GPU needs work, Before it just blindly sends work?

The only reason it will send work, is because you have requested it.
Grant
Darwin NT
ID: 1346236 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14645
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1346237 - Posted: 13 Mar 2013, 18:01:50 UTC - in response to Message 1346227.  

I have some 40 plus timeouts again on the I7 3770.
Is it possible to code so that the the sever asks whether the CPU or the GPU needs work, Before it just blindly sends work?

I wont lose any sleep over having time outs, But it seems like a waste of bandwith. Why send work twice when once would have worked.

No - the server doesn't blindly send work anyway.

Your computer requests work, and it requests CPU or GPU work as needed (or both, of course)

I edited out these lines from my last post:

13/03/2013 17:34:29 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
13/03/2013 17:34:29 | SETI@home | [sched_op] NVIDIA work request: 24998.86 seconds; 0.00 devices

13/03/2013 17:40:00 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
13/03/2013 17:40:00 | SETI@home | [sched_op] NVIDIA work request: 23683.21 seconds; 0.00 devices

Enable <sched_op_debug> from client configuration if you want to see that extra detail.
ID: 1346237 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1346244 - Posted: 13 Mar 2013, 18:26:52 UTC - in response to Message 1346237.  

I have some 40 plus timeouts again on the I7 3770.
Is it possible to code so that the the sever asks whether the CPU or the GPU needs work, Before it just blindly sends work?

I wont lose any sleep over having time outs, But it seems like a waste of bandwith. Why send work twice when once would have worked.

No - the server doesn't blindly send work anyway.

Your computer requests work, and it requests CPU or GPU work as needed (or both, of course)

I edited out these lines from my last post:

13/03/2013 17:34:29 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
13/03/2013 17:34:29 | SETI@home | [sched_op] NVIDIA work request: 24998.86 seconds; 0.00 devices

13/03/2013 17:40:00 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
13/03/2013 17:40:00 | SETI@home | [sched_op] NVIDIA work request: 23683.21 seconds; 0.00 devices

Enable <sched_op_debug> from client configuration if you want to see that extra detail.


I had a look in there Richard. Reminded me of a book of latin I once looked at:)
So I will leave the codeing and programming to those in the know. And take my timeouts with good grace.
[/quote]

Old James
ID: 1346244 · Report as offensive
AndrewM
Volunteer tester

Send message
Joined: 5 Jan 08
Posts: 369
Credit: 34,275,196
RAC: 0
Australia
Message 1346300 - Posted: 13 Mar 2013, 20:42:27 UTC

What is a pfb_splitter6 that lando is running on the Server Status page?
ID: 1346300 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (82) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.