Panic Mode On (78) Server Problems?

Message boards : Number crunching : Panic Mode On (78) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 22 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1302086 - Posted: 4 Nov 2012, 14:32:20 UTC - in response to Message 1302085.  

I was just checking the same thing. Across 7 machines, the server thinks I've got more than 1500 tasks in progress. But I've actually got 91 - and almost all of those are because two machines have started getting lost tasks resent within the last hour.
ID: 1302086 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1302088 - Posted: 4 Nov 2012, 14:35:44 UTC - in response to Message 1302073.  

Results ready to send is now 0 and 0, and scheduler contacts to Main and Beta are going through without problem,

Claggy

Not necessarily so. Take my host 5828732, which

Sorry, i should have said scheduler contacts are going through more speedily when reporting only, trying to get the resends resent is still hit and miss,

Claggy
ID: 1302088 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1302090 - Posted: 4 Nov 2012, 14:40:27 UTC - in response to Message 1302086.  

I have one computer that has just over 400 tasks on it while the server thinks it has just short of 5,200........
It can not connect to the server and should I get that many tasks I will have to abort most of them because they will never finish in time. I do keep trying to connect but am unable to.
I guess there is nothing I can do about this right now.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1302090 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30591
Credit: 53,134,872
RAC: 32
United States
Message 1302096 - Posted: 4 Nov 2012, 14:49:04 UTC

Shortly the server will break from the ghost storm it is creating. It will run out of result storage space. Perhaps then Eric and or Jeff will take action. With Matt out I hope they have notes on the last time this happened.

ID: 1302096 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302100 - Posted: 4 Nov 2012, 14:58:54 UTC - in response to Message 1302053.  

The tasks in progress is now showing 592 with only 1 completed completed task on the machine. Is there something I need to do? Or do I just sit back and wait for it to correct itself? this machine has only received 2 tasks in the last 18 hours.


Wow..within minutes of posting this the machine actually got 20 "resent lost tasks" Hoping this is a sign of things to come,



After getting the 20 resent task crunched and reported have not got anything else. since it appears the lost tasks are all for my GPU I set it to only use the gpu. however, instead of getting lost tasks resent it appears to try to send new tasks and the number of lost tasks is up to 701.
ID: 1302100 · Report as offensive
Filipe

Send message
Joined: 12 Aug 00
Posts: 218
Credit: 21,281,677
RAC: 20
Portugal
Message 1302102 - Posted: 4 Nov 2012, 15:03:36 UTC

What if some of us just suspend network contact with the seti server for maybe 12hours?

It could help, since the server is hamerred with 1300queries/second.

Most of witch are the same computers trying again an again to report or receive new tasks desesperately?
ID: 1302102 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1302106 - Posted: 4 Nov 2012, 15:09:35 UTC

What if some of us just suspend network contact with the seti server for maybe 12hours?

It could help, since the server is hamerred with 1300queries/second.

Most of witch are the same computers trying again an again to report or receive new tasks desesperately?

I've gone NNT and will stay there for a day or two - have 2.75 days of CPU work and 5 days of gpu work, so I will wait it out. After my last post, I again received a new task rather than a lost task resend, so that's when I went NNT.

If you suspend network operations, you can't report completions, and some folks crunch a second project (or more).
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1302106 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34249
Credit: 79,922,639
RAC: 80
Germany
Message 1302119 - Posted: 4 Nov 2012, 15:35:10 UTC - in response to Message 1302102.  

What if some of us just suspend network contact with the seti server for maybe 12hours?

It could help, since the server is hamerred with 1300queries/second.

Most of witch are the same computers trying again an again to report or receive new tasks desesperately?


Maybe a good idea.

No server contact from me for the next 12 hours.



With each crime and every kindness we birth our future.
ID: 1302119 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302139 - Posted: 4 Nov 2012, 16:36:47 UTC

If I reset the project will it abort the lost tasks and resend them to me or someone else?
ID: 1302139 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1302141 - Posted: 4 Nov 2012, 16:42:59 UTC - in response to Message 1302139.  

If I reset the project will it abort the lost tasks and resend them to me or someone else?

No - reset applies to your local machine only. You would lose all the work you have managed to download (without telling the server anything). Exactly the opposite of what you're trying to achieve.
ID: 1302141 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302146 - Posted: 4 Nov 2012, 16:49:27 UTC - in response to Message 1302141.  

If I reset the project will it abort the lost tasks and resend them to me or someone else?

No - reset applies to your local machine only. You would lose all the work you have managed to download (without telling the server anything). Exactly the opposite of what you're trying to achieve.


Ok thanks. I didn't know what it did and therefore I asked the question.
ID: 1302146 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1302148 - Posted: 4 Nov 2012, 16:53:41 UTC - in response to Message 1302146.  

I didn't know what it did and therefore I asked the question.

Now that is exactly the right thing to do. So many click first, and ask questions afterwards...
ID: 1302148 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1302153 - Posted: 4 Nov 2012, 16:55:52 UTC

11/4/2012 9:25:57 AM | SETI@home | update requested by user
11/4/2012 9:26:00 AM | SETI@home | Sending scheduler request: Requested by user.
11/4/2012 9:26:00 AM | SETI@home | Reporting 8 completed tasks, requesting new tasks for CPU and ATI
11/4/2012 9:31:10 AM | SETI@home | Scheduler request failed: Timeout was reached
11/4/2012 9:31:13 AM | | Project communication failed: attempting access to reference site
11/4/2012 9:31:14 AM | | Internet access OK - project servers may be temporarily down.
11/4/2012 9:39:25 AM | SETI@home | Sending scheduler request: To fetch work.
11/4/2012 9:39:25 AM | SETI@home | Reporting 8 completed tasks, requesting new tasks for CPU and ATI
11/4/2012 9:43:44 AM | SETI@home | update requested by user
11/4/2012 9:44:32 AM | SETI@home | Scheduler request failed: Timeout was reached
11/4/2012 9:44:36 AM | | Project communication failed: attempting access to reference site
11/4/2012 9:44:37 AM | | Internet access OK - project servers may be temporarily down.


well, that was a somewhat short-lived spurt LOL :)
ID: 1302153 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1302155 - Posted: 4 Nov 2012, 16:59:13 UTC - in response to Message 1302086.  

I was just checking the same thing. Across 7 machines, the server thinks I've got more than 1500 tasks in progress. But I've actually got 91 - and almost all of those are because two machines have started getting lost tasks resent within the last hour.

My notional list of "work in progress" has gone up from 1,500 to 2,100 in the last two hours.

Everything is getting set to NNT and staying there, until I see zero tasks ready to send AND the splitters disabled.
ID: 1302155 · Report as offensive
WendyR
Volunteer tester
Avatar

Send message
Joined: 1 Aug 05
Posts: 44
Credit: 1,962,140
RAC: 0
United States
Message 1302162 - Posted: 4 Nov 2012, 17:06:40 UTC - in response to Message 1302155.  

Will it get resends when on NNT??

My total is now closer to 300 "missing" tasks.
ID: 1302162 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1302163 - Posted: 4 Nov 2012, 17:09:51 UTC - in response to Message 1302162.  
Last modified: 4 Nov 2012, 17:11:52 UTC

Will it get resends when on NNT??

My total is now closer to 300 "missing" tasks.

No, not at this project, some projects like Einstein (which uses older server software) do get resends when NNT is set, with newer Server software resends only happen if you ask for work.

Claggy
ID: 1302163 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1302169 - Posted: 4 Nov 2012, 17:20:03 UTC

Anybody got any idea whether we're being haunted by Astropulse ghosts, to the same degree?

I think it's getting to the point that I need to put out an urgent APB, reluctant though I am to do so on a Sunday - just to get a lid put on the situation, until they can look at it tomorrow or Tuesday.

The question is - is stopping the MB splitters enough, or should I ask them to stop AP as well?

With splitters stopped, and no new work entering the system, we could allow work fetch and fish for resends of the work we've already supposedly got.
ID: 1302169 · Report as offensive
GaryG
Avatar

Send message
Joined: 17 Mar 12
Posts: 8
Credit: 2,593,273
RAC: 0
United States
Message 1302174 - Posted: 4 Nov 2012, 17:28:32 UTC - in response to Message 1302169.  

Anybody got any idea whether we're being haunted by Astropulse ghosts, to the same degree?

I think it's getting to the point that I need to put out an urgent APB, reluctant though I am to do so on a Sunday - just to get a lid put on the situation, until they can look at it tomorrow or Tuesday.

The question is - is stopping the MB splitters enough, or should I ask them to stop AP as well?

With splitters stopped, and no new work entering the system, we could allow work fetch and fish for resends of the work we've already supposedly got.


My vote would be to stop both. If most systems are like mine upwards of 900 tasks assigned but not received so there is plenty of work to run if we can get it.
ID: 1302174 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1302175 - Posted: 4 Nov 2012, 17:29:31 UTC

I'm showing over 5000 "in progress" with 0 work in BOINC Manager
ID: 1302175 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302176 - Posted: 4 Nov 2012, 17:30:28 UTC - in response to Message 1302169.  

Anybody got any idea whether we're being haunted by Astropulse ghosts, to the same degree?

I think it's getting to the point that I need to put out an urgent APB, reluctant though I am to do so on a Sunday - just to get a lid put on the situation, until they can look at it tomorrow or Tuesday.

The question is - is stopping the MB splitters enough, or should I ask them to stop AP as well?

With splitters stopped, and no new work entering the system, we could allow work fetch and fish for resends of the work we've already supposedly got.


Have not crunched any AP tasks in days, but I see 2 among the ghosts.
ID: 1302176 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.