Panic Mode On (107) Server Problems?

Message boards : Number crunching : Panic Mode On (107) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 29 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1892729 - Posted: 30 Sep 2017, 22:29:23 UTC - in response to Message 1892688.  
Last modified: 30 Sep 2017, 22:32:37 UTC

Yep, lots and lots of Arecibo VLARs. Even my machines are having difficulty getting new tasks, and my #1 cruncher actually got down to its last GPU task, the first time I can recall that happening. I just went ahead and rescheduled all the available guppis and non-VLAR Arecibo tasks from the CPU to the GPU queue. That freed up enough CPU queue space to be able to accept a bunch of Arecibo VLARs which, in turn, seemed to let a bunch of guppis and Arecibo non-VLAR tasks loose. It looks like that machine is slowly starting to rebuild the GPU queue but who knows how long that'll last. If I have to, I'll just move Arecibo VLARs over to the GPUs. I don't think they do all that badly with the Special App.


. . On my machines they take only about 50% longer ...

Stephen

:(
ID: 1892729 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1893029 - Posted: 3 Oct 2017, 5:19:26 UTC

Ah, we're back. Web site & forums went AWOL for about 10min.
Grant
Darwin NT
ID: 1893029 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893030 - Posted: 3 Oct 2017, 5:23:04 UTC - in response to Message 1893029.  

Yep, network communications unavailable. Looking for work.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893030 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1893032 - Posted: 3 Oct 2017, 5:29:46 UTC

Don't know how long we'll be back up for though.
Loading a thread (or even a page) is varying between slow & almost comatose.
Grant
Darwin NT
ID: 1893032 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893035 - Posted: 3 Oct 2017, 6:06:52 UTC - in response to Message 1893032.  

Yes, my last post took almost 3 minutes to appear.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893035 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1893050 - Posted: 3 Oct 2017, 10:26:21 UTC - in response to Message 1893032.  

Don't know how long we'll be back up for though.
Loading a thread (or even a page) is varying between slow & almost comatose.


. . Hey .. my ISP has been down for about 12 hours, everything is empty and the outage is about to start :(

. . Bummer dude!

Stephen

:(
ID: 1893050 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1893079 - Posted: 4 Oct 2017, 0:01:33 UTC

. . I am surprised this thread is not full of messages since the outrage. All three machines "No Tasks Available". Maybe I am the only one getting this ??

Stephen

??
ID: 1893079 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36841
Credit: 261,360,520
RAC: 489
Australia
Message 1893082 - Posted: 4 Oct 2017, 0:08:59 UTC - in response to Message 1893079.  

. . I am surprised this thread is not full of messages since the outrage. All three machines "No Tasks Available". Maybe I am the only one getting this ??

Stephen

??

You're not the only 1 this time, but at least my caches got very near full before that happened. ;-)

Cheers.
ID: 1893082 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1893085 - Posted: 4 Oct 2017, 0:35:35 UTC - in response to Message 1893082.  

. . I am surprised this thread is not full of messages since the outrage. All three machines "No Tasks Available". Maybe I am the only one getting this ??

Stephen

??

You're not the only 1 this time, but at least my caches got very near full before that happened. ;-)

Cheers.


. . Only one of my machines got much work before the famine. The other had barely enough to get through an hour.

. . But I am now convinced this thread is haunted. For over an hour "No tasks", but a few minutes after posting that message all machine got new work ... eerie :)

Stephen

<shudder> :)
ID: 1893085 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1893088 - Posted: 4 Oct 2017, 1:23:49 UTC - in response to Message 1893082.  

. . I am surprised this thread is not full of messages since the outrage. All three machines "No Tasks Available". Maybe I am the only one getting this ??

Stephen

??

You're not the only 1 this time, but at least my caches got very near full before that happened. ;-)

Cheers.

Mine went from getting the maintenance message to reporting and filling up.
10/3/2017 6:21:21 PM	SETI@home	Project is temporarily shut down for maintenance
10/3/2017 6:47:09 PM	SETI@home	Sending scheduler request: To report completed tasks.
10/3/2017 6:47:09 PM	SETI@home	Reporting 50 completed tasks
10/3/2017 6:47:09 PM	SETI@home	Requesting new tasks for CPU
10/3/2017 6:47:12 PM	SETI@home	Scheduler request completed: got 50 new tasks

Maybe having max_tasks_reported set to 50 has something to do with my hosts high success rate?
I did have on host report 46 and only receive 38. It did get 8 tasks on the next request 5 min later, but otherwise nothing to odd for me.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1893088 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36841
Credit: 261,360,520
RAC: 489
Australia
Message 1893091 - Posted: 4 Oct 2017, 1:30:56 UTC - in response to Message 1893082.  

. . I am surprised this thread is not full of messages since the outrage. All three machines "No Tasks Available". Maybe I am the only one getting this ??

Stephen

??

You're not the only 1 this time, but at least my caches got very near full before that happened. ;-)

Cheers.

2-3 further requests after that post my caches were full.

Cheers.
ID: 1893091 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893108 - Posted: 4 Oct 2017, 2:46:13 UTC

I think I am going to have to set the max_task_reported tag. I updated all the crunchers after the project came back up and the log said all were successful. I then took down Numbskull for the rebuild but now that I have checked the hosts, I see that the reported tasks for it didn't take.

I also had something strange occur again on the Linux cruncher. I've seen it one time before. If a task is unsuccessful in downloading and the log says either task * was supposed to be 720XXX bytes and it got 0 bytes or the task is missing its header, then BOINC crashes the machine and it reboots. Anybody else see that? Is this a known bug or is it something that has to be reported?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893108 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1893111 - Posted: 4 Oct 2017, 3:00:45 UTC - in response to Message 1893108.  

My Linux boxes have yet to have a crash, so I haven't seen that happen.
ID: 1893111 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893113 - Posted: 4 Oct 2017, 3:17:28 UTC - in response to Message 1893111.  

That is what I expected with Linux since it is a lot more bullet-proof than Windows. Been rock solid up until the first event last week where something in BOINC got corrupted and it crashes the machine after a failed task download. When I bring BOINC back up, ( it is not autostarted), it will run for about 10-20 seconds and when it attempts to retrieve the previously errored download task, it immediately dumps the machine again. That to me sound like something wrote to memory where it shouldn't have. But I know nothing about troubleshooting Linux, and don't know what tools I am supposed to use. I have all kinds of tools for Windows to figure out what went wrong and know how to use them. But with Linux, I don't have a clue where to start.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893113 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1893114 - Posted: 4 Oct 2017, 3:18:13 UTC - in response to Message 1893108.  

I also had something strange occur again on the Linux cruncher. I've seen it one time before. If a task is unsuccessful in downloading and the log says either task * was supposed to be 720XXX bytes and it got 0 bytes or the task is missing its header, then BOINC crashes the machine and it reboots. Anybody else see that? Is this a known bug or is it something that has to be reported?
Last Tuesday evening I got a couple with "Exit status -186 (0xFFFFFF46) ERR_RESULT_DOWNLOAD" and a Stderr with output like this:

WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>30mr08aa.13309.5798.9.36.5</file_name>
  <error_code>-200 (wrong size)</error_code>
</file_xfer_error>

That was the first and only time I've had D/L issues on any of my Linux hosts. However, they didn't cause any BOINC or system crash as far as I know.
ID: 1893114 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1893132 - Posted: 4 Oct 2017, 7:28:16 UTC - in response to Message 1893113.  

That is what I expected with Linux since it is a lot more bullet-proof than Windows. Been rock solid up until the first event last week where something in BOINC got corrupted and it crashes the machine after a failed task download. When I bring BOINC back up, ( it is not autostarted), it will run for about 10-20 seconds and when it attempts to retrieve the previously errored download task, it immediately dumps the machine again. That to me sound like something wrote to memory where it shouldn't have. But I know nothing about troubleshooting Linux, and don't know what tools I am supposed to use. I have all kinds of tools for Windows to figure out what went wrong and know how to use them. But with Linux, I don't have a clue where to start.


Check first your disk is not full !
If not try to delete the wrong wu....
ID: 1893132 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893133 - Posted: 4 Oct 2017, 8:25:19 UTC - in response to Message 1893132.  

No, the disk is fine. Only using 6.5GB out of 215GB. It seems to clear itself up after a few BOINC restarts with the no state file found for task so and so.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893133 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 1893148 - Posted: 4 Oct 2017, 12:30:47 UTC - in response to Message 1893133.  

No, the disk is fine. Only using 6.5GB out of 215GB. It seems to clear itself up after a few BOINC restarts with the no state file found for task so and so.

Is this the BOINC 7.8.2 machine(s)? If so might be the slot directories issue
BOINC blog
ID: 1893148 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1893154 - Posted: 4 Oct 2017, 13:26:52 UTC - in response to Message 1893108.  

I think I am going to have to set the max_task_reported tag. I updated all the crunchers after the project came back up and the log said all were successful. I then took down Numbskull for the rebuild but now that I have checked the hosts, I see that the reported tasks for it didn't take.

I also had something strange occur again on the Linux cruncher. I've seen it one time before. If a task is unsuccessful in downloading and the log says either task * was supposed to be 720XXX bytes and it got 0 bytes or the task is missing its header, then BOINC crashes the machine and it reboots. Anybody else see that? Is this a known bug or is it something that has to be reported?

I had to enable it for another project and never bothered to unset it for most of my hosts.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1893154 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1893205 - Posted: 4 Oct 2017, 18:09:55 UTC - in response to Message 1893154.  

I remember when there was trouble with that a few years ago when we had to set that to report tasks. Somewhere along the way cc_config.xml got set back to defaults.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1893205 · Report as offensive
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 29 · Next

Message boards : Number crunching : Panic Mode On (107) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.