SETI Tasks Keep Restarting

Questions and Answers : Windows : SETI Tasks Keep Restarting
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Toast

Send message
Joined: 3 Jan 08
Posts: 2
Credit: 52
RAC: 0
United States
Message 697388 - Posted: 4 Jan 2008, 20:29:01 UTC

I've noticed that the last few SETI tasks (jobs, applications, whatever) keep restarting after only getting to a few percent complete. I'm running a ClimatePredictor task on this same machine, and it doesn't seem to be experiencing any problems. Any idea why this would happen?
ID: 697388 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 697395 - Posted: 4 Jan 2008, 20:40:54 UTC - in response to Message 697388.  

I've noticed that the last few SETI tasks (jobs, applications, whatever) keep restarting after only getting to a few percent complete. I'm running a ClimatePredictor task on this same machine, and it doesn't seem to be experiencing any problems. Any idea why this would happen?


ClimatePrediction's science app doesn't report when it restarts (neither does any other science app). Since SETI@Home is the only one that actually logs when it restarted, it's the only one that looks like it's having problems.

As long as the tasks are completing, I wouldn't worry about it too much.
ID: 697395 · Report as offensive
Toast

Send message
Joined: 3 Jan 08
Posts: 2
Credit: 52
RAC: 0
United States
Message 697404 - Posted: 4 Jan 2008, 21:22:22 UTC - in response to Message 697395.  

How can they complete if they keep restarting? Or is it a case where the "Percent Complete" counter gets set back to zero but the previous progress on the job isn't lost?
ID: 697404 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 697426 - Posted: 4 Jan 2008, 23:05:22 UTC - in response to Message 697404.  

How can they complete if they keep restarting? Or is it a case where the "Percent Complete" counter gets set back to zero but the previous progress on the job isn't lost?


It's actually more of a "resume" than it is a "restart". The Percent Complete counter gets recalculated from zero, so it appears that it's restarting from the beginning until it picks up from it's last checkpoint.
ID: 697426 · Report as offensive
Rept-Tile
Avatar

Send message
Joined: 18 Oct 03
Posts: 20
Credit: 28,144,831
RAC: 0
Russia
Message 697453 - Posted: 5 Jan 2008, 0:49:05 UTC

Same here. After restarting BOINC the last task has restarted too:

1/5/2008 3:38:53 AM|SETI@home|Restarting task 08ja07af.22304.21340.5.6.180_1 using setiathome_enhanced version 528

Now it shows 4.8% completed after 01:55:00 hours of working. Prior to BOINC restart it was about 80%.
I didn't face to such behaviour before. Seems like something really wrong with some tasks. Let's see what will happen after calculation of this task will be completed again.
ID: 697453 · Report as offensive
Rept-Tile
Avatar

Send message
Joined: 18 Oct 03
Posts: 20
Credit: 28,144,831
RAC: 0
Russia
Message 697510 - Posted: 5 Jan 2008, 3:48:57 UTC - in response to Message 697453.  

Upd. No, this time BOINC finished the task normally, but it took almost twice of final time to complete against the same tasks in this computer.
Next task (with equal deadline time and estimated rsc_fpops) is counting normally and resumes from the checkpoint after restarting the BOINC.
Strange..
ID: 697510 · Report as offensive
DrkGoss

Send message
Joined: 6 Mar 04
Posts: 4
Credit: 62,449
RAC: 0
Romania
Message 698393 - Posted: 8 Jan 2008, 13:29:45 UTC

Same strange behavior here ...
After some processing time, the progress counter keeps returning to zero and the WU is being processed once again, either after I start BOINC after powering the computer or just without any warning, while still processing.
I aborted 2 WU that behaved like that, but the new one behaves just the same and it doesn't seem that I'll be able to finish it.
This strange thing only happens to SETI workunits, the other project are just fine.
ID: 698393 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 698399 - Posted: 8 Jan 2008, 14:06:23 UTC - in response to Message 698393.  

Same strange behavior here ...
After some processing time, the progress counter keeps returning to zero and the WU is being processed once again, either after I start BOINC after powering the computer or just without any warning, while still processing.
I aborted 2 WU that behaved like that, but the new one behaves just the same and it doesn't seem that I'll be able to finish it.
This strange thing only happens to SETI workunits, the other project are just fine.


I guess you must have SETI at quite a low resource share to have no cache of tasks on a 6400 C2D. Have you tried manually suspending the other projects to see if this task can crunch through to completion? Even if you are running the stock App with no overclock it should not take more than 3 - 4 hours on such a machine.

F.
ID: 698399 · Report as offensive
DrkGoss

Send message
Joined: 6 Mar 04
Posts: 4
Credit: 62,449
RAC: 0
Romania
Message 698479 - Posted: 8 Jan 2008, 23:22:23 UTC

It was all working just fine until about 1-2 months ago, with the same resource share between projects. Indeed, a WU wouldn't take more than 4 hours to complete before all of this started.
I've suspended the other projects and see what happens.
ID: 698479 · Report as offensive
AndrewM
Volunteer tester

Send message
Joined: 5 Jan 08
Posts: 369
Credit: 34,275,196
RAC: 0
Australia
Message 699733 - Posted: 13 Jan 2008, 10:52:36 UTC

Despite resource share at 100% and no other projects,
Task ID http://setiathome.berkeley.edu/result.php?resultid=710106221
Work unit ID http://setiathome.berkeley.edu/workunit.php?wuid=185636427

resumed/restarted multiple times (note past tense). Only found 8 msgs to copy:

13/01/2008 2:19:16 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 3:16:43 PM|SETI@home|Resuming task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 3:48:56 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 4:19:18 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 6:04:53 PM|SETI@home|Resuming task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 7:09:44 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 8:19:58 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2
13/01/2008 8:58:12 PM|SETI@home|Restarting task 01mr07ag.29460.18068.11.6.120_2 using setiathome_enhanced version 527

WU racked up 20 hrs cpu time before I aborted it. Is there a procedure to follow for reporting squirelly WUs? Wingman's (wing person's?) CPU time closer to 7Hrs.

A.
AndrewM
ID: 699733 · Report as offensive
DrkGoss

Send message
Joined: 6 Mar 04
Posts: 4
Credit: 62,449
RAC: 0
Romania
Message 700084 - Posted: 14 Jan 2008, 22:04:08 UTC

No luck. I've suspended the other projects, and the SETI WUs keep coming back to 0% from time to time. Even reinstalled BOINC and all projects and still the same thing keeps happening ...
ID: 700084 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 700108 - Posted: 14 Jan 2008, 23:48:38 UTC - in response to Message 700084.  

No luck. I've suspended the other projects, and the SETI WUs keep coming back to 0% from time to time. Even reinstalled BOINC and all projects and still the same thing keeps happening ...

Hmmm. Then I have only questions about the setup:

- Are you overclocking?
- Are you monitoring temps?
- Have you tried running Memtest and/or Prime95 just to check the hardware?

F.
ID: 700108 · Report as offensive
DrkGoss

Send message
Joined: 6 Mar 04
Posts: 4
Credit: 62,449
RAC: 0
Romania
Message 700134 - Posted: 15 Jan 2008, 1:17:51 UTC - in response to Message 700108.  

No luck. I've suspended the other projects, and the SETI WUs keep coming back to 0% from time to time. Even reinstalled BOINC and all projects and still the same thing keeps happening ...

Hmmm. Then I have only questions about the setup:

- Are you overclocking?
- Are you monitoring temps?
- Have you tried running Memtest and/or Prime95 just to check the hardware?

F.



No overclocking since the problem started, and the coretemps are at about 40C. Haven't run any Memtest or Orthos lately, but back when I was experimenting with overclocking settings I used to run them and the hardware seemed just fine.
ID: 700134 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 700193 - Posted: 15 Jan 2008, 7:31:44 UTC - in response to Message 700134.  

No luck. I've suspended the other projects, and the SETI WUs keep coming back to 0% from time to time. Even reinstalled BOINC and all projects and still the same thing keeps happening ...

Hmmm. Then I have only questions about the setup:

- Are you overclocking?
- Are you monitoring temps?
- Have you tried running Memtest and/or Prime95 just to check the hardware?

F.



No overclocking since the problem started, and the coretemps are at about 40C. Haven't run any Memtest or Orthos lately, but back when I was experimenting with overclocking settings I used to run them and the hardware seemed just fine.

Possibly worth a re-run just in case the memory needs re-seating or something else physical is going on?

F.
ID: 700193 · Report as offensive
hollus veterano

Send message
Joined: 9 Oct 03
Posts: 14
Credit: 478,141
RAC: 4
Spain
Message 700329 - Posted: 16 Jan 2008, 2:22:12 UTC

I have the same problem, units run but never complete (or very rarely so) but instead restart every now and then. The predicted time to completion counter is also a mess.
The problem started in november and I just found another thread on the same subject, curiously from november. No answers there.
I have deleted and reinstalled boinc, but the problem persists.
I am running BOINC 5.10.30 in a Windows XP Home Edition with Service Pack 2.
The computer is a Dell Inspiron M1710 with a T2600 CPU.
It took me 2 months to come to this help page and I like to tweak with this kind of programs, I cannot but wonder how many people more have this problem and either do not know it or do not report it.
I guess if help was easy it would already have arrived, but I thought I would just add myself to the count as one more user who cannot send finished units!
Even if I would really like to do so...

Cheers,

Jose Cuesta
ID: 700329 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 712157 - Posted: 13 Feb 2008, 18:32:39 UTC

OK, big question on the restarting of tasks. Does anyone with the problem use BOINC's CPU throttling?
ID: 712157 · Report as offensive
Thomas Yen

Send message
Joined: 26 Jan 07
Posts: 1
Credit: 9,291
RAC: 0
Taiwan
Message 723619 - Posted: 9 Mar 2008, 5:24:07 UTC
Last modified: 9 Mar 2008, 5:25:19 UTC

Hello,
I've had this problem for a couple of month now. Today I tried suspending the BOINC Manager's Internet connection while computing this workunit, and it finished on schedule. I've had the Internet connection enabled since I received the workunit two months ago, and it has been continually restarting until today. Maybe you'd like to try suspending the Internet connection?
ID: 723619 · Report as offensive
Profile Christopher

Send message
Joined: 25 Mar 08
Posts: 1
Credit: 75,929
RAC: 0
United Kingdom
Message 730612 - Posted: 26 Mar 2008, 14:26:39 UTC

Hello,
I am new to the project and having problems completing my first task.
I keep getting messages that the tasks are being restarted.
Every 30 minutes or so they restart.
Most I get up to is around 10% then they restart.
Have reinstalled, cancelled all my preferences.
D/C from the project and rejoined.
Have suspend the network connection as surgested above and still no luck.
Can any one please help?
Many Thanks
Chris
ID: 730612 · Report as offensive
simico

Send message
Joined: 25 Jun 04
Posts: 3
Credit: 65,244
RAC: 0
Hungary
Message 730947 - Posted: 27 Mar 2008, 11:37:30 UTC

Since a few weeks I also have this problem. I installed the latest 5.10.45 version of BOINC, excluded BOINC's folder in antivir sw, suspended network activity - to no avail.
Before this problem appeared I could complete a task in ~4 hours, now completing a task takes a week because it's always restarted. Right now the task was at 98% and got restarted, "To completion" time (which grows after every restart) is now 26 hours!!
I'm fed up with this... not gonna waste cpu time on never finishing tasks. I stop using BOINC until this gets fixed.
ID: 730947 · Report as offensive
simico

Send message
Joined: 25 Jun 04
Posts: 3
Credit: 65,244
RAC: 0
Hungary
Message 730948 - Posted: 27 Mar 2008, 11:39:29 UTC - in response to Message 712157.  

OK, big question on the restarting of tasks. Does anyone with the problem use BOINC's CPU throttling?


I use cpu throttling since I run BOINC on a laptop (100% cpu usage means more heat and noise).
ID: 730948 · Report as offensive
1 · 2 · Next

Questions and Answers : Windows : SETI Tasks Keep Restarting


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.