WU restarted. Lost checkpoint? Error hunting?

Questions and Answers : Windows : WU restarted. Lost checkpoint? Error hunting?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Jakob Creutzfeld
Volunteer tester
Avatar

Send message
Joined: 13 Oct 00
Posts: 611
Credit: 2,025,000
RAC: 1
Germany
Message 645532 - Posted: 21 Sep 2007, 6:29:32 UTC - in response to Message 645288.  

I have the same problem except mine does not ge that far.On bionic 5.10.20 I get maybe to 12 percent or less and then it resets back to 0%.


Hi OmegaDVD,

this could be a problem with your maximum disk/memory usage allowed for BOINC.

9/20/2007 9:53:09 AM||Preferences limit memory usage when active to 153.22MB
9/20/2007 9:53:09 AM||Preferences limit memory usage when idle to 153.22MB
9/20/2007 9:53:09 AM||Preferences limit disk usage to 0.93GB


These setting seems a little tight for BOINC (on a 512 MB System, that is abount 25-30 %). I'm not sure if the memory usage limitation includes the whole memory used by all programs (and the OS) or only the memory used by BOINC (if it is the whole used memory, this values probably must be raised). Looking at one of your Results, I found this in stderr.out:

<core_client_version>5.10.20</core_client_version>
<![CDATA[
<message>
Maximum memory exceeded
</message>
]]>

I would recommend to reset your memory memory preferences back to 75% (when in use) resp. 90% (when idle), unless you have a reason not to do so. Also your Disk settings may be a little tight, surely be worth to check them too.

HTH
Andy
ID: 645532 · Report as offensive
OmegaDVD

Send message
Joined: 16 Sep 07
Posts: 2
Credit: 0
RAC: 0
United States
Message 645288 - Posted: 21 Sep 2007, 0:09:39 UTC

I have the same problem except mine does not ge that far.On bionic 5.10.20 I get maybe to 12 percent or less and then it resets back to 0%. the ugly truth is i have been reading a lot of post everywhere for this problem and they all have one thing in common which is either a hyperthreading supported CPU or a duel Core cpu. This suggest that bonic has poor or no real support for these types of CPU's and it is something that needs to be fixed. jay_e I dont have the mobile version of the CPU but its the same family as yours [x86 Family 15 Model 2 Stepping 4]which means its a pentium 4 HT (hyperthreading) northwood core hence the family 15 model 2. The only difference is the stepping I also have windows xp Profesional sp2 which means this problem is not OS dependent. Anyone that has this problem please reported to the developers of bionic I will do the same as it will not be fixed until enough of us complain. However for those of us who have XP try this WindowsXP-KB896256-v3-x86-ENU if you can find it while it does not solve this problem it may help in other ways as xp was not originaly writen to take advantage of HT or duel core procesors but it fixes load balancing isues and is very usefull for some games as well.

9/20/2007 9:46:16 AM||Starting BOINC client version 5.10.20 for windows_intelx86
9/20/2007 9:46:16 AM||log flags: task, file_xfer, sched_ops
9/20/2007 9:46:16 AM||Libraries: libcurl/7.16.4 OpenSSL/0.9.8e zlib/1.2.3
9/20/2007 9:46:16 AM||Data directory: C:\\Program Files\\BOINC
9/20/2007 9:46:16 AM||Processor: 2 GenuineIntel Intel(R) Pentium(R) 4 CPU 3.00GHz [x86 Family 15 Model 2 Stepping 9]
9/20/2007 9:46:16 AM||Processor features: fpu tsc sse sse2 mmx
9/20/2007 9:46:16 AM||OS: Microsoft Windows XP: Professional Edition, Service Pack 2, (05.01.2600.00)
9/20/2007 9:46:16 AM||Memory: 510.73 MB physical, 1.72 GB virtual
9/20/2007 9:46:16 AM||Disk: 279.47 GB total, 57.35 GB free
9/20/2007 9:46:16 AM||Local time is UTC -5 hours
9/20/2007 9:46:17 AM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 3807481; location: (none); project prefs: default
9/20/2007 9:46:17 AM||General prefs: from SETI@home (last modified 2007-09-16 05:14:57)
9/20/2007 9:46:17 AM||Host location: none
9/20/2007 9:46:17 AM||General prefs: using your defaults
9/20/2007 9:46:17 AM||Preferences limit memory usage when active to 127.68MB
9/20/2007 9:46:17 AM||Preferences limit memory usage when idle to 153.22MB
9/20/2007 9:46:17 AM||Preferences limit disk usage to 0.93GB
9/20/2007 9:46:48 AM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 9:49:48 AM|SETI@home|Sending scheduler request: Requested by user
9/20/2007 9:49:48 AM|SETI@home|(not requesting new work or reporting completed tasks)
9/20/2007 9:49:53 AM|SETI@home|Scheduler RPC succeeded [server version 511]
9/20/2007 9:49:53 AM|SETI@home|Deferring communication for 11 sec
9/20/2007 9:49:53 AM|SETI@home|Reason: requested by project
9/20/2007 9:53:04 AM|SETI@home|Sending scheduler request: Requested by user
9/20/2007 9:53:04 AM|SETI@home|(not requesting new work or reporting completed tasks)
9/20/2007 9:53:09 AM|SETI@home|Scheduler RPC succeeded [server version 511]
9/20/2007 9:53:09 AM||General prefs: from SETI@home (last modified 2007-09-20 09:52:07)
9/20/2007 9:53:09 AM||Host location: none
9/20/2007 9:53:09 AM||General prefs: using your defaults
9/20/2007 9:53:09 AM||Preferences limit memory usage when active to 153.22MB
9/20/2007 9:53:09 AM||Preferences limit memory usage when idle to 153.22MB
9/20/2007 9:53:09 AM||Preferences limit disk usage to 0.93GB
9/20/2007 9:53:09 AM|SETI@home|Deferring communication for 11 sec
9/20/2007 9:53:09 AM|SETI@home|Reason: requested by project
9/20/2007 9:53:09 AM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 11:35:25 AM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 11:42:47 AM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 12:18:13 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 12:46:30 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 12:56:09 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 1:23:55 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 2:13:47 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 2:21:13 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 3:05:35 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 4:05:36 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 4:16:46 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 4:32:26 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527
9/20/2007 4:39:51 PM|SETI@home|Sending scheduler request: Requested by user
9/20/2007 4:39:51 PM|SETI@home|(not requesting new work or reporting completed tasks)
9/20/2007 4:39:56 PM|SETI@home|Scheduler RPC succeeded [server version 511]
9/20/2007 4:39:56 PM|SETI@home|Deferring communication for 11 sec
9/20/2007 4:39:56 PM|SETI@home|Reason: requested by project
9/20/2007 4:48:06 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 5:25:40 PM|SETI@home|Restarting task 29ja07ac.2466.17250.4.5.121_3 using setiathome_enhanced version 527
9/20/2007 5:49:11 PM|SETI@home|Restarting task 29ja07ad.13825.89951.9.5.141_2 using setiathome_enhanced version 527

ID: 645288 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 644536 - Posted: 19 Sep 2007, 22:59:58 UTC - in response to Message 644528.  

No New Tasks is per project. Did you set NNT for S@H, or did you assume it was for the machine as a whole?


Greetings.
I set no new tasks for every task - including seti.
(I usually run only one or two project at a time -
to give faster response times. )
The purpose was to see if the WU would finish without possible interference from other projects...


UPDATE:
On the second pass, the WU completed, was uploaded and ack'ed OK.
So, this is not recurring - Unless its noticed by someone else....) :-)

My thanks to the admins and crunchers reading the posts and giving their time and efforts providing answers per gratis.

Thanks, again,
Jay


I hate intermittent problems.


BOINC WIKI
ID: 644536 · Report as offensive
Profile jay_e

Send message
Joined: 6 Apr 03
Posts: 62
Credit: 1,072,112
RAC: 0
United States
Message 644528 - Posted: 19 Sep 2007, 22:49:49 UTC - in response to Message 643780.  

No New Tasks is per project. Did you set NNT for S@H, or did you assume it was for the machine as a whole?


Greetings.
I set no new tasks for every task - including seti.
(I usually run only one or two project at a time -
to give faster response times. )
The purpose was to see if the WU would finish without possible interference from other projects...


UPDATE:
On the second pass, the WU completed, was uploaded and ack'ed OK.
So, this is not recurring - Unless its noticed by someone else....) :-)

My thanks to the admins and crunchers reading the posts and giving their time and efforts providing answers per gratis.

Thanks, again,
Jay

ID: 644528 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 643780 - Posted: 18 Sep 2007, 22:12:22 UTC

No New Tasks is per project. Did you set NNT for S@H, or did you assume it was for the machine as a whole?


BOINC WIKI
ID: 643780 · Report as offensive
Profile jay_e

Send message
Joined: 6 Apr 03
Posts: 62
Credit: 1,072,112
RAC: 0
United States
Message 643639 - Posted: 18 Sep 2007, 14:17:37 UTC

Hi - Greetings - Good Morning,

I had WUs for 3 projects running.
About 8 hours ago, I noticed that SETI had progressed to over 60% complete. This morning, the same wu (I had set no new tasks) was at 14%.
(( I *assume* it is the same wu. Only had 1 seti wu...
and "won'tget new tasks" active.

Maybe it lost a checkpoint?

I dug up the usual suspects - read logs and here is what I found:

1) Boinc Gui message display
( aka c/Program Files/BOINC/stdoutdae.txt):
No errors. Just the usual stuff. here is a sample:
9/18/2007 6:04:03 AM|SETI@home|Restarting task 30ja07aa.26566.17659.9.5.146_1 using setiathome_enhanced version 527
9/18/2007 7:04:04 AM|World Community Grid|Restarting task faah2328_ZINC03845141_xmd01340_01_1 using faah version 541
9/18/2007 8:11:37 AM|SETI@home|Restarting task 30ja07aa.26566.17659.9.5.146_1 using setiathome_enhanced version 527
9/18/2007 8:48:59 AM||Resuming network activity

2) c/Program Files/BOINC/stdoutgui.txt
No entries for over 24 hours.

3) client_state.xml ( just a portion)

<active_task>
<project_master_url>http://setiathome.berkeley.edu/</project_master_url>
<result_name>30ja07aa.26566.17659.9.5.146_1</result_name>
<app_version_num>527</app_version_num>
<slot>1</slot>
<checkpoint_cpu_time>32327.654886</checkpoint_cpu_time>
<fraction_done>0.146235</fraction_done>
<current_cpu_time>32327.654886</current_cpu_time>
<swap_size>42094592.000000</swap_size>
<working_set_size>28626944.000000</working_set_size>
<working_set_size_smoothed>28626944.000000</working_set_size_smoothed>
<page_fault_rate>0.000000</page_fault_rate>
</active_task>


4) client_state_prev.xml
((( wasn't early enough - still shows reduced % )))
<active_task>
<project_master_url>http://setiathome.berkeley.edu/</project_master_url>
<result_name>30ja07aa.26566.17659.9.5.146_1</result_name>
<app_version_num>527</app_version_num>
<slot>1</slot>
<checkpoint_cpu_time>32327.654886</checkpoint_cpu_time>
<fraction_done>0.146235</fraction_done>
<current_cpu_time>32327.654886</current_cpu_time>
<swap_size>42094592.000000</swap_size>
<working_set_size>28626944.000000</working_set_size>
<working_set_size_smoothed>28626944.000000</working_set_size_smoothed>
<page_fault_rate>0.000000</page_fault_rate>
</active_task>



------------
5) time stamps for the above 2 files
Sep 18 09:53 client_state.xml
Sep 18 09:51 client_state_prev.xml

6) Looking into Program Files/BOINC/slots/1
stderr.txt ((Ah-HAH!! data starts over! ))

setiathome_enhanced 5.27 DevC++/MinGW

Work Unit Info:
...............
WU true angle range is : 0.438960
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled2 0.00180 0.00000
sse2_ChirpData_ak 0.09318 0.00000
v_vTranspose4x8ntw 0.03165 0.00000
AK SSE folding 0.00372 0.00000
Restarted at 7.05 percent.
Restarted at 14.30 percent.
Restarted at 21.21 percent.
Restarted at 26.69 percent.
Restarted at 33.63 percent.
Restarted at 73.63 percent.
Restarted at 82.18 percent.
setiathome_enhanced 5.27 DevC++/MinGW

Work Unit Info:
...............
WU true angle range is : 0.438960
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled 0.00180 0.00000
sse2_ChirpData_ak 0.05181 0.00000
v_vTranspose4x8ntw 0.02681 0.00000
AK SSE folding 0.00360 0.00000
Restarted at 7.40 percent.


-------------------------
7) Time stamps for files in slot 1
Sep 11 22:38 ..
Sep 17 05:28 work_unit.sah
Sep 17 05:28 setiathome_5.27_windows_intelx86.exe
Sep 17 05:28 setiathome-5.27_README
Sep 17 05:28 setiathome-5.27_COPYRIGHT
Sep 17 05:28 setiathome-5.27_COPYING
Sep 17 05:28 setiathome-5.27_AUTHORS
Sep 17 05:28 seti_logo
Sep 17 05:28 result.sah
Sep 17 05:28 libfftw3f-3-1-1a_upx.dll
Sep 17 05:28 .
Sep 18 08:11 init_data.xml
Sep 18 08:11 wisdom.sah
Sep 18 09:11 stderr.txt
Sep 18 09:11 state.sah

=====================
=====================
Environment info
Boinc version: 5.10.20
- what Boinc sees of the env.:
9/17/2007 4:13:43 AM||Libraries: libcurl/7.16.4 OpenSSL/0.9.8e zlib/1.2.3
9/17/2007 4:13:43 AM||Data directory: C:\\Program Files\\BOINC
9/17/2007 4:13:44 AM||Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 Mobile CPU 1.70GHz [x86 Family 15 Model 2 Stepping 4]
9/17/2007 4:13:44 AM||Processor features: fpu tsc sse mmx
9/17/2007 4:13:44 AM||OS: Microsoft Windows 2000: Professional Edition, Service Pack 4, (05.00.2195.00)



Whew! Probably too much data.. Sorry.
But - any ideas?


T H A N K Y O U!
Jay
(( PS
I use cywin to get data without locking files.))




ID: 643639 · Report as offensive
Previous · 1 · 2

Questions and Answers : Windows : WU restarted. Lost checkpoint? Error hunting?


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.