Something wrong somwhere?


log in

Advanced search

Message boards : Number crunching : Something wrong somwhere?

Previous · 1 · 2 · 3 · Next
Author Message
dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 993309 - Posted: 1 May 2010, 19:03:59 UTC - in response to Message 992765.

It worked! :)

The only "problem" is that this particular machine has a daily quota of 1 tast per day now. I hope it will change within days.
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 993993 - Posted: 4 May 2010, 10:18:14 UTC - in response to Message 993311.

Hasn't increased yet. There must be other things that is wrong too.

I got a triple core AMD prosessor running Fedora Core 12. it has stopped working. It seems like it got a lot of downloaded taskt, but it won't process them.

When running the client in a shell windows, this is what I get:

$ ./run_client 04-May-2010 12:08:30 [---] Starting BOINC client version 6.10.17 for x86_64-pc-linux-gnu 04-May-2010 12:08:30 [---] log flags: file_xfer, sched_ops, task 04-May-2010 12:08:30 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1 04-May-2010 12:08:30 [---] Data directory: /home/dahls/BOINC 04-May-2010 12:08:30 [---] Processor: 3 AuthenticAMD AMD Athlon(tm) II X3 435 Processor [Family 16 Model 5 Stepping 2] 04-May-2010 12:08:30 [---] Processor: 512.00 KB cache 04-May-2010 12:08:30 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy 04-May-2010 12:08:30 [---] OS: Linux: 2.6.31.5-127.fc12.x86_64 04-May-2010 12:08:30 [---] Memory: 7.78 GB physical, 9.77 GB virtual 04-May-2010 12:08:30 [---] Disk: 273.47 GB total, 253.49 GB free 04-May-2010 12:08:30 [---] Local time is UTC +2 hours 04-May-2010 12:08:30 [---] No usable GPUs found 04-May-2010 12:08:30 [---] Not using a proxy 04-May-2010 12:08:30 [SETI@home] URL http://setiathome.berkeley.edu/; Computer ID 5256434; resource share 100 04-May-2010 12:08:30 [SETI@home] General prefs: from SETI@home (last modified 11-Nov-2008 20:08:21) 04-May-2010 12:08:30 [SETI@home] Host location: none 04-May-2010 12:08:30 [SETI@home] General prefs: using your defaults 04-May-2010 12:08:30 [---] Preferences limit memory usage when active to 3985.84MB 04-May-2010 12:08:30 [---] Preferences limit memory usage when idle to 7174.51MB 04-May-2010 12:08:30 [---] Preferences limit disk usage to 100.00GB BOINC initialization completed, beginning process execution...


In the project directory:
$ pwd ~/BOINC/projects/setiathome.berkeley.edu $ ls 01dc06af.12808.7025.3.10.230 11dc06ag.30461.11115.12.10.18 13ja07af.27101.19918.5.10.147 28dc06aa.14595.18791.11.10.81 30dc06aa.15938.10296.4.10.241 31dc06aa.16264.11115.13.10.42 01dc06af.15276.3344.10.10.86 11dc06ag.32673.17250.5.10.16 13ja07af.27101.19918.5.10.155 28dc06aa.20809.15519.13.10.26 30dc06aa.15938.11114.4.10.44 31dc06aa.16264.29759.13.10.184 01dc06af.4151.21749.16.10.7 11dc06ag.6732.21749.13.10.235 13ja07af.27101.19918.5.10.161 28dc06aa.20809.20427.13.10.48 30dc06aa.15938.1707.4.10.1 31dc06aa.19061.28941.3.10.230 01dc06ag.18093.5389.7.10.42 11ja07ag.8471.5389.5.10.206 13ja07af.27101.19918.5.10.167 28dc06aa.24076.18382.12.10.146 30dc06aa.20838.20112.5.10.211 31dc06aa.20341.34258.5.10.11 01dc06ag.22446.72.11.10.160 11ja07ah.18477.11524.6.10.147 13ja07af.27101.22781.5.10.40 28dc06aa.24076.18382.12.10.149 30dc06aa.20838.20112.5.10.225 31dc06aa.23487.38348.4.10.27 01dc06ag.22941.47902.4.10.111 11ja07ah.18477.11524.6.10.153 13ja07af.27101.30552.5.10.129 28dc06aa.24076.18382.12.10.152 30dc06aa.20838.20112.5.10.227 31dc06aa.28373.34258.11.10.119 01dc06ag.29830.4571.8.10.91 11ja07ah.18477.11524.6.10.159 13ja07af.30754.18282.3.10.22 28dc06aa.24076.18382.12.10.168 30dc06aa.20838.20112.5.10.231 31dc06aa.28373.34258.11.10.68 01dc06ag.29830.5389.8.10.229 11ja07ah.18477.11524.6.10.212 13ja07af.30754.30143.3.10.163 28dc06aa.24076.18382.12.10.169 30dc06aa.24167.9478.3.10.112 31dc06aa.28373.5389.11.10.179 01dc06ag.30505.57718.13.10.165 11ja07ai.19083.16023.3.10.106 13ja07af.7325.2526.9.10.95 28dc06aa.24076.18382.12.10.171 30no06ae.12730.4980.14.10.160 31dc06aa.28373.5389.11.10.181 05dc06af.10057.1709.9.10.193 11ja07ai.19083.22567.3.10.68 13ja07af.7325.26053.9.10.101 28dc06aa.24076.18382.12.10.172 30no06ae.1495.11933.10.10.80 31dc06aa.28388.5389.12.10.139 05dc06af.11020.177218.15.10.126 11ja07ai.20369.2117.13.10.171 13ja07af.7325.3344.9.10.18 28dc06aa.24076.18382.12.10.175 30no06ae.1495.12342.10.10.87 31dc06aa.28388.5389.12.10.141 05dc06af.11020.181717.15.10.176 11ja07ai.20394.13160.14.10.226 20fe07ag.18503.16452.16.10.145 28dc06aa.24076.18791.12.10.6 30no06ae.1495.21749.10.10.28 31dc06aa.28388.5389.12.10.145 05dc06af.4533.4163.8.10.243 11ja07ai.21793.24612.15.10.48 20fe07ah.6366.481.8.10.19 28dc06aa.28243.21245.16.10.178 30no06ae.1495.22158.10.10.244 31dc06aa.28388.5389.12.10.149 10fe07af.6882.89953.3.10.68 11ja07ai.24780.23385.10.10.36 20fe07ah.9772.8252.4.10.234 28dc06aa.28243.4162.16.10.127 30no06ae.18310.9479.8.10.74 31dc06aa.28518.9888.15.10.5 10fe07af.6882.91998.3.10.18 11ja07ai.27321.9888.5.10.179 24ja07ad.3787.8661.3.10.206 28dc06aa.28243.4162.16.10.50 30no06ae.28350.14796.11.10.234 31dc06aa.30435.13569.7.10.212 10fe07af.6882.91998.3.10.22 11ja07ai.28173.16841.16.10.192 24ja07af.16290.85548.13.10.39 28dc06aa.28243.4162.16.10.52 30no06ae.28350.14796.11.10.240 31dc06aa.30435.28532.7.10.58 10fe07af.6882.92407.3.10.115 11mr07ae.24796.198076.9.10.21 24ja07af.9677.11933.4.10.129 28dc06aa.28243.4162.16.10.54 30no06ae.28350.14796.11.10.246 31dc06aa.30435.33440.7.10.157 11dc06af.11543.4162.9.10.46 11mr07ae.24796.198076.9.10.28 27dc06af.1247.19490.6.10.115 28dc06aa.28243.4162.16.10.61 30no06ae.28350.9479.11.10.53 31dc06aa.30435.4571.7.10.75 11dc06af.20120.24612.4.10.189 12ja07ac.11279.9888.6.10.33 27dc06af.16907.20308.4.10.45 28dc06aa.28243.4162.16.10.66 30no06ae.8383.20113.9.10.123 31dc06aa.31301.33849.8.10.105 11dc06af.23021.18477.7.10.240 12ja07af.10123.481.15.10.1 27dc06af.16907.28897.4.10.4 28dc06aa.28243.4162.16.10.67 30no06ae.8383.20113.9.10.79 31dc06aa.31301.8252.8.10.89 11dc06af.23021.21749.7.10.102 12ja07af.14268.23794.7.10.57 27dc06af.16907.28897.4.10.55 28dc06aa.28243.4162.16.10.68 30no06af.11000.1708.4.10.137 arecibo_181.png 11dc06af.23021.21749.7.10.174 12ja07af.14268.7843.7.10.79 27dc06af.25121.4980.7.10.155 28dc06aa.28243.4162.16.10.69 30no06af.11000.1708.4.10.138 sah_40.png 11dc06af.23021.21749.7.10.87 12ja07af.24270.8252.12.10.104 27dc06af.27652.18263.3.10.7 28dc06aa.28243.4162.16.10.71 30no06af.11000.1708.4.10.143 sah_banner_290.png 11dc06af.23021.21749.7.10.90 12ja07af.24295.18477.13.10.4 27ja07ag.14237.23385.15.10.82 28dc06aa.29509.8661.10.10.103 30no06af.11000.1708.4.10.151 sah_ss_290.png 11dc06af.23021.21749.7.10.94 12ja07af.27550.16432.8.10.166 27ja07ag.21227.16023.9.10.93 28dc06aa.29509.8661.10.10.88 30no06af.3017.20522.3.10.161 seti_528.jpg 11dc06af.23021.21749.7.10.99 12ja07af.27550.1708.8.10.146 27ja07ag.21227.17659.9.10.204 28dc06aa.29509.9479.10.10.120 30no06af.3017.58003.3.10.114 setiathome-5.28_AUTHORS 11dc06af.23021.7434.7.10.22 12ja07af.28050.20113.9.10.126 27ja07ag.21227.22158.9.10.28 28dc06aa.5555.11838.15.10.239 31dc06aa.14600.481.9.10.123 setiathome-5.28_COPYING 11dc06af.23021.7434.7.10.28 13ja07af.23583.17055.8.10.102 27ja07ag.24649.14387.3.10.107 28dc06aa.5555.13474.15.10.210 31dc06aa.14600.481.9.10.129 setiathome-5.28_COPYRIGHT 11dc06af.23021.7434.7.10.34 13ja07af.23583.17055.8.10.120 27ja07ag.24649.14387.3.10.68 28dc06aa.5555.21245.15.10.5 31dc06aa.14600.481.9.10.141 setiathome-5.28_README 11dc06af.23641.22976.8.10.103 13ja07af.23583.17055.8.10.153 27ja07ag.2476.23794.13.10.92 28dc06aa.7971.16746.8.10.212 31dc06aa.14600.481.9.10.147 setiathome-5.28.x86_64-pc-linux-gnu 11dc06af.23641.22976.8.10.111 13ja07af.23583.17055.8.10.165 27ja07ag.3797.6207.14.10.144 28dc06aa.7971.23290.8.10.96 31dc06aa.14600.481.9.10.153 slideshow_setiathome_enhanced_00 11dc06af.26961.1708.11.10.19 13ja07af.23583.17055.8.10.176 27ja07ag.4117.18886.10.10.250 29dc06aa.8964.12342.10.10.28 31dc06aa.14600.481.9.10.159 slideshow_setiathome_enhanced_01 11dc06ag.14973.20931.8.10.162 13ja07af.23583.17055.8.10.81 27ja07ag.7296.15614.4.10.212 29dc06af.6854.1299.12.10.56 31dc06aa.14600.4980.9.10.83 slideshow_setiathome_enhanced_02 11dc06ag.19076.24203.9.10.231 13ja07af.23583.17055.8.10.84 28dc06aa.14283.15519.14.10.180 29dc06af.6854.1299.12.10.59 31dc06aa.15993.30986.10.10.157 stat_icon 11dc06ag.20225.5389.10.10.190 13ja07af.23583.17055.8.10.87 28dc06aa.14283.15519.14.10.192 29dc06af.6854.1299.12.10.71 31dc06aa.15993.30986.10.10.19 11dc06ag.20319.13160.3.10.84 13ja07af.23583.17055.8.10.90 28dc06aa.14283.15519.14.10.201 29dc06af.6854.1299.12.10.72 31dc06aa.15993.30986.10.10.228 11dc06ag.20319.7843.3.10.83 13ja07af.23583.17055.8.10.96 28dc06aa.14595.14292.11.10.102 29dc06af.6854.2117.12.10.203 31dc06aa.15993.30986.10.10.76 11dc06ag.26523.2526.4.10.208 13ja07af.23583.17055.8.10.99 28dc06aa.14595.14292.11.10.107 29mr07af.31874.24612.12.10.154 31dc06aa.16264.11115.13.10.30


I also tried a
./boinccmd --set_run_mode always


but that does not seem to help. I did this since the first machine (the FC4) I had problem with began to suspend boinc while I was working on it (even the setup say it should always run). This worked on the FC4 machine, but does not seem to work on the FC12 machine.
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 993994 - Posted: 4 May 2010, 10:32:46 UTC - in response to Message 993993.

BTW, here is a result of 'boinccmd --get_state':
http://www.dahl-stamnes.net/dahls/Ymse/boinc.state.html
____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 359,640
RAC: 35
Germany
Message 993999 - Posted: 4 May 2010, 11:10:04 UTC - in response to Message 993994.

According to your BOINC state, you have 210 tasks ready to report. You should do a
./boinccmd --project http://setiathome.berkeley.edu/ update

What about error messages in stdoutdae?

Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1151
Credit: 3,841,770
RAC: 1,691
United Kingdom
Message 994002 - Posted: 4 May 2010, 11:15:55 UTC

got a http about an hour ago, is there problems again with one of the two download machine I think the .13 etc?
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 994021 - Posted: 4 May 2010, 14:25:41 UTC - in response to Message 993999.

According to your BOINC state, you have 210 tasks ready to report. You should do a
./boinccmd --project http://setiathome.berkeley.edu/ update

What about error messages in stdoutdae?

Gruß,
Gundolf


Doing the command above I get:
can't connect to local host.

If I do a run_client in a shell I get:
4-May-2010 12:49:51 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 12:49:51 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 12:55:02 [---] Project communication failed: attempting access to reference site 04-May-2010 12:55:02 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 12:55:05 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 15:46:42 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 15:46:42 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 15:51:54 [---] Project communication failed: attempting access to reference site 04-May-2010 15:51:57 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 15:51:58 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 15:52:59 [SETI@home] Fetching scheduler list 04-May-2010 15:53:04 [SETI@home] Master file download succeeded 04-May-2010 15:53:09 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 15:53:09 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 15:58:20 [---] Project communication failed: attempting access to reference site 04-May-2010 15:58:20 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 15:58:23 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 15:59:20 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 15:59:20 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 16:04:30 [---] Project communication failed: attempting access to reference site 04-May-2010 16:04:30 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 16:04:34 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 16:05:31 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 16:05:31 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 16:10:42 [---] Project communication failed: attempting access to reference site 04-May-2010 16:10:42 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 16:10:45 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 16:11:42 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 16:11:42 [SETI@home] Reporting 210 completed tasks, requesting new tasks 04-May-2010 16:16:53 [---] Project communication failed: attempting access to reference site 04-May-2010 16:16:53 [SETI@home] Scheduler request failed: Timeout was reached 04-May-2010 16:16:56 [---] Internet access OK - project servers may be temporarily down. 04-May-2010 16:17:53 [SETI@home] Sending scheduler request: To report completed tasks. 04-May-2010 16:17:53 [SETI@home] Reporting 210 completed tasks, requesting new tasks ^C04-May-2010 16:20:35 [---] Received signal 2 04-May-2010 16:20:36 [---] Exit requested by user


____________

Ralf Haziak
Volunteer tester
Send message
Joined: 15 Nov 02
Posts: 14
Credit: 600,623
RAC: 39
Germany
Message 994086 - Posted: 4 May 2010, 23:08:02 UTC - in response to Message 994021.
Last modified: 4 May 2010, 23:10:38 UTC

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

Every Tuesday morning (Pacific time) we have a 3-4 hour outage for database maintenance.

____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4312
Credit: 1,085,884
RAC: 1,515
United States
Message 994092 - Posted: 4 May 2010, 23:17:55 UTC - in response to Message 994086.

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

That should give a "Project is down" message in reply, the indication is the Scheduler requests aren't getting any kind of connection. There's a successful "master file" get, which ensures the host has the correct host name of the Scheduler. Possibly it's a DNS problem where the host name isn't being translated to the right IP address.
Joe

Ralf Haziak
Volunteer tester
Send message
Joined: 15 Nov 02
Posts: 14
Credit: 600,623
RAC: 39
Germany
Message 994099 - Posted: 4 May 2010, 23:40:33 UTC - in response to Message 994092.
Last modified: 4 May 2010, 23:47:54 UTC

only asking cause its thuesday and saw a few hours ago also
a timeout msg and then the internet connection test afterwards
in my logs

ofc it could be chance that i got same msgs today

btw are old msgs somewhere stored so i could chk if it was same timeframe?
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 994160 - Posted: 5 May 2010, 6:18:36 UTC - in response to Message 994092.

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

That should give a "Project is down" message in reply, the indication is the Scheduler requests aren't getting any kind of connection. There's a successful "master file" get, which ensures the host has the correct host name of the Scheduler. Possibly it's a DNS problem where the host name isn't being translated to the right IP address.
Joe


I've checked my DNS - no error messages are being reported (all machines are using the same DNS server).

I also checked my firewall, the XP machine, where things seem to be working OK (1) is contacting the same IP as the linux machines, where things does not work properly.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

1: Currently all machines are reporting "Project has no jobs available".
____________

Profile stephen Goodyer
Send message
Joined: 8 Oct 06
Posts: 37
Credit: 268,148
RAC: 130
United Kingdom
Message 994171 - Posted: 5 May 2010, 7:21:26 UTC - in response to Message 994166.
Last modified: 5 May 2010, 7:54:32 UTC

On the status page it says that lando and bamdi are not working, could this be the problem? because i can't get any wu's.

Oh well it gives me more time to plant some lettuce and onions, i'm not going to make a meal out of it.

Profile Leopoldo
Volunteer tester
Avatar
Send message
Joined: 4 Aug 99
Posts: 102
Credit: 2,888,039
RAC: 91
Russia
Message 994188 - Posted: 5 May 2010, 9:31:59 UTC - in response to Message 994160.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

Firstly you should report completed tasks. You can check availability of the upload server (208.68.240.16) by visiting the URL http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler with your favorite browser from this triple core machine.

Normal answer looks like:

<data_server_reply> <status>1</status> <message>no command</message> </data_server_reply>

If it does, check your uploading (completed tasks reported through the file "sched_request_setiathome.berkeley.edu.xml", so maybe this file can't be uploaded due to size)

If it doesn't, change URL with direct numbers instead of symbolic name and repeat visit.

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 994196 - Posted: 5 May 2010, 12:28:33 UTC - in response to Message 994188.


Firstly you should report completed tasks. You can check availability of the upload server (208.68.240.16) by visiting the URL http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler with your favorite browser from this triple core machine.


That would be wget, since this machine does not have any screen, keyboard or mouse ;)

And the answer looks like you said:


<data_server_reply> <status>1</status> <message>no command</message> </data_server_reply>

If it does, check your uploading (completed tasks reported through the file "sched_request_setiathome.berkeley.edu.xml", so maybe this file can't be uploaded due to size)

If it doesn't, change URL with direct numbers instead of symbolic name and repeat visit.


I checked the file, but I'm not sure what to look for. I see there is a lot of messages like:
Work Unit Info: ............... WU true angle range is : 1.388434 Optimal function choices: ----------------------------------------------------- name ----------------------------------------------------- v_BaseLineSmooth (no other) v_vGetPowerSpectrum 0.00023 0.00000 v_ChirpData 0.01180 0.00000 v_vTranspose4x16ntw 0.00435 0.00000 AK SSE folding 0.00074 0.00000 Unrecognized XML in parse_init_data_file: hostid Skipping: 5256434 Skipping: /hostid Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 4486.795998 Skipping: /starting_elapsed_time Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1272251169.000000 Skipping: /computation_deadline Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time Skipping: /mod_time Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active Skipping: 0 Skipping: /run_gpu_if_user_active Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct Skipping: 100.000000 Skipping: /max_ncpus_pct Unrecognized XML in parse_init_data_file: hostid Skipping: 5256434 Skipping: /hostid Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 4486.795998 Skipping: /starting_elapsed_time Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1272251169.000000 Skipping: /computation_deadline Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time Skipping: /mod_time Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active Skipping: 0 Skipping: /run_gpu_if_user_active Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct Skipping: 100.000000 Skipping: /max_ncpus_pct Restarted at 95.62 percent. Flopcounter: 11763840604466.185547


____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 994202 - Posted: 5 May 2010, 13:40:17 UTC - in response to Message 994175.

There is only one dataset left loaded to split...so very few WUs are being generated compared to the usual number...work will be very hard to come by until some more datasets are loaded.

I just inquired if anybody knows why in the Panic Mode thread.....


Are you telling us that we are running out of work? That even if everything is working normal, there will be no more work sets to download?

____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 28,615,023
RAC: 42,192
Norway
Message 994243 - Posted: 5 May 2010, 16:42:23 UTC - in response to Message 994202.

The first FC4 machine, which I had problem with can get new works now. It get one work set and, when it's finished it upload it and get a new one without having to wait.

At the same time the triple core FC12 machine is not able to report anything nor get any new work sets. Many of the work set that it want to upload is about to time out or has timed out.

In my opinion it seems like there is something wrong at the server side.

Should I delete everything on the triple core machine and then reinstall BOINC again?
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4312
Credit: 1,085,884
RAC: 1,515
United States
Message 994250 - Posted: 5 May 2010, 17:09:44 UTC - in response to Message 994188.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

Firstly you should report completed tasks. You can check availability of the upload server
...

The 210 tasks dahls has are "Ready to report" so they have already been uploaded successfully. Communications with the upload handler are done, it's communicating with the Scheduler that's failing.
Joe

Profile Leopoldo
Volunteer tester
Avatar
Send message
Joined: 4 Aug 99
Posts: 102
Credit: 2,888,039
RAC: 91
Russia
Message 994254 - Posted: 5 May 2010, 17:43:18 UTC - in response to Message 994250.

The 210 tasks dahls has are "Ready to report" so they have already been uploaded successfully. Communications with the upload handler are done, it's communicating with the Scheduler that's failing.

Thx, Josef. I missed this.

@dahls: Jørn, you can visit scheduler URL http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi from machine under question?

Normal answer looks like:

<scheduler_reply> <scheduler_version>611</scheduler_version> <master_url>http://setiathome.berkeley.edu/</master_url> <request_delay>11.000000</request_delay> <message priority="low">Error in request message: no start tag </message> <project_name>SETI@home</project_name> </scheduler_reply>

In my opinion it seems like there is something wrong at the server side.

To determine this, look into server answer after the uploading attempt (file "sched_reply_setiathome.berkeley.edu.xml")
Normal acknowledgement for completed task looks like:

... <result_ack> <name>11dc06ag.26523.16023.4.10.112_0</name> </result_ack> ...

Excuse me please, I never saw rejecting server answer, can't tell which it looks like.

Should I delete everything on the triple core machine and then reinstall BOINC again?

IMHO, in case of such big uploading troubles, which makes crunching at this machine useless, this action seems normal from my point of view...

Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Something wrong somwhere?

Copyright © 2014 University of California