Something wrong somwhere?


log in

Advanced search

Message boards : Number crunching : Something wrong somwhere?

Previous · 1 · 2 · 3 · Next
Author Message
dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 993309 - Posted: 1 May 2010, 19:03:59 UTC - in response to Message 992765.

It worked! :)

The only "problem" is that this particular machine has a daily quota of 1 tast per day now. I hope it will change within days.
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 993993 - Posted: 4 May 2010, 10:18:14 UTC - in response to Message 993311.

Hasn't increased yet. There must be other things that is wrong too.

I got a triple core AMD prosessor running Fedora Core 12. it has stopped working. It seems like it got a lot of downloaded taskt, but it won't process them.

When running the client in a shell windows, this is what I get:

$ ./run_client
04-May-2010 12:08:30 [---] Starting BOINC client version 6.10.17 for x86_64-pc-linux-gnu
04-May-2010 12:08:30 [---] log flags: file_xfer, sched_ops, task
04-May-2010 12:08:30 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
04-May-2010 12:08:30 [---] Data directory: /home/dahls/BOINC
04-May-2010 12:08:30 [---] Processor: 3 AuthenticAMD AMD Athlon(tm) II X3 435 Processor [Family 16 Model 5 Stepping 2]
04-May-2010 12:08:30 [---] Processor: 512.00 KB cache
04-May-2010 12:08:30 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy
04-May-2010 12:08:30 [---] OS: Linux: 2.6.31.5-127.fc12.x86_64
04-May-2010 12:08:30 [---] Memory: 7.78 GB physical, 9.77 GB virtual
04-May-2010 12:08:30 [---] Disk: 273.47 GB total, 253.49 GB free
04-May-2010 12:08:30 [---] Local time is UTC +2 hours
04-May-2010 12:08:30 [---] No usable GPUs found
04-May-2010 12:08:30 [---] Not using a proxy
04-May-2010 12:08:30 [SETI@home] URL http://setiathome.berkeley.edu/; Computer ID 5256434; resource share 100
04-May-2010 12:08:30 [SETI@home] General prefs: from SETI@home (last modified 11-Nov-2008 20:08:21)
04-May-2010 12:08:30 [SETI@home] Host location: none
04-May-2010 12:08:30 [SETI@home] General prefs: using your defaults
04-May-2010 12:08:30 [---] Preferences limit memory usage when active to 3985.84MB
04-May-2010 12:08:30 [---] Preferences limit memory usage when idle to 7174.51MB
04-May-2010 12:08:30 [---] Preferences limit disk usage to 100.00GB
BOINC initialization completed, beginning process execution...



In the project directory:
$ pwd
~/BOINC/projects/setiathome.berkeley.edu
$ ls
01dc06af.12808.7025.3.10.230 11dc06ag.30461.11115.12.10.18 13ja07af.27101.19918.5.10.147 28dc06aa.14595.18791.11.10.81 30dc06aa.15938.10296.4.10.241 31dc06aa.16264.11115.13.10.42
01dc06af.15276.3344.10.10.86 11dc06ag.32673.17250.5.10.16 13ja07af.27101.19918.5.10.155 28dc06aa.20809.15519.13.10.26 30dc06aa.15938.11114.4.10.44 31dc06aa.16264.29759.13.10.184
01dc06af.4151.21749.16.10.7 11dc06ag.6732.21749.13.10.235 13ja07af.27101.19918.5.10.161 28dc06aa.20809.20427.13.10.48 30dc06aa.15938.1707.4.10.1 31dc06aa.19061.28941.3.10.230
01dc06ag.18093.5389.7.10.42 11ja07ag.8471.5389.5.10.206 13ja07af.27101.19918.5.10.167 28dc06aa.24076.18382.12.10.146 30dc06aa.20838.20112.5.10.211 31dc06aa.20341.34258.5.10.11
01dc06ag.22446.72.11.10.160 11ja07ah.18477.11524.6.10.147 13ja07af.27101.22781.5.10.40 28dc06aa.24076.18382.12.10.149 30dc06aa.20838.20112.5.10.225 31dc06aa.23487.38348.4.10.27
01dc06ag.22941.47902.4.10.111 11ja07ah.18477.11524.6.10.153 13ja07af.27101.30552.5.10.129 28dc06aa.24076.18382.12.10.152 30dc06aa.20838.20112.5.10.227 31dc06aa.28373.34258.11.10.119
01dc06ag.29830.4571.8.10.91 11ja07ah.18477.11524.6.10.159 13ja07af.30754.18282.3.10.22 28dc06aa.24076.18382.12.10.168 30dc06aa.20838.20112.5.10.231 31dc06aa.28373.34258.11.10.68
01dc06ag.29830.5389.8.10.229 11ja07ah.18477.11524.6.10.212 13ja07af.30754.30143.3.10.163 28dc06aa.24076.18382.12.10.169 30dc06aa.24167.9478.3.10.112 31dc06aa.28373.5389.11.10.179
01dc06ag.30505.57718.13.10.165 11ja07ai.19083.16023.3.10.106 13ja07af.7325.2526.9.10.95 28dc06aa.24076.18382.12.10.171 30no06ae.12730.4980.14.10.160 31dc06aa.28373.5389.11.10.181
05dc06af.10057.1709.9.10.193 11ja07ai.19083.22567.3.10.68 13ja07af.7325.26053.9.10.101 28dc06aa.24076.18382.12.10.172 30no06ae.1495.11933.10.10.80 31dc06aa.28388.5389.12.10.139
05dc06af.11020.177218.15.10.126 11ja07ai.20369.2117.13.10.171 13ja07af.7325.3344.9.10.18 28dc06aa.24076.18382.12.10.175 30no06ae.1495.12342.10.10.87 31dc06aa.28388.5389.12.10.141
05dc06af.11020.181717.15.10.176 11ja07ai.20394.13160.14.10.226 20fe07ag.18503.16452.16.10.145 28dc06aa.24076.18791.12.10.6 30no06ae.1495.21749.10.10.28 31dc06aa.28388.5389.12.10.145
05dc06af.4533.4163.8.10.243 11ja07ai.21793.24612.15.10.48 20fe07ah.6366.481.8.10.19 28dc06aa.28243.21245.16.10.178 30no06ae.1495.22158.10.10.244 31dc06aa.28388.5389.12.10.149
10fe07af.6882.89953.3.10.68 11ja07ai.24780.23385.10.10.36 20fe07ah.9772.8252.4.10.234 28dc06aa.28243.4162.16.10.127 30no06ae.18310.9479.8.10.74 31dc06aa.28518.9888.15.10.5
10fe07af.6882.91998.3.10.18 11ja07ai.27321.9888.5.10.179 24ja07ad.3787.8661.3.10.206 28dc06aa.28243.4162.16.10.50 30no06ae.28350.14796.11.10.234 31dc06aa.30435.13569.7.10.212
10fe07af.6882.91998.3.10.22 11ja07ai.28173.16841.16.10.192 24ja07af.16290.85548.13.10.39 28dc06aa.28243.4162.16.10.52 30no06ae.28350.14796.11.10.240 31dc06aa.30435.28532.7.10.58
10fe07af.6882.92407.3.10.115 11mr07ae.24796.198076.9.10.21 24ja07af.9677.11933.4.10.129 28dc06aa.28243.4162.16.10.54 30no06ae.28350.14796.11.10.246 31dc06aa.30435.33440.7.10.157
11dc06af.11543.4162.9.10.46 11mr07ae.24796.198076.9.10.28 27dc06af.1247.19490.6.10.115 28dc06aa.28243.4162.16.10.61 30no06ae.28350.9479.11.10.53 31dc06aa.30435.4571.7.10.75
11dc06af.20120.24612.4.10.189 12ja07ac.11279.9888.6.10.33 27dc06af.16907.20308.4.10.45 28dc06aa.28243.4162.16.10.66 30no06ae.8383.20113.9.10.123 31dc06aa.31301.33849.8.10.105
11dc06af.23021.18477.7.10.240 12ja07af.10123.481.15.10.1 27dc06af.16907.28897.4.10.4 28dc06aa.28243.4162.16.10.67 30no06ae.8383.20113.9.10.79 31dc06aa.31301.8252.8.10.89
11dc06af.23021.21749.7.10.102 12ja07af.14268.23794.7.10.57 27dc06af.16907.28897.4.10.55 28dc06aa.28243.4162.16.10.68 30no06af.11000.1708.4.10.137 arecibo_181.png
11dc06af.23021.21749.7.10.174 12ja07af.14268.7843.7.10.79 27dc06af.25121.4980.7.10.155 28dc06aa.28243.4162.16.10.69 30no06af.11000.1708.4.10.138 sah_40.png
11dc06af.23021.21749.7.10.87 12ja07af.24270.8252.12.10.104 27dc06af.27652.18263.3.10.7 28dc06aa.28243.4162.16.10.71 30no06af.11000.1708.4.10.143 sah_banner_290.png
11dc06af.23021.21749.7.10.90 12ja07af.24295.18477.13.10.4 27ja07ag.14237.23385.15.10.82 28dc06aa.29509.8661.10.10.103 30no06af.11000.1708.4.10.151 sah_ss_290.png
11dc06af.23021.21749.7.10.94 12ja07af.27550.16432.8.10.166 27ja07ag.21227.16023.9.10.93 28dc06aa.29509.8661.10.10.88 30no06af.3017.20522.3.10.161 seti_528.jpg
11dc06af.23021.21749.7.10.99 12ja07af.27550.1708.8.10.146 27ja07ag.21227.17659.9.10.204 28dc06aa.29509.9479.10.10.120 30no06af.3017.58003.3.10.114 setiathome-5.28_AUTHORS
11dc06af.23021.7434.7.10.22 12ja07af.28050.20113.9.10.126 27ja07ag.21227.22158.9.10.28 28dc06aa.5555.11838.15.10.239 31dc06aa.14600.481.9.10.123 setiathome-5.28_COPYING
11dc06af.23021.7434.7.10.28 13ja07af.23583.17055.8.10.102 27ja07ag.24649.14387.3.10.107 28dc06aa.5555.13474.15.10.210 31dc06aa.14600.481.9.10.129 setiathome-5.28_COPYRIGHT
11dc06af.23021.7434.7.10.34 13ja07af.23583.17055.8.10.120 27ja07ag.24649.14387.3.10.68 28dc06aa.5555.21245.15.10.5 31dc06aa.14600.481.9.10.141 setiathome-5.28_README
11dc06af.23641.22976.8.10.103 13ja07af.23583.17055.8.10.153 27ja07ag.2476.23794.13.10.92 28dc06aa.7971.16746.8.10.212 31dc06aa.14600.481.9.10.147 setiathome-5.28.x86_64-pc-linux-gnu
11dc06af.23641.22976.8.10.111 13ja07af.23583.17055.8.10.165 27ja07ag.3797.6207.14.10.144 28dc06aa.7971.23290.8.10.96 31dc06aa.14600.481.9.10.153 slideshow_setiathome_enhanced_00
11dc06af.26961.1708.11.10.19 13ja07af.23583.17055.8.10.176 27ja07ag.4117.18886.10.10.250 29dc06aa.8964.12342.10.10.28 31dc06aa.14600.481.9.10.159 slideshow_setiathome_enhanced_01
11dc06ag.14973.20931.8.10.162 13ja07af.23583.17055.8.10.81 27ja07ag.7296.15614.4.10.212 29dc06af.6854.1299.12.10.56 31dc06aa.14600.4980.9.10.83 slideshow_setiathome_enhanced_02
11dc06ag.19076.24203.9.10.231 13ja07af.23583.17055.8.10.84 28dc06aa.14283.15519.14.10.180 29dc06af.6854.1299.12.10.59 31dc06aa.15993.30986.10.10.157 stat_icon
11dc06ag.20225.5389.10.10.190 13ja07af.23583.17055.8.10.87 28dc06aa.14283.15519.14.10.192 29dc06af.6854.1299.12.10.71 31dc06aa.15993.30986.10.10.19
11dc06ag.20319.13160.3.10.84 13ja07af.23583.17055.8.10.90 28dc06aa.14283.15519.14.10.201 29dc06af.6854.1299.12.10.72 31dc06aa.15993.30986.10.10.228
11dc06ag.20319.7843.3.10.83 13ja07af.23583.17055.8.10.96 28dc06aa.14595.14292.11.10.102 29dc06af.6854.2117.12.10.203 31dc06aa.15993.30986.10.10.76
11dc06ag.26523.2526.4.10.208 13ja07af.23583.17055.8.10.99 28dc06aa.14595.14292.11.10.107 29mr07af.31874.24612.12.10.154 31dc06aa.16264.11115.13.10.30



I also tried a

./boinccmd --set_run_mode always


but that does not seem to help. I did this since the first machine (the FC4) I had problem with began to suspend boinc while I was working on it (even the setup say it should always run). This worked on the FC4 machine, but does not seem to work on the FC12 machine.
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 993994 - Posted: 4 May 2010, 10:32:46 UTC - in response to Message 993993.

BTW, here is a result of 'boinccmd --get_state':
http://www.dahl-stamnes.net/dahls/Ymse/boinc.state.html
____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 358,297
RAC: 25
Germany
Message 993999 - Posted: 4 May 2010, 11:10:04 UTC - in response to Message 993994.

According to your BOINC state, you have 210 tasks ready to report. You should do a
./boinccmd --project http://setiathome.berkeley.edu/ update

What about error messages in stdoutdae?

Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1140
Credit: 3,721,419
RAC: 4,182
United Kingdom
Message 994002 - Posted: 4 May 2010, 11:15:55 UTC

got a http about an hour ago, is there problems again with one of the two download machine I think the .13 etc?
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 994021 - Posted: 4 May 2010, 14:25:41 UTC - in response to Message 993999.

According to your BOINC state, you have 210 tasks ready to report. You should do a
./boinccmd --project http://setiathome.berkeley.edu/ update

What about error messages in stdoutdae?

Gruß,
Gundolf


Doing the command above I get:
can't connect to local host.

If I do a run_client in a shell I get:

4-May-2010 12:49:51 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 12:49:51 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 12:55:02 [---] Project communication failed: attempting access to reference site
04-May-2010 12:55:02 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 12:55:05 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 15:46:42 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 15:46:42 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 15:51:54 [---] Project communication failed: attempting access to reference site
04-May-2010 15:51:57 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 15:51:58 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 15:52:59 [SETI@home] Fetching scheduler list
04-May-2010 15:53:04 [SETI@home] Master file download succeeded
04-May-2010 15:53:09 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 15:53:09 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 15:58:20 [---] Project communication failed: attempting access to reference site
04-May-2010 15:58:20 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 15:58:23 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 15:59:20 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 15:59:20 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 16:04:30 [---] Project communication failed: attempting access to reference site
04-May-2010 16:04:30 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 16:04:34 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 16:05:31 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 16:05:31 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 16:10:42 [---] Project communication failed: attempting access to reference site
04-May-2010 16:10:42 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 16:10:45 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 16:11:42 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 16:11:42 [SETI@home] Reporting 210 completed tasks, requesting new tasks
04-May-2010 16:16:53 [---] Project communication failed: attempting access to reference site
04-May-2010 16:16:53 [SETI@home] Scheduler request failed: Timeout was reached
04-May-2010 16:16:56 [---] Internet access OK - project servers may be temporarily down.
04-May-2010 16:17:53 [SETI@home] Sending scheduler request: To report completed tasks.
04-May-2010 16:17:53 [SETI@home] Reporting 210 completed tasks, requesting new tasks
^C04-May-2010 16:20:35 [---] Received signal 2
04-May-2010 16:20:36 [---] Exit requested by user


____________

Ralf Haziak
Volunteer tester
Send message
Joined: 15 Nov 02
Posts: 14
Credit: 597,982
RAC: 48
Germany
Message 994086 - Posted: 4 May 2010, 23:08:02 UTC - in response to Message 994021.
Last modified: 4 May 2010, 23:10:38 UTC

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

Every Tuesday morning (Pacific time) we have a 3-4 hour outage for database maintenance.

____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4245
Credit: 1,047,369
RAC: 275
United States
Message 994092 - Posted: 4 May 2010, 23:17:55 UTC - in response to Message 994086.

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

That should give a "Project is down" message in reply, the indication is the Scheduler requests aren't getting any kind of connection. There's a successful "master file" get, which ensures the host has the correct host name of the Scheduler. Possibly it's a DNS problem where the host name isn't being translated to the right IP address.
Joe

Ralf Haziak
Volunteer tester
Send message
Joined: 15 Nov 02
Posts: 14
Credit: 597,982
RAC: 48
Germany
Message 994099 - Posted: 4 May 2010, 23:40:33 UTC - in response to Message 994092.
Last modified: 4 May 2010, 23:47:54 UTC

only asking cause its thuesday and saw a few hours ago also
a timeout msg and then the internet connection test afterwards
in my logs

ofc it could be chance that i got same msgs today

btw are old msgs somewhere stored so i could chk if it was same timeframe?
____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 994160 - Posted: 5 May 2010, 6:18:36 UTC - in response to Message 994092.

stupid question:
are you trying to send the wu's during the weekly maintaince? ;-)

That should give a "Project is down" message in reply, the indication is the Scheduler requests aren't getting any kind of connection. There's a successful "master file" get, which ensures the host has the correct host name of the Scheduler. Possibly it's a DNS problem where the host name isn't being translated to the right IP address.
Joe


I've checked my DNS - no error messages are being reported (all machines are using the same DNS server).

I also checked my firewall, the XP machine, where things seem to be working OK (1) is contacting the same IP as the linux machines, where things does not work properly.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

1: Currently all machines are reporting "Project has no jobs available".
____________

Profile stephen Goodyer
Send message
Joined: 8 Oct 06
Posts: 37
Credit: 263,141
RAC: 127
United Kingdom
Message 994171 - Posted: 5 May 2010, 7:21:26 UTC - in response to Message 994166.
Last modified: 5 May 2010, 7:54:32 UTC

On the status page it says that lando and bamdi are not working, could this be the problem? because i can't get any wu's.

Oh well it gives me more time to plant some lettuce and onions, i'm not going to make a meal out of it.

Profile Leopoldo
Volunteer tester
Avatar
Send message
Joined: 4 Aug 99
Posts: 102
Credit: 2,883,647
RAC: 101
Russia
Message 994188 - Posted: 5 May 2010, 9:31:59 UTC - in response to Message 994160.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

Firstly you should report completed tasks. You can check availability of the upload server (208.68.240.16) by visiting the URL http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler with your favorite browser from this triple core machine.

Normal answer looks like:

<data_server_reply> <status>1</status> <message>no command</message> </data_server_reply>

If it does, check your uploading (completed tasks reported through the file "sched_request_setiathome.berkeley.edu.xml", so maybe this file can't be uploaded due to size)

If it doesn't, change URL with direct numbers instead of symbolic name and repeat visit.

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 994196 - Posted: 5 May 2010, 12:28:33 UTC - in response to Message 994188.


Firstly you should report completed tasks. You can check availability of the upload server (208.68.240.16) by visiting the URL http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler with your favorite browser from this triple core machine.


That would be wget, since this machine does not have any screen, keyboard or mouse ;)

And the answer looks like you said:


<data_server_reply> <status>1</status> <message>no command</message> </data_server_reply>

If it does, check your uploading (completed tasks reported through the file "sched_request_setiathome.berkeley.edu.xml", so maybe this file can't be uploaded due to size)

If it doesn't, change URL with direct numbers instead of symbolic name and repeat visit.


I checked the file, but I'm not sure what to look for. I see there is a lot of messages like:

Work Unit Info:
...............
WU true angle range is : 1.388434
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrum 0.00023 0.00000
v_ChirpData 0.01180 0.00000
v_vTranspose4x16ntw 0.00435 0.00000
AK SSE folding 0.00074 0.00000
Unrecognized XML in parse_init_data_file: hostid
Skipping: 5256434
Skipping: /hostid
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 4486.795998
Skipping: /starting_elapsed_time
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1272251169.000000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active
Skipping: 0
Skipping: /run_gpu_if_user_active
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 100.000000
Skipping: /max_ncpus_pct
Unrecognized XML in parse_init_data_file: hostid
Skipping: 5256434
Skipping: /hostid
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 4486.795998
Skipping: /starting_elapsed_time
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1272251169.000000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active
Skipping: 0
Skipping: /run_gpu_if_user_active
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 100.000000
Skipping: /max_ncpus_pct
Restarted at 95.62 percent.

Flopcounter: 11763840604466.185547


____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 994202 - Posted: 5 May 2010, 13:40:17 UTC - in response to Message 994175.

There is only one dataset left loaded to split...so very few WUs are being generated compared to the usual number...work will be very hard to come by until some more datasets are loaded.

I just inquired if anybody knows why in the Panic Mode thread.....


Are you telling us that we are running out of work? That even if everything is working normal, there will be no more work sets to download?

____________

dahls
Send message
Joined: 24 Oct 04
Posts: 122
Credit: 26,811,017
RAC: 39,770
Norway
Message 994243 - Posted: 5 May 2010, 16:42:23 UTC - in response to Message 994202.

The first FC4 machine, which I had problem with can get new works now. It get one work set and, when it's finished it upload it and get a new one without having to wait.

At the same time the triple core FC12 machine is not able to report anything nor get any new work sets. Many of the work set that it want to upload is about to time out or has timed out.

In my opinion it seems like there is something wrong at the server side.

Should I delete everything on the triple core machine and then reinstall BOINC again?
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4245
Credit: 1,047,369
RAC: 275
United States
Message 994250 - Posted: 5 May 2010, 17:09:44 UTC - in response to Message 994188.

The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

Firstly you should report completed tasks. You can check availability of the upload server
...

The 210 tasks dahls has are "Ready to report" so they have already been uploaded successfully. Communications with the upload handler are done, it's communicating with the Scheduler that's failing.
Joe

Profile Leopoldo
Volunteer tester
Avatar
Send message
Joined: 4 Aug 99
Posts: 102
Credit: 2,883,647
RAC: 101
Russia
Message 994254 - Posted: 5 May 2010, 17:43:18 UTC - in response to Message 994250.

The 210 tasks dahls has are "Ready to report" so they have already been uploaded successfully. Communications with the upload handler are done, it's communicating with the Scheduler that's failing.

Thx, Josef. I missed this.

@dahls: Jørn, you can visit scheduler URL http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi from machine under question?

Normal answer looks like:

<scheduler_reply> <scheduler_version>611</scheduler_version> <master_url>http://setiathome.berkeley.edu/</master_url> <request_delay>11.000000</request_delay> <message priority="low">Error in request message: no start tag </message> <project_name>SETI@home</project_name> </scheduler_reply>

In my opinion it seems like there is something wrong at the server side.

To determine this, look into server answer after the uploading attempt (file "sched_reply_setiathome.berkeley.edu.xml")
Normal acknowledgement for completed task looks like:

... <result_ack> <name>11dc06ag.26523.16023.4.10.112_0</name> </result_ack> ...

Excuse me please, I never saw rejecting server answer, can't tell which it looks like.

Should I delete everything on the triple core machine and then reinstall BOINC again?

IMHO, in case of such big uploading troubles, which makes crunching at this machine useless, this action seems normal from my point of view...

Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Something wrong somwhere?

Copyright © 2014 University of California