Work fetch anomaly


log in

Advanced search

Message boards : Number crunching : Work fetch anomaly

Author Message
Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8549
Credit: 50,324,109
RAC: 50,147
United Kingdom
Message 853921 - Posted: 15 Jan 2009, 20:46:19 UTC

No, not CUDA - relax: this is a nice boring 2.0GHz P4, single core, SSE2 workhorse running BOINC v5.10.13 (as it has for years).

Just spotted this sequence of messages from yesterday:

14/01/2009 13:54:23||Running CPU benchmarks 14/01/2009 13:54:23||Suspending computation - running CPU benchmarks 14/01/2009 13:54:55||Benchmark results: 14/01/2009 13:54:55|| Number of CPUs: 1 14/01/2009 13:54:55|| 1036 floating point MIPS (Whetstone) per CPU 14/01/2009 13:54:55|| 1640 integer MIPS (Dhrystone) per CPU 14/01/2009 13:54:56||Resuming computation 14/01/2009 13:54:57|Einstein@Home|Resuming task h1_1097.70_S5R4__713_S5R4a_2 using einstein_S5R4 version 610 14/01/2009 15:00:39|SETI@home|Resuming task ap_16no08ab_B5_P0_00225_20090109_17418.wu_0 using astropulse version 500 ... 15/01/2009 05:55:09|SETI@home|Sending scheduler request: To fetch work 15/01/2009 05:55:09|SETI@home|Requesting 69 seconds of new work 15/01/2009 05:55:14|SETI@home|Scheduler RPC succeeded [server version 607] 15/01/2009 05:55:14|SETI@home|Deferring communication for 11 sec 15/01/2009 05:55:14|SETI@home|Reason: requested by project 15/01/2009 05:55:16|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.244 15/01/2009 05:55:16|SETI@home|[file_xfer] Started download of file ap_02no08ab_B5_P1_00017_20090114_31303.wu 15/01/2009 05:55:23|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.244 15/01/2009 05:55:23|SETI@home|[file_xfer] Throughput 70130 bytes/sec 15/01/2009 05:55:23|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P0_00191_20090114_29537.wu 15/01/2009 05:56:56|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B5_P1_00017_20090114_31303.wu 15/01/2009 05:56:56|SETI@home|[file_xfer] Throughput 86135 bytes/sec 15/01/2009 05:56:56|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P1_00099_20090114_30525.wu 15/01/2009 05:57:00|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P0_00191_20090114_29537.wu 15/01/2009 05:57:00|SETI@home|[file_xfer] Throughput 87255 bytes/sec 15/01/2009 05:57:00|SETI@home|[file_xfer] Started download of file ap_02no08ab_B5_P1_00014_20090114_31303.wu 15/01/2009 05:58:29|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P1_00099_20090114_30525.wu 15/01/2009 05:58:29|SETI@home|[file_xfer] Throughput 90566 bytes/sec 15/01/2009 05:58:29|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P1_00098_20090114_30525.wu 15/01/2009 05:58:37|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B5_P1_00014_20090114_31303.wu 15/01/2009 05:58:37|SETI@home|[file_xfer] Throughput 87265 bytes/sec 15/01/2009 05:58:37|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.232 15/01/2009 05:58:47|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.232 15/01/2009 05:58:47|SETI@home|[file_xfer] Throughput 40189 bytes/sec 15/01/2009 05:58:47|SETI@home|[file_xfer] Started download of file 08no08ae.17951.2526.5.8.247 15/01/2009 05:58:54|SETI@home|[file_xfer] Finished download of file 08no08ae.17951.2526.5.8.247 15/01/2009 05:58:54|SETI@home|[file_xfer] Throughput 53030 bytes/sec 15/01/2009 05:58:54|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.238 15/01/2009 05:59:05|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.238 15/01/2009 05:59:05|SETI@home|[file_xfer] Throughput 38955 bytes/sec 15/01/2009 05:59:28|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P1_00098_20090114_30525.wu 15/01/2009 05:59:28|SETI@home|[file_xfer] Throughput 142266 bytes/sec

So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....


Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3246
Credit: 31,800,835
RAC: 3,467
Netherlands
Message 853927 - Posted: 15 Jan 2009, 21:00:47 UTC - in response to Message 853921.

Hi , yes that is quite optimistic and maybe 1 of the reasons, why BOINC 6.4.5. is written.
I now use it more then 1 month and it is quite realistic about the amount off work to fetch. The cache setting seems to be less influential.
Although a get quite a lot of AP WU's, it never ran out of work or gave me to much, so I won't make the deadline, or goes into High Priority Mode.

____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 853939 - Posted: 15 Jan 2009, 21:31:26 UTC - in response to Message 853921.


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8549
Credit: 50,324,109
RAC: 50,147
United Kingdom
Message 853951 - Posted: 15 Jan 2009, 22:05:45 UTC - in response to Message 853939.


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

<time_stats> <on_frac>0.994240</on_frac> <connected_frac>-1.000000</connected_frac> <active_frac>0.999917</active_frac> <cpu_efficiency>0.919307</cpu_efficiency> <last_update>1232056125.233749</last_update> </time_stats>

efficiency is low because this is my BoincView logger, primary browser, email client - heavy use daily driver. All normal.

SETI DCF is 0.190565 (client_state), 0.1906 (BV) - absolutely normal (I tend to run one optimisation step beyond current release, as part of Lunatics testing program)

Einstein has 16-18 hours still to run on new S5R5, but looks normal. No extra WUs cached - I think I'd better stop work fetch on both projects while this lot work through. Einstein fetched that task lunchtime yesterday (14/01/2009 12:12:56|Einstein@Home|Requesting 11 seconds of new work), and started it lunchtime today (15/01/2009 13:28:48|Einstein@Home|Starting task h1_0589.80_S5R4__701_S5R5a_1 using einstein_S5R5 version 301) - exactly in line with my 1-day cache setting.

As you might guess, I know this system very well indeed, and I've always known what to expect from BOINC running on it. This event is outside all previous experience, and it happened while I was asleep in bed - missed all the fun, darn it!

And before anyone asks, no, it isn't running in high priority (EDF) - just quietly getting on with things, exactly as BOINC should.

Alinator
Volunteer tester
Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 853956 - Posted: 15 Jan 2009, 22:17:10 UTC - in response to Message 853951.

Hmmmm...

Well, the metrics look good, although I would have expected them to be a little closer to my Northie 2.66 benchmark wise.

So I'm going to have to say that since the request was appropriate for the host, the project grossly screwed up in calculating how many aggregate tasks that represented for some reason. After all, it's the one which makes that determination. ;-)

Makes a good argument for why carrying a whopping big cache can be a bad idea though. :-D

Alinator

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8549
Credit: 50,324,109
RAC: 50,147
United Kingdom
Message 853963 - Posted: 15 Jan 2009, 22:27:05 UTC

Sorry, should have posted a link for the host: 1791152. Yep, that creation date of 26 Nov 2005 14:22:39 UTC is absolutely genuine - it was one of the first hosts I migrated from Classic, when I got the closedown circular. I actually gave it a bit of a spring-clean last weekend, upgraded the RAM from 512MB to 1GB (first hardware upgrade since I bought it), and finally got round to installing SP3 for XP. Not that that would make any difference.

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 854009 - Posted: 16 Jan 2009, 1:03:42 UTC - in response to Message 853951.


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

<time_stats> <on_frac>0.994240</on_frac> <connected_frac>-1.000000</connected_frac> <active_frac>0.999917</active_frac> <cpu_efficiency>0.919307</cpu_efficiency> <last_update>1232056125.233749</last_update> </time_stats>

efficiency is low because this is my BoincView logger, primary browser, email client - heavy use daily driver. All normal.

SETI DCF is 0.190565 (client_state), 0.1906 (BV) - absolutely normal (I tend to run one optimisation step beyond current release, as part of Lunatics testing program)

Einstein has 16-18 hours still to run on new S5R5, but looks normal. No extra WUs cached - I think I'd better stop work fetch on both projects while this lot work through. Einstein fetched that task lunchtime yesterday (14/01/2009 12:12:56|Einstein@Home|Requesting 11 seconds of new work), and started it lunchtime today (15/01/2009 13:28:48|Einstein@Home|Starting task h1_0589.80_S5R4__701_S5R5a_1 using einstein_S5R5 version 301) - exactly in line with my 1-day cache setting.

As you might guess, I know this system very well indeed, and I've always known what to expect from BOINC running on it. This event is outside all previous experience, and it happened while I was asleep in bed - missed all the fun, darn it!

And before anyone asks, no, it isn't running in high priority (EDF) - just quietly getting on with things, exactly as BOINC should.

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??
____________

Alinator
Volunteer tester
Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 854028 - Posted: 16 Jan 2009, 1:47:36 UTC - in response to Message 854009.
Last modified: 16 Jan 2009, 1:57:17 UTC

Simply put, connected_frac was broken through at least 5.10.38. As far as I know, it still doesn't work.

In any event, I don't see what effect it should have for a CPU intensive project, although I would guess it would spoff a network intensive one pretty well.

Alinator

Ingleside
Volunteer developer
Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 4,307,666
RAC: 5,352
Norway
Message 854147 - Posted: 16 Jan 2009, 9:44:06 UTC - in response to Message 854009.

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??

connected_frac = -1 means "I don't know how often you're connected". This AFAIK depends on how you're connected to the net, and your OS. If not mis-remembers, it's always shown only -1 for me, atleast upto v5.10.45. No idea on v6...

____________
"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8549
Credit: 50,324,109
RAC: 50,147
United Kingdom
Message 854153 - Posted: 16 Jan 2009, 9:58:06 UTC - in response to Message 854147.

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??

connected_frac = -1 means "I don't know how often you're connected". This AFAIK depends on how you're connected to the net, and your OS. If not mis-remembers, it's always shown only -1 for me, atleast upto v5.10.45. No idea on v6...

Most of my machines are on routers, so have a 100% ethernet connection at least for the first hop. I do have one machine on a DSL modem on a rather poor line, which keeps dropping out.

The connected frac for that machine, with BOINC v6.2.19, is showing as 64.4344%, which is probably about right for when it's switched on (part-time cruncher).

Message boards : Number crunching : Work fetch anomaly

Copyright © 2014 University of California