Work fetch anomaly

Message boards : Number crunching : Work fetch anomaly
Message board moderation

To post messages, you must log in.

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 853921 - Posted: 15 Jan 2009, 20:46:19 UTC

No, not CUDA - relax: this is a nice boring 2.0GHz P4, single core, SSE2 workhorse running BOINC v5.10.13 (as it has for years).

Just spotted this sequence of messages from yesterday:

14/01/2009 13:54:23||Running CPU benchmarks
14/01/2009 13:54:23||Suspending computation - running CPU benchmarks
14/01/2009 13:54:55||Benchmark results:
14/01/2009 13:54:55||   Number of CPUs: 1
14/01/2009 13:54:55||   1036 floating point MIPS (Whetstone) per CPU
14/01/2009 13:54:55||   1640 integer MIPS (Dhrystone) per CPU
14/01/2009 13:54:56||Resuming computation
14/01/2009 13:54:57|Einstein@Home|Resuming task h1_1097.70_S5R4__713_S5R4a_2 using einstein_S5R4 version 610
14/01/2009 15:00:39|SETI@home|Resuming task ap_16no08ab_B5_P0_00225_20090109_17418.wu_0 using astropulse version 500
...
15/01/2009 05:55:09|SETI@home|Sending scheduler request: To fetch work
15/01/2009 05:55:09|SETI@home|Requesting 69 seconds of new work
15/01/2009 05:55:14|SETI@home|Scheduler RPC succeeded [server version 607]
15/01/2009 05:55:14|SETI@home|Deferring communication for 11 sec
15/01/2009 05:55:14|SETI@home|Reason: requested by project
15/01/2009 05:55:16|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.244
15/01/2009 05:55:16|SETI@home|[file_xfer] Started download of file ap_02no08ab_B5_P1_00017_20090114_31303.wu
15/01/2009 05:55:23|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.244
15/01/2009 05:55:23|SETI@home|[file_xfer] Throughput 70130 bytes/sec
15/01/2009 05:55:23|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P0_00191_20090114_29537.wu
15/01/2009 05:56:56|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B5_P1_00017_20090114_31303.wu
15/01/2009 05:56:56|SETI@home|[file_xfer] Throughput 86135 bytes/sec
15/01/2009 05:56:56|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P1_00099_20090114_30525.wu
15/01/2009 05:57:00|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P0_00191_20090114_29537.wu
15/01/2009 05:57:00|SETI@home|[file_xfer] Throughput 87255 bytes/sec
15/01/2009 05:57:00|SETI@home|[file_xfer] Started download of file ap_02no08ab_B5_P1_00014_20090114_31303.wu
15/01/2009 05:58:29|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P1_00099_20090114_30525.wu
15/01/2009 05:58:29|SETI@home|[file_xfer] Throughput 90566 bytes/sec
15/01/2009 05:58:29|SETI@home|[file_xfer] Started download of file ap_02no08ab_B4_P1_00098_20090114_30525.wu
15/01/2009 05:58:37|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B5_P1_00014_20090114_31303.wu
15/01/2009 05:58:37|SETI@home|[file_xfer] Throughput 87265 bytes/sec
15/01/2009 05:58:37|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.232
15/01/2009 05:58:47|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.232
15/01/2009 05:58:47|SETI@home|[file_xfer] Throughput 40189 bytes/sec
15/01/2009 05:58:47|SETI@home|[file_xfer] Started download of file 08no08ae.17951.2526.5.8.247
15/01/2009 05:58:54|SETI@home|[file_xfer] Finished download of file 08no08ae.17951.2526.5.8.247
15/01/2009 05:58:54|SETI@home|[file_xfer] Throughput 53030 bytes/sec
15/01/2009 05:58:54|SETI@home|[file_xfer] Started download of file 16no08af.913.2526.11.8.238
15/01/2009 05:59:05|SETI@home|[file_xfer] Finished download of file 16no08af.913.2526.11.8.238
15/01/2009 05:59:05|SETI@home|[file_xfer] Throughput 38955 bytes/sec
15/01/2009 05:59:28|SETI@home|[file_xfer] Finished download of file ap_02no08ab_B4_P1_00098_20090114_30525.wu
15/01/2009 05:59:28|SETI@home|[file_xfer] Throughput 142266 bytes/sec

So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....


ID: 853921 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 853927 - Posted: 15 Jan 2009, 21:00:47 UTC - in response to Message 853921.  

Hi , yes that is quite optimistic and maybe 1 of the reasons, why BOINC 6.4.5. is written.
I now use it more then 1 month and it is quite realistic about the amount off work to fetch. The cache setting seems to be less influential.
Although a get quite a lot of AP WU's, it never ran out of work or gave me to much, so I won't make the deadline, or goes into High Priority Mode.

ID: 853927 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 853939 - Posted: 15 Jan 2009, 21:31:26 UTC - in response to Message 853921.  


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

ID: 853939 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 853951 - Posted: 15 Jan 2009, 22:05:45 UTC - in response to Message 853939.  


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

<time_stats>
    <on_frac>0.994240</on_frac>
    <connected_frac>-1.000000</connected_frac>
    <active_frac>0.999917</active_frac>
    <cpu_efficiency>0.919307</cpu_efficiency>
    <last_update>1232056125.233749</last_update>
</time_stats>

efficiency is low because this is my BoincView logger, primary browser, email client - heavy use daily driver. All normal.

SETI DCF is 0.190565 (client_state), 0.1906 (BV) - absolutely normal (I tend to run one optimisation step beyond current release, as part of Lunatics testing program)

Einstein has 16-18 hours still to run on new S5R5, but looks normal. No extra WUs cached - I think I'd better stop work fetch on both projects while this lot work through. Einstein fetched that task lunchtime yesterday (14/01/2009 12:12:56|Einstein@Home|Requesting 11 seconds of new work), and started it lunchtime today (15/01/2009 13:28:48|Einstein@Home|Starting task h1_0589.80_S5R4__701_S5R5a_1 using einstein_S5R5 version 301) - exactly in line with my 1-day cache setting.

As you might guess, I know this system very well indeed, and I've always known what to expect from BOINC running on it. This event is outside all previous experience, and it happened while I was asleep in bed - missed all the fun, darn it!

And before anyone asks, no, it isn't running in high priority (EDF) - just quietly getting on with things, exactly as BOINC should.
ID: 853951 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 853956 - Posted: 15 Jan 2009, 22:17:10 UTC - in response to Message 853951.  

Hmmmm...

Well, the metrics look good, although I would have expected them to be a little closer to my Northie 2.66 benchmark wise.

So I'm going to have to say that since the request was appropriate for the host, the project grossly screwed up in calculating how many aggregate tasks that represented for some reason. After all, it's the one which makes that determination. ;-)

Makes a good argument for why carrying a whopping big cache can be a bad idea though. :-D

Alinator
ID: 853956 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 853963 - Posted: 15 Jan 2009, 22:27:05 UTC

Sorry, should have posted a link for the host: 1791152. Yep, that creation date of 26 Nov 2005 14:22:39 UTC is absolutely genuine - it was one of the first hosts I migrated from Classic, when I got the closedown circular. I actually gave it a bit of a spring-clean last weekend, upgraded the RAM from 512MB to 1GB (first hardware upgrade since I bought it), and finally got round to installing SP3 for XP. Not that that would make any difference.
ID: 853963 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 854009 - Posted: 16 Jan 2009, 1:03:42 UTC - in response to Message 853951.  


So the SETI server thinks my humble P4 can do 5 Astropulse tasks and 4 MB tasks in 69 seconds, all on a single core shared with Einstein????? Boincview has it calculated as a more realistic 25 days. And I only asked for 1 day.

Raistmer and Jason, help! My P4 needs you....

Can you look into your client_state.xml and let us know what the duration correction factor is for SETI?

<time_stats>
    <on_frac>0.994240</on_frac>
    <connected_frac>-1.000000</connected_frac>
    <active_frac>0.999917</active_frac>
    <cpu_efficiency>0.919307</cpu_efficiency>
    <last_update>1232056125.233749</last_update>
</time_stats>

efficiency is low because this is my BoincView logger, primary browser, email client - heavy use daily driver. All normal.

SETI DCF is 0.190565 (client_state), 0.1906 (BV) - absolutely normal (I tend to run one optimisation step beyond current release, as part of Lunatics testing program)

Einstein has 16-18 hours still to run on new S5R5, but looks normal. No extra WUs cached - I think I'd better stop work fetch on both projects while this lot work through. Einstein fetched that task lunchtime yesterday (14/01/2009 12:12:56|Einstein@Home|Requesting 11 seconds of new work), and started it lunchtime today (15/01/2009 13:28:48|Einstein@Home|Starting task h1_0589.80_S5R4__701_S5R5a_1 using einstein_S5R5 version 301) - exactly in line with my 1-day cache setting.

As you might guess, I know this system very well indeed, and I've always known what to expect from BOINC running on it. This event is outside all previous experience, and it happened while I was asleep in bed - missed all the fun, darn it!

And before anyone asks, no, it isn't running in high priority (EDF) - just quietly getting on with things, exactly as BOINC should.

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??
ID: 854009 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 854028 - Posted: 16 Jan 2009, 1:47:36 UTC - in response to Message 854009.  
Last modified: 16 Jan 2009, 1:57:17 UTC

Simply put, connected_frac was broken through at least 5.10.38. As far as I know, it still doesn't work.

In any event, I don't see what effect it should have for a CPU intensive project, although I would guess it would spoff a network intensive one pretty well.

Alinator
ID: 854028 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 854147 - Posted: 16 Jan 2009, 9:44:06 UTC - in response to Message 854009.  

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??

connected_frac = -1 means "I don't know how often you're connected". This AFAIK depends on how you're connected to the net, and your OS. If not mis-remembers, it's always shown only -1 for me, atleast upto v5.10.45. No idea on v6...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 854147 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 854153 - Posted: 16 Jan 2009, 9:58:06 UTC - in response to Message 854147.  

Maybe it's just me, but the "connected frac" looks really odd. Shouldn't it be between 0 and 1, not less than zero??

connected_frac = -1 means "I don't know how often you're connected". This AFAIK depends on how you're connected to the net, and your OS. If not mis-remembers, it's always shown only -1 for me, atleast upto v5.10.45. No idea on v6...

Most of my machines are on routers, so have a 100% ethernet connection at least for the first hop. I do have one machine on a DSL modem on a rather poor line, which keeps dropping out.

The connected frac for that machine, with BOINC v6.2.19, is showing as 64.4344%, which is probably about right for when it's switched on (part-time cruncher).
ID: 854153 · Report as offensive

Message boards : Number crunching : Work fetch anomaly


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.