Old (BOINC?) bug maybe?

Message boards : Number crunching : Old (BOINC?) bug maybe?
Message board moderation

To post messages, you must log in.

AuthorMessage
Cavalary

Send message
Joined: 15 Jul 99
Posts: 104
Credit: 7,507,548
RAC: 38
Romania
Message 1703315 - Posted: 20 Jul 2015, 13:48:52 UTC

With no net access for hours today in the sense of a problem past the first node of my ISP, so was connected, not instant errors, but everything (starting from DNS requests) timed out, I managed to identify the cause of an odd behavior I have noticed before, I assume in similar situations, but was never entirely sure why it happened. I'm sure it's well known and assume it existed for years, if not all along, but the issue is that while trying to connect and the DNS request timing out (or maybe all the time, but in case of instant no connection errors it's hardly noticeable), BOINC seems to rather lock other procedures while waiting for a response, with the most noticeable side-effect being that task timers are frozen though processing continues, and you end up with CPU time higher than elapsed time. (Server sorts it out by listing elapsed = CPU when task's reported apparently.) Also noticed I can't view log during that time, if I try to open it while it's waiting for response it's just a blank box for that period.

Minor issue I guess, and old, and with the recent funding announcements I doubt much will be done even about bigger ones, but just reminding people of it.

(Also, got another blank stderr after full processing last night. Been a while. Wingman doesn't help, being an autocorr overflow from a host with many (or at least many inconclusives).)
ID: 1703315 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1703373 - Posted: 20 Jul 2015, 16:01:12 UTC - in response to Message 1703315.  

With no net access for hours today in the sense of a problem past the first node of my ISP, so was connected, not instant errors, but everything (starting from DNS requests) timed out, I managed to identify the cause of an odd behavior I have noticed before, I assume in similar situations, but was never entirely sure why it happened. I'm sure it's well known and assume it existed for years, if not all along, but the issue is that while trying to connect and the DNS request timing out (or maybe all the time, but in case of instant no connection errors it's hardly noticeable), BOINC seems to rather lock other procedures while waiting for a response, with the most noticeable side-effect being that task timers are frozen though processing continues, and you end up with CPU time higher than elapsed time. (Server sorts it out by listing elapsed = CPU when task's reported apparently.) Also noticed I can't view log during that time, if I try to open it while it's waiting for response it's just a blank box for that period.

From what I've seen of BOINC over the years, that does sound plausible - but it's going to be mighty hard to reproduce it under controlled conditions. Anybody got the kind of ISP line / gateway / router where they could introduce an arbitrary delay in the connection? I think mine just goes to 'immediate failure' with no wasted time.

Minor issue I guess, and old, and with the recent funding announcements I doubt much will be done even about bigger ones, but just reminding people of it.

Actually, just at the moment, they're doing quite a lot of tidying up of a new version for imminent release - in about another week, if testing goes well. This would be a good time to get in another bugfix, if the details can be firmed up in time. I managed to get them to cure one a couple of months ago where the screen updates were frozen for 12-13 seconds at a time (turned out they were trying to rename an optional file which didn't exist...)

(Also, got another blank stderr after full processing last night. Been a while. Wingman doesn't help, being an autocorr overflow from a host with many (or at least many inconclusives).)

You might find that the blank stderr problem (actually, incomplete stderr report to server) might be eased by using BOINC v7.6.6 - made available for public testing last week. That cured the same problem at Milkyway@Home.

http://boinc.berkeley.edu/download_all.php
ID: 1703373 · Report as offensive
Cavalary

Send message
Joined: 15 Jul 99
Posts: 104
Credit: 7,507,548
RAC: 38
Romania
Message 1703951 - Posted: 22 Jul 2015, 11:55:34 UTC

Off the thread's main topic, interesting, that task with the (apparently?) blank stderr validated. Guessing the result contained the actual data, just not the public readable part?
ID: 1703951 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1703979 - Posted: 22 Jul 2015, 14:38:16 UTC

Correct - the stderr file is a very brief summary of the actual results file.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1703979 · Report as offensive

Message boards : Number crunching : Old (BOINC?) bug maybe?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.