V7 & cuda

Message boards : Number crunching : V7 & cuda
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1415959 - Posted: 15 Sep 2013, 14:15:54 UTC - in response to Message 1415955.  



On still another matter - those of you that consider those of us that desire to primarily process AP tasks as "AP HOGS", I contend as being wrong and you do not understand the nature of the beast; especially this beast. I would rather spend 13 hrs. working on one type of task, then a couple of hours working on x of another type. Its not how many tasks you process in any 24 hour period, unless you are a "credit king/queen", it's how much science that you do in the same period.


No worries, mate.
It's not like you are cherry picking amongst work sent to you and aborting what you don't like, as some have done in the past. You are simply choosing what you wish to process on your computers with your donated resources.
As do I, and many others on the project.
It's all good.

I know what runs best on my computers......believe me, I do...LOL.
And if I or you or anybody else wishes to set things up to get the most work that runs best on their computers.....go for it! When AP work is available, it's available to all. When it runs out, you can either choose to idle or do some MB work in the interim. No shame or foul in either choice.

Hope you get some answers to your other question.

+1

Cheers.
ID: 1415959 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1416119 - Posted: 15 Sep 2013, 19:25:14 UTC - in response to Message 1415937.  

Switched over to the new app_info this morning and received 64 new cuda50 tasks as expected with no v7 CPU units. Now if only a way can be found to pre-split AP tasks at their origin so that when they get to Berkeley they are on their own tapes and we can have a better supply of them.

On another matter, after the switch I recycled the machine in my normal manner and one of the openCL tasks (3154423812) aborted 194 (0xc2) EXIT_ABORTED_BY_CLIENT. The first couple of lines in the stderr.txt is

<core_client_version>7.2.11</core_client_version>
<![CDATA[
<message>
finish file present too long
</message>
<stderr_txt>

Is there anything I can do so this doesn't happen again, or is it the client?
...

It seems to be the BOINC client combined with Murphy's Law. One of the last things an app does before exiting is write an empty "finish file", and that error indicates the app didn't exit within ten seconds after writing that file. I think the error was likely caused by shutting BOINC down after the file was written but before the app had gotten to its usual exit, although there's only a very brief time period between. IOW, you'd probably never see it happen again even if you tried to duplicate the conditions.

The stderr for that task does show that BOINC actually restarted it after it had already finished, so I'm unsure of that analysis, but IMO that also indicates the problem is in the BOINC client.
                                                                  Joe
ID: 1416119 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1416127 - Posted: 15 Sep 2013, 20:13:18 UTC - in response to Message 1416119.  

Switched over to the new app_info this morning and received 64 new cuda50 tasks as expected with no v7 CPU units. Now if only a way can be found to pre-split AP tasks at their origin so that when they get to Berkeley they are on their own tapes and we can have a better supply of them.

On another matter, after the switch I recycled the machine in my normal manner and one of the openCL tasks (3154423812) aborted 194 (0xc2) EXIT_ABORTED_BY_CLIENT. The first couple of lines in the stderr.txt is

<core_client_version>7.2.11</core_client_version>
<![CDATA[
<message>
finish file present too long
</message>
<stderr_txt>

Is there anything I can do so this doesn't happen again, or is it the client?
...

It seems to be the BOINC client combined with Murphy's Law. One of the last things an app does before exiting is write an empty "finish file", and that error indicates the app didn't exit within ten seconds after writing that file. I think the error was likely caused by shutting BOINC down after the file was written but before the app had gotten to its usual exit, although there's only a very brief time period between. IOW, you'd probably never see it happen again even if you tried to duplicate the conditions.

The stderr for that task does show that BOINC actually restarted it after it had already finished, so I'm unsure of that analysis, but IMO that also indicates the problem is in the BOINC client.
                                                                  Joe


I've delved quite deeply into this kindof strange symptom over time, of which there are a large number of different odd behaviours originating from a similar set of root causes.

It's taking me longer than I hoped to document those causes for (hopefully) wider benefit, but the general gist is some assumptions made about process and IO management that may have applied in the distant past, no longer hold true (if they ever did) under operating systems & runtimes optimising for desktop performance. Where low priorities and high system contention mix, such as might be encountered at system shutdown, IO completion, normal process termination & garbage collection can be postponed from the more usual fractions of a second, to on the order of minutes, on otherwise healthy systems.

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1416127 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : V7 & cuda


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.