Strange result, how is this possible?

Message boards : Number crunching : Strange result, how is this possible?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061441 - Posted: 30 Dec 2010, 14:36:12 UTC - in response to Message 1061423.  

See: wuid=669862967, and you'll find two super trashing machines validate each others -9's, making the likely right result invalid, and enter false data into the data base and science.

Edit: And soon it will be gone from view too, when the system purges old data from view. Sneaking into the data base as valid result, and nobody will react and do anything about this joke of science.

Specifically, Computer 5472266 and Computer 5293938, both i7 CPUs with Fermi cards, and both running Raistmer's incompatible v12 autokill application.

This is the sort of thing which, unfortunately, gets optimisation a bad name.

To recap: the early NVidia stock application was incompatible with Fermis, too - and nobody, not even NVidia (who wrote the app) knew that incompatible hardware would be developed 18 months later. So I have no criticism of Raistmer or his application, which was fit, proper and useful in its time.

The problem is the inappropriate use of optimised applications in general. When the Fermi incompatibility was discovered, a new stock application could be (and was) rolled out, so that all 'set and forget' hosts got the new application automatically and stopped trashing WUs.

But all third-party applications, whether optimised or not, bypass the automatic updating process - so these unattended machines will go on trashing work until forcibly stopped, either by their owners or by the project blocking them.

That is why all contributors to this message board have a heavy responsibility when suggesting the use of third-party applications. You must, must, must stress the importance of users monitoring both the health of their machines, and software developments which may require a manual upgrade, as it does in this case.

The two users hare have, in the first case, never posted on these message boards, and in the second case, not posted for almost 4 years. What on earth are people like that doing running optimised applications? Who told them the applications existed, or where to get them? I sincerely hope nobody here would do such a thing (without including the caution on updating), and I also hope there aren't any hardware or team forums out there giving bad advice, as well.
ID: 1061441 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061442 - Posted: 30 Dec 2010, 14:44:04 UTC - in response to Message 1061434.  

Or how about the SETI project tries to contact these host owners to correct the problem, and if that doesn't help for whatever reason, block these hosts? Can they block hosts if they want?

No way does the project (or specifically, its - overworked, underpaid, and desparately outnumbered - staff) have time to do that.

In other projects, the volunteer moderators - not having much forum moderating to do - have taken on this sort of role. But I don't see that happening here. And what happens if people have email notification turned off, or have stopped using the email registered when they joined the project?

No, the simplest and most robust solution will be for the project to block the use of third-party applications entirely. Maybe that would bring some people to their senses.
ID: 1061442 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1061445 - Posted: 30 Dec 2010, 14:58:06 UTC - in response to Message 1061442.  

Or how about the SETI project tries to contact these host owners to correct the problem, and if that doesn't help for whatever reason, block these hosts? Can they block hosts if they want?

No way does the project (or specifically, its - overworked, underpaid, and desparately outnumbered - staff) have time to do that.

In other projects, the volunteer moderators - not having much forum moderating to do - have taken on this sort of role. But I don't see that happening here. And what happens if people have email notification turned off, or have stopped using the email registered when they joined the project?

No, the simplest and most robust solution will be for the project to block the use of third-party applications entirely. Maybe that would bring some people to their senses.

I didn't necessarily mean underpaid staff should do themselves but some volunteers of sort. They could try to contact the host owners in question, and if that's doesn't help, for example if e-mail doesn't work, they could put the host on a server block list.

Blocking opp apps is a bad solution I think.
ID: 1061445 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1061448 - Posted: 30 Dec 2010, 15:04:36 UTC - in response to Message 1061442.  
Last modified: 30 Dec 2010, 15:06:35 UTC

I am using a BOINC client 6.2.19 by Dotsch and a 6.03 SETI app on my virtual machine running Solaris Express on a Linux host and, although slower than the Lunatics app I am using on my Linux host, they never crash. No GPU.
Tullio
ID: 1061448 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061462 - Posted: 30 Dec 2010, 15:42:36 UTC - in response to Message 1061445.  

Or how about the SETI project tries to contact these host owners to correct the problem, and if that doesn't help for whatever reason, block these hosts? Can they block hosts if they want?

No way does the project (or specifically, its - overworked, underpaid, and desparately outnumbered - staff) have time to do that.

In other projects, the volunteer moderators - not having much forum moderating to do - have taken on this sort of role. But I don't see that happening here. And what happens if people have email notification turned off, or have stopped using the email registered when they joined the project?

No, the simplest and most robust solution will be for the project to block the use of third-party applications entirely. Maybe that would bring some people to their senses.

I didn't necessarily mean underpaid staff should do themselves but some volunteers of sort. They could try to contact the host owners in question, and if that's doesn't help, for example if e-mail doesn't work, they could put the host on a server block list.

Blocking opp apps is a bad solution I think.

I think in reality the project would want to avoid blocking anonymous applications except as the very, very last resort - but if the science is damaged by these false validations, and the reputation of the Lunatics applications is reduced as a consequence, then they might be forced to consider it - even if perfectly blameless participants like Tullio suffer collateral damage.

But I don't see how volunteers could solve the problem. We don't have direct access to the users email, only indirectly via PMs: and we certainly don't have (and will never be given) the power to block individual hosts. The most we could ask is that a member of staff could develop a 'block and notify' script (such as was used under similar circumstances at CPDN), and agree to run it against hosts identified and notified by responsible volunteers.
ID: 1061462 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1061474 - Posted: 30 Dec 2010, 16:10:20 UTC

Hi, I just added a new cruncher three weeks ago, i7-950 with two EVGA GTX-470s running boinc apps and NO 3rd party apps on windows 7 pro. with the newest nVidia drivers.
I occasionally just stare at boinc tasks screen (as probably all of you do sometimes) and noticed that once in a while cuda calculations finish in a couple of minutes despite the projected time of over 10 min while all other calculations of the same WU series take over 10 min.

Is this what you are talking about?
If so, what can be the cause of this in case of my PC?
All stock apps and newest drivers, it's not even overclocked.
ID: 1061474 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1061478 - Posted: 30 Dec 2010, 16:14:51 UTC - in response to Message 1061462.  

Or how about the SETI project tries to contact these host owners to correct the problem, and if that doesn't help for whatever reason, block these hosts? Can they block hosts if they want?

No way does the project (or specifically, its - overworked, underpaid, and desparately outnumbered - staff) have time to do that.

In other projects, the volunteer moderators - not having much forum moderating to do - have taken on this sort of role. But I don't see that happening here. And what happens if people have email notification turned off, or have stopped using the email registered when they joined the project?

No, the simplest and most robust solution will be for the project to block the use of third-party applications entirely. Maybe that would bring some people to their senses.

I didn't necessarily mean underpaid staff should do themselves but some volunteers of sort. They could try to contact the host owners in question, and if that's doesn't help, for example if e-mail doesn't work, they could put the host on a server block list.

Blocking opp apps is a bad solution I think.

I think in reality the project would want to avoid blocking anonymous applications except as the very, very last resort - but if the science is damaged by these false validations, and the reputation of the Lunatics applications is reduced as a consequence, then they might be forced to consider it - even if perfectly blameless participants like Tullio suffer collateral damage.

But I don't see how volunteers could solve the problem. We don't have direct access to the users email, only indirectly via PMs: and we certainly don't have (and will never be given) the power to block individual hosts. The most we could ask is that a member of staff could develop a 'block and notify' script (such as was used under similar circumstances at CPDN), and agree to run it against hosts identified and notified by responsible volunteers.



I think you have an idea there with script to block. And I really hope that the guys in the lab read this thread.
[/quote]

Old James
ID: 1061478 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061485 - Posted: 30 Dec 2010, 16:25:05 UTC - in response to Message 1061474.  

Hi, I just added a new cruncher three weeks ago, i7-950 with two EVGA GTX-470s running boinc apps and NO 3rd party apps on windows 7 pro. with the newest nVidia drivers.
I occasionally just stare at boinc tasks screen (as probably all of you do sometimes) and noticed that once in a while cuda calculations finish in a couple of minutes despite the projected time of over 10 min while all other calculations of the same WU series take over 10 min.

Is this what you are talking about?
If so, what can be the cause of this in case of my PC?
All stock apps and newest drivers, it's not even overclocked.

If it only happens 'once in a while', that's normal, and nothing to worry about.

What we're talking about are computers which do that all day, every day, for months on end - and whose owners obviously don't look at them for at least as long. Yours is running fine.

PS - well, not quite fine: Error tasks for computer 5636753. Maybe running a bit hot? But nothing like the problems in this thread.
ID: 1061485 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061489 - Posted: 30 Dec 2010, 16:30:09 UTC - in response to Message 1061478.  

Or how about the SETI project tries to contact these host owners to correct the problem, and if that doesn't help for whatever reason, block these hosts? Can they block hosts if they want?

No way does the project (or specifically, its - overworked, underpaid, and desparately outnumbered - staff) have time to do that.

In other projects, the volunteer moderators - not having much forum moderating to do - have taken on this sort of role. But I don't see that happening here. And what happens if people have email notification turned off, or have stopped using the email registered when they joined the project?

No, the simplest and most robust solution will be for the project to block the use of third-party applications entirely. Maybe that would bring some people to their senses.

I didn't necessarily mean underpaid staff should do themselves but some volunteers of sort. They could try to contact the host owners in question, and if that's doesn't help, for example if e-mail doesn't work, they could put the host on a server block list.

Blocking opp apps is a bad solution I think.

I think in reality the project would want to avoid blocking anonymous applications except as the very, very last resort - but if the science is damaged by these false validations, and the reputation of the Lunatics applications is reduced as a consequence, then they might be forced to consider it - even if perfectly blameless participants like Tullio suffer collateral damage.

But I don't see how volunteers could solve the problem. We don't have direct access to the users email, only indirectly via PMs: and we certainly don't have (and will never be given) the power to block individual hosts. The most we could ask is that a member of staff could develop a 'block and notify' script (such as was used under similar circumstances at CPDN), and agree to run it against hosts identified and notified by responsible volunteers.

I think you have an idea there with script to block. And I really hope that the guys in the lab read this thread.

Here's the CPDN reference in case they do: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=6482
ID: 1061489 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1061491 - Posted: 30 Dec 2010, 16:34:16 UTC

All your Cuda machines seems to work as they should, even the one you mentioned above. There's a few errors but nothing out of the ordinary. As far as I can judge you've done a good job in setting up your systems, and can relax, sit back and enjoy the ride.

Very well done.


Thank you but following is what i found in my WU.
http://setiathome.berkeley.edu/workunit.php?wuid=670260959
This work HAS -9 result_overflow

I went back and looked at some pending results and found a few of these.
ID: 1061491 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1061509 - Posted: 30 Dec 2010, 16:59:27 UTC - in response to Message 1061491.  

All your Cuda machines seems to work as they should, even the one you mentioned above. There's a few errors but nothing out of the ordinary. As far as I can judge you've done a good job in setting up your systems, and can relax, sit back and enjoy the ride.

Very well done.


Thank you but following is what i found in my WU.
http://setiathome.berkeley.edu/workunit.php?wuid=670260959
This work HAS -9 result_overflow

I went back and looked at some pending results and found a few of these.


And welcome to the boards. Hopefully you have a program to monitor temps. Also it helps to clean the dust bunnies out every so often. And as I use this i7 as my daily driver I reboot about every 3 or 4 days.
[/quote]

Old James
ID: 1061509 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 1061516 - Posted: 30 Dec 2010, 17:17:48 UTC

Previously I mentioned "old" installations should be candidates for cut off also. Look at this one:

http://setiathome.berkeley.edu/show_host_detail.php?hostid=3198906

How (and/or why would you want to) can a machine with a RAC of 100 be issued 1100+ work units??


Dave

ID: 1061516 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 1061527 - Posted: 30 Dec 2010, 17:38:02 UTC

He is running 4.19 Boinc.

Did I read in another thread that the 4 versions do not calculate credit correctly any longer?

I am paired with him on an AP WU.

http://setiathome.berkeley.edu/workunit.php?wuid=674824439

I guess my AP instead of ~700 credits will receive nothing or close to ??



Dave

ID: 1061527 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1061533 - Posted: 30 Dec 2010, 17:47:02 UTC

I have a cache of 0.25 days on 6 BOINC projects and I get a new WU to crunch on my Linux box only when the preceding one has been uploaded. And I am never out of work.
Tullio
ID: 1061533 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1061602 - Posted: 30 Dec 2010, 21:20:45 UTC - in response to Message 1061527.  

He is running 4.19 Boinc.

Did I read in another thread that the 4 versions do not calculate credit correctly any longer?

I am paired with him on an AP WU.

http://setiathome.berkeley.edu/workunit.php?wuid=674824439

I guess my AP instead of ~700 credits will receive nothing or close to ??




Einstein@Home has a minimum boinc that you need to run. I cant fathom why Seti@Home dosent do the same. Seems that would whack some of the garbage being returned.
[/quote]

Old James
ID: 1061602 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061620 - Posted: 30 Dec 2010, 22:03:20 UTC - in response to Message 1061602.  

Einstein@Home has a minimum boinc that you need to run. I cant fathom why Seti@Home dosent do the same. Seems that would whack some of the garbage being returned.

The trouble is, that the replies from BOINC v4.19 aren't garbage.

The primary output, the result file, aka 'science', is OK.

It's only the secondary output, the credit, which is distorted.
ID: 1061620 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1061771 - Posted: 31 Dec 2010, 3:28:22 UTC - in response to Message 1061527.  

He is running 4.19 Boinc.

Did I read in another thread that the 4 versions do not calculate credit correctly any longer?
...

You may have read that, but it isn't an accurate statement. Any BOINC version prior to 5.2.7 doesn't report the fpops_cumulative we were using for work based credit, but that no longer matters for the new credit system.

The 4.x versions on Mac systems definitely have a problem establishing shared memory communication, and if that fails the application does the work in standalone mode and no CPU time is reported, leading to zero credit claims even with the new credit system. On other platforms the 4.x versions may also have weakness in that same area, but the pending tasks for that host have reasonable CPU times. Not having elapsed time reporting (which started someplace in the 6.6 series) also doesn't make much difference.
                                                                Joe
ID: 1061771 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1061788 - Posted: 31 Dec 2010, 4:31:56 UTC - in response to Message 1061473.  

Sten-Arne wrote:
But number one though, is to find an automated way to find and indentify those hosts that produces false results into the database. There's no way in h*** that it's going to be possible for volunteers to manually search through the system to find those hosts. It has to be done in some automatic database search fashion.

4202271
5257703
5293938
5354400
5393593
5396192
5445713
5467571
5472266
5478875
                                                              Joe
ID: 1061788 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1061853 - Posted: 31 Dec 2010, 9:54:08 UTC - in response to Message 1061788.  

Looking at that list I don't understand why with "Consecutive valid tasks=0" "Number of tasks today= several hundred" is this correct, if so why.

Bernie.
ID: 1061853 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1061900 - Posted: 31 Dec 2010, 12:33:18 UTC - in response to Message 1061899.  

Sten-Arne wrote:
But number one though, is to find an automated way to find and indentify those hosts that produces false results into the database. There's no way in h*** that it's going to be possible for volunteers to manually search through the system to find those hosts. It has to be done in some automatic database search fashion.

4202271
5257703
5293938
5354400
5393593
5396192
5445713
5467571
5472266
5478875
                                                              Joe


Yes, you have found a few of them. Was that made through manually searching the system the way we all can, or do you have access to the database through other means?


hostid=5393593 has reverted back to Stock since the list was published.

Claggy
ID: 1061900 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Strange result, how is this possible?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.