Strange result, how is this possible?

Message boards : Number crunching : Strange result, how is this possible?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1061903 - Posted: 31 Dec 2010, 12:39:03 UTC - in response to Message 1061900.  
Last modified: 31 Dec 2010, 12:45:16 UTC

hostid=5393593 has reverted back to Stock since the list was published.

Claggy

No evidence of participation in these forums, but a member of a non-english-language team. I hope that's a good sign that the message might spread outwards through other means.

Edit - yes, they are discussing this problem on the SETI.Germany forums. Thank you.
ID: 1061903 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062038 - Posted: 31 Dec 2010, 19:13:21 UTC - in response to Message 1061899.  

Sten-Arne wrote:
But number one though, is to find an automated way to find and indentify those hosts that produces false results into the database. There's no way in h*** that it's going to be possible for volunteers to manually search through the system to find those hosts. It has to be done in some automatic database search fashion.

4202271
5257703
5293938
5354400
5393593
5396192
5445713
5467571
5472266
5478875
                                                              Joe


Yes, you have found a few of them. Was that made through manually searching the system the way we all can, or do you have access to the database through other means?

By manually searching, mostly last August. By August 25 I was no longer finding new host IDs by checking wingmates confirming Fermi false overflows so I judged the list of 22 hosts was complete for all practical purposes. Rechecking that list yesterday found 8 which were still actively using V12 on Fermi GPUs, and without being thorough I also picked up 2 additional IDs.

Whether 10 or 20 is "few" depends on point of view, and I didn't keep track of how many Fermi GPUs each host has. The false overflows only take about 20 seconds, and those VHAR WUs which the combination crunches correctly not much more time. So those GPUs will be wanting more than the base quota of 800 tasks per day. The frequent invalid cases will ensure that the quota doesn't often go much above that base, the fact that the false overflows are "Success" as far as the report goes ensures the quota will not stay below that base. So the quota system is at least helping somewhat to limit the problem, but it would be better if there were some way it could distinguish between a continuing problem and a temporary accident.

================================

Hosts which are running the correct software but still produce many false overflows is essentially an insoluble problem. BOINC advertises that simply installing the infrastructure and selecting projects to run is all that's required, and the majority of users essentially do so. Given that the crunching is done with consumer grade hardware, the projects must be prepared to deal with anomalous results. While those of us who regularly read and contribute to this forum would prefer it otherwise, some marginal hosts will continue to produce results which have similar effects to additional RFI. Luckily, CUDA false positives when running the correct software very seldom match so mainly impact efficiency.
                                                             Joe
ID: 1062038 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1062048 - Posted: 31 Dec 2010, 19:36:17 UTC - in response to Message 1062038.  

I chased one of my inconclusives down to a wingman running a 295. half of it was turning in -9s the other was turning in good work. He may never know something is wrong if he just looks at his credits still rising and his RAC still holding.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1062048 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1062388 - Posted: 1 Jan 2011, 15:25:09 UTC - in response to Message 1062048.  

heres another guy with invald problems. Seems he gets 1 WU correct for every 5 he messes up. 5444296

In fact of his current valid WU's he does have invalid work getting credit[/url]


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1062388 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1062391 - Posted: 1 Jan 2011, 15:32:25 UTC - in response to Message 1062388.  

Yep, and here's why
V12 modification by Raistmer


We need to really start harping on the fact this old app won't work with the new Fermi cards!!


PROUD MEMBER OF Team Starfire World BOINC
ID: 1062391 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1062414 - Posted: 1 Jan 2011, 17:28:13 UTC - in response to Message 1062388.  

heres another guy with invald problems. Seems he gets 1 WU correct for every 5 he messes up. 5444296

In fact of his current valid WU's he does have invalid work getting credit[/url]

As someone else mentioned, how can he have a max tasks per day of 101 if he only gets 1 OK tasks for every 5 bad ones?

SETI@home Enhanced (anonymous platform, nvidia GPU)
Number of tasks completed 1657
Max tasks per day 101
Number of tasks today 585
Consecutive valid tasks 0
ID: 1062414 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1062458 - Posted: 1 Jan 2011, 19:36:39 UTC

Another fermi with wrong opp app 5701024
ID: 1062458 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1062462 - Posted: 1 Jan 2011, 19:42:52 UTC - in response to Message 1062458.  

Another fermi with wrong opp app 5701024

Seems he gets 1 WU correct for every 5 he messes up. 5444296

Both of those are members of team SETI.Germany, like the one who cleaned up his host earlier in this thread. I wonder whether we might have a distribution pattern here?
ID: 1062462 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1062469 - Posted: 1 Jan 2011, 19:50:27 UTC - in response to Message 1062462.  

Another fermi with wrong opp app 5701024

Seems he gets 1 WU correct for every 5 he messes up. 5444296

Both of those are members of team SETI.Germany, like the one who cleaned up his host earlier in this thread. I wonder whether we might have a distribution pattern here?


I did some digging and his wingman is turning in invalid and errors. Plus the worst is for machine 5701024 a lot of the work looks like it validated for 4.0 credits. His wingman is also running GPU.
[/quote]

Old James
ID: 1062469 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9960
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1062504 - Posted: 1 Jan 2011, 21:11:24 UTC

I am obviously missing something here, but:

Number of tasks completed 1663
Max tasks per day 100
Number of tasks today 585
Consecutive valid tasks 0

Really doesn't make any sense.

Bernie
ID: 1062504 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062517 - Posted: 1 Jan 2011, 21:51:11 UTC

As of now, this is as close as I can come to a complete list of hosts running the V12 CUDA application on Fermi GPUs:

3099502, 4202271, 5029073, 5149058
5231715, 5257703, 5293938, 5305178
5354400, 5357884, 5396192, 5444296
5445713, 5448502, 5467571, 5472266
5478875, 5541031, 5564376, 5701024

Not all of those hosts have results showing which were reported within the last few hours, so some could possibly have been fixed recently.

There are 24 Fermi GPUs in those 20 hosts.
                                                               Joe
ID: 1062517 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062521 - Posted: 1 Jan 2011, 22:27:10 UTC - in response to Message 1062504.  

I am obviously missing something here, but:

Number of tasks completed 1663
Max tasks per day 100
Number of tasks today 585
Consecutive valid tasks 0

Really doesn't make any sense.

Bernie

Number of tasks completed is a count of valid tasks done with the indicated application since the new credit system was started. "Number of tasks which have been granted credit" would be more accurate.

Max tasks per day is an accurate label only for a single core CPU. It's multiplied by 8 for GPUs at this project, and of course by the number of compute resources available for the application. The 100 value can be exceeded if Consecutive valid grows, and can be reduced by errors and invalid work. However, when it's below 100 each result reported by the host as a success tries to double it; any temporary value of 50 or more jumps to 100. If the application is throwing false overflows they are reported as successes just like a real overflow caused by the data.
                                                               Joe
ID: 1062521 · Report as offensive
Profile Todd Hebert
Volunteer tester
Avatar

Send message
Joined: 16 Jun 00
Posts: 648
Credit: 228,292,957
RAC: 0
United States
Message 1062543 - Posted: 1 Jan 2011, 23:11:45 UTC
Last modified: 1 Jan 2011, 23:12:21 UTC

Just glad that the PC's causing the issues aren't mine or any members of my team. In looking at the resources of the machines it is too bad that they aren't being well used. Has anyone PM'd the users in question? I believe that there is a proper optimization guide that was published by one of my team members - AndrewKing01 floating around here somewhere. I will dig it up if it can't be found.

The members of GPU Users group are always available to lend a hand to any user that is experiencing problems getting their machine to run as they should.

Todd
Team GPU Users Group Admin - World RAC Leader
ID: 1062543 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1062576 - Posted: 2 Jan 2011, 0:22:11 UTC

Do you think it would do any good to post on the seti home paige. Just a short blurb that some cudas are running the wrong apps and all users should check their cards?

Or maybe a mass E-mailing to all users.

Or maybe we are worried over nothing, Be interesting what the labs take on this is.
[/quote]

Old James
ID: 1062576 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1062614 - Posted: 2 Jan 2011, 2:41:11 UTC - in response to Message 1062543.  

Just glad that the PC's causing the issues aren't mine or any members of my team. In looking at the resources of the machines it is too bad that they aren't being well used. Has anyone PM'd the users in question?


I've been trying to PM the ones I catch as my wingmen but I'm not sure it will do much good. If they haven't paid attention to the forums they probably have email notification turned off.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1062614 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1062630 - Posted: 2 Jan 2011, 3:13:19 UTC - in response to Message 1062576.  

Do you think it would do any good to post on the seti home paige. Just a short blurb that some cudas are running the wrong apps and all users should check their cards?

Or maybe a mass E-mailing to all users.

Or maybe we are worried over nothing, Be interesting what the labs take on this is.

Nope since the apps are not technically setis and because these are obviously people that set it and forget it. they dont check in here, they don't post here, they don't monitor their machines after all the time people are told that they need to monitor their machines and check in to be sure some update hasn't occurred.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1062630 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062654 - Posted: 2 Jan 2011, 6:03:45 UTC - in response to Message 1062517.  

As of now, this is as close as I can come to a complete list of hosts running the V12 CUDA application on Fermi GPUs:

3099502, 4202271, 5029073, 5149058
5231715, 5257703, 5293938, 5305178
5354400, 5357884, 5396192, 5444296
5445713, 5448502, 5467571, 5472266
5478875, 5541031, 5564376, 5701024

Not all of those hosts have results showing which were reported within the last few hours, so some could possibly have been fixed recently.

There are 24 Fermi GPUs in those 20 hosts.
                                                               Joe

One less now, 5541031 has updated.
                                                                Joe
ID: 1062654 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34684
Credit: 79,922,639
RAC: 80
Germany
Message 1062661 - Posted: 2 Jan 2011, 8:13:51 UTC


I´m wondering how this can happen.

http://setiathome.berkeley.edu/workunit.php?wuid=671914757

I can´t find the credit multiplier in the fermi app.


With each crime and every kindness we birth our future.
ID: 1062661 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062818 - Posted: 2 Jan 2011, 19:10:22 UTC - in response to Message 1062661.  


I´m wondering how this can happen.

http://setiathome.berkeley.edu/workunit.php?wuid=671914757

I can´t find the credit multiplier in the fermi app.

The new credit system doesn't use anything from the app, the elapsed time reported by the BOINC core client (or CPU time from older versions of BOINC) is the basis of credits. That's combined with server averaging which produces what amounts to a benchmark for each application on the host.

The WU you linked has of course been purged from the database, with only about 6 hour purge delay it's best to quote pertinent details here.
                                                               Joe
ID: 1062818 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1062822 - Posted: 2 Jan 2011, 19:17:41 UTC

From my latest run looking for anonymous platform hosts using outdated CUDA apps on Fermi cards, 5029073 has been updated and is no longer in the list:

Running Lunatics V12... 3099502 4202271 5149058 5231715 5257703 5293938 5305178 5346869 5354400 5357884 5396192 5444296 5445713 5448502 5467571 5472266 5478875 5564376 5701024
Running Crunch3r's 64 bit Linux r12... 5618971
                                                               Joe
ID: 1062822 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Strange result, how is this possible?


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.