Anonumous host throwing only errors, 3223 right now


log in

Advanced search

Message boards : Number crunching : Anonumous host throwing only errors, 3223 right now

Previous · 1 · 2 · 3 · 4 · 5 · Next
Author Message
Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7120
Credit: 95,290,383
RAC: 74,160
Australia
Message 1242758 - Posted: 7 Jun 2012, 13:28:29 UTC - in response to Message 1242725.

Host
6469701
has a lot of overflows or -9 errors.
Yet another anonymous host with an GTX 560 and a GTS8800!

I listed that machine earlier as its 1 of a few that are making a mess of things in my pendings. ;)

Cheers.
____________

Profile Yanivicious
Avatar
Send message
Joined: 29 Mar 12
Posts: 157
Credit: 12,448,985
RAC: 7,188
United States
Message 1242772 - Posted: 7 Jun 2012, 14:07:00 UTC - in response to Message 1242758.

i just looked at my own tasks, and I have 4 of these -12 errors on my laptop (Gtx 8800M)- I remember these errors happening and all of them happened when my GPU was overloaded because I have SETI crunching enabled 24/7 on both CPUs and GPU, and if I load up a video on youtube or media player etc.. without suspending the BOINC usage first, that happens every now and then.

LadyL
Volunteer tester
Avatar
Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1242788 - Posted: 7 Jun 2012, 14:50:40 UTC - in response to Message 1242772.
Last modified: 7 Jun 2012, 14:58:29 UTC

i just looked at my own tasks, and I have 4 of these -12 errors on my laptop (Gtx 8800M)- I remember these errors happening and all of them happened when my GPU was overloaded because I have SETI crunching enabled 24/7 on both CPUs and GPU, and if I load up a video on youtube or media player etc.. without suspending the BOINC usage first, that happens every now and then.


That is probably coincidence. [aka supersticious pidgeon*]

There was a condition where not enough memory for tasks could lead to false -12 instead of triggering the 'not enough mem' functions [CPU fallback for stock and up to x38g, task wait for x41g and later] but that got ironed out somewhere between x41g and x41x.

In general -12 are 'too may triplets found' or something like two triplets too close together iirc. High occurance in stock, low occurrence in x41g and eliminated in x41x.

* the event is random, but you notice those instances most when it happens in conjuction with something else. the belief in connection gets reinforced, the instances when it's not connected are either overlooked or seen as exceptions from a rule. Fascinating thing, Psychology.
____________
I'm not the Pope. I don't speak Ex Cathedra!

Profile Swordfish
Avatar
Send message
Joined: 5 Aug 06
Posts: 72
Credit: 3,012,670
RAC: 0
United Kingdom
Message 1242793 - Posted: 7 Jun 2012, 14:54:25 UTC

Further to my post below I found these as well

Host 6112618 Host 4202271 Host 6257665

Pending 3257 5041 65
Valid 1 102 8
Invalid 397 353 3
Error 44 12 24

All these are in my pending as inconsistants, with another wingman, having to redo the work.

Why don't people check their results periodically, to see if they are throwing out crap results, and take action if they are.

I agree with the comments, below, aand would go further, and blacklist those who are consistantly, throwing out erroneous results.

We all have the occasional bum result, but the majority of us do not have the large number quoted above, and in my previous post.

Alwana
Send message
Joined: 5 Oct 01
Posts: 1
Credit: 729,867
RAC: 775
Malaysia
Message 1244391 - Posted: 10 Jun 2012, 16:11:45 UTC - in response to Message 1242793.

Don't understand this error. any advice on how to solve. error occur near 90% completed.

msg from system
Stderr output
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
setiathome_enhanced 6.02 DevC++/MinGW
libboinc: 6.3.6

Work Unit Info:
...............
WU true angle range is : 0.011385
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled2 0.00057 0.00000
sse3_ChirpData_ak 0.01955 0.00000
v_vTranspose4np 0.01383 0.00000
BH SSE folding 0.00453 0.00000
Restarted at 0.21 percent.

</stderr_txt>
]]>

StickProject donor
Volunteer tester
Send message
Joined: 26 Feb 00
Posts: 84
Credit: 1,704,171
RAC: 1,093
United States
Message 1244409 - Posted: 10 Jun 2012, 17:19:12 UTC
Last modified: 10 Jun 2012, 17:28:17 UTC

Here's another one: anonymous host 3378825. It has a small number of valid tasks, quite a few invalid tasks, plus a boatload of pending inconclusives. And it's getting bunches of new work.
____________

Profile Yanivicious
Avatar
Send message
Joined: 29 Mar 12
Posts: 157
Credit: 12,448,985
RAC: 7,188
United States
Message 1244522 - Posted: 10 Jun 2012, 22:52:24 UTC - in response to Message 1244409.

i just spent some time looking through a lot of the tasks that my computers have completed and seeing how the various wingmen are faring, and i was suprised to see how many of the wingmen's systems are throwing out hundreds of invalid and errored tasks. i sent private messages to all of the non anonymous wingmen to give them a heads up about their systems. i suggest other people do the same. but we still need a permanent solution for alerting or blocking all of those anonymous crunchers out there..

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,767,981
RAC: 1,178
United States
Message 1244537 - Posted: 10 Jun 2012, 23:31:38 UTC

I still think it's time for server-side intervention.


Others have mentioned the same thing.. If someone out there had their heart set on screwing with our DC project, they could setup two or more PC's in the same way to throw errors on both and reinforce tons of bad validated results.

This is not cool. there needs to be an error quota. If you screw up over 50% of any type of given task, you should not get any more of that task type without getting your stuff fixed and contacting us here to get your account reactivated.
____________
-Dave #2

3.2.0-33

Profile Yanivicious
Avatar
Send message
Joined: 29 Mar 12
Posts: 157
Credit: 12,448,985
RAC: 7,188
United States
Message 1244539 - Posted: 10 Jun 2012, 23:36:44 UTC - in response to Message 1244537.

I still think it's time for server-side intervention.


Others have mentioned the same thing.. If someone out there had their heart set on screwing with our DC project, they could setup two or more PC's in the same way to throw errors on both and reinforce tons of bad validated results.

This is not cool. there needs to be an error quota. If you screw up over 50% of any type of given task, you should not get any more of that task type without getting your stuff fixed and contacting us here to get your account reactivated.


exactly. i'm relatively new around here so i don't know the answer, but is there somebody this can be addressed to? has anybody done that recently? some of us spend a lot of time & money on this project (and other BOINC projects) and really believe in the work we are doing here and want to see that the integrity of this project isn't in jeopardy!

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7120
Credit: 95,290,383
RAC: 74,160
Australia
Message 1244544 - Posted: 11 Jun 2012, 0:24:32 UTC - in response to Message 1244539.

I still think it's time for server-side intervention.


Others have mentioned the same thing.. If someone out there had their heart set on screwing with our DC project, they could setup two or more PC's in the same way to throw errors on both and reinforce tons of bad validated results.

This is not cool. there needs to be an error quota. If you screw up over 50% of any type of given task, you should not get any more of that task type without getting your stuff fixed and contacting us here to get your account reactivated.


exactly. i'm relatively new around here so i don't know the answer, but is there somebody this can be addressed to? has anybody done that recently? some of us spend a lot of time & money on this project (and other BOINC projects) and really believe in the work we are doing here and want to see that the integrity of this project isn't in jeopardy!

Something should be done soon or a lot of work of mine will likely be lost to being errored out wrongly by these rogue machines plus my inconclusive pendings are still increasing at a bad rate.

Cheers.


____________

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7120
Credit: 95,290,383
RAC: 74,160
Australia
Message 1244923 - Posted: 12 Jun 2012, 4:59:00 UTC - in response to Message 1244544.

I did a bit of homework today to find out how many machines are mucking my end of things up (and many others to I bet) so here they are,

Computer 1334363
Computer 2699082
Computer 3378825
Computer 3440890
Computer 5007852
Computer 5236541
Computer 5292089
Computer 5345364
Computer 5461131
Computer 5485240
Computer 5542882
Computer 5762247
Computer 5874840
Computer 5889577
Computer 5927348
Computer 5935169
Computer 5967897
Computer 6126443
Computer 6148459
Computer 6229518
Computer 6247292
Computer 6249179
Computer 6253461
Computer 6256705
Computer 6271384
Computer 6283429
Computer 6318737
Computer 6365602
Computer 6401198
Computer 6441935
Computer 6469701
Computer 6568123
Computer 6586696
Computer 6589662
Computer 6633670
Computer 6640112
Computer 6643594
Computer 6643620
Computer 6650172
Computer 6650230
Computer 6651362,

and I bet that there would be many others besides them but the bigger question is, what are these doing to the load on the databases?

Cheers.
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4242
Credit: 116,005,338
RAC: 142,869
United States
Message 1244931 - Posted: 12 Jun 2012, 5:14:06 UTC - in response to Message 1244923.

I did a bit of homework today to find out how many machines are mucking my end of things up (and many others to I bet) so here they are,

<snip>

and I bet that there would be many others besides them but the bigger question is, what are these doing to the load on the databases?

Cheers.

Even thought they are throwing out mostly errors they are still chewing though less tasks per day then several of the top machines. I imagine the global errors or invalid tasks per day would be something like 50,000-100,000 per day. However with around 2 million being returned a day it isn't that much. Drops in a bucket as they say.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile betregerProject donor
Avatar
Send message
Joined: 29 Jun 99
Posts: 2391
Credit: 5,039,870
RAC: 10,668
United States
Message 1244934 - Posted: 12 Jun 2012, 5:24:09 UTC - in response to Message 1244931.
Last modified: 12 Jun 2012, 5:25:18 UTC

Hal, you speak like our fearless leader, Dr. D.A.
____________

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,767,981
RAC: 1,178
United States
Message 1245072 - Posted: 12 Jun 2012, 15:01:19 UTC

Two machines erroring in the same fashion will produce valids if they wing each other...

So I guess it's a good thing these rogue machines give us grief as opposed to working silently as a team..
____________
-Dave #2

3.2.0-33

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4252
Credit: 1,050,582
RAC: 248
United States
Message 1245080 - Posted: 12 Jun 2012, 15:11:14 UTC

The overall effect seems to be that on average there are about 2.2 results for each MB WU:

Workunits waiting for db purging 620,658 0 0m Results waiting for db purging 1,345,459 37 0m

The extra results of course extend the duration of the database and storage impact, so it's more than the indicated 10%. When there are stats for AP, that ratio is often higher.
Joe

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3247
Credit: 31,806,008
RAC: 3,419
Netherlands
Message 1245289 - Posted: 13 Jun 2012, 10:14:33 UTC - in response to Message 1245080.

The overall effect seems to be that on average there are about 2.2 results for each MB WU:
Workunits waiting for db purging 620,658 0 0m Results waiting for db purging 1,345,459 37 0m

The extra results of course extend the duration of the database and storage impact, so it's more than the indicated 10%. When there are stats for AP, that ratio is often higher.
Joe


Resulting in extra NET-Traffic, DataBase strain, this >10% is 10% too
much and probably even higher, as Joe stated.

I see more and more results requiering a 3rd wingman.
Is it the quota-system unable to limit the amount of work of those hosts?


____________

JohnDKProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 00
Posts: 844
Credit: 45,179,666
RAC: 71,614
Denmark
Message 1245448 - Posted: 13 Jun 2012, 17:39:51 UTC

Another one with a bunch of invalids

http://setiathome.berkeley.edu/results.php?hostid=6247549&offset=0&show_names=0&state=4&appid=

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1245500 - Posted: 13 Jun 2012, 19:04:36 UTC - in response to Message 1244923.
Last modified: 13 Jun 2012, 19:10:39 UTC

...
Computer 5345364
...
That's a nasty one.
I just noticed that my pending list keeps growing and I found a few pages of inconclusives, all with this host as wingman.

Why is it fetching workunits for cuda/cuda23 when there's no GPU shown on its host details page?
(stderr says GTX 280 and GTX 295)
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4252
Credit: 1,050,582
RAC: 248
United States
Message 1245529 - Posted: 13 Jun 2012, 20:29:32 UTC - in response to Message 1245500.

...
Computer 5345364
...
That's a nasty one.
I just noticed that my pending list keeps growing and I found a few pages of inconclusives, all with this host as wingman.

Why is it fetching workunits for cuda/cuda23 when there's no GPU shown on its host details page?
(stderr says GTX 280 and GTX 295)

Its last work fetch was about 31 hours ago, I judge the user has recognized the problem and either told BOINC not to use it or removed it completely.
Joe

Profile Area 51
Avatar
Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1245541 - Posted: 13 Jun 2012, 20:51:31 UTC

Sad though this is (to the extent it makes a complete mockery of our efforts to produce good results), and given that the project has the capacity to black-list hosts should they want to, I can only assume that the impact of these hosts is considered largely insignificant. Either that, or the project just don't want to get into the business of micro managing individual hosts.....

The trouble with the quota system as it is, is that it doesn't really protect against invalids - as they don't materialise until quorum cannot be reached (by which time, a rogue host has downloaded another 1,000 tasks). Fixing that would alleviate a lot of the issues....

____________

Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Anonumous host throwing only errors, 3223 right now

Copyright © 2014 University of California