Trainwreck waiting to happen or valid science ??

Author	Message
Dave Stegner Volunteer tester Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27	Message 1530342 - Posted: 20 Jun 2014, 20:04:55 UTC Last modified: 20 Jun 2014, 20:06:55 UTC I just started up a new machine and am keeping a close eye on it. http://setiathome.berkeley.edu/results.php?hostid=7317240 Naturally when I saw a validation inconclusive, I checked it out. http://setiathome.berkeley.edu/workunit.php?wuid=1526127745 I looked a little further at the wingman's cruncher. http://setiathome.berkeley.edu/show_host_detail.php?hostid=7295144 While looking at a few of his VALID tasks, I found this. http://setiathome.berkeley.edu/workunit.php?wuid=1526259021 My question is, are invalid results validating against other invalid results. Whenever I have checked my inconclusives before, the wing person with a few seconds of gpu work against my cpu 5 hours would always invalidate after being run by a third. Now 2 or 3 seconds of work is validated against another machine with the same type gpu and a couple seconds of work. GPU's, properly configured and watched, are probably worth having. But how many that do not fall into that category do we need before someone does something? Since May the 27th, 1698 inconclusive and 796 invalid and still counting. I for one am here to do valid science but, when I see things like this I begin to wonder. Maybe it is just me being paranoid but, it sure looks ugly Dave ID: 1530342 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1530347 - Posted: 20 Jun 2014, 20:21:07 UTC Last modified: 20 Jun 2014, 20:23:13 UTC The workunit you linked to is 100 blanked. I'm not certain, but I think those get validated like MB's that come in with -30 overflow errors. All of the valid tasks on that machines have wingmates with no invalid or processing errors. Even a broken clock is right twice a day. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1530347 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1530400 - Posted: 21 Jun 2014, 0:09:46 UTC - in response to Message 1530347. Last modified: 21 Jun 2014, 0:12:40 UTC The workunit you linked to is 100 blanked. I'm not certain, but I think those get validated like MB's that come in with -30 overflow errors. All of the valid tasks on that machines have wingmates with no invalid or processing errors. Even a broken clock is right twice a day. And that workunit is a B3_P1. That channel has had an intermittent data problem for years which can cause all the AP tasks from that channel to fail because there's no usable data. The problem tends to affect all tasks from that channel recorded for a month or more to fail that way, then it mysteriously disappears only to appear again a few months later. The validation is as Hal has noted, the apps indicate a "Success" exit and the validator considers zero signals from both hosts as a match. Joe ID: 1530400 ·

Dave Stegner Volunteer tester Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27	Message 1530405 - Posted: 21 Jun 2014, 0:28:14 UTC Apparently I picked on the wrong work unit as an example or maybe I should not have picked one at all. But, what about the other 2500 "issues" he has created in 2 weeks time? Dave ID: 1530405 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34760 Credit: 261,360,520 RAC: 489	Message 1530409 - Posted: 21 Jun 2014, 0:38:23 UTC - in response to Message 1530405. Apparently I picked on the wrong work unit as an example or maybe I should not have picked one at all. But, what about the other 2500 "issues" he has created in 2 weeks time? There are a lot more rigs like that 1 around here and most won't answer PM's while others insist that there's nothing wrong with them. :-( You only have to check out the Invalid Host Messaging thread pinned near the top of this forum section to see a lot more. Cheers. ID: 1530409 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1530416 - Posted: 21 Jun 2014, 0:55:00 UTC - in response to Message 1530409. Yes Seti has a lot but a question I ask is why don't I see them on my other project, E@H? As an aside I crunch a lot more there. ID: 1530416 ·

Dave Stegner Volunteer tester Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27	Message 1530422 - Posted: 21 Jun 2014, 1:14:33 UTC As per Jofef's explanation (posted in the other thread, quoted below) Why can something not be done to shut these producers of scrap off period? If I understand correctly, the allocation for a gpu starts at 800. If you trash those 800, you should be cut off. [Quote} How can his number of tasks today be so high, when his max tasks per day is only 33? For GPUs, max tasks per day is multiplied by the project's gpu_multiplier setting which is 8 here. So the base quota is 264 per GPU. And the quota check is done early, you either get the message saying quota has been exceeded or the servers try to fulfill the work request. That often overshoots the quota (particularly on GPUs) because the project's max_wus_to_send is also multiplied by gpu_multiplier. See http://boinc.berkeley.edu/trac/wiki/ProjectOptions#Joblimits Wiggo has suggested reducing the base quota, and it could probably be trimmed to 25 or so without making recovery from a temporary problem too slow. But perhaps it would make sense to reduce the gpu_multiplier setting too. Joe [/Quote] Dave ID: 1530422 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1530445 - Posted: 21 Jun 2014, 2:45:02 UTC - in response to Message 1530405. Apparently I picked on the wrong work unit as an example or maybe I should not have picked one at all. But, what about the other 2500 "issues" he has created in 2 weeks time? All of that machines "valid" tasks I looked at seems to be OK to me. It is like they are returning 0/1 & the other host is returning 0/5. Both of which equal 0. All of the tasks they are trashing just cause another task to be sent out or that workunit. Then when there are two other good results their task is marked invalid. Which they have a large portion of right now. The work throttling mechanism could use some work I think. I had a machine at work go on a killing spree a few weeks ago. It trashed 2000-2500 before it was limited to only 33 at a time to trash. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1530445 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.