invalid resends

Message boards : Number crunching : invalid resends
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1628237 - Posted: 16 Jan 2015, 2:34:58 UTC

Hi,
can anyone enlighten me as to why resends are sent out 10 times, only to result in a cannot validate, to many attempts or similar?

Frankly it seems to me that tasks that have been resent more than 4 times are a waste of resources. None seem to validate.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1628237 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1628296 - Posted: 16 Jan 2015, 7:19:12 UTC - in response to Message 1628237.  

Cliff, all of these invalids are from the same tape described in this thread. The tasks are misconfigured for the incorrect application. All will end in invalid. The best thing to do is to abort the tasks immediately upon seeing them in your system. It is a waste of time and resources to process them. Unfortunately there is no way to remove them from the splitter queue. They will slowly purge from the system once they have been issued ten times and then will be retired. If you look at your invalids and click on the column header to show task names, you will see that they all start with 19ap11ad.xxxxx. This is the thread describing the problem:

http://setiathome.berkeley.edu/forum_thread.php?id=76479

Cheers, Keith
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1628296 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1628405 - Posted: 16 Jan 2015, 13:19:11 UTC - in response to Message 1628296.  

Hi Keith,

Thanks for the info, WU deletion about to commence:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1628405 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1630026 - Posted: 19 Jan 2015, 22:13:21 UTC

Hi Folks,

Well its happening again, diff WU this time 01jl's with several resends, I have 2 right now with _7 and _4 seems to me those will get a label of too many results like the other lot:-(

I also got 2 x AP v7 WU sent, when my prefs are set to NO AP's either v6 or v7.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1630026 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1630040 - Posted: 19 Jan 2015, 22:34:06 UTC - in response to Message 1630026.  

Yes, we've identified the 01jl12ad.5031.xxxxx.140733193388035 channel in the other thread already, too. Just the one bad set there, so far as anyone has reported so far.
ID: 1630040 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1630044 - Posted: 19 Jan 2015, 22:40:36 UTC - in response to Message 1630040.  

Hi Richard,

Yes, we've identified the 01jl12ad.5031.xxxxx.140733193388035 channel in the other thread already, too. Just the one bad set there, so far as anyone has reported so far.


Whats the consensus? Abort those tasks as the last lot were?

It seems possible to me that there are 2 problems occurring now, not just the splitter, but also whichever server allocates tasks seems to be misreading user pref's as well.

When there were problems yesterday getting MB tasks I set AP v7 to yes, a few minutes later I got a load of MB tasks, and immediately reset my prefs to NO for AP v7. But apparently its still being seen as YES..

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1630044 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1630064 - Posted: 20 Jan 2015, 0:18:10 UTC
Last modified: 20 Jan 2015, 0:18:56 UTC

I just let them run on my 2 rigs Cliff.

They're only done on my GPU's and they only take 5mins to do. :-)

Plus there really isn't all that many of them and I thinks that's it's far better just to work them out of the system instead of just aborting them, Aborts have way more chance of lasting so very much longer in the system. ;-)

BTW I just noticed another batch, 19ap11ad.23016.13537.140733193388045.12.221_9. :-(

Cheers.
ID: 1630064 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1630069 - Posted: 20 Jan 2015, 0:40:40 UTC - in response to Message 1630064.  

Hi Wiggo,

Well some take up to 25mins on my 970's not the 1st batch but the new ones.
I don't mind a 4sec sprinter, but over 5 mins and I'm less than amused:-/

Wasting up to 25 mins on a known invalid is a waste of resources.

Even a 5 min task is cutting throughput over time, get 10 or more of them and you loose a couple of 'good' WU completions.

I've not aborted any as yet though, waiting to see how things pan out.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1630069 · Report as offensive
Profile Uli
Volunteer tester
Avatar

Send message
Joined: 6 Feb 00
Posts: 10923
Credit: 5,996,015
RAC: 1
Germany
Message 1630107 - Posted: 20 Jan 2015, 3:12:51 UTC

I am with Wiggo, I just let them run. I rather have an invalid on my account, then an error.
Pluto will always be a planet to me.

Seti Ambassador
Not to late to order an Anni Shirt
ID: 1630107 · Report as offensive
bluestar

Send message
Joined: 5 Sep 12
Posts: 7031
Credit: 2,084,789
RAC: 3
Message 1630113 - Posted: 20 Jan 2015, 3:52:40 UTC

Oh, supposedly one is more funnier than another.
ID: 1630113 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1630122 - Posted: 20 Jan 2015, 4:26:08 UTC

What's kind of funny (or at least mildly amusing) to me, are those who go to the effort of running (and often fine tuning) optimized apps to squeeze as much productivity as they can out of their rigs, then turn around and think nothing of running tasks that have been shown to achieve zero productivity. Not much optimization can do to help out there! ;^)

Anyway, I'm with the "abort them and move along" faction. So far I've killed 154 (including 25 that were waiting for me just this morning), with no ill effects other than some temporary dampening of my "max tasks per day" threshold (which has never gotten low enough to keep my queue from filling). Even if each of those tasks ran on a GPU for only 5-10 minutes, they would have resulted in quite a few non-productive hours.

Aborting them helps speed the WUs to the 10 total tasks needed for their ultimate demise, which seems to be a constant requirement regardless of whether tasks are aborted or invalidated. And, frankly, I don't much care whether a task shows up on my task list as an Invalid or as an Error. I hate seeing either one, but it's unavoidable with this crop of WUs.
ID: 1630122 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1630126 - Posted: 20 Jan 2015, 4:42:26 UTC - in response to Message 1630122.  

To each his own I say

I choose to let them run.

What do I care if they only use 5 minutes to run. Actually its a lot less than that since my machines run them that much faster than a "stock" computer.

Saves me the trouble of having to go sorting them out and aborting them.

Probably would take me longer to go hunting for them than to let them run.

Since this started I've had 259 reported as invalid and another 156 that will be also consider that.

As it is, it doesn't affect my total allotment of work units for the day and has minimal affect on my RAC.

Abort, run...doesn't matter in the end. Only thing that matters is that the errors aren't being caused by my machines and I'm not the one causing others to have to waste time and energy recrunching the data.

Hopefully they will figure this out and we can get back to just doing what we do...


Zalster
ID: 1630126 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1630315 - Posted: 20 Jan 2015, 16:11:05 UTC

I am just letting them run as well.
Along with the fact that finding and nuking them amongst the 3100 cached tasks on 9 rigs being quite a chore, they are taking very small bites of time to process and if this helps to clear them from the system a bit faster, I am OK with that.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1630315 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1630331 - Posted: 20 Jan 2015, 16:51:34 UTC - in response to Message 1630122.  

...
Aborting them helps speed the WUs to the 10 total tasks needed for their ultimate demise, which seems to be a constant requirement regardless of whether tasks are aborted or invalidated.
...

As a practical matter that's probably true. But "Aborted by user" is an error condition, so 6 of those would kill a WU with "too many errors".

The invalid case is of course tasks which the client sent back as "success", but there's a specific exclusion such that they are only counted against the total. See http://boinc.berkeley.edu/trac/ticket/1029.
                                                                   Joe
ID: 1630331 · Report as offensive

Message boards : Number crunching : invalid resends


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.