Message boards :
Number crunching :
To Whomever this concerns: Backoffs and error -6
Message board moderation
Author | Message |
---|---|
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
Please Discontinue the Backoffs on Error -6 as a certain amount of VLAR's(WU's) are currently assigned wrongly to the gpu and should be reassigned to the cpu, But since there is no mechanism in the Seti apps or Boinc Itself currently to do this(although It has been suggested I've read) the only other alternative is yep Error -6 so that the WU can be reassigned to some other cruncher somewhere else that hopefully doesn't have a gpu(Nvidia video card, Or later on ATi video cards maybe), If You can't do this then return the $20 donation to Me. As I had one WU on the gpu here take over 5 hours. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Blurf Send message Joined: 2 Sep 06 Posts: 8962 Credit: 12,678,685 RAC: 0 |
SJ, I have sent your concerns directly to Eric and Matt for a response from them on this issue. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
SJ, Thanks Blurf. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Sorry, but if you choose to use an unofficial App that kills VLAR's with an error condition, then, as far as I am concerned, that is your choice. Don't blame the project for the consequences. F. |
Andy Williams Send message Joined: 11 May 01 Posts: 187 Credit: 112,464,820 RAC: 0 |
I had one WU on the gpu here take over 5 hours. I just aborted three. One was over 11 1/2 hours, two were over ten hours. Nowhere near done. -- Classic 82353 WU / 400979 h |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
First, you are using a development version of BOINC. second running seti and making donations are voluntary. I guess thats 2 and 3. 4th the way it works is the way it works. Seti apps dont look at teh angle range before assigning them to a cpu or GPU. It just assigns them. If you have hanging WU's on your GPU you should try raistmers v.11 app. This app is supposed to eliminate the hanging GPU WU's. As I recall BOINC 6.6.20 already has the VLAR killer in it. I'm not sure how anyone would work around the -6 error but VLARs are still a problm for a few folks In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Andy Williams Send message Joined: 11 May 01 Posts: 187 Credit: 112,464,820 RAC: 0 |
First, you are using a development version of BOINC. I'm not sure who this is addressed to, but I personally am running a development version on one non-CUDA machine. 6.6.20 on all others. If the VLAR killer is in 6.6.20 (first time I've heard that), it clearly is buggy. I am already using Raistmer's apps on all 6.6.20 machines. -- Classic 82353 WU / 400979 h |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
As I recall BOINC 6.6.20 already has the VLAR killer in it. I'm not sure how anyone would work around the -6 error but VLARs are still a problm for a few folks Nope. VLAR killer is only in Raistmer's apps. 6.6.20 will feed you VLAR's if they are next in the queue when you ask. F. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
First, you are using a development version of BOINC. Actually It's in the apps Raistmer makes(but not all), It's not in any version of Boinc. Don't bother Fred W, Yer blocked. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
He does have a point. The application is intentionally throwing errors, and what BOINC does on an error was known beforehand. The effort to "fix" this could/should go into the application itself to find out why the CUDA card does these slowly and address it there, not by returning work because it doesn't pay well enough. I understand Nvidia did the port, and they'd likely be the best ones to fix it. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
pay, smay, It just takes too damn long, Sure It should be done in the app, So far no one knows how to fix this I've read, Not a clue, So reassign the dumb things and allow the WU to be aborted by the app in question, Seti has already had a lot of users depart for various reasons, Do they need more to depart forever? Before something positive on the subject is done? The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Andy Williams Send message Joined: 11 May 01 Posts: 187 Credit: 112,464,820 RAC: 0 |
The effort to "fix" this could/should go into the application itself to find out why the CUDA card does these slowly and address it there, not by returning work because it doesn't pay well enough. The point is not the payment or credits. It's about using the proper tool for the job. Wasting 24 hours, or worse hanging the video card completely, on a VLAR is an enormous waste of computational resources. -- Classic 82353 WU / 400979 h |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
The effort to "fix" this could/should go into the application itself to find out why the CUDA card does these slowly and address it there, not by returning work because it doesn't pay well enough. I hear Ya, Mine hung yesterday and I had to reboot the PC. Another nail. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
Backoffs as in Communication Deferred on the Projects Tab in Boinc and It can be as much as 24 hours, You want to punish somebody? What's next Banishment for using the VLAR killer? Until the project can come up with a better idea that works, Stop punishing the users for this. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Blurf Send message Joined: 2 Sep 06 Posts: 8962 Credit: 12,678,685 RAC: 0 |
Backoffs as in Communication Deferred on the Projects Tab in Boinc and It can be as much as 24 hours, You want to punish somebody? What's next Banishment for using the VLAR killer? Until the project can come up with a better idea that works, Stop punishing the users for this. SJ--please relax. Your concern has been escalated to the Admins and I am sure they will address it when they can. |
Moabiter Send message Joined: 9 Dec 02 Posts: 79 Credit: 215,029 RAC: 0 |
Hi there! I'm not sure of how many VLARS are around, but this problem will be fade away, when they are all crunched? How long will it take to deal with 2 month (Dec/Jan) of problem WUs? In my opinion, yes, it is a waste of resources if the WU could be done 5 times faster. Non-Optimized SSEx Astropulse Apps are 10 times slower, also a waste of resources (for todays CPUs). I think, if the coder are working on future apps, they should not waste time to 'optimize' 'old' apps. You need a new mainboard if you want more CPU-power (mostly). To contribute to Seti as a small cruncher more effectively would be a big motivation boost. Btw. OT: Can someone point me to the Raimster + .xml and any needed .dlls to run optimized AP + S@Home Enhanched but without killing the VLARs, please? Browsing the forums I only come across VLAR-killers. Thanks =) During browsing berkeley's homepage I came along this statement that makes me say: "ME!, ME!, ME!, wants to know, what patch was used and is it available for windows?" |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14655 Credit: 200,643,578 RAC: 874 |
He does have a point. The application is intentionally throwing errors, and what BOINC does on an error was known beforehand. I agree with you and Fred. The VLAR kill is merely a workround - and you might say a clumsy workround - for a problem which is proving surprisingly hard to fix. If you use a workround, you should be aware of the consequences, and be prepared to live with them. As it happens, I had an email today from the nVidia developer who did a major part of the port. He said: I am aware of the VLAR sluggishness and have spent some time trying different way to resolve it given the current pulse detection algorithm with not much luck. The bottom line is that when using CUDA on the same GPU that also used as the graphical display, it is difficult to prioritize one type of GPU client over another on the current HW architecture. SETI@home is not alone, this issue can affect any CUDA application that demands high amounts of time from CUDA. And BTW, BOINC does not impose a communications backoff as a result of a single reported error. If you report 100 in a row, with no valid work at all, then quota limits will kick in to protect the project servers (which heaven knows are in a fragile enough state). But if you have valid work to return, backoff can always be overcome. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65839 Credit: 55,293,173 RAC: 49 |
Backoffs as in Communication Deferred on the Projects Tab in Boinc and It can be as much as 24 hours, You want to punish somebody? What's next Banishment for using the VLAR killer? Until the project can come up with a better idea that works, Stop punishing the users for this. I was only elaborating on what I meant by backoffs Blurf(so relax) as there are other backoffs due to traffic problems which are a different subject for some other thread. :D The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
The other point that occurs to me is "Has SJ actually run out of work because of the 24 hour back-off?" (He obviously can't answer this himself since I am blacklisted). As Richard says, if the VLAR killer returns 100+ consecutive VLARs then the back-off will be 24 hours but only until a non-error is returned and there should be plenty of those well within that 24 hour period so all it needs is a click on the "Update" button to report them isn't it? F. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I've got an AP work unit which has 266 hours on it so far. I think it'll finish sometime tomorrow. It's not a problem, it is the perception of a problem. My question is: is there a problem, or are people leaving because of the constant complaining? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.