Message boards :
Number crunching :
OpenCL NV MultiBeam v8 SoG edition for Windows
Message board moderation
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 · Next
Author | Message |
---|---|
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4892 Credit: 85,281,665 RAC: 126 |
If "1" never shown what about app's device capabilities listing? Does it show sometime other GPU selected? Yes, sometimes it has used the device 0 and shown it correctly on both lines of stderr. Unfortunately I had to revert back to the cuda applications, too many driver and computer crashes while running SoG. I may try again after tuning becomes easier. Maybe these cards (GTX970 and GTX650 Ti) are too different to run SoG smoothly on a same computer. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Maybe these cards (GTX970 and GTX650 Ti) are too different to run SoG smoothly on a same computer. There is special treatment possible that was developed specially for the case of very different GPUs of same vendor in single host. Look ReadMe for
|
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4892 Credit: 85,281,665 RAC: 126 |
Thank you for the information. I'll keep it in mind for the next time. For now I don't have time to experiment more. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hi Raistmer, . . This is probably nothing, it is the only error so far in probably over 500 WU's but here it is. http://setiathome.berkeley.edu/result.php?resultid=4974924416 . . . Very little information in the output but I thought it might be better to add it to the database :) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
http://www.ghacks.net/2015/10/16/fixing-the-application-was-unable-to-start-correctly-0xc0000018-in-windows/ SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
0xC0000018 STATUS_CONFLICTING_ADDRESSES {Conflicting Address Range} The specified address range conflicts with the address space. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
0xC0000018 Probably if not repeatable, then a genuine bitflip (e.g from cosmic rays or radioactive carbon in the processor/ram). Workstation grade components with ECC memory reduce the probability of that. We've been referring to that as ' "Eddys in the spacetime continuum", "Eddie Who's Eddie?", "No Not WHo's Eddie, What's Eddie?", "What? What's Eddie doing in the spacetime continuum ?" "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Just got two bad work units with missing header information it looks like. 4294967290 (0xfffffffa) Unknown exit code These work units: Task 4975612198 Task 4975612087 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
Just got two bad work units with missing header information it looks like. Both wingmates completed successfully, which suggests that the raw datafile had headers intact. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
So does that mean that computer mangled the work units just when it grabbed them for processing? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
So does that mean that computer mangled the work units just when it grabbed them for processing? Many possible layers between the server and client CPU, from download through reading from disk. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
William Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0 |
So does that mean that computer mangled the work units just when it grabbed them for processing? there's an MD5 check after DL isn't there? And/or a size check? A person who won't read has no advantage over one who can't read. (Mark Twain) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
So does that mean that computer mangled the work units just when it grabbed them for processing? And we have seen that error message before, in other applications including CUDA, with no conclusive evidence that the data file has suffered any corruption at all. It seemed (IIRC) to be more prevalent on task restarts than initial runs. I think that the code generating that error message dates from the original Berkeley CPU code: checking that for trigger points might give us a better handle on what's really happening under the hood. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
So does that mean that computer mangled the work units just when it grabbed them for processing? Any prevalence more common than about once every 3 months on a given host, would indicate either a configuration, system or indeed client or application issue. Less frequently than that on sub-workstation grade componentry indicates noise (radiation). "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I actually think it was the BOINC shutdown that froze on exit and then blue-screened the computer that did it. Strange thing is that I always wait till a quiescent period in BOINC activity before I initiate a shutdown. That means no work units are close to finishing, all recently completed work units have successfully uploaded and BOINC is not close to asking for network communication. Only when all those cases are met do I shutdown the Manager and close the client. I can only conclude that BOINC was reading those tasks when the computer blue-screened. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I actually think it was the BOINC shutdown that froze on exit and then blue-screened the computer that did it. Strange thing is that I always wait till a quiescent period in BOINC activity before I initiate a shutdown. That means no work units are close to finishing, all recently completed work units have successfully uploaded and BOINC is not close to asking for network communication. Only when all those cases are met do I shutdown the Manager and close the client. I can only conclude that BOINC was reading those tasks when the computer blue-screened. I'd class that as possibly reproducible [Rather than Eddy/Eddie]. Can you try that ? (could take substantial hammering :) ) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
At least the error message gives us a file name and line number: !swi.data_type || !found || !swi.nsamples File: ..\seti_header.cpp Line: 216 // Allow old style headers to be parsed correctly. // jeffc - need this? //swi.fft_len=2048; //swi.ifft_len=8; do { fgets(buf, 256, f); } while (!feof(f) && !xml_match_tag(buf,"<workunit_header")) ; Looks like we've dropped through to some legacy code - perhaps we should have branched to a more modern path higher up? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
Actually, the error message is on line 232 of the current file. Are we using an outdated version of seti_header.cpp? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I believe that type of failure and exit status is the first I've experienced. I am running the latest beta BOINC Manager 7.6.29(x64) which I believe has had some code changed recently to fix Manager exits compared to the last stable release 7.6.22(x64). Richard probably could say just what the code jockeys played with in the latest beta. [Edit] Looks like my copy of the beta is not the latest now. We're up to 7.6.33(x64) Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I believe that type of failure and exit status is the first I've experienced. I am running the latest beta BOINC Manager 7.6.29(x64) which I believe has had some code changed recently to fix Manager exits compared to the last stable release 7.6.22(x64). Richard probably could say just what the code jockeys played with in the latest beta. Yeah, not a 'normal' situation IMO. Would need to be reproducible on demand to localise better. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.