Message boards :
Number crunching :
app_info for AP500, AP503, MB603 and MB608
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 12 · Next
Author | Message |
---|---|
![]() Send message Joined: 16 Aug 99 Posts: 114 Credit: 6,352,198 RAC: 0 ![]() |
Try to check if the files have the correct permissions. They should be the same as the folder. |
![]() ![]() Send message Joined: 23 Feb 00 Posts: 20 Credit: 6,932,001 RAC: 0 ![]() |
What permisssions? Bark Loud! Bite Hard! |
![]() Send message Joined: 16 Aug 99 Posts: 114 Credit: 6,352,198 RAC: 0 ![]() |
right-click the file, then under the security tab |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Well the first thing I notice in your app_info.xml file is that you are trying to use the wrong App for Astropulse v5. The AK_V8 will not process Astropulse - you need the ap_5.03r112_SSE3.exe for that. F. ![]() |
EPG Send message Joined: 3 Apr 99 Posts: 110 Credit: 10,416,543 RAC: 0 ![]() |
Having some problems... Can I get some help? There is a line after the preferences limits file projects\setiathome.berkeley.edu\ap_5.00r103_SSE3.exe not found check that
|
![]() ![]() Send message Joined: 23 Feb 00 Posts: 20 Credit: 6,932,001 RAC: 0 ![]() |
Yup I added the AK to the AP, My bonehead move... Changed it to the correct AP 503 and all is well, now I can see what I can do with this new i7! Thanks again... Bark Loud! Bite Hard! |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
Having some problems... Can I get some help? FWIW, 4 of those AP WUs had a data problem not your fault at all. Specifically, your tasks 1216982434, 1216982874, 1216983056, and 1216984097 quit without producing an output file with: Error in ap_remove_radar.cpp: generate_envelope: num_ffts_performed < 100. Blanking too much RFI? They were all from the the B3_P1 channel of 'tapes' from ap_13mr09 or ap_20mr09, which matches what others are seeing. See some recent posts in the Astropulse Errors II-Optimized version 5.03! thread for more details. Joe |
Terror Australis Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44 ![]() ![]() |
Just a tip that some may find useful. I found that altering the lines <avg_ncpus>0.127970</avg_ncpus> <max_ncpus>0.127970</max_ncpus> back to the "stock" 0.05 gave smoother running on my CUDA box. This stopped a tendency to stall for periods of up to a minute on both GPU and CPU units and helped with VLAR's in particular. Brodo |
![]() ![]() Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 ![]() |
This thread keeps getting buried. Just bringing it back to the front page. ![]() PROUD MEMBER OF Team Starfire World BOINC |
![]() ![]() ![]() ![]() ![]() Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 ![]() ![]() |
Following Brodo's comments about reducing avg/max_ncpus :- I also had my settings as :- <avg_ncpus>0.127970</avg_ncpus> <max_ncpus>0.127970</max_ncpus> I have been having issues with Computation Errors (exit code -5 = no result file) in the following circumstances with CUDA tasks - although I don't see any stalling going on. 1. When suspending a task, a new task starts and ends after about 6 seconds with error -5. A second task then starts OK. 2. When new tasks are downloaded at the top of the queue as the status changes to READY TO RUN it fails as above. The same continues to happen for all the following tasks as they arrive. If I'm lucky and happen to spot a bunch of tasks like this before they get reported, I stop BOINC and change their states back to READY TO RUN and they then run OK. On one PC I have now tried reducing the settings until I reached 0.099000 - problems at 0.100000. I haven't carried out exhaustive tests but was able to suspend tasks without new tasks erroring at this setting. I haven't yet seen if this reduction is throttling the GPU throughput. I have been running AK V8 MB and stock CUDA app on versions of BOINC from 6.6.20 up to 6.6.28 on various PCs with GTS250 and GT8600 and GT9600 GPUs with various recent graphic driver versions. (GPUs not overclocked) Has anyone else encountered similar problems? GPU Users Group ![]() ![]() |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Following Brodo's comments about reducing avg/max_ncpus :- My experience is similar, but with significant variants. I have assumed that it is a "feature" of Boinc 6.6.23 so have not raised it as an issue - but since you ask... I have <ave_ncpus> and <max_ncpus> both set to 0.15 and that has served me well so far. If, for any reason, my rig goes into EDF mode (e.g. watching live-feed football on the web the other night so the GTX295 had other things to do for a change) whatever is being crunched at the time goes to "waiting to run" and 2 nearest scheduled WU's start up (this is ALL on the CUDA). When the first of those 2 finishes and uploads, the second goes into "waiting to run" and 2 new ones start up at the same time. And so it goes on - for every 2 that start, one reports and the second goes to "waiting to run" so when it comes out of EDF mode, I have a whole list waiting to run when their turn comes round. Now the similarity - when those WU's finally do run, they error with the same exit code -5 = no result file. So, as you can see, different circumstances - same symptom; perhaps there is a linkage somewhere? F. ![]() |
![]() ![]() Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 ![]() |
I had mine at 0.127970 and was having no problems but saw it suggested that 0.040000 Would help it run a little bit smoother. I am trying it now but I don't see much change. Then again, I've never had the -5 problem either.(knock on wood) ![]() PROUD MEMBER OF Team Starfire World BOINC |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Now the similarity - when those WU's finally do run, they error with the same exit code -5 = no result file. I got my first error -5 today, while doing some manual suspend/resume work to get a Beta test run to start before its time. The full message is: <core_client_version>6.6.28</core_client_version> <![CDATA[ <message> - exit code -5 (0xfffffffb) </message> <stderr_txt> SETI@home error -5 Can't open file (work_unit.sah) in read_wu_state() errno=2 File: ..\worker.cpp Line: 123 </stderr_txt> ]]> - so nothing to do with result files. [BOINC's message about a missing file is always a consequence of the crash, and says nothing whatsoever about the cause of the crash] The behaviour of BOINC when suspending/resuming CUDA tasks does still seem to have some problems - I'll try to write it up for boinc_alpha. Knowing the usual response - Fred, would you be willing to catch some debug logs and screen shots (of BOINC, not the footy) next time there's a good match on? |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I had mine at 0.127970 and was having no problems but saw it suggested that 0.040000 Would help it run a little bit smoother. I am trying it now but I don't see much change. Then again, I've never had the -5 problem either.(knock on wood) There's nothing magical about 0.127970 either. The figure was what BOINC worked out for my Q6600/9800GT combination when I was first experimenting with the new form of app_info files - see Beta message 36887. I think it probably comes from the SETI app needing about 2m 30s of CPU time for a task that runs 20 minutes on the CUDA card - the ratio feels about right. So far as I know, the figure doesn't (shouldn't?) affect processing in any way: like the flops adjustment, it's supposed to be used just to smooth the work-fetch balance between CPU and CUDA tasks. If you feel inclined to fiddle, I think you should try: If you have a mega-fast CUDA card on a comparatively slow motherboard/CPU host, expect to use a higher<avg_ncpus> & <max_ncpus> figure. If you have a fast CPU, but only a comparatively weak CUDA card, go the other way - use a lower<avg_ncpus> & <max_ncpus> figure. But always keep both figures the same as each other. (Unless you can work out something else that we don't know yet!) |
![]() ![]() Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 ![]() |
Unless of course you have a case like mine where everything is slow. :) ![]() PROUD MEMBER OF Team Starfire World BOINC |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Now the similarity - when those WU's finally do run, they error with the same exit code -5 = no result file. Re-looking at my results, mine matches your result exactly, Richard. Anything I can do to help, of course. I could even watch something other than soccer ;). What debug flags / screen-shots are needed? F. ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Re-looking at my results, mine matches your result exactly, Richard. Jorden usually says "Run at least once with these flags on: <work_fetch_debug>, <sched_op_debug> and one run with <debt_debug>", but I think he's away this weekend, so we can make our own rules ;-) Don't worry about work_fetch or debt_debug for this one: sched_op hardly seems to do anything. I think we're going to need <cpu_sched> and <cpu_sched_debug>. For screenshots, I was thinking of those "every 2 that start, one reports and the second goes to 'waiting to run'" - something showing those 'waiting to run' on the tasks tab, perhaps with a matching later one showing the errors. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Re-looking at my results, mine matches your result exactly, Richard. OK. I will give that a whirl tomorrow (should be watching the Spanish Grand Prix anyway). I've now upgraded to CUDA2.2 (and had to increase the GPU fan speed considerably) and BOINC 6.6.28 so, if it still happens it will prove that it is not driver or 6.6.23 causing it. F. ![]() |
![]() ![]() ![]() ![]() ![]() Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 ![]() ![]() |
Well I've gone back and given this another go and am getting the same results as you are :- <![CDATA[ <message> - exit code -5 (0xfffffffb) </message> <stderr_txt> SETI@home error -5 Can't open file (work_unit.sah) in read_wu_state() errno=2 File: ..\worker.cpp Line: 123 </stderr_txt> Also get <status>-161</status> in the file_info section of client_state.xml which I use to track the errors and do a fix up. Apologies for the "no result file" red herring. I think I had this problem mixed up with another issue I had. I've also tried running with a few avg/nax_ncpus values ranging from 0.15 down to 0.04 on a Core 2 6600 / GT9600T (driver 18250) on Windows XP (not exhaustively). I hadn't previously tried the lower settings but this seems on my set up to produce the same level of -5 errors when suspending as the higher settings. 0.099 which seemed to be most reliable I found now does still give some -5 errors so it is hard to say what the full influences are. Another thing I have noticed on just this machine is that some CUDA jobs occasionally hang and the elapsed and completed times continue to clock up into hours but progress doesn't move. I don't think they are VLAR tasks (I have been doing a fairly coarse rebranding of those since I started getting flooded by them) and stop/starting BOINC kicks them into life again. I also notice that some versions of app_info.xml in these threads give avg/max_ncpus = 1.000000 settings for the the non cuda apps. My files do not have this entry. Is this significant or is it just for completeness and the 1.000000 the default setting anyway? As I seem to be able to recreate the error -5s at will, I am happy to produce some debug output if it is of use. John. GPU Users Group ![]() ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I think I've worked out what's happening here, and I've submitted a bug report. Fred, you have mail (ISP permitting). It seems likely to happen whenever a running CUDA task is pre-empted, and a new CUDA task started in its place. I think it will happen with every version of BOINC from v6.6.23 onwards, because of a bug with new code added then: Changes for 6.6.23 If the job start is delayed to allow the old one to tidy up after itself, the next attempt is treated as a re-start, instead of a new start: that's presumably why the work_unit.sah file isn't copied into the slot directory ready for use. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.