Message boards :
Number crunching :
Running SETI@home on an nVidia Fermi GPU
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 15 · Next
Author | Message |
---|---|
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Good morning! Based on careful reading of this thread, I tried to get my app_info file straightened out with the latest file names. I got 3 GPU units, but they errored out instantly, so I realize I need help. This is what I have so far. The CPU and AP portions work perfectly, as they are the result of the Lunatics installer, but clearly the Fermi portions are flawed somehow. <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_v8b_win_x64_SSSE3x.exe</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v505</name> </app> <file_info> <name>ap_5.05r409_SSE.exe</name> <executable/> </file_info> <app_version> <app_name>astropulse_v505</app_name> <version_num>505</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>ap_5.05r409_SSE.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>astropulse_v505</app_name> <version_num>505</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>ap_5.05r409_SSE.exe</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</name> <executable/> </file_info> <file_info> <name>cudart32_30_14.dll</name> <executable/> </file_info> <file_info> <name>cufft32_30_14.dll</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-1-1a_upx.dll</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <avg_ncpus>0.100000</avg_ncpus> <max_ncpus>0.100000</max_ncpus> <platform>windows_intelx86_64</platform> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <avg_ncpus>0.100000</avg_ncpus> <max_ncpus>0.100000</max_ncpus> <platform>windows_x86_64</platform> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <avg_ncpus>0.100000</avg_ncpus> <max_ncpus>0.100000</max_ncpus> <platform>windows_intelx86_64</platform> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version> Thank you for any help. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
Good morning! Based on careful reading of this thread, I tried to get my app_info file straightened out with the latest file names. I got 3 GPU units, but they errored out instantly, so I realize I need help. This is what I have so far. The CPU and AP portions work perfectly, as they are the result of the Lunatics installer, but clearly the Fermi portions are flawed somehow. It would be better if you posted a representative sub-set of error messages, so we know what we're looking for. |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Good morning! Based on careful reading of this thread, I tried to get my app_info file straightened out with the latest file names. I got 3 GPU units, but they errored out instantly, so I realize I need help. This is what I have so far. The CPU and AP portions work perfectly, as they are the result of the Lunatics installer, but clearly the Fermi portions are flawed somehow. At the time BOINC hadn't reported yet,and all BOINC said was computation error. Here is a link to one of the failed units. http://setiathome.berkeley.edu/result.php?resultid=1638383755 Thank you! I really feel bad causing even one error. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Speedy, the rescheduler won't work with the fermi. It only recognizes 6.08 and 6.09 it cannot do the 6.10 Fermi plan_class. Found your madMac conversation now. It's hard keeping up some times - I don't know how you do it! So the rebranded tasks just get processed with the extra >cuda< section of app_info and all original unbrandeded GPU tasks are processed with the original >cuda_fermi< section. GPU Users Group |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
Looking back at this 'Speedy' post, are you still getting 0 tasks showing up when you run the Reschedule tool? If you actually have CPU/GPU tasks but it shows a 0 count you may be suffering from a problem I had where a slight (very difficult to spot) corruption in the client_state.xml causes reschedule to show 0 tasks even though BOINC works perfectly OK. GPU Users Group |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
Good morning! Based on careful reading of this thread, I tried to get my app_info file straightened out with the latest file names. I got 3 GPU units, but they errored out instantly, so I realize I need help. This is what I have so far. The CPU and AP portions work perfectly, as they are the result of the Lunatics installer, but clearly the Fermi portions are flawed somehow. IIRC, "Exit status -185 (0xffffffffffffff47)" may refer to not having the correct DLL files either linked via app_info, or present in the project directory. But I'm 100 miles away from the nearest CUDA card this weekend, so it's hard to check. Or: Anyone else reading Steve's app_info as having three identical Fermi sections, all with <version_num>610</version_num> <platform>windows_intelx86_64</platform> <plan_class>cuda_fermi</plan_class> Read back over my conversations with MadMaC, but I think I'd try that with at least one each of: <platform> windows_intelx86 <plan_class> cuda_fermi and <platform> windows_intelx86 <plan_class> cuda (in the original format, of course: I've just shown it like that to emphasise the changes) The DLL references look OK - just check the files themselves are still there..... |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Thank you! I got 5 more GPU tasks, but I suspended the project until I had made the changes. It was a bit confusing, because my previous app_info had 3 identical sections. I had guessed it was because Todd had three GPU's on his machine. In your conversation with MadMac, it seemed like the first and third would be identical. Once I get this right, I will back it up as a guide. OK, I'll give it a go and see what happens. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
I'm stopping here for a moment. I did the edits, with the project suspended, closed and restarted BOINC, but the 5 GPU units disapeared. The project is still suspended until I get this figured out. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
Thank you! I got 5 more GPU tasks, but I suspended the project until I had made the changes. It was a bit confusing, because my previous app_info had 3 identical sections. I had guessed it was because Todd had three GPU's on his machine. In your conversation with MadMac, it seemed like the first and third would be identical. Once I get this right, I will back it up as a guide. OK, I'll give it a go and see what happens. He may have done that, but there's no need: you don't have four CPU sections for the four cores in your quad, do you? There's never any need for any exact duplicates, except while you're pasting a template to work from. If it's been done properly, there may be near-duplicates, but they will have subtle (but important) differences. When you (or anyone else) looks back through a thread like this, make sure you refer to an app_info that works, not a broken one that somebody has posted with a plea for help! |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
I appreciate that bit of knowledge. I want very much to come up to speed on how all this works. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
TheFreshPrince a.k.a. BlueTooth76 Send message Joined: 4 Jun 99 Posts: 210 Credit: 10,315,944 RAC: 0 |
I'm stopping here for a moment. I did the edits, with the project suspended, closed and restarted BOINC, but the 5 GPU units disapeared. The project is still suspended until I get this figured out. I have one GTX470 and it works perfectly :) At this moment I'm running 3 tasks on the GPU, because it seems to have the highest output after several tests with the same WU's. When running 4 tasks it started to slow down. But I'm not sure if that is caused by the GPU or the CPU that may be to slow to handle it. I now have 90-97% GPU usage. When running 2 tasks it was only 80-85% and running 1 task it was 60-65%. Another advantage is that the card keeps on crunching other WU's pretty fast if there is a VLAR processed by the GPU. My Watt-meter is also giving highest powerconsumption when running 3 tasks on GPU. You can use "GPU_Z" to watch the CPU usage. <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_v8b_win_x64_SSSE3x.exe</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>libfftw3f-3-1-1a_upx.dll</name> <executable/> </file_info> <file_info> <name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</name> <executable/> </file_info> <file_info> <name>cudart32_30_14.dll</name> <executable/> </file_info> <file_info> <name>cufft32_30_14.dll</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <avg_ncpus>0.200000</avg_ncpus> <max_ncpus>0.200000</max_ncpus> <flops>57462450464</flops> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>0.33</count> </coproc> </app_version> </app_info> |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
I appreciate that bit of knowledge. I want very much to come up to speed on how all this works. You have read anonymous platform, haven't you? It doesn't help much with the practical nitty-gritty you're wrestling with here, but it should give you a feel for the shape of how it's supposed to work. |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
I appreciate that bit of knowledge. I want very much to come up to speed on how all this works. Steve, The app_info you posted has a missing trailing </app_info> tag at the end. BOINC might be forgiving and not bother but for completeness (or perhaps you just didnt copy the whole text?) I assume you've gone from a working to non working set up. What steps happened in between? Did you just edit the app_info file? Are you using the same fermi app and dlls or download new ones - have they been corrupted (try again) or are the permissions not correct to allow execution? Do you still have an original app_info that worked OK still for comparison? John. GPU Users Group |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Yes it was working before the issues came up. Through a series of mistakes on my part, I had detached and reattached a couple of times. Since I wanted things to me done as correctly as possible, and I have a genuine desire to learn how all of this actually works, I did my best to follow this thread to set things up. Right now, the system correctly won't let me have any more GPU units, so I may have to wait until tomorrow to see if things are working. I am glad that it won't just keep throwing GPU units my way only to see them error out. The actual app_info file I was using did have the closing tag, but it may not have been selected when I coppied it. Once I get the mechanics straightened out, then it should be easy for me to duplicate things in the future for testing purposes. This is really facinating stuff, and I love learning it. My deepest respect to all the developers and their knowledge. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Everything seems to be working ok. Thankyou for asking Questor. I'm unsure on what to edit in the client state file for shifting vlar tasks to gpu so I'm going to leave it alone. |
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
I'm unsure on what to edit in the client state file for shifting vlar tasks to gpu so I'm going to leave it alone. Not client_state.xml, it's app_info.xml where changes are to be made. Gruß, Gundolf Computer sind nicht alles im Leben. (Kleiner Scherz) SETI@home classic workunits 3,758 SETI@home classic CPU time 66,520 hours |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Thanks I'll have a look in app info.xml Dose this mean I can use a opti cpu app & the standard 6.10 gpu app? |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
I'm unsure on what to edit in the client state file for shifting vlar tasks to gpu so I'm going to leave it alone. Three sides of the same coin. Reschedule gets its task count from client_state and modifies client_state when moving tasks from CPU <-> GPU. Richards and Madmacs work around for Reschedule being unable to cope with plan_classes other than >cuda< is to add extra entries to app_info for when Reschedule incorrectly brands the CPU to GPU tasks with cuda rather than cuda_23 or cuda_fermi (and thus mismatching the app entries in app_info). The other option was to manually edit client_state after running Reschedule to change the incorrectly branded entries of cuda to either cuda_23 or cuda_fermi as required. The file corruption I was referring to also occurs in client_state where Reschedule is unable to correctly parse the file (even though BOINC continues to work OK) and therefore reports zero tasks available / requiring rebranding. i.e. User testing for a reschedule CPU tasks: 0 (0 VLAR, 0 VHAR) GPU tasks: 0 (0 VLAR, 0 VHAR) No reschedule needed Even though you have many tasks. GPU Users Group |
Questor Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157 |
If you post your app_info file we can advise on the actual entries you need for your set up. But you will have a 603 entry for CPU tasks which can be either stock or opt app as you choose. You will need 2 entries for 610 fermi apps one with a plan_class of cuda_fermi and an identical copy but with plan_class cuda. You can then use Reschedule to move tasks form CPU <-> GPU. Normal unbranded GPU tasks will be handled by the first entry but when it moves tasks to GPU it will create entries with plan_class cuda which will be dealt with by the second entry (the work around). When using anonymous platform (i.e. having an app_info file) BOINC will not download any required apps so before modifying the file you will need to obtain copies of the required exes and dlls if you are going to use diffent ones from present. Edit: Apologies I now see from your machine list that the PC with your GTX470 in is not using an app_info file at present so Rescheduling without is going to be a fiddly manual process (if not using app_info then you will have only cuda_fermi entries and cannot use the Richard/Madmac workaround) I described in Message 1006100 so probably best to get your app_info sorted out before trying to Reschedule. Anyway time for bed now said Zebedee. GPU Users Group |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Heres the app info file after I installed Lunatics_Win64v0.36_(SSE3+)_AP505r409_AKv8bx64_CudaV12 using default settings App info file is from project>Seti folder. Is this the right app info file? <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_v8b_win_x64_SSE3.exe</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AK_v8b_win_x64_SSE3.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AK_v8b_win_x64_SSE3.exe</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v505</name> </app> <file_info> <name>ap_5.05r409_SSE.exe</name> <executable/> </file_info> <app_version> <app_name>astropulse_v505</app_name> <version_num>505</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>ap_5.05r409_SSE.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>astropulse_v505</app_name> <version_num>505</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>ap_5.05r409_SSE.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.