Message boards :
Number crunching :
Running SETI@home on an nVidia Fermi GPU
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 15 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
Which is 6.10 currently, On Fermi I think 6.08 and 6.09 are not Fermi compatible either. Correct. 6.08 produces computation errors, 6.09 produces the -9 overflow on most tasks. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
My understanding of a stock application is that you do not need to change anything. I thought post #2 applied if you are using a non stock app. I'm using stock cpu app. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
No you're Not running the stock CPU app, you're running an app_info with the AK_V8 SSSE3x CPU app and the V12 Cuda app, Stock means you're running the bog standard apps from the project, with or without an app_info, Claggy |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
No, that version doesn't work on Fermi - see the eponymous thread. You need the stock application, as described in post #2 of this thread. At the time it was written, there was no stock Fermi app here on the main SETI project. In order to run a true Fermi app, you had to use an app_info.xml file, and you had to add that section. Subsequently, it became apparent that people were installing Fermi cards regardless, and the project code (as then written) was wrongly sending them the cuda23 app - which doesn't work on a Fermi. That's what led to all the kerfuffle of the last ten days. But neither of those paragraphs applies to you. The task you linked, saying that you thought it was your first good result, was crunched using Raistmer's v12 application. That isn't, and never was, a stock application. There now is a stock application, and you're free to use that, either using BOINC's automatic distribution system, or by manually installing it into app_info. But please don't use v12. |
John Send message Joined: 21 May 99 Posts: 51 Credit: 5,667,907 RAC: 0 |
How do I tell the difference between fermi credits pending and Cpu Credits pending? Host ID 5012909 |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
i accidentally dumped "rationed" MP work units by using Reschedule to move work from the cpu to the gpu, boinc then dumped the work since there wasn't a application. can't i put the fermi app down as 608 too, or would that bork the science? |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66277 Credit: 55,293,173 RAC: 49 |
i accidentally dumped "rationed" MP work units by using Reschedule to move work from the cpu to the gpu, boinc then dumped the work since there wasn't a application. can't i put the fermi app down as 608 too, or would that bork the science? Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D[/quote] Can you clarify this please? Im a muppet!!! I'm presuming that I rename the 6.10 app to something else and then edit the app info to reflect the new name? Im guessing it has to be the 6.08 version? But I don't want to screw up anything else... The fermi part of the app_info is a couple of posts below edited by claggy.. |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
[quote] Can you clarify this please? Im a muppet!!! Im guessing that I want to replace any mention of the 6.10 app in my app_info with MB_608_CUDA_V12_VLARkill_FPLim2048.exe and then rescheduler would work?? The fermi part of the app_info is a couple of posts below edited by claggy, I have taken this and had a go... Have I done this right, if this app info is correct I will rename the exe to MB_608_CUDA_V12_VLARkill_FPLim2048.exe to reflect the changes I have made.. <app> <name>setiathome_enhanced</name> </app> <file_info> <name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</name> <executable/> </file_info> <file_info> <name>cudart32_30_14.dll</name> <executable/> </file_info> <file_info> <name>cufft32_30_14.dll</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-1-1a_upx.dll</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>608</version_num> <avg_ncpus>0.200000</avg_ncpus> <max_ncpus>0.200000</max_ncpus> <platform>windows_intelx86</platform> <flops>57462450464</flops> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>608</version_num> <avg_ncpus>0.200000</avg_ncpus> <max_ncpus>0.200000</max_ncpus> <platform>windows_x86_64</platform> <flops>57462450464</flops> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version> </app_info> |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
[quote] STOP!!! Please don't guess. Please take the time and trouble to get it right. I'm only just catching up after last night. First, please identify which host you're talking about, while I catch up on the previous discussion. Then we can work out the proper way to proceed. But renaming executables isn't it. |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
Damn, that red STOP is scary... I haven't changed a thing yet, I was waiting until those more knowledgeable than me had replied to see if I was right.. My host ID is 5424775 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
Two general obervations, prompted by MadMaC's post but applicable to everyone. 1) Please don't rename executable files. They would still work, but the name has no impact on the sort of issues we're discussing here. What it would do is to cause major confusion if you ever had to ask for help here in the future. If you post a fragment of app_info saying that you're using "this_app", but you're actually using "that_app" under another name, we'll all end up getting into even more bother than we are already. 2) The ReScheduler tool was written by Marius. He posted it in a discussion thread at Lunatics, but that's the limit of Lunatics' involvement. Unfortunately, shortly after he wrote and posted it, he became very busy at work and couldn't stay around to provide support. It's a brilliant tool, and many people are - rightly - grateful to Marius for writing it. But no-one except Marious really knows, in detail, exactly how it works. It's a black box, with no source code. So adapting it to work in new situations, which Marius didn't plan for (and no reason why he should) requires experimentation and observation. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
Damn, that red STOP is scary... OK, that's your GTX 470. So, for the time being, the application you have to use is setiathome_6.10_windows_intelx86__cuda_fermi.exe. If you've ever renamed any files, I suggest you download a fresh copy so you can be sure it does exactly what it says on the tin.... Now, to ReScheduler. We had a conversation about this a week ago, in this very thread: You: message 1001516 Me: message 1001523 You: message 1001578 You had problems, but you didn't actually post the results of the experiment I suggested. Can you remember what they were? |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
Well I can remember trying to add a block into my app_info starting up boinc and it trashed all my tasks (or discarded them to be more accurate)... Just tried it again and the same has happened - This seems a bit strange as I only moved 11% of my taks to the gpu, so there should have been some left.. Id post the entire client state.xml but it was 847K, now it is 7K and all my tasks have gone. Boinc was shutdown after I suspended the GPU I think the problem is down to me either not identifying the tasks correctly in the client state xml or I have made an error somewhere else. Either way, I will have to wait now until I can get some more work and then try again.. Damn - will have to post an apology to my wingmen... Off to a meeting now, but will check back in when I get back to my desk... |
ToxicTBag Send message Joined: 5 Feb 10 Posts: 101 Credit: 57,197,902 RAC: 0 |
First a big thank you to all who are contributing to the thread and are aiming for a greater understanding into how to get theres and others configurations to work with seti. My setup is as follows machine 1 3 gtx 470 machine 2 1 gtx 480 machine 3 2x gtx 295 and 1 x gtx 260 Please correct me if i am wrong the machines are as follows Machine 1 and 2 were running 6.10.56 with opt lunatics .36 and had not downloaded work in 3 days along with the 100 quota message. Machine 3 was running 6.10.56 with opt .36 and had work but was not receiving new work along with 100 message and app_info error message(though it carried on working). When the upload servers began working yesterday 1 and 2 uploaded all their work(not a lot) number 3 had over 1500 completed wu's and began uploading after it had uploaded around 1000 of the tasks i had a message your password is not valid. I tried a restart more than once and when it did connect it said i was not attached to any projects, i checked my computers tab in boinc and it had uploaded a lot of the work but showed a lot as client detached. That is my situation at present, i would like to ask the following. On machines 1, 2 and 3 which have downloaded a small amount of work now should i just install 6.10.56 app with no opt .36 and use rescheduler to catch vlars etc and wait and see what happens once present cache is cleared? Apologies for the long winded question but i have read a lot of threads and have been given a lot of advice but do not seem to be getting anywhere.I am not very good with editing the app_info and would appreciate any advice the thread has to offer. Thank you in advance. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
The process was meant to be: 1) Stop BOINC 2) ReSchedule tasks 3) Examine client_state 4) Modify app_info 5) Start BOINC You will lose work if you allow #5 without successfully completing #3 and #4. In general, I would urge people not to post app_info in a sticky 'information' thread like this - it just confuses other readers in the future, who may have difficulty picking out the "working" advice from the "faulty" requests for help. Start another thread, which can fade off the first page when the problem is solved, or at a pinch send someone a PM. And never post the whole of client_state! Apart from the size, it's got security information in it. In this case, all we need is a matched <workunit> / <result> pair for a rescheduled job. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14676 Credit: 200,643,578 RAC: 874 |
On machines 1, 2 and 3 which have downloaded a small amount of work now should i just install 6.10.56 app with no opt .36 and use rescheduler to catch vlars etc and wait and see what happens once present cache is cleared? Much of your question relates to the general server problems, which are better covered in the other threads. But since machines 1 and 2 have fermi cards: Make sure you have already downloaded both Fermi (v6.10) work and CPU (v6.03) work, so you have the necessary equipment to handle rescheduled tasks. Then, use ReScheduler only to move VLAR work from GPU, to CPU. Don't even attempt to move work in the opposite direction until we get MadMaC's problem sorted out, and we can draw up more general rules for other people to follow. I can't work it out without MadMaC's co-operation, because many of his problems are related to his 64-bit operating system, and I don't have one here to practice on. |
MadMaC Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 |
In this case, all we need is a matched <workunit> / <result> pair for a rescheduled job. Is there an easy way to match the ID of a rescheduled WU. One of the reasons I started up boinc in suspended mode was to see if I could get a workunit ID to search for in the client state xml file? Once I have downloaded some work I will post back here - it might not be until tommorrow the way the servers are going though.. edit - I have some cpu work so suspend boinc, shut it down and try and go through the xml file and see what I can find.. I will not start boinc up at all.. Interesting Re-scheduler logs --------------------------- Reschedule version 1.9 Time: 15-06-2010 14:20:10 User testing for a reschedule CPU tasks: 128 (0 VLAR, 66 VHAR) GPU tasks: 0 (0 VLAR, 0 VHAR) Reschedule needed because not enough GPU units --------------------------- Reschedule version 1.9 Time: 15-06-2010 14:20:17 User forced a reschedule Boinc applications setiathome_enhanced 603 windows_intelx86 No SETI cuda application found And it doesn't move any units - Im going to look but I know my fermi app is there... |
ToxicTBag Send message Joined: 5 Feb 10 Posts: 101 Credit: 57,197,902 RAC: 0 |
[quote But since machines 1 and 2 have fermi cards: Make sure you have already downloaded both Fermi (v6.10) work and CPU (v6.03) work, so you have the necessary equipment to handle rescheduled tasks. Then, use ReScheduler only to move VLAR work from GPU, to CPU. Don't even attempt to move work in the opposite direction until we get MadMaC's problem sorted out, and we can draw up more general rules for other people to follow. I can't work it out without MadMaC's co-operation, because many of his problems are related to his 64-bit operating system, and I don't have one here to practice on.[/quote] Thank you for taking the time to reply Richard very much appreciated, i will continue to watch carefully for progress on these problems. TTBag |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66277 Credit: 55,293,173 RAC: 49 |
Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D Ok You've got It mostly right. You can rename the program and It's entries within the app_info file, As long as It doesn't match an existing 6.08 or 6.09 filename, For example You could call It: MissPiggy_6085.exe Ok cookie monster? ;) Oh and the app_info would look sort of like so: <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>MissPiggy_6085.exe</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>6085</version_num> <file_ref> <file_name>MissPiggy_6085.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> Now I just redid this app_info, As I used My normal app_info as a template, The only lines that are important to alter are as follows: <name>MissPiggy_608.exe</name> <version_num>6085</version_num> <file_name>MissPiggy_6085.exe</file_name> After all You do want It to run, The PC doesn't care what You name It one bit, As It could be named Cookie Monster and the PC would accept It. This is on the PC end, I don't know if the Server end has a fixed set of names that It recognizes or not, If It does then don't do this. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.