Running SETI@home on an nVidia Fermi GPU

Message boards : Number crunching : Running SETI@home on an nVidia Fermi GPU
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 15 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004195 - Posted: 14 Jun 2010, 21:48:44 UTC - in response to Message 1004190.  

Which is 6.10 currently, On Fermi I think 6.08 and 6.09 are not Fermi compatible either.

Correct. 6.08 produces computation errors, 6.09 produces the -9 overflow on most tasks.
ID: 1004195 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1004220 - Posted: 14 Jun 2010, 22:49:36 UTC - in response to Message 1004185.  


No, that version doesn't work on Fermi - see the eponymous thread. You need the stock application, as described in post #2 of this thread.

My understanding of a stock application is that you do not need to change anything. I thought post #2 applied if you are using a non stock app. I'm using stock cpu app.
ID: 1004220 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1004239 - Posted: 14 Jun 2010, 23:16:55 UTC - in response to Message 1004220.  
Last modified: 14 Jun 2010, 23:18:13 UTC


No, that version doesn't work on Fermi - see the eponymous thread. You need the stock application, as described in post #2 of this thread.

My understanding of a stock application is that you do not need to change anything. I thought post #2 applied if you are using a non stock app. I'm using stock cpu app.

No you're Not running the stock CPU app, you're running an app_info with the AK_V8 SSSE3x CPU app and the V12 Cuda app,

Stock means you're running the bog standard apps from the project, with or without an app_info,

Claggy
ID: 1004239 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004241 - Posted: 14 Jun 2010, 23:20:04 UTC - in response to Message 1004220.  

No, that version doesn't work on Fermi - see the eponymous thread. You need the stock application, as described in post #2 of this thread.

My understanding of a stock application is that you do not need to change anything. I thought post #2 applied if you are using a non stock app. I'm using stock cpu app.

At the time it was written, there was no stock Fermi app here on the main SETI project. In order to run a true Fermi app, you had to use an app_info.xml file, and you had to add that section.

Subsequently, it became apparent that people were installing Fermi cards regardless, and the project code (as then written) was wrongly sending them the cuda23 app - which doesn't work on a Fermi. That's what led to all the kerfuffle of the last ten days.

But neither of those paragraphs applies to you. The task you linked, saying that you thought it was your first good result, was crunched using Raistmer's v12 application. That isn't, and never was, a stock application.

There now is a stock application, and you're free to use that, either using BOINC's automatic distribution system, or by manually installing it into app_info. But please don't use v12.
ID: 1004241 · Report as offensive
John

Send message
Joined: 21 May 99
Posts: 51
Credit: 5,667,907
RAC: 0
United States
Message 1004286 - Posted: 15 Jun 2010, 1:06:36 UTC

How do I tell the difference between fermi credits pending and Cpu Credits pending?


Host ID 5012909
ID: 1004286 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1004340 - Posted: 15 Jun 2010, 5:38:24 UTC

i accidentally dumped "rationed" MP work units by using Reschedule to move work from the cpu to the gpu, boinc then dumped the work since there wasn't a application. can't i put the fermi app down as 608 too, or would that bork the science?
ID: 1004340 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66199
Credit: 55,293,173
RAC: 49
United States
Message 1004345 - Posted: 15 Jun 2010, 6:11:25 UTC - in response to Message 1004340.  

i accidentally dumped "rationed" MP work units by using Reschedule to move work from the cpu to the gpu, boinc then dumped the work since there wasn't a application. can't i put the fermi app down as 608 too, or would that bork the science?

Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 1004345 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004366 - Posted: 15 Jun 2010, 7:45:08 UTC - in response to Message 1004345.  


Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D[/quote]

Can you clarify this please? Im a muppet!!!

I'm presuming that I rename the 6.10 app to something else and then edit the app info to reflect the new name? Im guessing it has to be the 6.08 version? But I don't want to screw up anything else...
The fermi part of the app_info is a couple of posts below edited by claggy..
ID: 1004366 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004381 - Posted: 15 Jun 2010, 8:39:36 UTC - in response to Message 1004366.  

[quote]
Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D


Can you clarify this please? Im a muppet!!!

Im guessing that I want to replace any mention of the 6.10 app in my app_info with MB_608_CUDA_V12_VLARkill_FPLim2048.exe and then rescheduler would work??
The fermi part of the app_info is a couple of posts below edited by claggy, I have taken this and had a go...

Have I done this right, if this app info is correct I will rename the exe to MB_608_CUDA_V12_VLARkill_FPLim2048.exe to reflect the changes I have made..



<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_30_14.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_30_14.dll</name>
<executable/>
</file_info>
<file_info>
<name>libfftw3f-3-1-1a_upx.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<avg_ncpus>0.200000</avg_ncpus>
<max_ncpus>0.200000</max_ncpus>
<platform>windows_intelx86</platform>
<flops>57462450464</flops>
<plan_class>cuda_fermi</plan_class>
<file_ref>
<file_name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
</app_version>

<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<avg_ncpus>0.200000</avg_ncpus>
<max_ncpus>0.200000</max_ncpus>
<platform>windows_x86_64</platform>
<flops>57462450464</flops>
<plan_class>cuda_fermi</plan_class>
<file_ref>
<file_name>MB_608_CUDA_V12_VLARkill_FPLim2048.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
</app_version>

</app_info>

ID: 1004381 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004391 - Posted: 15 Jun 2010, 9:32:38 UTC - in response to Message 1004381.  

[quote]
Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D


Can you clarify this please? Im a muppet!!!

Im guessing that I want to replace any mention of the 6.10 app in my app_info with MB_608_CUDA_V12_VLARkill_FPLim2048.exe and then rescheduler would work??
The fermi part of the app_info is a couple of posts below edited by claggy, I have taken this and had a go...

Have I done this right, if this app info is correct I will rename the exe to MB_608_CUDA_V12_VLARkill_FPLim2048.exe to reflect the changes I have made..

STOP!!!

Please don't guess. Please take the time and trouble to get it right.

I'm only just catching up after last night. First, please identify which host you're talking about, while I catch up on the previous discussion. Then we can work out the proper way to proceed. But renaming executables isn't it.
ID: 1004391 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004397 - Posted: 15 Jun 2010, 9:57:46 UTC - in response to Message 1004391.  

Damn, that red STOP is scary...

I haven't changed a thing yet, I was waiting until those more knowledgeable than me had replied to see if I was right..

My host ID is 5424775

ID: 1004397 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004402 - Posted: 15 Jun 2010, 10:17:49 UTC

Two general obervations, prompted by MadMaC's post but applicable to everyone.

1) Please don't rename executable files. They would still work, but the name has no impact on the sort of issues we're discussing here. What it would do is to cause major confusion if you ever had to ask for help here in the future. If you post a fragment of app_info saying that you're using "this_app", but you're actually using "that_app" under another name, we'll all end up getting into even more bother than we are already.

2) The ReScheduler tool was written by Marius. He posted it in a discussion thread at Lunatics, but that's the limit of Lunatics' involvement. Unfortunately, shortly after he wrote and posted it, he became very busy at work and couldn't stay around to provide support.

It's a brilliant tool, and many people are - rightly - grateful to Marius for writing it. But no-one except Marious really knows, in detail, exactly how it works. It's a black box, with no source code. So adapting it to work in new situations, which Marius didn't plan for (and no reason why he should) requires experimentation and observation.
ID: 1004402 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004405 - Posted: 15 Jun 2010, 10:30:01 UTC - in response to Message 1004397.  

Damn, that red STOP is scary...

I haven't changed a thing yet, I was waiting until those more knowledgeable than me had replied to see if I was right..

My host ID is 5424775

OK, that's your GTX 470. So, for the time being, the application you have to use is setiathome_6.10_windows_intelx86__cuda_fermi.exe. If you've ever renamed any files, I suggest you download a fresh copy so you can be sure it does exactly what it says on the tin....

Now, to ReScheduler. We had a conversation about this a week ago, in this very thread:

You: message 1001516
Me: message 1001523
You: message 1001578

You had problems, but you didn't actually post the results of the experiment I suggested. Can you remember what they were?
ID: 1004405 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004415 - Posted: 15 Jun 2010, 11:11:46 UTC
Last modified: 15 Jun 2010, 11:14:27 UTC

Well I can remember trying to add a block into my app_info starting up boinc and it trashed all my tasks (or discarded them to be more accurate)...
Just tried it again and the same has happened - This seems a bit strange as I only moved 11% of my taks to the gpu, so there should have been some left..
Id post the entire client state.xml but it was 847K, now it is 7K and all my tasks have gone. Boinc was shutdown after I suspended the GPU
I think the problem is down to me either not identifying the tasks correctly in the client state xml or I have made an error somewhere else. Either way, I will have to wait now until I can get some more work and then try again..
Damn - will have to post an apology to my wingmen...

Off to a meeting now, but will check back in when I get back to my desk...
ID: 1004415 · Report as offensive
ToxicTBag

Send message
Joined: 5 Feb 10
Posts: 101
Credit: 57,197,902
RAC: 0
United Kingdom
Message 1004421 - Posted: 15 Jun 2010, 11:21:44 UTC

First a big thank you to all who are contributing to the thread and are aiming for a greater understanding into how to get theres and others configurations to work with seti.
My setup is as follows
machine 1 3 gtx 470
machine 2 1 gtx 480
machine 3 2x gtx 295 and 1 x gtx 260
Please correct me if i am wrong the machines are as follows
Machine 1 and 2 were running 6.10.56 with opt lunatics .36 and had not downloaded work in 3 days along with the 100 quota message.
Machine 3 was running 6.10.56 with opt .36 and had work but was not receiving new work along with 100 message and app_info error message(though it carried on working).
When the upload servers began working yesterday 1 and 2 uploaded all their work(not a lot) number 3 had over 1500 completed wu's and began uploading after it had uploaded around 1000 of the tasks i had a message your password is not valid. I tried a restart more than once and when it did connect it said i was not attached to any projects, i checked my computers tab in boinc and it had uploaded a lot of the work but showed a lot as client detached.
That is my situation at present, i would like to ask the following.
On machines 1, 2 and 3 which have downloaded a small amount of work now should i just install 6.10.56 app with no opt .36 and use rescheduler to catch vlars etc and wait and see what happens once present cache is cleared?
Apologies for the long winded question but i have read a lot of threads and have been given a lot of advice but do not seem to be getting anywhere.I am not very good with editing the app_info and would appreciate any advice the thread has to offer.
Thank you in advance.
ID: 1004421 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004423 - Posted: 15 Jun 2010, 11:30:45 UTC - in response to Message 1004415.  

The process was meant to be:

1) Stop BOINC
2) ReSchedule tasks
3) Examine client_state
4) Modify app_info
5) Start BOINC

You will lose work if you allow #5 without successfully completing #3 and #4.

In general, I would urge people not to post app_info in a sticky 'information' thread like this - it just confuses other readers in the future, who may have difficulty picking out the "working" advice from the "faulty" requests for help. Start another thread, which can fade off the first page when the problem is solved, or at a pinch send someone a PM.

And never post the whole of client_state! Apart from the size, it's got security information in it. In this case, all we need is a matched <workunit> / <result> pair for a rescheduled job.
ID: 1004423 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004425 - Posted: 15 Jun 2010, 11:42:12 UTC - in response to Message 1004421.  

On machines 1, 2 and 3 which have downloaded a small amount of work now should i just install 6.10.56 app with no opt .36 and use rescheduler to catch vlars etc and wait and see what happens once present cache is cleared?

Much of your question relates to the general server problems, which are better covered in the other threads.

But since machines 1 and 2 have fermi cards:

Make sure you have already downloaded both Fermi (v6.10) work and CPU (v6.03) work, so you have the necessary equipment to handle rescheduled tasks. Then, use ReScheduler only to move VLAR work from GPU, to CPU. Don't even attempt to move work in the opposite direction until we get MadMaC's problem sorted out, and we can draw up more general rules for other people to follow. I can't work it out without MadMaC's co-operation, because many of his problems are related to his 64-bit operating system, and I don't have one here to practice on.
ID: 1004425 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004438 - Posted: 15 Jun 2010, 13:05:11 UTC - in response to Message 1004423.  
Last modified: 15 Jun 2010, 13:28:00 UTC

In this case, all we need is a matched <workunit> / <result> pair for a rescheduled job.


Is there an easy way to match the ID of a rescheduled WU. One of the reasons I started up boinc in suspended mode was to see if I could get a workunit ID to search for in the client state xml file?

Once I have downloaded some work I will post back here - it might not be until tommorrow the way the servers are going though..

edit - I have some cpu work so suspend boinc, shut it down and try and go through the xml file and see what I can find..
I will not start boinc up at all..

Interesting

Re-scheduler logs

---------------------------
Reschedule version 1.9
Time: 15-06-2010 14:20:10
User testing for a reschedule
CPU tasks: 128 (0 VLAR, 66 VHAR)
GPU tasks: 0 (0 VLAR, 0 VHAR)
Reschedule needed because not enough GPU units
---------------------------
Reschedule version 1.9
Time: 15-06-2010 14:20:17
User forced a reschedule
Boinc applications
setiathome_enhanced 603 windows_intelx86
No SETI cuda application found

And it doesn't move any units - Im going to look but I know my fermi app is there...
ID: 1004438 · Report as offensive
ToxicTBag

Send message
Joined: 5 Feb 10
Posts: 101
Credit: 57,197,902
RAC: 0
United Kingdom
Message 1004458 - Posted: 15 Jun 2010, 14:48:17 UTC - in response to Message 1004425.  

[quote
But since machines 1 and 2 have fermi cards:

Make sure you have already downloaded both Fermi (v6.10) work and CPU (v6.03) work, so you have the necessary equipment to handle rescheduled tasks. Then, use ReScheduler only to move VLAR work from GPU, to CPU. Don't even attempt to move work in the opposite direction until we get MadMaC's problem sorted out, and we can draw up more general rules for other people to follow. I can't work it out without MadMaC's co-operation, because many of his problems are related to his 64-bit operating system, and I don't have one here to practice on.[/quote]

Thank you for taking the time to reply Richard very much appreciated, i will continue to watch carefully for progress on these problems.
TTBag
ID: 1004458 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66199
Credit: 55,293,173
RAC: 49
United States
Message 1004474 - Posted: 15 Jun 2010, 15:55:08 UTC - in response to Message 1004366.  
Last modified: 15 Jun 2010, 16:01:12 UTC

Only 6.10 is safe for Fermi, As to renaming the 6.10 app and the name in the app_info file, I don't see any harm in that, Unless ones a Muppet. :D


Can you clarify this please? Im a muppet!!!

I'm presuming that I rename the 6.10 app to something else and then edit the app info to reflect the new name? Im guessing it has to be the 6.08 version? But I don't want to screw up anything else...
The fermi part of the app_info is a couple of posts below edited by claggy..

Ok You've got It mostly right. You can rename the program and It's entries within the app_info file, As long as It doesn't match an existing 6.08 or 6.09 filename, For example You could call It:

MissPiggy_6085.exe

Ok cookie monster? ;)

Oh and the app_info would look sort of like so:

<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>MissPiggy_6085.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>6085</version_num>
<file_ref>
<file_name>MissPiggy_6085.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>


Now I just redid this app_info, As I used My normal app_info as a template, The only lines that are important to alter are as follows:

<name>MissPiggy_608.exe</name>
<version_num>6085</version_num>
<file_name>MissPiggy_6085.exe</file_name>

After all You do want It to run, The PC doesn't care what You name It one bit, As It could be named Cookie Monster and the PC would accept It.

This is on the PC end, I don't know if the Server end has a fixed set of names that It recognizes or not, If It does then don't do this.
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 1004474 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 15 · Next

Message boards : Number crunching : Running SETI@home on an nVidia Fermi GPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.