Running SETI@home on an nVidia Fermi GPU

Message boards : Number crunching : Running SETI@home on an nVidia Fermi GPU
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 15 · Next

AuthorMessage
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004559 - Posted: 16 Jun 2010, 7:59:50 UTC - in response to Message 1004474.  
Last modified: 16 Jun 2010, 8:06:34 UTC

Thanks for the clarification on this VW - will keep this as plan B

Im going to work with richard to see if we can get re-scheduler working with fermi so that we can come up with a process for doing this as I know Im not the only one who needs it working as the normal scheduler does not work that well for fermi yet..

I have shutdown BOINC, used rescheduler to move 6 tasks across,but I having trouble locating the 6 tasks I have moved in the client state.xml file.
What field am I looking for?
I can see 4 active tasks, lots of results, and Im guessing that I want to look at the workunit field and version number?

I can see an app version filed which gives the relevant info on the filename

ile_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name>

Then a section below which is a mirror of my app info...

Then I get to the workunit section - sample below

<workunit>
<name>05no09ag.3059.6207.13.10.168</name>
<app_name>setiathome_enhanced</app_name> Im guessing that this is where the error is
<version_num>610</version_num>
<rsc_fpops_est>157714230303466.000000</rsc_fpops_est>
<rsc_fpops_bound>1577142303034660.000000</rsc_fpops_bound>
<rsc_memory_bound>33554432.000000</rsc_memory_bound>
<rsc_disk_bound>33554432.000000</rsc_disk_bound>
<file_ref>
<file_name>05no09ag.3059.6207.13.10.168</file_name>
<open_name>work_unit.sah</open_name>
</file_ref>
</workunit>

Then I get down to the result field

<result>
<name>05no09ag.19879.25021.12.10.248_0</name>
<final_cpu_time>0.000000</final_cpu_time>
<final_elapsed_time>0.000000</final_elapsed_time>
<exit_status>0</exit_status>
<state>2</state>
<platform>windows_intelx86</platform>
<version_num>610</version_num>
<plan_class>cuda</plan_class>
<wu_name>05no09ag.19879.25021.12.10.248</wu_name>
<report_deadline>1280546919.000000</report_deadline>
<received_time>1276603771.589111</received_time>
<file_ref>
<file_name>05no09ag.19879.25021.12.10.248_0_0</file_name>
<open_name>result.sah</open_name>
</file_ref>
</result>

Thats all I can find..

Any help anyone on anything else I should be looking for.....
ID: 1004559 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004575 - Posted: 16 Jun 2010, 8:45:43 UTC - in response to Message 1004474.  

I was just in the middle of typing an answer to this when maintenance hit, 30 minutes before I was expecting it. They were in a hurry to get started! (And I was never so relieved to see maintenance start - went straight out for a long walk in the sunshine. I needed that!)

Now I just redid this app_info, As I used My normal app_info as a template, The only lines that are important to alter are as follows:

<name>MissPiggy_608.exe</name>
<version_num>6085</version_num>
<file_name>MissPiggy_6085.exe</file_name>

After all You do want It to run, The PC doesn't care what You name It one bit, As It could be named Cookie Monster and the PC would accept It.

This is on the PC end, I don't know if the Server end has a fixed set of names that It recognizes or not, If It does then don't do this.

That's absolutely right - the name doesn't matter. But I'd still strongly urge people not to rename MissPiggy_608.exe to KermitTheFrog_610.exe - that would cause endless confusion for us (and them!) if you/they ever need help/counselling....

I was going on to talk about the significance of <plan_class>, but I see MadMaC's post answers my next question, so I'll move on to that one.
ID: 1004575 · Report as offensive
Profile Leopoldo
Volunteer tester
Avatar

Send message
Joined: 4 Aug 99
Posts: 102
Credit: 3,051,091
RAC: 0
Russia
Message 1004577 - Posted: 16 Jun 2010, 8:57:01 UTC - in response to Message 1004559.  

MadMaC,

<app_name>setiathome_enhanced</app_name> Im guessing that this is where the error is
No, this string of <workunit> section is correct. It's multibeam workunit.

<version_num>610</version_num>
<plan_class>cuda</plan_class>
Here for fermi-oriented app 6.10 must be

<plan_class>cuda_fermi</plan_class>

Rescheduler 1.9 doesn't know about new version 6.10 and this new plan_class - at compilation time such new things didn't exists...
ID: 1004577 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004584 - Posted: 16 Jun 2010, 9:13:03 UTC - in response to Message 1004559.  

<app_name>setiathome_enhanced</app_name> Im guessing that this is where the error is

No, that's not it - leave that one well alone. Eevery current workunit will be either "setiathome_enhanced" or "astropulse_v505" - no other choice.

The problem is actually in the "result" section:

<result>
<name>05no09ag.19879.25021.12.10.248_0</name>
<final_cpu_time>0.000000</final_cpu_time>
<final_elapsed_time>0.000000</final_elapsed_time>
<exit_status>0</exit_status>
<state>2</state>
<platform>windows_intelx86</platform>
<version_num>610</version_num>
<plan_class>cuda</plan_class>
<wu_name>05no09ag.19879.25021.12.10.248</wu_name>
<report_deadline>1280546919.000000</report_deadline>
<received_time>1276603771.589111</received_time>
<file_ref>
<file_name>05no09ag.19879.25021.12.10.248_0_0</file_name>
<open_name>result.sah</open_name>
</file_ref>
</result>

That's the one I was going to talk to Victor about - the extra important line, apart from <version_num>, involved in re-branding.

You have obviously set up ReScheduler to use version 610 for the re-branded (new) CUDA tasks - which is right. But (so far as I know) Marius hasn't provided any mechanism for modifying the <plan_class>.

One work-round would be to do a global search-and-replace on client_state.xml - or at least the SETI@home part of it (don't change any sections belonging to other projects!). Usual warnings: take backups, have BOINC fully shut down, use Notepad or another plain-text editor only.

The change would be: Replace

<plan_class>cuda</plan_class>

with

<plan_class>cuda_fermi</plan_class>

But I don't recommend it: you would have to repeat the change every time you re-scheduled work from CPU to GPU.

Far better to modify your app_info.

Look back to post #2. Copy the whole first section (<app_version> ... </app_version>) again, and paste it as a whole duplicate section into app_info (keep it inside the <app_info> ... </app_info> tags, but clear of everything else).

Then, in one of the duplicate app_version blocks, change <plan_class>cuda_fermi</plan_class> to <plan_class>cuda</plan_class>. That should give you the necessary tool for processing the results modified by ReScheduler. Let us kow how you get on.
ID: 1004584 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004586 - Posted: 16 Jun 2010, 9:34:13 UTC
Last modified: 16 Jun 2010, 9:40:22 UTC

OK - this failed as boinc reported errors

16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home [error] No application found for task: windows_intelx86 610 cuda; discarding
16/06/2010 10:26:56 SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5424775; resource share 100


The last section is the duplicated one that I added, Im guessing that I did it wrong.....
Everything was inside the app info and I have checked to make sure that setiathome_6.10_windows_intelx86__cuda_fermi.exe is where it should be..

<app_info>
<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>AK_v8b_win_SSE3_AMD.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<platform>windows_intelx86</platform>
<file_ref>
<file_name>AK_v8b_win_SSE3_AMD.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>603</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>AK_v8b_win_SSE3_AMD.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>astropulse_v505</name>
</app>
<file_info>
<name>ap_5.05r409_SSE.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v505</app_name>
<version_num>505</version_num>
<platform>windows_intelx86</platform>
<file_ref>
<file_name>ap_5.05r409_SSE.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app_version>
<app_name>astropulse_v505</app_name>
<version_num>505</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>ap_5.05r409_SSE.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_30_14.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_30_14.dll</name>
<executable/>
</file_info>
<file_info>
<name>libfftw3f-3-1-1a_upx.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>610</version_num>
<avg_ncpus>0.200000</avg_ncpus>
<max_ncpus>0.200000</max_ncpus>
<platform>windows_intelx86</platform>
<flops>57462450464</flops>
<plan_class>cuda_fermi</plan_class>
<file_ref>
<file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
</app_version>

<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>610</version_num>
<avg_ncpus>0.200000</avg_ncpus>
<max_ncpus>0.200000</max_ncpus>
<platform>windows_x86_64</platform>
<flops>57462450464</flops>
<plan_class>cuda_fermi</plan_class>
<file_ref>
<file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
</app_version>

<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>610</version_num>
<avg_ncpus>0.200000</avg_ncpus>
<max_ncpus>0.200000</max_ncpus>
<plan_class>cuda</plan_class>
<file_ref>
<file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_30_14.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
</app_version>

</app_info>
ID: 1004586 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004588 - Posted: 16 Jun 2010, 9:43:22 UTC - in response to Message 1004586.  

OK - think I have done this right...

The last section is the duplicated one...

Ah - I'd forgotten you'd already modified the file with those <platform> tags for 64-bit Windows.

Look at the app_info you've posted, and try to get used to seeing the 'shape' of the file: separate sections, each with a <tag> ... </tag> enclosing it.

You've now got three <app_version> sections. You can probably get away with that, as the sample (rebranded) <result> you posted had the vital clue:

<platform>windows_intelx86</platform>

I suggest you add that to the new (third) <app_version> block, so it matches the first one. And then check again for any other differences I've missed, before you run it!
ID: 1004588 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1004591 - Posted: 16 Jun 2010, 9:54:14 UTC - in response to Message 1004588.  

Ok - Thanks -that seemed to have worked...
I will post my app_info on a seperate thread once I have tidied it up based on your recommendations...

Thank you so much for your help..
ID: 1004591 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004597 - Posted: 16 Jun 2010, 10:36:04 UTC - in response to Message 1004591.  

Sounding good. I recommend you take it one step at a time. Assuming you're crunching good work (sensible run times), sit back and take a breather. Wait for the servers to come back up, so you can report the work and - crucially - test that you can still download new work.

If that's OK, you can just put back the <flops> correction, and the job's finished.
ID: 1004597 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1004683 - Posted: 16 Jun 2010, 15:46:34 UTC

Sorry for to be offtopic..

I know, the ReSchedule V1.9 tool don't work with SETI@home Beta Test.

But, is there a way to get let it run there also with some mods?
Maybe two instances of ReSchedule tools separately? Different installation paths.
And mods of .xml files?
Yes, maybe lot of handwork, but worth to do it.

ID: 1004683 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004695 - Posted: 16 Jun 2010, 16:20:32 UTC - in response to Message 1004683.  

Sorry for to be offtopic..

I know, the ReSchedule V1.9 tool don't work with SETI@home Beta Test.

But, is there a way to get let it run there also with some mods?
Maybe two instances of ReSchedule tools separately? Different installation paths.
And mods of .xml files?
Yes, maybe lot of handwork, but worth to do it.

Please read the second paragraph of message 1004402, further down this thread.
ID: 1004695 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1004728 - Posted: 16 Jun 2010, 17:57:55 UTC - in response to Message 1004695.  

Sorry for to be offtopic..

I know, the ReSchedule V1.9 tool don't work with SETI@home Beta Test.

But, is there a way to get let it run there also with some mods?
Maybe two instances of ReSchedule tools separately? Different installation paths.
And mods of .xml files?
Yes, maybe lot of handwork, but worth to do it.

Please read the second paragraph of message 1004402, further down this thread.


I thought maybe you have tested it in past.


Since my account was deleted over at the Lunatics forum, I have no change to get in contact with Marius.

You know his SETI@home account number?
SETI@home have a lot of Marius's..

ID: 1004728 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004737 - Posted: 16 Jun 2010, 18:09:03 UTC - in response to Message 1004728.  

As it happens, I have never even downloaded Marius's program, but I can read the posts about it.

If Marius has time to revise the program, then I am sure the will have time to read here as well, and will see your suggestion.
ID: 1004737 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1004740 - Posted: 16 Jun 2010, 18:21:11 UTC - in response to Message 1004737.  

How you protect your GPUs for VLAR WUs?

ID: 1004740 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004749 - Posted: 16 Jun 2010, 18:41:18 UTC - in response to Message 1004740.  

How you protect your GPUs for VLAR WUs?

I use a vbs script which Fred W originally developed, and which I have subsequently tweaked and enhanced to my own satisfaction. Because it's a plain-text file, I have full control over it, and I can customise it to suit changing circumstances - such as the arrival of Fermi.
ID: 1004749 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1004774 - Posted: 16 Jun 2010, 19:49:37 UTC - in response to Message 1004749.  

And you could make a tool out of it (like the ReSchedule tool) for dummies like me which can rewrite MAIN and BETA WUs? :-D

ID: 1004774 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1004779 - Posted: 16 Jun 2010, 19:57:04 UTC - in response to Message 1004774.  

And you could make a tool out of it (like the ReSchedule tool) for dummies like me which can rewrite MAIN and BETA WUs? :-D

No, I'm not going to do that, because from my life outside SETI and BOINC, I know that if you develop a tool and supply it to others, you have a duty of care to maintain and support it - and if you don't, people keep pestering you with an expectation that you will.

To me, SETI / BOINC is a hobby and a pleasure. The moment it becomes a duty, the fun goes out of it. I made that mistake once, when I agreed (under pressure) to become a Mod. That experience was so unpleasant I nearly left BOINC altogether: I was coaxed back by friendly emails from other projects, not this one.

I'm not going to make that mistake again.
ID: 1004779 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 1005168 - Posted: 17 Jun 2010, 9:25:24 UTC

I understand that Richard doesn't want to write a ReSchedule tool. Dose anyone know a way to shift Fermi 6.10 tasks over to the cpu?
Thank you Richard for all you have done for Seti/Boinc
ID: 1005168 · Report as offensive
Profile MadMaC
Volunteer tester
Avatar

Send message
Joined: 4 Apr 01
Posts: 201
Credit: 47,158,217
RAC: 0
United Kingdom
Message 1005227 - Posted: 17 Jun 2010, 12:45:45 UTC

If you read through the top half of this thread, you will see the process that Richard guided me through to get mine working..
ID: 1005227 · Report as offensive
TheFreshPrince a.k.a. BlueTooth76
Avatar

Send message
Joined: 4 Jun 99
Posts: 210
Credit: 10,315,944
RAC: 0
Netherlands
Message 1005338 - Posted: 17 Jun 2010, 17:45:24 UTC

Is there anyone who got work for his Fermi?
And (if yes) what platform do you use?

Windows x64, GTX 470, Optimized apps and 610 executable, no work...
ID: 1005338 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1005341 - Posted: 17 Jun 2010, 18:01:41 UTC - in response to Message 1005338.  

All tasks for computer 4292666

Automatic download while I was asleep. 32-bit Windows XP, GTX 470, optimised apps + stock v6.10 exe, no platform tag in app_info.

Unfortunately, they're all VLAR like task 1635955802, and my rebranding script didn't pick them up. Where's that support guy when you need him........

........ah, I see. My script works on the distinctive <rsc_fpops_est> signature of VLARs, to save the overhead of opening and reading each actual data file - which means it works during downloading, unlike ReScheduler.

Er, worked. The new server-side DCF adjuster fiddles with <rsc_fpops_est> ...... WAAAAAAARRRGGGHHHH! Berkeley Broke My Script! Support!
ID: 1005341 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 15 · Next

Message boards : Number crunching : Running SETI@home on an nVidia Fermi GPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.