GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU

Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 37 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1809708 - Posted: 16 Aug 2016, 21:41:46 UTC - in response to Message 1809682.  

You just have to go back in and update your <count> value as usual ...

Or, as frequently recommended, put your <gpu_usage> values into an app_config.xml file, and it's all preserved for you.

... and change over to the new command line file. Pretty painless. No ghosts produced across all three machines.

This should be a one-time only change.
ID: 1809708 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1809728 - Posted: 16 Aug 2016, 22:44:05 UTC - in response to Message 1809657.  

. . I was worried about trying to manually install r3486 or r3500 as I have not done this before. But now that Lunatics Beta4 is out with r3450 I will be keen to try it out, but maybe not in the middle of the WOW event :)

It's been updated.  See Message 1809279.


. . Yes I know, that last version number was a typo, it was meant to say r3500, and I have it running now. I got swamped with Guppies and decided I may as well dive in :).

. . First impressions with only defaults running and one WU at a time are:

o . . NonVLAR WUS take about 8 to 9 mins (compared to r3430 running foursies taking about 7 mins/WU)

o . . Guppies take about 16 to 20 mins (compared to the above settings where they took about 12 to 15 mins. These are favourable comparisons and if I remember r3430 took slightly longer than this in default mode.

o . . There is a definite improvement in GPU utilisation, where running singles still uses the GPU fairly well. NonVLARS use approx. 80%-90% of GPU for the first half then drop to about 40%-50% for the remaining WU runtime. Guppies show more consistent GPU use overall but are using 100% CPUs.

o . . This is on a Pentium-D930 rig with 2 GTX970 cards.

o . . This rig takes a long time to load each new task so there is a 1 to 2 mins lag at the start of each WU, when you allow for this the runtimes are even better and look like about 7/8 mins nonVLAR and 14/18 mins Guppies. Seti was down for maintenance so I could not review the suggested command line options but I will be implementing them today.

o . . So far it is looking fairly good, but not quite a paradigm shift. It will be interesting when I find the right settings.
ID: 1809728 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809729 - Posted: 16 Aug 2016, 22:45:32 UTC - in response to Message 1809708.  

You just have to go back in and update your <count> value as usual ...

Or, as frequently recommended, put your <gpu_usage> values into an app_config.xml file, and it's all preserved for you.

... and change over to the new command line file. Pretty painless. No ghosts produced across all three machines.

This should be a one-time only change.

I have had the gpu_usage value in my app_config for a while now. However, somewhere along the way, maybe 0.44 or Beta 3, the GPU's fall back to 1 task per card with the default app_info. Had me very confused until I went back to the old method of modifying <count> to get my normal two tasks per card running correctly. I don't know what has changed. App_config used to work with the default app_info.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809729 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809732 - Posted: 16 Aug 2016, 22:47:52 UTC - in response to Message 1809705.  

I've taken to just placing the commandline into my app_config.xml that Richard mentioned some time back.

Use it on all my machines and haven't had any problems

I wouldn't mind trying that myself ..... if I could get app_config working correctly again as it used to.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809732 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1809733 - Posted: 16 Aug 2016, 22:49:15 UTC - in response to Message 1809682.  


. . I was worried about trying to manually install r3486 or r3500 as I have not done this before. But now that Lunatics Beta4 is out with r3450 I will be keen to try it out, but maybe not in the middle of the WOW event :)

I updated to Lunatics Beta4 yesterday, in the middle of the WOW contest. No problems seen, EXCEPT, you have to manually copy your mb_cmdline_win_x86_SSE3_OpenCL_NV.txt parameter settings into the newly named mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt file. Raistmer changed the text file name in the R3500 aistub file which builds the new app_info. You just have to go back in and update your <count> value as usual and change over to the new command line file. Pretty painless. No ghosts produced across all three machines.


. . I did too, I got swamped with a zillion BLC5 guppies so I knew the output would be down anyway so I dove in. What is this "count" file? I just left it at the new defaults.
ID: 1809733 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1809734 - Posted: 16 Aug 2016, 22:49:35 UTC - in response to Message 1809729.  

Post your app_config.xml so we can look at it and see if we can spot the problem.
ID: 1809734 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809736 - Posted: 16 Aug 2016, 22:52:01 UTC - in response to Message 1809734.  

Post your app_config.xml so we can look at it and see if we can spot the problem.

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>.50</gpu_usage>
<cpu_usage>.20</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>
</app_config>

Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809736 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1809737 - Posted: 16 Aug 2016, 22:53:42 UTC - in response to Message 1809734.  

Keith,

Have a look at mine

<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>0.33</ngpus>
    <cmdline>replace with your values</cmdline>
  </app_version>
   <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>0.35</avg_ncpus>
    <ngpus>0.33</ngpus>
    <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
  </app_version>
<project_max_concurrent>12</project_max_concurrent>
</app_config>



Zalster
ID: 1809737 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1809738 - Posted: 16 Aug 2016, 22:57:00 UTC - in response to Message 1809734.  

Post your app_config.xml so we can look at it and see if we can spot the problem.


. . I am surprised he isn't getting warning notices if his app_config.xml is not strictly correct.
ID: 1809738 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1809741 - Posted: 16 Aug 2016, 22:59:26 UTC - in response to Message 1809736.  

Post your app_config.xml so we can look at it and see if we can spot the problem.

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>.50</gpu_usage>
<cpu_usage>.20</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>
</app_config>


. . i think your .50 should be 0.5 and same for the .20 should be 0.2 as with your AP section.
ID: 1809741 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809743 - Posted: 16 Aug 2016, 23:07:14 UTC - in response to Message 1809738.  

Post your app_config.xml so we can look at it and see if we can spot the problem.


. . I am surprised he isn't getting warning notices if his app_config.xml is not strictly correct.

Nope, never have got warning notices on startup. So it must be within the boundaries of what is normally expected.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809743 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1809745 - Posted: 16 Aug 2016, 23:12:53 UTC - in response to Message 1809741.  

. . i think your .50 should be 0.5 and same for the .20 should be 0.2 as with your AP section.

The BOINC XML parser is pretty good at interpreting all numeric formats these days - I think it was given a pretty good workout when there was a fashion for populating the <flops> line, and people tried some pretty wild variations of scientific (exponential) numeric formats. They were all interpreted correctly.
ID: 1809745 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1809746 - Posted: 16 Aug 2016, 23:14:58 UTC - in response to Message 1809743.  

Post your app_config.xml so we can look at it and see if we can spot the problem.


. . I am surprised he isn't getting warning notices if his app_config.xml is not strictly correct.

Nope, never have got warning notices on startup. So it must be within the boundaries of what is normally expected.

So, what exactly is the problem you're having? Is it functional (the wrong number of tasks running), or presentational (the wrong fractional count being displayed in BOINC Manager)?
ID: 1809746 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1809748 - Posted: 16 Aug 2016, 23:23:29 UTC - in response to Message 1809651.  

Raistmer,
I hope some of the ideas they have been floating around help you in your work.
(It is a shame Windows doesn't have the same high frequency timers that some other o/s have)

Unfortunately, supposed to be 100ns-granularity method (https://gist.github.com/Youka/4153f12cf2e17a77314c) gives same 1ms only resolution on my netbook.
So, still nothing better than Sleep(1) with high-prec timer enabled.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1809748 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809750 - Posted: 16 Aug 2016, 23:27:47 UTC - in response to Message 1809746.  

Post your app_config.xml so we can look at it and see if we can spot the problem.


. . I am surprised he isn't getting warning notices if his app_config.xml is not strictly correct.

Nope, never have got warning notices on startup. So it must be within the boundaries of what is normally expected.

So, what exactly is the problem you're having? Is it functional (the wrong number of tasks running), or presentational (the wrong fractional count being displayed in BOINC Manager)?

It is functional. If I don't modify the <count>1</count> default line in app_info that the latest Lunatics installers write, I end up only running one GPU task per card. I have the gpu_usage set for 0.5 in app_config. It just started to ignore that value somewhere along 0.44 or Beta3. I never ran the earlier betas, just jumped from stock 0.44 to Beta3.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809750 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1809752 - Posted: 16 Aug 2016, 23:33:44 UTC - in response to Message 1809748.  

Raistmer,
I hope some of the ideas they have been floating around help you in your work.
(It is a shame Windows doesn't have the same high frequency timers that some other o/s have)

Unfortunately, supposed to be 100ns-granularity method (https://gist.github.com/Youka/4153f12cf2e17a77314c) gives same 1ms only resolution on my netbook.
So, still nothing better than Sleep(1) with high-prec timer enabled.


Late to the party but have you tried SwitchToThread()? We use a combination of _mm_pause and STT for some of the more violent synchronization stuff.
ID: 1809752 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1809756 - Posted: 16 Aug 2016, 23:53:03 UTC - in response to Message 1809750.  

Post your app_config.xml so we can look at it and see if we can spot the problem.


. . I am surprised he isn't getting warning notices if his app_config.xml is not strictly correct.

Nope, never have got warning notices on startup. So it must be within the boundaries of what is normally expected.

So, what exactly is the problem you're having? Is it functional (the wrong number of tasks running), or presentational (the wrong fractional count being displayed in BOINC Manager)?

It is functional. If I don't modify the <count>1</count> default line in app_info that the latest Lunatics installers write, I end up only running one GPU task per card. I have the gpu_usage set for 0.5 in app_config. It just started to ignore that value somewhere along 0.44 or Beta3. I never ran the earlier betas, just jumped from stock 0.44 to Beta3.

Hmmm. I'm going to have to sleep on that one, and maybe squint at some <cpu_sched_debug> logs in the morning. I was planning to build and test David's latest bugfix, anyway.
ID: 1809756 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809766 - Posted: 17 Aug 2016, 0:34:06 UTC - in response to Message 1809737.  

Keith,

Have a look at mine

<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>0.33</ngpus>
    <cmdline>replace with your values</cmdline>
  </app_version>
   <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>0.35</avg_ncpus>
    <ngpus>0.33</ngpus>
    <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
  </app_version>
<project_max_concurrent>12</project_max_concurrent>
</app_config>



Zalster
Hi Zalster, I got brave and tried your app_config with with my parameters. The good thing is I am back to running two GPU tasks per card. The bad thing is that I am no longer running CPU tasks. I looked at the app_config documentation and it all seems to apply to GPU configuration. I am not sure how my original app_config worked for both CPU and GPU tasks without it specifically calling out CPU work. I am missing something obviously.

Can anyone pick out why my new app_config doesn't run my CPU tasks anymore?
<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>0.5</ngpus>
    <cmdline>-sbs 512 -instance_per_device 2 -period_iterations_num 5 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -tt 60</cmdline>
  </app_version>
   <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>0.35</avg_ncpus>
    <ngpus>0.5</ngpus>
    <cmdline>-unroll 18 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 8 1 -tune 2 64 8 1</cmdline>
  </app_version>
<project_max_concurrent>12</project_max_concurrent>
</app_config>

Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809766 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1809769 - Posted: 17 Aug 2016, 0:44:50 UTC - in response to Message 1809766.  

Keith,

I'm assuming you aren't running more than 2 GPUs per machine correct.

Try removing the <project_max_concurrent>12</project_max_concurrent> and then have boinc reread the config files and see if the CPUs pick back up

It's really shouldn't affect your CPU work units as it's tailored to the GPU but it will affect the total overall number at any one time on your machine, in this case, it should be 4 GPU work units and maybe 4 CPU unless you specified in your preferences a different value of how many cores to use
ID: 1809769 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1809770 - Posted: 17 Aug 2016, 0:52:23 UTC - in response to Message 1809769.  

Keith,

I'm assuming you aren't running more than 2 GPUs per machine correct.

Try removing the <project_max_concurrent>12</project_max_concurrent> and then have boinc reread the config files and see if the CPUs pick back up

It's really shouldn't affect your CPU work units as it's tailored to the GPU but it will affect the total overall number at any one time on your machine, in this case, it should be 4 GPU work units and maybe 4 CPU unless you specified in your preferences a different value of how many cores to use

No, just 2 GPUs per machine. I have my preferences set to only use 49% of my CPU cores to preserve enough cores to feed each GPU task with one CPU core. That puts to work 7 cores with one core reserved for general desktop maintenance. 3 cores doing a CPU task each and 4 cores feeding the 4 GPU tasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1809770 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 37 · Next

Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.