Anything relating to AstroPulse tasks

Message boards : Number crunching : Anything relating to AstroPulse tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 120 · Next

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1700921 - Posted: 12 Jul 2015, 23:33:20 UTC - in response to Message 1700919.  

One advantage of having a slow rig and then a very slow one is that I get a full 10 day cache on each.

I could only wish.

Cheers.
ID: 1700921 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1700940 - Posted: 13 Jul 2015, 2:05:20 UTC - in response to Message 1700921.  

Once the APs finish splitting my cache last about a day and half then it's back to MBs.
ID: 1700940 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1700958 - Posted: 13 Jul 2015, 4:12:43 UTC

Hi Folks,
Just recently had a problem with a mis-configured app_info file:-(
With the result that Boinc promptly deleted my entire cache of WU, incl 1 x AP:-/
Now the other side of this misery is that when Boinc decides to bin an entire cache it doesn't bother to tell the servers. So I guess that anyone who was also working on those WU is gonna wait quite at time before s@h servers decide to re-issue those WU.

Maybe it might be an idea for the devs to consider having the cache trasher [boinc] actually report to the servers that its deleted WU and which ones its binned.
Anyone consider this a reasonable idea? or not? if not why not?

Over to you folks:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1700958 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1700967 - Posted: 13 Jul 2015, 4:48:40 UTC - in response to Message 1700958.  

I am sorry to hear that happened. The feature that you described would be rather handy. If you want to release your tasks back to the grid before the return date, here is the easiest thing I find to do in this situation is return lost task is turned off. Set no new tasks. Empty cache of work. Then D attach from the project and reattach. If you are using Lunatics processing you will need to reinstall this
ID: 1700967 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1700968 - Posted: 13 Jul 2015, 4:49:46 UTC - in response to Message 1700940.  

Once the APs finish splitting my cache last about a day and half then it's back to MBs.

So long as my CPU cores are kept happy with AP's for several days then I only have to worry about keeping my GPU's fed with MB's, or letting them have at it with a backup project.

With the ratio of AP's produced compared to MB's it only seems fitting to me to limit my CPU cores to AP's and just let my GPU's deal with the MB's. ;-)

Now as to keeping my main rig's GPU's fed during a slightly extended outrage is another story which usually ends in another project being done for a day (my other rig has a 50% longer lasting GPU cache with it's extra GPU). :-(

Cheers.
ID: 1700968 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701031 - Posted: 13 Jul 2015, 12:09:47 UTC

Right now my machine has a full load of AP tasks. Early this morning the machine went south because it got excited and overheated. When I started it back up I noticed that there were no CPU tasks running, therefore I have 100 AP CPU tasks sitting around gathering dust. I can find nothing in the event log to suggest something wrong. I'm running a max of 7 CPUs, 2 cores for GTX750TIs @ x2 each, leaving 3 cores for CPU tasks and had no problems with the AP 7.03(sse2) until now. Running NVIDA 347.88, BOINC 7.6.3 (x64) and Lunatics. Any help will be greatly appreciated.


I don't buy computers, I build them!!
ID: 1701031 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1701039 - Posted: 13 Jul 2015, 13:42:10 UTC

How many GPU tasks are you running at the same time?
ID: 1701039 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1701055 - Posted: 13 Jul 2015, 14:18:53 UTC - in response to Message 1701039.  

As he said two:

I'm running a max of 7 CPUs, 2 cores for GTX750TIs @ x2 each, leaving 3 cores for CPU tasks
ID: 1701055 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1701058 - Posted: 13 Jul 2015, 14:29:02 UTC - in response to Message 1701055.  
Last modified: 13 Jul 2015, 14:32:27 UTC

He could be running more than two tasks per GPU.

If there is any restriction of cores in boinc or more than 2 tasks per GPU, that could explain it, as each AP opencl TASK uses 1 core, unless -use_sleep is implemented.(running lunatics)
ID: 1701058 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701088 - Posted: 13 Jul 2015, 16:08:56 UTC
Last modified: 13 Jul 2015, 16:22:28 UTC

To clear things up I'm running 2x AP tasks on twin Gtx750TIs. Each task is allocated .5 CPU core on a 4770K hyper threaded machine. Before anyone shouts that I must have 1 core free for each AP task, I've been running this way for several years, ever since my first 1st gen. i7/930. With 80% CPU utilization set in local prefs (6 cores) I'm usually been able to 4~5 CPU AP tasks at the same time. If I suspend all of the GPU tasks, I get 6 CPU tasks running, and as soon as I resumed some of the GPU tasks, they their started running along side 4 CPU tasks after placing 2 CPU task in "waiting to run" status.

13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Preempting ap_05fe15ad_B0_P0_00211_20150711_12329.wu_0 (removed from memory)
13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Preempting ap_04jn15ac_B6_P0_00320_20150711_09600.wu_0 (removed from memory)
13-Jul-2015 11:08:53 [SETI@home] Starting task ap_05fe15ac_B0_P1_00386_20150711_24137.wu_0
13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Starting task ap_05fe15ac_B0_P1_00386_20150711_24137.wu_0 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 0
13-Jul-2015 11:08:53 [SETI@home] Starting task ap_04jn15ab_B4_P1_00359_20150711_08549.wu_1
13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Starting task ap_04jn15ab_B4_P1_00359_20150711_08549.wu_1 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 2
13-Jul-2015 11:08:53 [SETI@home] Starting task ap_04jn15ab_B5_P0_00321_20150711_08794.wu_0
13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Starting task ap_04jn15ab_B5_P0_00321_20150711_08794.wu_0 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 3
13-Jul-2015 11:08:53 [SETI@home] Starting task ap_05fe15ad_B0_P0_00292_20150711_12329.wu_0
13-Jul-2015 11:08:53 [SETI@home] [cpu_sched] Starting task ap_05fe15ad_B0_P0_00292_20150711_12329.wu_0 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 8

I then resumed the rest of the suspended tasks. As soon as the running GPU tasks completed and new ones started, the 4 running CPU tasks went into "waiting to run" status.

Here is an example from 5 July running a max of 5 cores. You'll notice there are 2 MB and I AP CPU tasks running.

05-Jul-2015 07:53:55 [---] [dcf] scaling all duration correction factors by 1.024786
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task ap_30no14ab_B2_P0_00331_20150701_18156.wu_0 using astropulse_v7 version 703 (sse2) in slot 9
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task 04fe15ab.14621.2627.438086664202.12.69.vlar_1 using setiathome_v7 version 700 in slot 7
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task 04fe15ab.14621.3854.438086664202.12.228.vlar_0 using setiathome_v7 version 700 in slot 6
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task ap_04fe15ab_B5_P0_00246_20150624_04101.wu_1 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 8
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task ap_07ja15aa_B0_P0_00089_20150624_22554.wu_0 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 11
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task ap_04fe15ab_B6_P1_00245_20150624_24572.wu_0 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 12
05-Jul-2015 07:53:56 [SETI@home] [cpu_sched] Restarting task ap_06fe15aa_B4_P0_00386_20150624_06136.wu_1 using astropulse_v7 version 705 (opencl_nvidia_100) in slot 13

Regardless what I've set the "Use at most % of CPUs", CPU tasks will not run concurrent with GPU tasks. I'm thinking of reverting back to 7.4.42 to see if that will make a difference.

[edit]It seems that my ap_cmdline_7.05_windows_intelx86_opencl_nvidia_100.txt is not working. I might have to reinstall Lunatics. MANY THANKS WehZ for asking my settings.[/edit]

-use_sleep -unroll 10 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 8 1 -tune 2 128 8 1 -ocIFFT_plan 256 16 512 -hp


I don't buy computers, I build them!!
ID: 1701088 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1701096 - Posted: 13 Jul 2015, 16:38:53 UTC - in response to Message 1701088.  

[edit]It seems that my ap_cmdline_7.05_windows_intelx86_opencl_nvidia_100.txt is not working. I might have to reinstall Lunatics. MANY THANKS WehZ for asking my settings.[/edit]

-use_sleep -unroll 10 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 8 1 -tune 2 128 8 1 -ocIFFT_plan 256 16 512 -hp


Well, file I have to edit is named ap_cmdline_win_x86_SSE2_OpenCL_NV.txt

I'm using is Lunatics Win64 v0.43a. Maybe the difference is that I'm using AMD CPU and You are using Intel CPU??? Don't know for sure...

And btw, it's WezH, not WehZ. And don't worry, You are not the first one to mess it, and You will not be last one :)
ID: 1701096 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701118 - Posted: 13 Jul 2015, 17:53:23 UTC - in response to Message 1701096.  

[edit]It seems that my ap_cmdline_7.05_windows_intelx86_opencl_nvidia_100.txt is not working. I might have to reinstall Lunatics. MANY THANKS WehZ for asking my settings.[/edit]

-use_sleep -unroll 10 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 8 1 -tune 2 128 8 1 -ocIFFT_plan 256 16 512 -hp


Well, file I have to edit is named ap_cmdline_win_x86_SSE2_OpenCL_NV.txt

I'm using is Lunatics Win64 v0.43a. Maybe the difference is that I'm using AMD CPU and You are using Intel CPU??? Don't know for sure...

And btw, it's WezH, not WehZ. And don't worry, You are not the first one to mess it, and You will not be last one :)


Both files have the same settings, but it has had no effect on the CPU(SSE2) tasks. When I looked at my valid AP GPU tasks, I noticed that the CPU & and total run time were basically the same, there should have been a marked difference between the two. I reinstalled Lunatics and will wait until new GPU tasks to see if it took.


I don't buy computers, I build them!!
ID: 1701118 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1701151 - Posted: 13 Jul 2015, 19:33:44 UTC

With 80% CPU utilization set in local prefs (6 cores) I'm usually been able to 4~5 CPU AP tasks at the same time.


Set it 100% and watch.(no restriction)
ID: 1701151 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701173 - Posted: 13 Jul 2015, 20:55:00 UTC

As I stated in one of the previous posts -

Regardless what I've set the "Use at most % of CPUs", CPU tasks will not run concurrent with GPU tasks.


That also included no restrictions.

I've since uninstalled 7.6.3 and did a clean install of 7.4.42. since this is my daily machine I set "Use at most 70% of CPUs" (5 cores, 2, GPU, 3 CPU), which allows for 4 GPU & 3 CPU tasks to run concurrently. Looking at the properties for the GPU tasks, it seems that the ap_cmdline_7.05_windows_intelx86_opencl_nvidia_100.txt was not functioning. I then installed 7.6.1 over 7.4.42, then 7.6.2 over 7.6.1, which I'm currently running, with the same results as 7.4.42. I'll keep this version for now until I find out why the ap_cmdline is not functioning. Anyone running Lunatics have any suggestions?

Unfortunately I can't go back further than 8 July to see when the ap_cmdline starting failing. The CPU run time should be a very small fraction of the total run time.


I don't buy computers, I build them!!
ID: 1701173 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1701184 - Posted: 13 Jul 2015, 21:13:14 UTC - in response to Message 1701173.  

If I'm not mistaken ... But I could be completely wrong ...

ap_cmdline_7.05_windows_intelx86_opencl_nvidia_100.txt
Is for Stock apps

ap_cmdline_win_x86_SSE2_OpenCL_NV.txt
Is for Lunatics
ID: 1701184 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1701198 - Posted: 13 Jul 2015, 21:35:27 UTC - in response to Message 1701173.  
Last modified: 13 Jul 2015, 21:36:49 UTC

Hey Cliff,

Sorry I haven't gone back and re-read all your post ( I'll do that soon). 1 thing that sticks in my mind is you are setting the % utilization at 70%.

I don't know if you saw one of my post where I commented on that I don't think that is the right approach for reserving cores.

Telling the computer that it can only use 70% of the cores then thinking that it's going countermand that order and use part of the 30% for feeding the cores doesn't make sense. To me, it's more likely going to steal some time from that 70% of cores.

I went a different route. This only works if you are only doing 1 project.

I use the <project_max_concurrent> in the app_config.

Keeping the number of work units on the GPU for APs and MB the same (ie 2 per card in this case)
I figure out how many total (for both CPU and GPU, in your case 7) and list that in the <project_max_concurrent>

Then I allow the computer 100% use of all cores. This way It will only allow that total max to run at any point (as long as you have work) and their isn't any straggling or fighting for CPU cores. When I did that, my times improved and system was better responsiveness.

Something to think about. Looking at the CPU task I have running. They use 99.9% of each core. So there isn't much left for the GPU depending on what you have their CPU usage set at.

My 2 cents....

Zalster

Edit..

Cliff, if you suspend GPU crunching, do the CPU task start up?
ID: 1701198 · Report as offensive
Profile Todderbert
Avatar

Send message
Joined: 17 Jun 99
Posts: 221
Credit: 53,153,779
RAC: 0
United States
Message 1701238 - Posted: 13 Jul 2015, 23:25:40 UTC
Last modified: 13 Jul 2015, 23:26:30 UTC

I followed Zalster's advice a few weeks ago and tweaked it some just using the max_concurrent variable to free more cpu cores for these APs. When the APs dry up I will increase my max_concurrent variable to allow for more cpu wus, knowing that the MBs don't need as much cpu support. His advice is sound and did the trick for my multiple gpu setups.
ID: 1701238 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701243 - Posted: 14 Jul 2015, 0:06:14 UTC

This is the current version app_config.xml -

<app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v7</name>
<max_concurrent>8</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>

The only recent change done was to change the cpu_usaage from .5 -> 1 to see if the kernel thrashing is reduced. Because the thrashing has not been reduced is more proof that the app_cmdline is not functioning. With the use of -use_sleep, thrashing should have been eliminated.


I don't buy computers, I build them!!
ID: 1701243 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1701248 - Posted: 14 Jul 2015, 0:30:26 UTC - in response to Message 1701243.  

This is the current version app_config.xml -

<app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v7</name>
<max_concurrent>8</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>

The only recent change done was to change the cpu_usaage from .5 -> 1 to see if the kernel thrashing is reduced. Because the thrashing has not been reduced is more proof that the app_cmdline is not functioning. With the use of -use_sleep, thrashing should have been eliminated.



Not sure you need that <max_concurrent> in the app_config.xml

Have you tried shutting down boinc and remove that max concurrent, save then restarting boinc and seeing if it makes any difference?

Just curious to see what it does.
ID: 1701248 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1701250 - Posted: 14 Jul 2015, 0:39:40 UTC - in response to Message 1701248.  

This is the current version app_config.xml -

<app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v7</name>
<max_concurrent>8</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>

The only recent change done was to change the cpu_usaage from .5 -> 1 to see if the kernel thrashing is reduced. Because the thrashing has not been reduced is more proof that the app_cmdline is not functioning. With the use of -use_sleep, thrashing should have been eliminated.



Not sure you need that <max_concurrent> in the app_config.xml

Have you tried shutting down boinc and remove that max concurrent, save then restarting boinc and seeing if it makes any difference?

Just curious to see what it does.


Made no difference.


I don't buy computers, I build them!!
ID: 1701250 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 120 · Next

Message boards : Number crunching : Anything relating to AstroPulse tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.