Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)

Message boards : Number crunching : Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 32 · Next

AuthorMessage
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1794820 - Posted: 9 Jun 2016, 21:51:28 UTC - in response to Message 1794816.  

It just OK with ~51min of elapsed time and ~2min CPU time (not bad too).


I am not sure, if it is the one.You still have to remember, that it only finished, because i eventually set it to run by itself.
ID: 1794820 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1794822 - Posted: 9 Jun 2016, 21:57:08 UTC - in response to Message 1794820.  

It just OK with ~51min of elapsed time and ~2min CPU time (not bad too).


I am not sure, if it is the one.You still have to remember, that it only finished, because i eventually set it to run by itself.


Well, try to leave BOINC untouched for awhile. Then if some issue shows up again do new report (with checkable data of course in the report body).
ID: 1794822 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794876 - Posted: 10 Jun 2016, 0:32:39 UTC - in response to Message 1794757.  

You can always decrease the -sbs to 256 if you want and see if that works better



. . I think with his card 256 might be better, even 384 might be two much for running triples.
ID: 1794876 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1794885 - Posted: 10 Jun 2016, 1:10:06 UTC - in response to Message 1794876.  

You can always decrease the -sbs to 256 if you want and see if that works better



. . I think with his card 256 might be better, even 384 might be two much for running triples.

I tried 256 and 192, as the notes said, sog would run wu's then put wu's in waiting, run new ones, then make those wait, I had one go out to about 5000 days, I think My card and sog just don't like each other, it's a GTX 580, a fermi, I give up. I can't even replace the seat in My car, not for 2 months, soI give up.

Two more threads and I'm gone...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794885 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1794887 - Posted: 10 Jun 2016, 1:13:45 UTC - in response to Message 1794885.  

Zoom,

Did you also make a app_config.xml and specify 2 work units in it for both SoG and Cuda 42?
ID: 1794887 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1794946 - Posted: 10 Jun 2016, 4:49:03 UTC - in response to Message 1793908.  

Beta2 now available, in both 64-bit and 32-bit versions.


Selected and ran 64bit installer.

Win10 64bit, re-selected CPU AVX application, selected NVidia GPU SoG application (was running CUDA50).
Installed, restarted BOINC.
No problems.
Crunching resumed, cache intact.


. . When you updated to 0.45 I take it your cached WUs remained as CUDA50, is that correct?
ID: 1794946 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13789
Credit: 208,696,464
RAC: 304
Australia
Message 1794947 - Posted: 10 Jun 2016, 4:52:37 UTC - in response to Message 1794946.  
Last modified: 10 Jun 2016, 4:53:04 UTC

Beta2 now available, in both 64-bit and 32-bit versions.


Selected and ran 64bit installer.

Win10 64bit, re-selected CPU AVX application, selected NVidia GPU SoG application (was running CUDA50).
Installed, restarted BOINC.
No problems.
Crunching resumed, cache intact.


. . When you updated to 0.45 I take it your cached WUs remained as CUDA50, is that correct?

Yep.
New work downloaded labelled as SoG.

When I went back to CUDA likewise no issues.
Current work labeled SoG ran OK, all new work labeled as CDUA.
Grant
Darwin NT
ID: 1794947 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1794949 - Posted: 10 Jun 2016, 4:56:45 UTC - in response to Message 1794887.  

Zoom,

Did you also make a app_config.xml and specify 2 work units in it for both SoG and Cuda 42?

Lunatics does that, I just modify the count, and only the count,
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794949 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1794954 - Posted: 10 Jun 2016, 5:21:23 UTC - in response to Message 1794949.  

Ok,

I can't figure out why it's doing the waiting to run part. It should run consecutively. Have you tried decreasing the cache size , maybe it's thinking it won't complete the number of tasks before the deadline?

That is all I can figure for now.

Sorry can't be any more help. I'll keep thinking about it

Z
ID: 1794954 · Report as offensive
woohoo
Volunteer tester

Send message
Joined: 30 Oct 13
Posts: 973
Credit: 165,671,404
RAC: 5
United States
Message 1794957 - Posted: 10 Jun 2016, 5:32:29 UTC

could it be that the remaining time becomes too long so then boinc decides to move on to the next task that it thinks it can complete before the deadline?
ID: 1794957 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1794959 - Posted: 10 Jun 2016, 5:37:29 UTC - in response to Message 1794954.  

Ok,

I can't figure out why it's doing the waiting to run part. It should run consecutively. Have you tried decreasing the cache size , maybe it's thinking it won't complete the number of tasks before the deadline?

That is all I can figure for now.

Sorry can't be any more help. I'll keep thinking about it

Z

No. Is this what you mean by cache? Since the notes for SoG make no mention of cache, yeah I looked.

Disk
Use no more than	100 GB
Leave at least		0.001 GB free
Use no more than	100% of total

The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794959 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1794962 - Posted: 10 Jun 2016, 5:48:04 UTC - in response to Message 1794959.  
Last modified: 10 Jun 2016, 5:48:21 UTC

No, I mean how much work do you request in your settings.

Under the preferences it says store at least (blank) days

and additonal (blank) days

I used to use 10 days and 0.1 days for my settings, that filled my cache to 400 work units

However, for you I would think it would try to give you 100 work units

As such, if it thinks it will not complete all those work units in time, it might be switching to the ones closest to timing out and run those.

You could change your preferences to say 0.5 day and 0.1 day and see if that decreases the total amount you get and then see if the work proceeds in order rather than going to the ones with the shortest deadline.

This is just an idea, you don't have to try this if you don't want.
ID: 1794962 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1794963 - Posted: 10 Jun 2016, 5:51:26 UTC - in response to Message 1794954.  

Zoom, things like this I find quite strange,

Restarted at 17.38 percent.
Used GPU device parameters are:
Number of compute units: 16
Single buffer allocation size: 384MB
Total device global memory: 1536MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: yes
LowPerformanceGPU path: no
period_iterations_num=50
13:54:12 (55616): Can't acquire lockfile (32) - waiting 35s
13:54:47 (55616): Can't acquire lockfile (32) - exiting
13:54:47 (55616): Error: The process cannot access the file because it is being used by another process.

(0x20)
13:55:44 (43484): BOINC client no longer exists - exiting
13:55:44 (43484): timer handler: client dead, exiting
13:55:54 (43484): BOINC client no longer exists - exiting
13:55:54 (43484): timer handler: client dead, exiting
13:56:03 (53804): Can't acquire lockfile (32) - waiting 35s
13:56:04 (43484): BOINC client no longer exists - exiting
13:56:04 (43484): timer handler: client dead, exiting
13:56:14 (43484): BOINC client no longer exists - exiting
13:56:14 (43484): timer handler: client dead, exiting
13:56:24 (43484): BOINC client no longer exists - exiting
13:56:24 (43484): timer handler: client dead, exiting
13:56:34 (43484): BOINC client no longer exists - exiting
13:56:34 (43484): timer handler: client dead, exiting
13:56:38 (53804): Can't acquire lockfile (32) - exiting
13:56:38 (53804): Error: The process cannot access the file because it is being used by another process.

ID: 1794963 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1794964 - Posted: 10 Jun 2016, 6:00:58 UTC - in response to Message 1794962.  

No, I mean how much work do you request in your settings.

Under the preferences it says store at least (blank) days

and additonal (blank) days

I used to use 10 days and 0.1 days for my settings, that filled my cache to 400 work units

However, for you I would think it would try to give you 100 work units

As such, if it thinks it will not complete all those work units in time, it might be switching to the ones closest to timing out and run those.

You could change your preferences to say 0.5 day and 0.1 day and see if that decreases the total amount you get and then see if the work proceeds in order rather than going to the ones with the shortest deadline.

This is just an idea, you don't have to try this if you don't want.

I changed that to 10 and 2, the 10 was never higher, the 2 was at 8, that was like that for a long time.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794964 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1794966 - Posted: 10 Jun 2016, 6:06:07 UTC - in response to Message 1794963.  
Last modified: 10 Jun 2016, 6:09:15 UTC

Zoom, things like this I find quite strange,

Restarted at 17.38 percent.
Used GPU device parameters are:
Number of compute units: 16
Single buffer allocation size: 384MB
Total device global memory: 1536MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: yes
LowPerformanceGPU path: no
period_iterations_num=50
13:54:12 (55616): Can't acquire lockfile (32) - waiting 35s
13:54:47 (55616): Can't acquire lockfile (32) - exiting
13:54:47 (55616): Error: The process cannot access the file because it is being used by another process.

(0x20)
13:55:44 (43484): BOINC client no longer exists - exiting
13:55:44 (43484): timer handler: client dead, exiting
13:55:54 (43484): BOINC client no longer exists - exiting
13:55:54 (43484): timer handler: client dead, exiting
13:56:03 (53804): Can't acquire lockfile (32) - waiting 35s
13:56:04 (43484): BOINC client no longer exists - exiting
13:56:04 (43484): timer handler: client dead, exiting
13:56:14 (43484): BOINC client no longer exists - exiting
13:56:14 (43484): timer handler: client dead, exiting
13:56:24 (43484): BOINC client no longer exists - exiting
13:56:24 (43484): timer handler: client dead, exiting
13:56:34 (43484): BOINC client no longer exists - exiting
13:56:34 (43484): timer handler: client dead, exiting
13:56:38 (53804): Can't acquire lockfile (32) - exiting
13:56:38 (53804): Error: The process cannot access the file because it is being used by another process.


0.45 has the unfortunate habit of leaving wu's running w/o BoincTasks(1.69), so right now I'm on 0.44. I had 3 files that would cycle, they'd work up to a point, then go back to the last checkpoint, did that several times before I aborted them, since they would just do that over and over all night long if I didn't do something, it's like they had the hiccups.

When 0.44 was installed, SoG went away, so did the 16 Sog wu's which were replaced with 18 guppis, the 16 were also guppis.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1794966 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795046 - Posted: 10 Jun 2016, 13:32:02 UTC - in response to Message 1794947.  
Last modified: 10 Jun 2016, 14:17:56 UTC



. . When you updated to 0.45 I take it your cached WUs remained as CUDA50, is that correct?

Yep.
New work downloaded labelled as SoG.

When I went back to CUDA likewise no issues.
Current work labeled SoG ran OK, all new work labeled as CDUA.



. . OK.

. . Mine kept processing but runtimes doubled and I didn't know if that was normal or an issue. So I went back to CUDA50 and like you the return process was seamless but with one exception. When it restored the previous settings it found a version of app_config.xml setup for running doubles, so until the cache is down I will be running twosies :)

. . Then I will reload 0.45 and deal with the tweaking necessary to get it running as smoothly as the 950, well at least as close to it as I can get.

. . The low_performance path options may be a little less co-operative and not so simple to resolve.

[Edit] I only just noticed, because of a message to Zoom, that on the change back the SoG WU's have disappeared. Or at best have been re-identified as CUDA50.

.
ID: 1795046 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795047 - Posted: 10 Jun 2016, 13:33:58 UTC - in response to Message 1794959.  
Last modified: 10 Jun 2016, 13:43:37 UTC

Ok,

I can't figure out why it's doing the waiting to run part. It should run consecutively. Have you tried decreasing the cache size , maybe it's thinking it won't complete the number of tasks before the deadline?

That is all I can figure for now.

Sorry can't be any more help. I'll keep thinking about it

Z

No. Is this what you mean by cache? Since the notes for SoG make no mention of cache, yeah I looked.

Disk
Use no more than	100 GB
Leave at least		0.001 GB free
Use no more than	100% of total


. . No he means the WU files cached on your HDD, or as some people call it, your work queue. A little joke there.

. . In BOINC manager, under Options/Computing Preferences/Computing tab, box at the bottom where is says "Store at least xx days of work". Reduce that to a small number to reduce the number of WU files stored in your cache. But it will take time, as the cache only reduces in size as tasks are completed. But it will not replenish until the estimated runtime of remaining WUs is less than the value you have set as above.

. . The part you cited is about disk storage I believe.
ID: 1795047 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795053 - Posted: 10 Jun 2016, 13:48:05 UTC - in response to Message 1794964.  
Last modified: 10 Jun 2016, 13:49:57 UTC


I changed that to 10 and 2, the 10 was never higher, the 2 was at 8, that was like that for a long time.


. . I believe 10 is the maximum allowed. But he means a much lower value as he stated of something like 0.5, though you can leave the second line at "0".

. . Just remember it will not reduce immediately.

. . BTW mine is set at 1.5 and zero and is AOK, but I do not have the issues you are talking about.
ID: 1795053 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795067 - Posted: 10 Jun 2016, 14:15:10 UTC - in response to Message 1794966.  


0.45 has the unfortunate habit of leaving wu's running w/o BoincTasks(1.69), so right now I'm on 0.44. I had 3 files that would cycle, they'd work up to a point, then go back to the last checkpoint, did that several times before I aborted them, since they would just do that over and over all night long if I didn't do something, it's like they had the hiccups.

When 0.44 was installed, SoG went away, so did the 16 Sog wu's which were replaced with 18 guppis, the 16 were also guppis.


. . I think you needed to reset the option in Boinc Tasks to shut down the BOINC client when exiting. They run independently and unless you use that option in Boinc Tasks the BOINC client will continue to run in the background as it is designed to do.

. . There is definitely an issue there with the change back process as when I changed my "little" machine to 0.45 the cache remained intact, no files were lost and the WU process designations remained unchanged. New WUs were labelled SoG. The pre-existing cached CUDA50 files were still CUDA50. But when I changed back the S0G WU's were also gone from BOINC Manager. I cannot tell if the WU files are still in my local cache as there are too many files to manually look at each one to see if it identifies as SoG but the Seti list only shows the files as Anonymous Nvidia GPU not as SoG or CUDA50. Maybe Richard has some insight into that.
ID: 1795067 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65962
Credit: 55,293,173
RAC: 49
United States
Message 1795093 - Posted: 10 Jun 2016, 15:31:21 UTC - in response to Message 1795067.  


0.45 has the unfortunate habit of leaving wu's running w/o BoincTasks(1.69), so right now I'm on 0.44. I had 3 files that would cycle, they'd work up to a point, then go back to the last checkpoint, did that several times before I aborted them, since they would just do that over and over all night long if I didn't do something, it's like they had the hiccups.

When 0.44 was installed, SoG went away, so did the 16 Sog wu's which were replaced with 18 guppis, the 16 were also guppis.


. . I think you needed to reset the option in Boinc Tasks to shut down the BOINC client when exiting. They run independently and unless you use that option in Boinc Tasks the BOINC client will continue to run in the background as it is designed to do.

. . There is definitely an issue there with the change back process as when I changed my "little" machine to 0.45 the cache remained intact, no files were lost and the WU process designations remained unchanged. New WUs were labelled SoG. The pre-existing cached CUDA50 files were still CUDA50. But when I changed back the S0G WU's were also gone from BOINC Manager. I cannot tell if the WU files are still in my local cache as there are too many files to manually look at each one to see if it identifies as SoG but the Seti list only shows the files as Anonymous Nvidia GPU not as SoG or CUDA50. Maybe Richard has some insight into that.

I already do have BoincTasks set to close Boinc down, since I use that instead of the front end that Boinc uses.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1795093 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 32 · Next

Message boards : Number crunching : Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.