Is it possible to swap a guppi assigned to GPU with a Arecibo assigned to CPU?

Message boards : Number crunching : Is it possible to swap a guppi assigned to GPU with a Arecibo assigned to CPU?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1799700 - Posted: 30 Jun 2016, 17:15:59 UTC - in response to Message 1799697.  

... early in testing I found it was losing work when client_state was out of synch with sched_request and sched_reply but that may have been another cause then. Thank you.

'Request' should certainly be irrelevant, but the client needs to read the 'reply' file and merge the contents into the next written copy of client_state.xml

If the rescheduling routine is only run after the BOINC client has been shut down, isn't that 'reply' info already merged into the client_state.xml that's already on disk? I haven't had any problems with my own routine just manipulating the client_state.xml file.
ID: 1799700 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1799710 - Posted: 30 Jun 2016, 17:56:45 UTC - in response to Message 1799700.  

... early in testing I found it was losing work when client_state was out of synch with sched_request and sched_reply but that may have been another cause then. Thank you.

'Request' should certainly be irrelevant, but the client needs to read the 'reply' file and merge the contents into the next written copy of client_state.xml

If the rescheduling routine is only run after the BOINC client has been shut down, isn't that 'reply' info already merged into the client_state.xml that's already on disk? I haven't had any problems with my own routine just manipulating the client_state.xml file.

Yes, exactly. You could conceivably get into a race condition if a reply was received from the server at the precise millisecond when you (or the rescheduler program) issued the shutdown command - I don't know how completely or robustly BOINC deals with that.
ID: 1799710 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1799712 - Posted: 30 Jun 2016, 18:12:14 UTC - in response to Message 1799710.  

Yes, exactly. You could conceivably get into a race condition if a reply was received from the server at the precise millisecond when you (or the rescheduler program) issued the shutdown command - I don't know how completely or robustly BOINC deals with that.

I suppose in that situation, the downloads wouldn't complete anyway and might crap out on BOINC restart with one of those "Timed out - no response" errors. I wonder if BOINC reads the latest scheduler reply file on startup to ensure that the client_state.xml is in sync with it. Doesn't seem likely, but I suppose it's possible.
ID: 1799712 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1799736 - Posted: 30 Jun 2016, 20:25:02 UTC - in response to Message 1799712.  

I wonder if BOINC reads the latest scheduler reply file on startup to ensure that the client_state.xml is in sync with it.

I think I just answered my own question. While I was eating lunch, it occurred to me that I could use Process Monitor during a BOINC startup to see what files were actually being read by the BOINC client.

So, here's a list of all the files that the client reads from the BOINC data directory ("C:\ProgramData\BOINC\" on my daily driver) from the time the client is launched (in my case, by BOINC Manager) and the time the active tasks are up and running. I've only listed the first ReadFile for each (although cc_config appears to be read twice).

12:55:12.2982954 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\cc_config.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:12.5438900 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\daily_xfer_history.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:12.7054181 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\account_setiathome.berkeley.edu.xml	SUCCESS	Offset: 0, Length: 2,625, Priority: Normal
12:55:12.7242364 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\statistics_setiathome.berkeley.edu.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:13.0120202 PM	boinc.exe	6060	ReadFile	C:\ProgramData\BOINC\cc_config.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:13.8964257 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\coproc_info.xml	SUCCESS	Offset: 0, Length: 2,666, Priority: Normal
12:55:13.9640905 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\app_info.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:13.9875153 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\client_state.xml	SUCCESS	Offset: 0, Length: 4,096, Priority: Normal
12:55:14.0896113 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\app_config.xml	SUCCESS	Offset: 0, Length: 378, Priority: Normal
12:55:14.2975966 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\global_prefs_override.xml	SUCCESS	Offset: 0, Length: 1,480, Priority: Normal
12:55:14.3439112 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\global_prefs.xml	SUCCESS	Offset: 0, Length: 1,407, Priority: Normal
12:55:14.3635428 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\slots\2\boinc_task_state.xml	SUCCESS	Offset: 0, Length: 539, Priority: Normal
12:55:14.3641406 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\slots\1\boinc_task_state.xml	SUCCESS	Offset: 0, Length: 538, Priority: Normal
12:55:14.3646406 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\slots\0\boinc_task_state.xml	SUCCESS	Offset: 0, Length: 500, Priority: Normal
12:55:14.3677519 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\gui_rpc_auth.cfg	SUCCESS	Offset: 0, Length: 32, Priority: Normal
12:55:14.6029155 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\notices\feeds_setiathome.berkeley.edu.xml	SUCCESS	Offset: 0, Length: 274, Priority: Normal
12:55:14.6378302 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\notices\archive_setiathome.berkeley.edu_notices.php.xml	SUCCESS	Offset: 0, Length: 1,673, Priority: Normal
12:55:15.0231213 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\MB8_win_x86_SSE3_VS2008_r3330.exe	SUCCESS	Offset: 735,232, Length: 1,024, Priority: Normal
12:55:15.0655795 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\MB8_win_x86_SSE3_VS2008_r3330.exe	SUCCESS	Offset: 735,232, Length: 1,024, Priority: Normal
12:55:15.1006801 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\Lunatics_x41zi_win32_cuda50.exe	SUCCESS	Offset: 6,853,632, Length: 1,024, Priority: Normal
12:55:16.5113042 PM	boinc.exe	8072	ReadFile	C:\ProgramData\BOINC\notices\setiathome.berkeley.edu_notices.php.xml	SUCCESS	Offset: 0, Length: 1,899, Priority: Normal

There's nothing indicating that any scheduler file is read, either before or after the client_state.xml is read.
ID: 1799736 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1800081 - Posted: 2 Jul 2016, 7:02:19 UTC - in response to Message 1799736.  

I'm loving this exchange of neuron data!
So much so that I tried a suggestion in a PM on reassigning tasks to the GPU or CPU by making minor changes in client_state.xml (while the Boinc Client has been shut down).
It works but I made a few mistakes so I just let the cache nearly empty itself before aborting the last few and resetting the project in order to start fresh.

The trick I've come up with is to:
1. do 2 batches: 1 for sending from GPU to CPU, and the other from CPU to GPU
2. for each batch, I "Suspend" the tasks in Boinc Manager (or BoincTasks in my case) the ones to reassign before I shut it down.
3. using Find & Replace, I can remove the line <plan_class>cuda50</plan_class> above the <suspended_via_gui/>,
or replace <suspended_via_gui/> with <plan_class>cuda50</plan_class>

The one abnormality I've come across is that a few reassigned-to-GPU tasks take much longer!
I haven't looked into why yet (by doing smaller batch transfers with a much smaller cache); I just thought I'd share that now in case others have come across it.

Cheers,
Rob :-D
ID: 1800081 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1800142 - Posted: 2 Jul 2016, 16:53:49 UTC - in response to Message 1800081.  

The one abnormality I've come across is that a few reassigned-to-GPU tasks take much longer!

It looks like those may be Arecibo VLARs that you moved from CPU to GPU. VLARs of any kind just don't do well on NVIDIA GPUs. I treat both GBT and Arecibo VLARs the same when it comes to rescheduling.
ID: 1800142 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1800143 - Posted: 2 Jul 2016, 16:58:58 UTC - in response to Message 1799694.  

@Mr Kevvy:
is there an ETA for a beta test of your Windows script?
...cuz my Pavlov dog saliva is starting to run dry! ;-)
ID: 1800143 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 1800230 - Posted: 3 Jul 2016, 1:00:49 UTC - in response to Message 1800143.  

@Mr Kevvy:
is there an ETA for a beta test of your Windows script?
...cuz my Pavlov dog saliva is starting to run dry! ;-)


Well, given you have the same little red and white icon on your profile that I do, you know what weekend this. Yes, it's a weekend of rest, relaxation, good food preferably around a BBQ, and a celebration of what it is to be Canadian...

...for other people. For me, it's time to be put to be work for a hellish three days of dust, dirt, sweat and moving heavy objects, most of which try to crush my fingers and toes.

Hoping to have it ready sometime next week. :^p
ID: 1800230 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1800233 - Posted: 3 Jul 2016, 1:08:53 UTC - in response to Message 1800230.  



Well, given you have the same little red and white icon on your profile that I do, you know what weekend this. Yes, it's a weekend of rest, relaxation, good food preferably around a BBQ, and a celebration of what it is to be Canadian...

...for other people. For me, it's time to be put to be work for a hellish three days of dust, dirt, sweat and moving heavy objects, most of which try to crush my fingers and toes.



To quote my favorite music station

It's the 1st of July Holiday weekend....

We see you Canada ;)
ID: 1800233 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1800724 - Posted: 4 Jul 2016, 16:54:13 UTC - in response to Message 1800230.  

For me, it's time to be put to be work for a hellish three days of dust, dirt, sweat and moving heavy objects, most of which try to crush my fingers and toes.
Sounds like you were helping one of the 70,000 households moving in Montreal on July 1st!
I hope your fingers and toes are intact.

While I wait impatiently ;-) for your script,
is there something else I should be manually changing other than 1 or 2 lines within each <result>...</result> section of the file: C:\ProgramData\BOINC\client_state.xml ?
ID: 1800724 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1800799 - Posted: 4 Jul 2016, 22:12:56 UTC - in response to Message 1800724.  

On this rig 8010413, I am running Cuda50 with 2 tasks at a time on the GPU. I was using Notepad++ to do a Find&Replace on the tasks I had suspended. It's fairly easy when it only involves deleted or replacing 1 line.

On my other rig 7996377 , I am running SoG (installed with Lunatics v0.45 beta3) with 1 task at a time running on the GPU. It wasn't easy with Notepad++ since the version number had to be changed from 800 to 812 (or vice-versa) in addition to the same line needing to be modified as described above for the Cuda50.

I then tried other text editors (that I had used in the past to prep host.gz to import into MS Access) and luckily, "Sublime Text" has a multiple line Find&Replace interface that doesn't even use a pop-up window (like Notepad++). In addition, it allows you to do a: Replace All!

All this to say: if you'd like to send tasks assigned to CPU or GPU to the other, I recommend using "Sublime Text".
For CPU to GPU, all you need to do is replace:
<version_num>800</version_num>
    <suspended_via_gui/>
with
<version_num>812</version_num>
    <plan_class>opencl_nvidia_SoG</plan_class>
...but don't forget to make a copy of client_state.xml before!

Cheers,
Rob ;-)
ID: 1800799 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1801038 - Posted: 6 Jul 2016, 6:18:14 UTC - in response to Message 1800799.  


...but don't forget to make a copy of client_state.xml before!


. . I don't have a file by that name in my setiathome directory.

. . Should it be there?

Stephen
ID: 1801038 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1801040 - Posted: 6 Jul 2016, 6:31:51 UTC - in response to Message 1801038.  

. . I don't have a file by that name in my setiathome directory.

It's in:
C:\ProgramData\BOINC

ID: 1801040 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 1801355 - Posted: 7 Jul 2016, 22:22:01 UTC

Some updates: the work-in-progress detection is fixed now. The app. works fine on Linux and our Windows 64-bit box, but a couple of the testers didn't have good results on Windows possibly 32-bit. One of them indicated that the file(s?) have a different format in Win32 which I had no idea of.

I was in a conundrum until I remembered I have a spare machine and a small PCIe-powered card to go in it. So I will be imaging that with Win7 32-bit this weekend starting Friday evening, and then I can finally test it properly with stock and Lunatics. Let's hope I can finally get it out there this weekend.
ID: 1801355 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1801361 - Posted: 7 Jul 2016, 22:37:41 UTC - in response to Message 1801355.  

One of them indicated that the file(s?) have a different format in Win32 which I had no idea of.

I have one 64-bit and four 32-bit machines on which I've been doing VLAR rescheduling for the last couple weeks, and I haven't noticed any format differences, at least not for the client_state.xml, which is the only file that I'm touching.
ID: 1801361 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1802251 - Posted: 12 Jul 2016, 22:47:22 UTC - in response to Message 1801355.  
Last modified: 12 Jul 2016, 22:49:57 UTC

Some updates: the work-in-progress detection is fixed now. The app. works fine on Linux and our Windows 64-bit box, but a couple of the testers didn't have good results on Windows possibly 32-bit. One of them indicated that the file(s?) have a different format in Win32 which I had no idea of.

I was in a conundrum until I remembered I have a spare machine and a small PCIe-powered card to go in it. So I will be imaging that with Win7 32-bit this weekend starting Friday evening, and then I can finally test it properly with stock and Lunatics. Let's hope I can finally get it out there this weekend.



. . Your efforts are much appreciated and I look forward to trying it out. I have a little Windows10 64 bit box (core2 Duo) with a Gt 730 card that is itching for it :)
ID: 1802251 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1802254 - Posted: 12 Jul 2016, 22:55:17 UTC - in response to Message 1801040.  

. . I don't have a file by that name in my setiathome directory.

It's in:
C:\ProgramData\BOINC


. . OK,

. . I have tried the batch file with no luck, it crashes badly on my Core2 Duo. I cannot tell exactly what is failing because it scrolls the windoww before I can read anything and then goes to a red screen saying I need to rerun it.

. . One result message I have seen is a message saying it cannot find Boinc Tasks. Apparently it is not compatible with Boinc Manager at all.

. . Also I think it would help to have a single instructions file outlining the process and then taking you through the steps from start to finish. But I cannot even begin to create one when it fails almost right away. :(

.
ID: 1802254 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1802759 - Posted: 15 Jul 2016, 19:11:37 UTC

For those working on developing a VLAR rescheduler (and I know there are at least 2 of you), I want to mention a situation that I ran into on one of my own boxes last evening, which probably should be taken into consideration in a rescheduler.

I'm currently running my own little home-grown VLAR rescheduler on each of my crunch-only machines as a scheduled task at user logon. After the rescheduling is completed, the routine then launches BOINC Manager (instead of having BM launch itself at startup). Last evening, the rescheduler's log on my WinVISTA machine showed that 7 VLARs had been moved to the CPU. However, BOINC Manager was showing me that those 7 VLARs were still scheduled to run on the GPU.

I didn't have time to look into it then, but this morning I reviewed the BOINC Event Log and found a curious line:

7/14/2016 9:00:57 PM |  | Using state file client_state_next.xml

It seems that when the system shut down yesterday afternoon (a normal weekday occurrence), BOINC hadn't finished cleanly writing and renaming its assorted client_state files, probably because that machine experienced one of those "restarting tasks during shutdown" episodes that has been discussed in several threads here. No tasks actually failed, but apparently the OS ultimately terminated BOINC "with prejudice" while it was still busy with client_state activities.

So, it seems that during BOINC startup, if it finds a client_state_next.xml file, it uses it to the exclusion of any client_state.xml file that exists. I think that makes sense since, assuming the client_state_next file is complete (and was cleanly closed before the previous BOINC shutdown), it would contain the most up-to-date client_state data.

Ultimately, that probably means that any alterations made to a client_state.xml file (while BOINC is shut down, of course), whether for rescheduling or any other purpose, should only be performed when a client_state_next.xml file doesn't currently exist. Food for thought! :^)
ID: 1802759 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1802789 - Posted: 15 Jul 2016, 23:16:34 UTC
Last modified: 15 Jul 2016, 23:27:41 UTC

146 points in under 4-6 minutes. I wouldn't swap. Just wait -- it is coming to you all NV people.

see this for an example. And my inconclusives is falling at the same rate my valids are increasing. An error or two with (tens of) thousands a day is a ...

 Next 20
State: All (6472) · In progress (500) · Validation pending (3092) · Validation inconclusive (1023) · Valid (1854) · Invalid (0) · Error (3) 


Please, do not take this too seriously. The improvements are coming. I'm going -- to sleep.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1802789 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1802793 - Posted: 15 Jul 2016, 23:29:18 UTC - in response to Message 1802789.  

146 points in under 4-6 minutes. I wouldn't swap. Just wait -- it is coming to you all NV people.

Soon?
Please say "very soon."
Grant
Darwin NT
ID: 1802793 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Is it possible to swap a guppi assigned to GPU with a Arecibo assigned to CPU?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.