GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU

Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 37 · Next

AuthorMessage
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806695 - Posted: 2 Aug 2016, 23:55:27 UTC - in response to Message 1806646.  

I'd appreciate it if you could try out my front-end script (*.cmd) to automate Mr. Kevvy's script (.exe). The link is the last one in Mr Kevvy's original post at the top...and here it is again: front-end .CMD batch file for Windows

. . With drop box it helps if you make your uploaded files public access, otherwise I cannot get it.

Hey Stephen,
I had no prob DLing it when I'm logged out of my DropBox account and neither did Mr Kevvy and Keith.
I'll email it to you in case it's an issue from the other side of the globe! ;-}
If you could UL it to your dropbox Australia account and post the link for Aussies and Kiwis, I would greatly appreciate it.
Cheers,
Rob :-)
ID: 1806695 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1806717 - Posted: 3 Aug 2016, 3:23:58 UTC - in response to Message 1806695.  


Hey Stephen,
I had no prob DLing it when I'm logged out of my DropBox account and neither did Mr Kevvy and Keith.
I'll email it to you in case it's an issue from the other side of the globe! ;-}
If you could UL it to your dropbox Australia account and post the link for Aussies and Kiwis, I would greatly appreciate it.
Cheers,
Rob :-)


. . Now that sounds like a plan. It would be good for lots of Aussie Contributors who may be interested. I hope Lionel is reading this and maybe there will be a few dozen thousand less ghosted WUs :)
ID: 1806717 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806722 - Posted: 3 Aug 2016, 3:45:30 UTC - in response to Message 1806717.  

I don't think he was aware his 4 hosts were mismanaging ~47,000 tasks in total.
At least it gives a counter argument to those who think that rescheduling is "nefarious" and putting the servers at an increased risk of crashing!

I can't believe this is almost a taboo subject!
If the cache/queue gets a turnover within a few days, I don't see an issue.
It's not even likely that a thousand participants will be using the GUPPIrescheduler for the WoW event...unless "Bunkering" is much more popular than I think it is.

Oh well, I'm keeping an open mind in case there's a variable I didn't consider.
ID: 1806722 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1806724 - Posted: 3 Aug 2016, 4:27:20 UTC - in response to Message 1806722.  

I don't think he was aware his 4 hosts were mismanaging ~47,000 tasks in total.
At least it gives a counter argument to those who think that rescheduling is "nefarious" and putting the servers at an increased risk of crashing!

I can't believe this is almost a taboo subject!
If the cache/queue gets a turnover within a few days, I don't see an issue.
It's not even likely that a thousand participants will be using the GUPPIrescheduler for the WoW event...unless "Bunkering" is much more popular than I think it is.

Oh well, I'm keeping an open mind in case there's a variable I didn't consider.

"Bunkering" We can only hope!!!! Last year before the start of the contest there was a slew of AstroPulse work made available and we got a head start in the RAC bump the contest encourages. I don't think we will so fortuitous this year with the dearth of Arecibo work and prevalence of Guppis. At least we can make use of the great app and script that Mr. Kevvy and Stubbles69 provides in re-scheduling work to the optimum platform. I'm already signed up with the great Pisces team as I did last year. Power bill suffers but I get a huge psychological "attaboy" for my efforts. Hoping to better my finish position from last year.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1806724 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806729 - Posted: 3 Aug 2016, 4:55:58 UTC - in response to Message 1806724.  

Looking at the calendar, the WoW event starts on a Monday.
That will make the Tuesday maintenance downtime an even bigger reason to cache over 100 tasks per device for power-rigs.
But I don't think it is an easy thing to do with Mr Kevvy's script on High-End GPUs.
I know that with my GTX 750 Ti and Mr Kevvy's script, my cache on GPU is getting a bit too big to be able to justify :-S
...if I didn't have the hope of getting my hands on a GTX 1060 before the WoW event.
ID: 1806729 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1806748 - Posted: 3 Aug 2016, 7:01:05 UTC - in response to Message 1806729.  

I wonder if the difference between or two systems is simply the difference in hardware. I only reschedule once a day and my tasks are not growing out of bounds. I've only seen maybe an increase of 50 or so tasks over my nominal 300 in progress. Anything more that shows is simply my "ghosted" tasks from my stupidity in editing app_info.xml.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1806748 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806751 - Posted: 3 Aug 2016, 7:27:53 UTC - in response to Message 1806748.  

In my NV_SoG rig, the sole GTX 750 Ti is only processing ~2 nonVLARs for each guppi-on-CPU.
Your cards perform better than mine and you have 2 in each rig.
I'm guessing you process a few guppis on your GPUs...while I process all on CPU.

As for the Cuda50 rig with also just 1 GTX 750 Ti, that's a whole different thread!
ID: 1806751 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1806947 - Posted: 4 Aug 2016, 3:47:55 UTC - in response to Message 1806729.  

Looking at the calendar, the WoW event starts on a Monday.
That will make the Tuesday maintenance downtime an even bigger reason to cache over 100 tasks per device for power-rigs.
But I don't think it is an easy thing to do with Mr Kevvy's script on High-End GPUs.
I know that with my GTX 750 Ti and Mr Kevvy's script, my cache on GPU is getting a bit too big to be able to justify :-S
...if I didn't have the hope of getting my hands on a GTX 1060 before the WoW event.


. . There is a 1060 now? I must have a peek.

. . And what is a number too great to justify? 3 days work maybe? :)
ID: 1806947 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806977 - Posted: 4 Aug 2016, 7:54:13 UTC - in response to Message 1806947.  

Looking at the calendar, the WoW event starts on a Monday.
That will make the Tuesday maintenance downtime an even bigger reason to cache over 100 tasks per device for power-rigs.
But I don't think it is an easy thing to do with Mr Kevvy's script on High-End GPUs.
I know that with my GTX 750 Ti and Mr Kevvy's script, my cache on GPU is getting a bit too big to be able to justify :-S
...if I didn't have the hope of getting my hands on a GTX 1060 before the WoW event.

. . There is a 1060 now? I must have a peek.
. . And what is a number too great to justify? 3 days work maybe? :)

Be careful who you tell about exceeding 100 tasks on GPU or CPU.
see: https://setiathome.berkeley.edu/forum_thread.php?id=80031&postid=1806958
I certainly didn't enjoy having to write a response to that.

I really don't understand why it is such a taboo subject :-S
Especially since no one with knowledge of the old Boinc client is helping Lionel with his 47,000 task issues.
ID: 1806977 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1806981 - Posted: 4 Aug 2016, 8:18:45 UTC



Especially since no one with knowledge of the old Boinc client is helping Lionel with his 47,000 task issues.


Lionel doesn't have an issue, he's just creating another 1 on purpose seeing as he worked out that aborting all of his Guppies reduced his work quota, so he's just purposely turning them into ghosts now. ;-)

Once you've known him for several years you'll know what I mean, but all that matters to Lionel is keeping his RAC up no matter what he has to do to keep it going and this is just his latest fool trick to achieve his goal. :-(

Cheers.
ID: 1806981 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1806982 - Posted: 4 Aug 2016, 8:21:35 UTC - in response to Message 1806977.  

Especially since no one with knowledge of the old Boinc client is helping Lionel with his 47,000 task issues.

What issue?
Lionel doesn't want to process Guppies & is actively trashing them, but in a way that doesn't count as a straight up error or abortion.
Grant
Darwin NT
ID: 1806982 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1807005 - Posted: 4 Aug 2016, 9:41:51 UTC - in response to Message 1806981.  

OMGoodnes, I had no clue! lol
I'm guessing he might not have figured out that guppis run faster on CPU than nonVLARs do.
What a crazy world we live in.

I hope most are not viewing my efforts to optimize and understand how things work under such a negative light :-S
...cuz I'm certainly feeling quite a bit of resistance on 2 threads right now.
I certainly didn't see all that coming! WoW

Anyways, happy pimping your rigs!
Rob :-}
ID: 1807005 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1807037 - Posted: 4 Aug 2016, 15:11:49 UTC - in response to Message 1806977.  

Looking at the calendar, the WoW event starts on a Monday.
That will make the Tuesday maintenance downtime an even bigger reason to cache over 100 tasks per device for power-rigs.
But I don't think it is an easy thing to do with Mr Kevvy's script on High-End GPUs.
I know that with my GTX 750 Ti and Mr Kevvy's script, my cache on GPU is getting a bit too big to be able to justify :-S
...if I didn't have the hope of getting my hands on a GTX 1060 before the WoW event.

. . There is a 1060 now? I must have a peek.
. . And what is a number too great to justify? 3 days work maybe? :)

Be careful who you tell about exceeding 100 tasks on GPU or CPU.
see: https://setiathome.berkeley.edu/forum_thread.php?id=80031&postid=1806958
I certainly didn't enjoy having to write a response to that.

I really don't understand why it is such a taboo subject :-S
Especially since no one with knowledge of the old Boinc client is helping Lionel with his 47,000 task issues.



. . It seems there are some contributors with a quasi-religious attitude to the idea of aborting WUs or having more than 100 jobs cached. This is despite the fact that aborted WUs are rebirthed within a few hours and that many of those with 100+ WUs cached will probably have the results returned within a few days. Which to me seems to be the logical objective of such a project as this. Yet as you discovered in that exchange they are fine with the idea of mismanaged hosts caching more work than they will do in 2 months causing them to eventually time out and be rebirthed after 2 months of unnecessary delay. I would have no problems writing a reply to such a note even if I was very unpopular afterwards. If a contributor was as Rob described, then there is no reason for them behaving so ill-considered a fashion. If a user has a low power rig and/or only participates for very short periods during the day/week, then there is NO logical reason for them caching 200 jobs. When I started out with the C2D I only cached about 20 WUs at a time, which was slightly less than a days work (before I added the GT730) and it worked very nicely. That is such a ridiculous double standard ... it makes me wonder about some people.

. . Mind you I have no desire to have many hundreds of WUs cached either, enough for a day or so is sufficient, as long as I can get the numbers of WUs in the right caches. Grant is another contributor with that same attitude as Rob. After a similar discussion to yours with Rob I am definitely not one of his favourite contributors <shrug>.
ID: 1807037 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1807421 - Posted: 6 Aug 2016, 0:29:47 UTC - in response to Message 1803959.  

Warning: This machine appears to be attached to SETI@Home Beta.
If there are active Beta work units, this program may reassign them improperly,
as they look like SETI@Home work. OK to proceed (Y to continue)?Y

I think this "problem" with SETI@Home Beta can be avoided,
I'll give some hints to be discussed
(my knowledge of C is from long ago and I really don't know the ++ part of C++)


   Now the C++ code searches the entire client_state.xml like:
// Now scan client_state for the GPU platform and version_num
tempstring = "<plan_class>" + plan_class;
currentposition = 0;
while (currentposition >= 0) {
	nextposition = client_state.find(tempstring, currentposition);



   Instead it maybe can be searched between two pointers p1 and p2 from the start to end of SETI@home section, something like:
// Find start of SETI@home section
tempstring = "<master_url>http://setiathome.berkeley.edu/</master_url>";
p1 = client_state.find(tempstring, 0);
	if ( p1 == string::npos )  {
		cout << "Error: could not find SETI@home in client_state.xml, Nothing changed.\n";
		return 0;
	}

// Find end of SETI@home section
p2 = client_state.find("<project>", p1);
	if ( p2 == string::npos )  {
		p2 = client_state.find("</client_state>", p1);
	}


// Now scan client_state for the GPU platform and version_num
tempstring = "<plan_class>" + plan_class;
currentposition = p1;
while (currentposition >= p1 && currentposition < p2) {
	nextposition = client_state.find(tempstring, currentposition);

.........
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1807421 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1808779 - Posted: 12 Aug 2016, 21:21:19 UTC

Just had a failure in using the front-end script and app on this machine. Has worked flawlessly up till now for weeks.

Reading configuration files...
Error: could not determine CPU version_num from client_state. Nothing changed.


I wonder if something has changed in how client_state.xml is written by the project.

Worked like it should on my other two crunchers a while ago. Anybody else seeing this new issue??
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1808779 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1808796 - Posted: 12 Aug 2016, 23:06:29 UTC - in response to Message 1808779.  
Last modified: 12 Aug 2016, 23:09:03 UTC

Worked like it should on my other two crunchers a while ago. Anybody else seeing this new issue??

Hey Keith,
Following your PM and now reading your post, I just remembered 2 unrelated things that could help Mr K in debugging the error you are reporting.
They are two very minor interrelated issues I've had and both are outside the scope of normal use of Mr. Kevvy's script (>500 tasks).

If the client_state.xml has 900-1000 tasks in it (done with another script I built but haven't released) and I run Mr. Kevvy's .exe, sometimes (but only rarely) it:
1. reports that it reassigned nothing ( 0 nonVLARs to GPU + 0 guppi to CPU ); or
2. terminates abnormally and closes the .cmd file.

I usually then reassign a few tasks with my other script, rerun Mr K's and it has then worked every single time as expected.

I know this doesn't help you directly but it might give you some ideas about using some different troubleshooting techniques that you hadn't tried or thought about.
Cheers,
Rob :-)

[edit]fixed typos above[/e]
ID: 1808796 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1808799 - Posted: 12 Aug 2016, 23:22:37 UTC - in response to Message 1808779.  

Just had a failure in using the front-end script and app on this machine. Has worked flawlessly up till now for weeks.

Reading configuration files...
Error: could not determine CPU version_num from client_state. Nothing changed.


I wonder if something has changed in how client_state.xml is written by the project.

Worked like it should on my other two crunchers a while ago. Anybody else seeing this new issue??



Hi Keith,

. . This error has been reported before by Zalster ...

Message 1804557

. . And Mr Kevvy made a cogent response. Were there any nonVLAR tasks in your CPU queue on that machine at the time? If the app does not find any nonVLAR tasks in the CPU cache then it will terminate without taking any action.
ID: 1808799 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1808816 - Posted: 13 Aug 2016, 4:49:19 UTC - in response to Message 1808799.  

Stephen, I can't find that message 1804557 and thus can't read Zalster's response. Can you provide the link please.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1808816 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1808817 - Posted: 13 Aug 2016, 4:51:42 UTC - in response to Message 1808796.  

Only Pipsqueek is over 500 tasks. The other two are only slightly over 400.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1808817 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1808818 - Posted: 13 Aug 2016, 4:55:12 UTC - in response to Message 1808817.  
Last modified: 13 Aug 2016, 4:55:47 UTC

Message 1804557 -
Error: could not determine CPU version_num from client_state.


I have not run any stock version.I am running cpu tasks on lunatics and sog nvidia.

Message 1804679 - Posted: 24 Jul 2016, 15:47:27 UTC

This can happen if there aren't enough of a certain work unit type in your queue for it to make a determination. Give it a few hours until the queue is different and it should clear up, if not please advise.

Grant
Darwin NT
ID: 1808818 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 37 · Next

Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.