New rescheduler

Message boards : Number crunching : New rescheduler
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

AuthorMessage
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1015332 - Posted: 13 Jul 2010, 13:30:37 UTC

Anyone know what I screwed up?

13 July 2010 - 06:22:25 BoincRescheduler V: 0.7


13 July 2010 - 06:27:15 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:27:15 Found: GPU: 3519, VLAR: 0, VHAR: 0
13 July 2010 - 06:27:15 No rescheduling needed

13 July 2010 - 06:27:53 Shutting down BOINC client
13 July 2010 - 06:27:57 Shutdown of BOINC client completed
13 July 2010 - 06:28:41 Invalid rsc_fpops_bound < 500000000000000000.000000, total: 4222
13 July 2010 - 06:28:41 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:28:41 Found: GPU: 3517, VLAR: 0, VHAR: 0
13 July 2010 - 06:28:41 Rescheduling CPU version: 603 ,Gpu version: 608 planclass: cuda
13 July 2010 - 06:29:32 Copied rescheduled client_state.xml
13 July 2010 - 06:29:32 ERROR: The BOINC client is running, but it shouldn't, so something is wrong
13 July 2010 - 06:29:32 ERROR: Move completed with an error.

Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1015332 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1015338 - Posted: 13 Jul 2010, 13:52:47 UTC - in response to Message 1015332.  

Anyone know what I screwed up?

13 July 2010 - 06:22:25 BoincRescheduler V: 0.7


13 July 2010 - 06:27:15 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:27:15 Found: GPU: 3519, VLAR: 0, VHAR: 0
13 July 2010 - 06:27:15 No rescheduling needed

13 July 2010 - 06:27:53 Shutting down BOINC client
13 July 2010 - 06:27:57 Shutdown of BOINC client completed
13 July 2010 - 06:28:41 Invalid rsc_fpops_bound < 500000000000000000.000000, total: 4222
13 July 2010 - 06:28:41 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:28:41 Found: GPU: 3517, VLAR: 0, VHAR: 0
13 July 2010 - 06:28:41 Rescheduling CPU version: 603 ,Gpu version: 608 planclass: cuda
13 July 2010 - 06:29:32 Copied rescheduled client_state.xml
13 July 2010 - 06:29:32 ERROR: The BOINC client is running, but it shouldn't, so something is wrong
13 July 2010 - 06:29:32 ERROR: Move completed with an error.

That's one of these checks I made and never expected to see.

The client is running after it reported it had stopped.
Maybe the client did not properly shut down and restarted itself.
Maybe the client is a bit sluggish and slow because your tasks are above the safe limit of BOINC.
But the state file is copied, so at least shut down the client and restart it to see if it reports any problems.
TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1015338 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1015341 - Posted: 13 Jul 2010, 13:56:27 UTC

I did a restart of Boinc and the manager didn't show any error messages. I hope it made the flop change...So now there are 3 reschedulers...
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1015341 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1015347 - Posted: 13 Jul 2010, 14:13:17 UTC - in response to Message 1015338.  
Last modified: 13 Jul 2010, 14:14:07 UTC

Anyone know what I screwed up?

13 July 2010 - 06:22:25 BoincRescheduler V: 0.7


13 July 2010 - 06:27:15 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:27:15 Found: GPU: 3519, VLAR: 0, VHAR: 0
13 July 2010 - 06:27:15 No rescheduling needed

13 July 2010 - 06:27:53 Shutting down BOINC client
13 July 2010 - 06:27:57 Shutdown of BOINC client completed
13 July 2010 - 06:28:41 Invalid rsc_fpops_bound < 500000000000000000.000000, total: 4222
13 July 2010 - 06:28:41 Found: CPU: 705, VLAR: 705, VHAR: 0
13 July 2010 - 06:28:41 Found: GPU: 3517, VLAR: 0, VHAR: 0
13 July 2010 - 06:28:41 Rescheduling CPU version: 603 ,Gpu version: 608 planclass: cuda
13 July 2010 - 06:29:32 Copied rescheduled client_state.xml
13 July 2010 - 06:29:32 ERROR: The BOINC client is running, but it shouldn't, so something is wrong
13 July 2010 - 06:29:32 ERROR: Move completed with an error.

That's one of these checks I made and never expected to see.

The client is running after it reported it had stopped.
Maybe the client did not properly shut down and restarted itself.
Maybe the client is a bit sluggish and slow because your tasks are above the safe limit of BOINC.
But the state file is copied, so at least shut down the client and restart it to see if it reports any problems.

My client is fine as far as sluggishness, not sure what you mean by "Safe limit of Boinc" My manager works fine with way more work than this. Something changed around February and my client hasn't had sluggish problems since, and from what I hear none of the top hosts have this problem anymore. People have a whole lot more work than I have. The funny thing is most of my work downloaded last night and it was almost all on the CPU until I rescheduled this morning. I moved 2000 to GPU.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1015347 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1015353 - Posted: 13 Jul 2010, 14:29:25 UTC - in response to Message 1015347.  


My client is fine as far as sluggishness, not sure what you mean by "Safe limit of Boinc" My manager works fine with way more work than this. Something changed around February and my client hasn't had sluggish problems since, and from what I hear none of the top hosts have this problem anymore. People have a whole lot more work than I have. The funny thing is most of my work downloaded last night and it was almost all on the CPU until I rescheduled this morning. I moved 2000 to GPU.

The more WU you have on your system the more overhead. You have a bit more that you need for 3 days.

TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1015353 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1015358 - Posted: 13 Jul 2010, 14:37:58 UTC - in response to Message 1015353.  


My client is fine as far as sluggishness, not sure what you mean by "Safe limit of Boinc" My manager works fine with way more work than this. Something changed around February and my client hasn't had sluggish problems since, and from what I hear none of the top hosts have this problem anymore. People have a whole lot more work than I have. The funny thing is most of my work downloaded last night and it was almost all on the CPU until I rescheduled this morning. I moved 2000 to GPU.

The more WU you have on your system the more overhead. You have a bit more that you need for 3 days.

I am about to go back to lurking mode until this is straitened out. Can't keep up with all the suggestions is so many different threads. Too bad someone can't make a good sticky that others can't add to but would have to be edited by only one person. The stickies get way to sidetracked...As long as I don't have 100's of downloads stuck, my manager can handle twice this amount of work without any communication problems.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1015358 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1015383 - Posted: 13 Jul 2010, 15:45:26 UTC

I am still in Eternal High Priority Mode but have some of those Aug 4 units waiting to run. Will see if they send me back to 100 hours when another goes thru. It did last night just like I thought they would.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1015383 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1015386 - Posted: 13 Jul 2010, 15:58:04 UTC - in response to Message 1015383.  

Bet it won't get there.
7/13/2010 11:56:12 AM SETI@home Scheduler request failed: Couldn't connect to server
Looks like we are down for the count until Friday.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1015386 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1015534 - Posted: 14 Jul 2010, 7:28:36 UTC - in response to Message 1015386.  

V 0.8

Add: Settings tab: Option, start at login.
Change: When your try starting the program for the second time, it shows the dialog of the running copy.
Fixed: Expert setting "Limit rsc_fpops_bound" sometimes failed to update a workunit.

TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1015534 · Report as offensive
Profile [DPC] hansR Project Donor
Volunteer tester
Avatar

Send message
Joined: 14 Jul 00
Posts: 47
Credit: 235,829,569
RAC: 8
Netherlands
Message 1015536 - Posted: 14 Jul 2010, 8:40:35 UTC - in response to Message 1015534.  

Tested the "Limit rsc_fpops_bound" aetting. Now works OK. Thanks!
ID: 1015536 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1016211 - Posted: 16 Jul 2010, 14:19:12 UTC - in response to Message 1015536.  
Last modified: 16 Jul 2010, 14:19:38 UTC

V 0.9

Changed: Now updates rsc_fpops_bound on running tasks.
Changed: When moving from CPU <-> GPU, the est and bound values will be the calculated ones not the server values.
Add: Move work from GPU to CPU.
Add: Expert tab: Read config button, will read the config.xml again.
Add: Config.xml SETI settings: <est_ratio_cpu_min> <est_ratio_cpu_max> <est_ratio_gpu_min> <est_ratio_gpu_max> These can be used to limit the range of the CPU and GPU ratio on server and calculated work.

This is an example of the config.xml file I now use.

<config>
<seti>
<dcf_min>0.8</dcf_min>
<dcf_max>1.2</dcf_max>
<est_ratio_cpu_min>4</est_ratio_cpu_min>
<est_ratio_cpu_max>4</est_ratio_cpu_max>
<est_ratio_gpu_min>5</est_ratio_gpu_min>
<est_ratio_gpu_max>5</est_ratio_gpu_max>
</seti>
</config>

I use this to get the DCF as close as possible to 1.0.
Instead of the 0.1 I had before. Now: Duration correction factor 0.9964010000
The DCF is allowed to go from 0.8 to 1.2.
A higher ratio value means shorter runtimes = faster computer.
The min and max values are the same, this in effect sets this value.
TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1016211 · Report as offensive
Profile [DPC] hansR Project Donor
Volunteer tester
Avatar

Send message
Joined: 14 Jul 00
Posts: 47
Credit: 235,829,569
RAC: 8
Netherlands
Message 1017516 - Posted: 19 Jul 2010, 7:14:56 UTC - in response to Message 1016211.  

Problem on windows XP running as service. After restart all CPU jobs erroring out. CUDA just starts running. No info in BOINC log ??? (stdoutdae.txt/old)
ID: 1017516 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1017518 - Posted: 19 Jul 2010, 7:31:09 UTC - in response to Message 1017516.  

Running as a service?
That's not the preferred mode with a GPU for some time.
And I haven't any computer running BOINC as a service, so I can't test it.

You not by any chance, have a copy of the BOINC folder before this you could send me.
What CPU and GPU versions are you running?

And errors of this kind will show up in the regular logging.



TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1017518 · Report as offensive
Profile [DPC] hansR Project Donor
Volunteer tester
Avatar

Send message
Joined: 14 Jul 00
Posts: 47
Credit: 235,829,569
RAC: 8
Netherlands
Message 1017519 - Posted: 19 Jul 2010, 7:43:15 UTC - in response to Message 1017518.  
Last modified: 19 Jul 2010, 7:58:40 UTC

I know, but running as a service under XP is my preferred configuration for all my windows XP systems. Only 1 system has a CUDA GPU and this works perfect. Problem is only with Vista en W7.

No copy of the BOINC folder because it just happened unexpected.

Realy nothing in the logging. Maybe boinc was still running as a service and no real restart of boinc or second copy? BOINC manager logged something in its message window about missing ddl's or so, but this didn't show up in the logfiles.

Have no more info available. Lost all my CPU jobs, and not getting new jobs :-(

Example of one of the jobs: http://setiathome.berkeley.edu/result.php?resultid=1661259358
ID: 1017519 · Report as offensive
Profile [DPC] hansR Project Donor
Volunteer tester
Avatar

Send message
Joined: 14 Jul 00
Posts: 47
Credit: 235,829,569
RAC: 8
Netherlands
Message 1017526 - Posted: 19 Jul 2010, 8:37:03 UTC - in response to Message 1017519.  

Have changed BOINC installation to not run as service. The outage is coming soon and this systeem needs a lot of jobs ;-)
ID: 1017526 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1017529 - Posted: 19 Jul 2010, 9:47:48 UTC - in response to Message 1017526.  

Have changed BOINC installation to not run as service. The outage is coming soon and this systeem needs a lot of jobs ;-)

You could try moving work over from the GPU to the CPU.
Start with a small number, this should get you GPU work as well.

TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1017529 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1017541 - Posted: 19 Jul 2010, 11:42:09 UTC

Looks as though the reschedulers are still needed, have over 100 VLars on GPU's so far.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1017541 · Report as offensive
Profile Hellsheep
Volunteer tester

Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1017543 - Posted: 19 Jul 2010, 11:46:41 UTC
Last modified: 19 Jul 2010, 11:47:25 UTC

I believe the reason running as a service under Win 7 or Vista doesn't work for GPU crunching is due to the way the drivers run under those OS's, something to do with WDDM. I think Joe posted a bit about this, i'll have a look and see if i can find the thread.

Also, the vlar change has only taken effect on Beta as far as i know. If all goes well at Beta, it will be released here soon.
- Jarryd
ID: 1017543 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 1017545 - Posted: 19 Jul 2010, 11:48:46 UTC - in response to Message 1017543.  

I believe the reason running as a service under Win 7 or Vista doesn't work for GPU crunching is due to the way the drivers run under those OS's, something to do with WDDM. I think Joe posted a bit about this, i'll have a look and see if i can find the thread.

Also, the vlar change has only taken effect on Beta as far as i know. If all goes well at Beta, it will be released here soon.

Cool wasn't sure, downloads are working this AM. Had to set to no new work...YAY
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 1017545 · Report as offensive
Profile S@NL - eFMer - efmer.com/boinc
Volunteer tester
Avatar

Send message
Joined: 7 Jun 99
Posts: 512
Credit: 148,746,305
RAC: 0
United States
Message 1017547 - Posted: 19 Jul 2010, 11:58:04 UTC - in response to Message 1017545.  

V 1.0

Add: A balloon text when the BOINC client, is not running for 3 minutes.
Changed: Don't use the mutex to check if the BOINC client is running.
Changed: Check if the BOINC client is running just before the copy of the new client state file.
Fixed: the est_ratio_xpu min and max values where reversed.


TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking.
ID: 1017547 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

Message boards : Number crunching : New rescheduler


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.