BOINC 7 - problems, solutions and tips

Message boards : Number crunching : BOINC 7 - problems, solutions and tips
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1188531 - Posted: 26 Jan 2012, 13:35:19 UTC

On 23 January, I successfully upgraded from 6.12.34 -> 7.0.8. Since that time the scheduler has not requested any new tasks. On 1/24 it automatically reported 204 tasks, 1/25 it automatically reported 211 tasks. Today I manually requested an update and reported 150 completed tasks, expecting to receive a number of tasks, but the BM requested none. I am down to approx. 4 days of work left on this machine. Prior to the upgrade, the system normally kept at least 10 days in the queue. There are three projects on the machine (SETI, Milkyway & Einstein), with only SETI allowed to get tasks.

Is there something on the server side that is severly restricting any machine from getting a full allotment of work when it applies to 7.xx.xx versions? The machine, A-SYS, is running W7/64-bit, 4Gb ram, Lunatics v.039, (1) EVGA GTX460 SE, (1) EVGA GTS250.




I don't buy computers, I build them!!
ID: 1188531 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1188533 - Posted: 26 Jan 2012, 13:43:20 UTC

Please post here if you are running BOINC 7 (Alpha!) and are having issues.
The scheduling in BOINC 7 has been revamped, you may be seeing that.
Also if you encounter bugs they need to be fed back to the devs.
IF you are running BOINC alpha clients you should be reading the mailing list!
ID: 1188533 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1188537 - Posted: 26 Jan 2012, 13:51:58 UTC

quoting from a different thread

Workfetch will be initiated when the calculated remaining work drops below the 'connect every' setting and will then ask for as much work as specified in the 'aditional work' setting. It will ask from the project for which the debt entry is the smallest (calculated as work done recently compared to resource share). If no work is available from that project it will ask from the next larger and so on until it gets work.
With SETI this has the usual problem of it being luck whether you get work on a request, so as with boinc 6 you end up getting more work from the other projects. SETI eventually rises in priority until you get lucky.

With BOINC 7 cache needs to be set quite differently from the approach used with previous versions to get the same effect.
Because it will wait until the 'connect every' minimum level is reached before it asks for work (instead of the frequent top ups with previous) if you want to be sure to have work for 3 days available at all times on the machine you need to set 'connect every' to 3. If you want it to ask often for work (to get that elusive SETI task) you should set 'additional' to a small value.


@Cliff If you can only fetch from SETI, you probably have more cache left (over all projects) than whatever your 'connect every' setting is.
ID: 1188537 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1188602 - Posted: 26 Jan 2012, 18:24:47 UTC

Hi, I think scheduler is great except when running too many projects...
It seems like some projects gets behind when i ran more then 4 projects on 1 ATI GPU. The fifth project never started....Maybe I should have waited more then 1 week.

I now run some SETI Beta on my ATI cards 5850 + 4850 and i haven't set the flops in app_info.xml so the tasks has very long runtimes when they start so I only have 2 tasks i que.
But sceduling works very nice and when 1 task only has a few hours left it gets a new task. Great!!!

A not nessesary wish would be to let more projects to run at the same time(alternating and running) at the same time.

When it works now it "feels" very good for me as a user.

No more suspend and NNT in boincmanager.
TRuEQ & TuVaLu
ID: 1188602 · Report as offensive
Profile Piotr Kunkel
Volunteer tester

Send message
Joined: 7 Apr 00
Posts: 18
Credit: 19,385,083
RAC: 0
Poland
Message 1188816 - Posted: 27 Jan 2012, 9:48:15 UTC

Cliff Harding - try 7.02, is seems to be good.
Is's, for me, the only 7.xx whitch requesting and downloading new tasks.
" Then they came for me and there was no one left to speak out for me."

Martin Niemöller
ID: 1188816 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1188837 - Posted: 27 Jan 2012, 12:47:04 UTC
Last modified: 27 Jan 2012, 12:47:43 UTC

Is there any change that the message button(event log) can be put back next to the disk option in BoincManager?
TRuEQ & TuVaLu
ID: 1188837 · Report as offensive
Profile iwazaru
Volunteer tester
Avatar

Send message
Joined: 31 Oct 99
Posts: 173
Credit: 509,430
RAC: 0
Greece
Message 1188845 - Posted: 27 Jan 2012, 13:37:53 UTC

Any idea why this WU resulted in an error? Apologies in advance if not a Boinc 7(.08) problem...

http://setiathome.berkeley.edu/workunit.php?wuid=916898344
ID: 1188845 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1188853 - Posted: 27 Jan 2012, 14:13:34 UTC - in response to Message 1188845.  

Any idea why this WU resulted in an error? Apologies in advance if not a Boinc 7(.08) problem...

http://setiathome.berkeley.edu/workunit.php?wuid=916898344

Well, the task details report

<message>
upload failure: <file_xfer_error>
  <file_name>18oc11aa.365.12101.9.10.62.vlar_0_0</file_name>
  <error_code>-131</error_code>
</file_xfer_error>

</message>

In other words, the task finished normally, but there was a communications glitch afterwards when it tried to send the outcome back to Berkeley.

It's possible that is related to BOINC v7.0.8, but unlikely - the preceding task uploaded properly, and that was with v7.0.8, too. How long have you been running that particular (experimental) version - in other words, how many tasks have you completed since the upgrade? If it's been OK for a while, and this is the first error, I'd be inclined to put it down to a random gremlin, and move on.

However, v7.0.8 is a bit old to still be testing - as the thread title now says, testing is now up to v7.0.12. That seems to be running well here, so you might feel inclined to upgrade and help us find a whole new set of bugs ;-)
ID: 1188853 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1188855 - Posted: 27 Jan 2012, 14:30:30 UTC - in response to Message 1188853.  

<error_code>-131</error_code>

In other words, the task finished normally, but there was a communications glitch afterwards when it tried to send the outcome back to Berkeley.

Error -131 means that the size of one of the output files was bigger than the maximum set by the project for upload. BOINC will not try to upload this file.

Perhaps something to do with those mega MBs?
ID: 1188855 · Report as offensive
Profile iwazaru
Volunteer tester
Avatar

Send message
Joined: 31 Oct 99
Posts: 173
Credit: 509,430
RAC: 0
Greece
Message 1188857 - Posted: 27 Jan 2012, 14:38:45 UTC - in response to Message 1188853.  
Last modified: 27 Jan 2012, 14:50:23 UTC

...testing is now up to v7.0.12. That seems to be running well here, so you might feel inclined to upgrade and help us find a whole new set of bugs ;-)


Sure! Where does 7.0.12 live?:-) My Seti knowledge only takes me this far:
http://boinc.berkeley.edu/download_all.php

PS I've retired this 7 year old 73W Celeron for energy/efficiency reasons, but it never gave me any trouble, so it should be OK (keeping an eye on it, nevertheless). Just wanted to see what Boinc 7 looked like...

Edit: OK, I seem to have found the source of my confusion. The second post in the thread looks like it should have been the first. I've never seen that in the forums before... So what/where is Boinc Alpha?
ID: 1188857 · Report as offensive
Wembley
Volunteer tester
Avatar

Send message
Joined: 16 Sep 09
Posts: 429
Credit: 1,844,293
RAC: 0
United States
Message 1188865 - Posted: 27 Jan 2012, 15:07:33 UTC - in response to Message 1188857.  


Sure! Where does 7.0.12 live?:-) My Seti knowledge only takes me this far:
http://boinc.berkeley.edu/download_all.php


Try here: http://boinc.berkeley.edu/dl/?C=M;O=D
ID: 1188865 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1188867 - Posted: 27 Jan 2012, 15:15:40 UTC - in response to Message 1188855.  

<error_code>-131</error_code>

In other words, the task finished normally, but there was a communications glitch afterwards when it tried to send the outcome back to Berkeley.

Error -131 means that the size of one of the output files was bigger than the maximum set by the project for upload. BOINC will not try to upload this file.

Perhaps something to do with those mega MBs?

It's something to watch out for when we crunch them, but I doubt it.

I've got several mega MBs now, but every single one of them is called 21jn11ac.5207... - the tasks produced by other splitter instances working on the same 21jn11ac 'tape' are normal size.

iwazaru's file is called 18oc11aa.365..., and has been successfully completed by the other user, so I think it must be a different issue.
ID: 1188867 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1188912 - Posted: 27 Jan 2012, 18:54:39 UTC

I abandoned the getting my fifth project Albert to get it's schedular prio the same as the other 4 projects that worked together as they should.

Now I am running 3 projects SETI Beta , SETI and Albert.
The problem is Albert is showing the same difference in scheduling prio as it did when i ran more projects.
Could this have to do with Albert in any way??
TRuEQ & TuVaLu
ID: 1188912 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1189357 - Posted: 28 Jan 2012, 10:06:39 UTC

Ok, now Albert is alternating with Milkyway, SETI and SETI Beta.

Seti is set to use only ap tasks which is not available at the moment.
Seti Beta is set to NNT and completed it's last three tasks and then had 0 tasks in cue.
Milkyway is running and then Albert prio for some reason got closer to the others. Maybe it just needed to get going with it's big difference prio to start functioning as it should.

SETI is still the project that should be running according to it's prio but Albert and Milkyway are in fact the once that are alternating and running as they should since SETI ap tasks are not available.

It was like prio calculation for Albert got stuck with a difference from the other projects. It did go up and down but not enough to make Albert run.
TRuEQ & TuVaLu
ID: 1189357 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1189363 - Posted: 28 Jan 2012, 10:21:11 UTC

A tip.

Poem has like 0 or 1 task avalable from time to time for their GPU app.
My BM(7.0.11) doesn't seem to "click on update" when the "communication deffered" time has come to 0
So, I don't get any tasks from time to time.
When I click on the update when it's not deffered a couple of times I usually get 1 or 2 tasks. Couldn't the BM do this??

note:
POEM has a very short "communication deffered".
TRuEQ & TuVaLu
ID: 1189363 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 1189368 - Posted: 28 Jan 2012, 10:38:05 UTC

There are issues around gpu work fetch with the current alpha versions (including 7.0.12). The BOINC alpha mailing list is where we discuss these things.
ID: 1189368 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1189387 - Posted: 28 Jan 2012, 12:07:26 UTC - in response to Message 1189368.  

There are issues around gpu work fetch with the current alpha versions (including 7.0.12). The BOINC alpha mailing list is where we discuss these things.


Ok, then I know you guys are working on it.

TRuEQ & TuVaLu
ID: 1189387 · Report as offensive

Message boards : Number crunching : BOINC 7 - problems, solutions and tips


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.