Don't know where it should go? Stick it here!

Message boards : Number crunching : Don't know where it should go? Stick it here!
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 116 · 117 · 118 · 119 · 120 · 121 · 122 . . . 147 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043219 - Posted: 5 Apr 2020, 21:57:25 UTC - in response to Message 2043210.  

50% of the load hasn't been removed.

Each time i asked in the past what was the way to measure the load on the DB someone tell is: queries/second

Was in the range of 1200-1500 and now is at 508 so by my math the number downs even mora than 50%
Sorry, i though you were referring to the numbers of Tasks.
Yes, the number of queries per second is down, but much of that load would have been from the splitting of new work, and the high return rate of completed work. But the fact is the database is still bloated, and even though there's more CPU resources, the I/O bottleneck remains, so the Assimilation backlog remains, so everything remains backlogged.
But slowly, ever so very slowly, the backlog is clearing,
Grant
Darwin NT
ID: 2043219 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2043251 - Posted: 6 Apr 2020, 2:21:53 UTC
Last modified: 6 Apr 2020, 2:25:52 UTC

since the load on the servers is much less now, the assimilators can now actually make some progress. thats why you see the numbers finally on a steady decline for once.

it will go slow for a while, and speed up as more and more resources are freed up. the sink is finally draining faster than it is being filled (mainly because the faucet was reduced from full open to just a steady trickle)



note how the validation (results) queue is decreasing at a faster rate than the assimilation (workunits) queue. for every entry removed from the assimilation queue, ~2.2 entries are removed from the validation queue. like we've been saying all along.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2043251 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2043259 - Posted: 6 Apr 2020, 3:53:20 UTC
Last modified: 6 Apr 2020, 3:54:56 UTC

It may have taken a PhD to write BOINC, but it shouldn't require one to configure it.
This thing load-shares like three teenagers and a dog in the back seat of the car.
Just sayin' ...
ID: 2043259 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043262 - Posted: 6 Apr 2020, 4:22:01 UTC - in response to Message 2043259.  

It may have taken a PhD to write BOINC, but it shouldn't require one to configure it.
This thing load-shares like three teenagers and a dog in the back seat of the car.
Just sayin' ...
I suspect that when developing it they had no idea of the ways in which people would try to get things to run. And many of the options can conflict with other options, combined with Anonymous Platform support, the way Resource sharing is determined, variable processing times & hugely varying deadlines and hardware requirements along with incredibly small to ridiculously huge caches and people using ncpus & other such settings in ways the developers never imagined in their wildest dreams (worst nightmares?) results in things not happening in quite the way people might expect them too, and certainly not within their expected time frame- which generally seems to be immediately, if not retrospectively.
Grant
Darwin NT
ID: 2043262 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2043278 - Posted: 6 Apr 2020, 6:21:21 UTC - in response to Message 2043262.  

It may have taken a PhD to write BOINC, but it shouldn't require one to configure it.
This thing load-shares like three teenagers and a dog in the back seat of the car.
Just sayin' ...
I suspect that when developing it they had no idea of the ways in which people would try to get things to run. And many of the options can conflict with other options, combined with Anonymous Platform support, the way Resource sharing is determined, variable processing times & hugely varying deadlines and hardware requirements along with incredibly small to ridiculously huge caches and people using ncpus & other such settings in ways the developers never imagined in their wildest dreams (worst nightmares?) results in things not happening in quite the way people might expect them too, and certainly not within their expected time frame- which generally seems to be immediately, if not retrospectively.

As I said, like 3 teenagers and a dog. :)
ID: 2043278 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 2043428 - Posted: 7 Apr 2020, 0:55:44 UTC

It looks like I got
11 MB's on April 2.
19 MB's on the 3rd.
1 on the 4th
1 on the 5th
3 on the 6th (so far).

Tom
A proud member of the OFA (Old Farts Association).
ID: 2043428 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043731 - Posted: 8 Apr 2020, 22:54:57 UTC

BOINC v7.16.5/.6 has been released, but looking at the Release notes i can't see any mention of a "Finish file present too long" fix.

Changes in 7.16
If output file is missing on startup, flag task as error.
Let project specify directories in logical file names.
Fix security vulnerability involving logical file names.
Make "reread config files" work for ncpus.
Support fetch of files over GUI RPC; allow projects to supply their own web-based GUI.
FreeBSD: check for AVX
Support GUI RPCs as HTTP Post requests.
Register user consent to terms of use.
Enable "Other options" in simple view if no client connected.
Clear "vm_extensions_disabled" flag on startup.
Fix work fetch bug when max_concurrent used.
Unsuspend jobs before telling them to quit.
Sanity-check job runtime limits.
Fix overflow in OpenCL GPU FLOPS calculation.
Windows: show processor group info at startup
Fix stall if --skip_cpu_benchmarks
Fix crash in RSS feed fetch
Windows: fix GUI RPC password generation when running in a VM
Windows: make --dir work

Grant
Darwin NT
ID: 2043731 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2043736 - Posted: 8 Apr 2020, 23:46:42 UTC - in response to Message 2043731.  

https://github.com/BOINC/boinc/pull/3019/commits
Merged JuhaSointusalo merged 1 commit into master from dpa_finish_file on Mar 30, 2019
Since BOINC 7.16.5 was build from master, the fix was included.
master (#3019) server_release/1.2/1.2.1 ... client_release/7.16/7.16.1
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2043736 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043738 - Posted: 8 Apr 2020, 23:58:00 UTC - in response to Message 2043736.  

https://github.com/BOINC/boinc/pull/3019/commits
Merged JuhaSointusalo merged 1 commit into master from dpa_finish_file on Mar 30, 2019
Since BOINC 7.16.5 was build from master, the fix was included.
master (#3019) server_release/1.2/1.2.1 ... client_release/7.16/7.16.1
So it got rolled in to the
If output file is missing on startup, flag task as error.
fix?
Grant
Darwin NT
ID: 2043738 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043742 - Posted: 9 Apr 2020, 0:25:49 UTC

Has the server backoff after a request gone up from 10min to 30min now?
10min since last request, showing 20min till next one one in Project Status column.
Grant
Darwin NT
ID: 2043742 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51527
Credit: 1,018,363,574
RAC: 1,004
United States
Message 2043744 - Posted: 9 Apr 2020, 0:31:41 UTC - in response to Message 2043742.  

Just did one, and got the 10 minute server backoff.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 2043744 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043746 - Posted: 9 Apr 2020, 0:36:58 UTC - in response to Message 2043744.  

Just did one, and got the 10 minute server backoff.

Meow.

Oh well, 7min to go, we'll see what happens next.
Grant
Darwin NT
ID: 2043746 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043749 - Posted: 9 Apr 2020, 0:44:59 UTC - in response to Message 2043746.  
Last modified: 9 Apr 2020, 0:45:20 UTC

Just did one, and got the 10 minute server backoff.

Meow.
Oh well, 7min to go, we'll see what happens next.
Another 30min delay after the next request.
Better than the 1hr+ delays the Manager starts to do.
Grant
Darwin NT
ID: 2043749 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51527
Credit: 1,018,363,574
RAC: 1,004
United States
Message 2043753 - Posted: 9 Apr 2020, 1:11:35 UTC - in response to Message 2043749.  

There must have been a change made serverside.
I am getting a 30 minute backoff now too.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 2043753 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2043762 - Posted: 9 Apr 2020, 1:31:12 UTC - in response to Message 2043738.  

https://github.com/BOINC/boinc/pull/3019/commits
Merged JuhaSointusalo merged 1 commit into master from dpa_finish_file on Mar 30, 2019
Since BOINC 7.16.5 was build from master, the fix was included.
master (#3019) server_release/1.2/1.2.1 ... client_release/7.16/7.16.1
So it got rolled in to the
If output file is missing on startup, flag task as error.
fix?


I guess so. That is what actually happens with a "finish file present too long" error. They just dumbed down the description instead of tying it directly to the error description. This is DA's description of the error and the fix in the pull request.


When an app finishes, it writes a "finish file",
which ensures the client that the app really finished.

If the app process is still there N seconds after the finish file appears,
the client assumes that something went wrong, and it aborts the job.

Previously N was 10.
This was too small during periods of heavy paging.
I increased it to 300.

It has been pointed out that if the app creates the finish file,
and its output files are present,
it should be treated as successful regardless of whether it exits.
This is probably true, but right now we don't have a mechanism
for killing a job and marking it as success.
The longer timeout makes this moot.

Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2043762 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 2043765 - Posted: 9 Apr 2020, 1:51:44 UTC - in response to Message 2043762.  
Last modified: 9 Apr 2020, 1:53:16 UTC

I guess so. That is what actually happens with a "finish file present too long" error. They just dumbed down the description instead of tying it directly to the error description. This is DA's description of the error and the fix in the pull request.

When an app finishes, it writes a "finish file",
which ensures the client that the app really finished.

If the app process is still there N seconds after the finish file appears,
the client assumes that something went wrong, and it aborts the job.
Yeah, i saw his description.
And what he describes (which is the "Finish file present too long" problem) is different to "If output file is missing on startup, flag task as error."
The second one is "if a file is missing, then and error has occurred" the first one "if a process is still going a certain time after the file is written, the Task is aborted". To me they are two different issues- one is a missing file, the other the file is there. One is the result of an error, the other results in an error.


Will just have to see if any "Finish file present too long" errors show up or not on systems with the new Manager.
Grant
Darwin NT
ID: 2043765 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 2043779 - Posted: 9 Apr 2020, 2:27:29 UTC

Just got notified of a Boinc Manager update for my Windows box. Downloaded the VM version. Maybe I will explore running Cosmology@Home which is a VM only project on it.

Tom M
A proud member of the OFA (Old Farts Association).
ID: 2043779 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2043791 - Posted: 9 Apr 2020, 5:50:53 UTC - in response to Message 2043779.  

Just got notified of a Boinc Manager update for my Windows box. Downloaded the VM version. Maybe I will explore running Cosmology@Home which is a VM only project on it.

Tom M

Just FYI, if it's 7.16.5 x64 you've updated to, be advised I've been having nothing but trouble with that on my Win box with Einstein jobs, and after 2 trys I just fell back to 7.14.2. Seems CPU jobs just halt. Also apparently trashed the history, to the point where it''s saying I reached my max of 480 tasks per day and threw me into a 24 hour project back-off.
ID: 2043791 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2043801 - Posted: 9 Apr 2020, 8:32:24 UTC - in response to Message 2043765.  

Will just have to see if any "Finish file present too long" errors show up or not on systems with the new Manager.

I've been running the 7.16 branch clients for a long time now. Haven't seen a finish file present error since.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2043801 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 2043814 - Posted: 9 Apr 2020, 10:28:41 UTC - in response to Message 2043791.  

Just got notified of a Boinc Manager update for my Windows box. Downloaded the VM version. Maybe I will explore running Cosmology@Home which is a VM only project on it.

Tom M

Just FYI, if it's 7.16.5 x64 you've updated to, be advised I've been having nothing but trouble with that on my Win box with Einstein jobs, and after 2 trys I just fell back to 7.14.2. Seems CPU jobs just halt. Also apparently trashed the history, to the point where it''s saying I reached my max of 480 tasks per day and threw me into a 24 hour project back-off.


My Windows box is mostly not running and except for S@H its not set to allow any new tasks so I can't offer any opinion. Maybe over the Easter weekend I will "play" with it.
I am interested to see if with the VM installed does using a VM-based project become "turn key"?

Tom M
A proud member of the OFA (Old Farts Association).
ID: 2043814 · Report as offensive     Reply Quote
Previous · 1 . . . 116 · 117 · 118 · 119 · 120 · 121 · 122 . . . 147 · Next

Message boards : Number crunching : Don't know where it should go? Stick it here!


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.