Panic Mode On (109) Server Problems?

Message boards : Number crunching : Panic Mode On (109) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 35 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1912397 - Posted: 11 Jan 2018, 21:07:39 UTC - in response to Message 1912344.  

My RAC was working its way up after a week of forced downtime. And now this....
Before long, I'll lose my gold badge on this mediocre machine, too.
A general crunch strike comes to mind... haha..

...Ghia...

Everyone else's RAC will be going down too, so you should keep your gold badge.


. . This is no time for being logical ... :)

Stephen

:)
ID: 1912397 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1912398 - Posted: 11 Jan 2018, 21:11:09 UTC - in response to Message 1912388.  

2 minor thoughts ...

And with only one type of work now, it will not take that long before we get negative credit/task.

1) Back before credit-new (long time ago)I once got negative credit. it was "-10,675,007.77" so things could be worse.

2) if credit is assigned per flop ... they have to count flops ... slowing things down so we can be assigned a meaningless number ...

Ed F


. . They are counting FLops. How else do they determine the APR?

Stephen

??
ID: 1912398 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1912399 - Posted: 11 Jan 2018, 21:17:09 UTC - in response to Message 1912394.  

Having issues getting work on the Linux cruncher for the past half hour. Triple Update isn't working. Toggling preferences isn't working. If it goes on much longer I will try the ghost recovery protocol.


. . As usual, my two slower rigs are keeping topped up. It is only the "Big hitter" rig that is being punished with constant "project has no tasks available" responses. I guess that is just a symptom these days of hitting the servers so often. The server end needs an upgrade methinks! SO much for needing/wanting more volunteer crunching ... it would only result in greater and more frequent work starvation.

Stephen

:(
ID: 1912399 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22754
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1912401 - Posted: 11 Jan 2018, 21:25:26 UTC - in response to Message 1912398.  

APR is an "artifact" that is only required by CreditScrew - if one goes to a data based credit system it become redundant.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1912401 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1912410 - Posted: 11 Jan 2018, 22:22:08 UTC - in response to Message 1912401.  

APR is an "artifact" that is only required by CreditScrew - if one goes to a data based credit system it become redundant.
I was under the impression that APR was a determining factor in the task allotment calculation, in order to determine how many tasks it takes to fill the "Store at least nn days of work" requirement. (At least for those who aren't already maxing out the 100 + 100 limits.) Is that not correct?
ID: 1912410 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1912420 - Posted: 11 Jan 2018, 23:32:11 UTC - in response to Message 1912410.  

APR is an "artifact" that is only required by CreditScrew - if one goes to a data based credit system it become redundant.
I was under the impression that APR was a determining factor in the task allotment calculation, in order to determine how many tasks it takes to fill the "Store at least nn days of work" requirement. (At least for those who aren't already maxing out the 100 + 100 limits.) Is that not correct?
Certainly. When work is needed, the client requests, as a given number of seconds. It's a pretty fundamental requirement that the server, and the client, are working to the same conversion factor (i.e. speed) when calculating how many tasks to send to fill the number of seconds requested.

The client estimate is controlled by APR (directly, for stock applications: indirectly, by manipulating the task size, when running anonymous platform). And the server does the same, and it does work.

The one fly in the ointment is for those semi-upgraded projects which still allow their hosts to use DCF (though I haven't actually noticed one of those for a while). The servers don't consider DCF at all any more: client DCF is much more responsive to short-term changes in workunit performance than the replacement server code, which can lead to "interesting" runtime estimates and work fetch decisions.

Edit: pre-post correction: GPUGrid still allows DCF, but has APR as well.
ID: 1912420 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1912430 - Posted: 12 Jan 2018, 0:15:44 UTC - in response to Message 1912410.  

APR is an "artifact" that is only required by CreditScrew - if one goes to a data based credit system it become redundant.
I was under the impression that APR was a determining factor in the task allotment calculation, in order to determine how many tasks it takes to fill the "Store at least nn days of work" requirement. (At least for those who aren't already maxing out the 100 + 100 limits.) Is that not correct?


. . That is my understanding too.

Stephen

??
ID: 1912430 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1912431 - Posted: 12 Jan 2018, 0:17:37 UTC - in response to Message 1912420.  

Thanks for that confirmation, Richard. I was pretty sure it wasn't just the credit calculation that needed the APR, but would have been hard-pressed to verify that assumption myself. ;^)
ID: 1912431 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1912458 - Posted: 12 Jan 2018, 4:19:34 UTC - in response to Message 1912420.  


The one fly in the ointment is for those semi-upgraded projects which still allow their hosts to use DCF (though I haven't actually noticed one of those for a while). The servers don't consider DCF at all any more: client DCF is much more responsive to short-term changes in workunit performance than the replacement server code, which can lead to "interesting" runtime estimates and work fetch decisions.

Edit: pre-post correction: GPUGrid still allows DCF, but has APR as well.


The worst offender is Einstein which still uses DCF and DOESN'T use APR.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1912458 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1912461 - Posted: 12 Jan 2018, 4:26:56 UTC

I still am unable to keep my cache filled on the Linux machine. Down a hundred gpu tasks. Nothing but no work is available messages upon work requests.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1912461 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 1912462 - Posted: 12 Jan 2018, 4:31:23 UTC - in response to Message 1912458.  

The worst offender is Einstein which still uses DCF and DOESN'T use APR.

Yeah, but Einstein works well.
ID: 1912462 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1912463 - Posted: 12 Jan 2018, 4:37:11 UTC - in response to Message 1912462.  

The worst offender is Einstein which still uses DCF and DOESN'T use APR.

Yeah, but Einstein works well.

I would not agree with statement at all ..... not in a month of Sundays. Einstein sends WAY TOO WORK when requested since its estimate of how much work requested is completely out to lunch. I have to abort 2/3 of the tasks it sends me at every request.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1912463 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 1912464 - Posted: 12 Jan 2018, 4:46:51 UTC - in response to Message 1912463.  

I would not agree with statement at all ..... not in a month of Sundays. Einstein sends WAY TOO WORK when requested since its estimate of how much work requested is completely out to lunch. I have to abort 2/3 of the tasks it sends me at every request.

That's Boinc averaging Cpu run times with GPU run times. Since Einstein ran out of gravity waves a couple of weeks ago and I now run only seti on the CPUs there is no problem.
ID: 1912464 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1912467 - Posted: 12 Jan 2018, 4:54:04 UTC - in response to Message 1912464.  

I've never run cpu tasks at either Milkyway or Einstein. Just Seti ..... just recently as an experiment I ran some cpu tasks at GPUGrid.net.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1912467 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 1912472 - Posted: 12 Jan 2018, 5:10:53 UTC - in response to Message 1912463.  

The worst offender is Einstein which still uses DCF and DOESN'T use APR.

Yeah, but Einstein works well.

I would not agree with statement at all ..... not in a month of Sundays. Einstein sends WAY TOO WORK when requested since its estimate of how much work requested is completely out to lunch. I have to abort 2/3 of the tasks it sends me at every request.

I would agree ...
ID: 1912472 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 1912477 - Posted: 12 Jan 2018, 5:31:04 UTC
Last modified: 12 Jan 2018, 5:31:33 UTC

I don't know what you guys are doing, the time estimates on both of my boxes with GTX1060s are very accurate. I run 2 at a time and they process at the rate of 4 an hour, so they estimate 30 min a task.
ID: 1912477 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1912479 - Posted: 12 Jan 2018, 5:49:01 UTC

I test ran some Einstein work and it bogged my rig down so far it became totally unusable for anything else. That was during the great SETI outage of '16 and I havn't tried it since.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1912479 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 1912482 - Posted: 12 Jan 2018, 5:55:38 UTC - in response to Message 1912431.  

I was pretty sure it wasn't just the credit calculation that needed the APR

And running more than 1 WU at a time results in a significantly reduced APR value- even though you might be doing more work per hour. And rescheduling does all sorts of things to APR values.
Grant
Darwin NT
ID: 1912482 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13918
Credit: 208,696,464
RAC: 304
Australia
Message 1912484 - Posted: 12 Jan 2018, 6:00:04 UTC - in response to Message 1912477.  

I don't know what you guys are doing, the time estimates on both of my boxes with GTX1060s are very accurate. I run 2 at a time and they process at the rate of 4 an hour, so they estimate 30 min a task.

But you get deadline issues on the CPU tasks?
Grant
Darwin NT
ID: 1912484 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1912493 - Posted: 12 Jan 2018, 6:31:10 UTC - in response to Message 1912461.  

I still am unable to keep my cache filled on the Linux machine. Down a hundred gpu tasks. Nothing but no work is available messages upon work requests.


. . Have you "kicked" the server?

Stephen

??
ID: 1912493 · Report as offensive
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 35 · Next

Message boards : Number crunching : Panic Mode On (109) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.