Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 37 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13761
Credit: 208,696,464
RAC: 304
Australia
Message 1963104 - Posted: 3 Nov 2018, 7:50:20 UTC - in response to Message 1963096.  

I thought it was on disk as well

It is, and that's the problem. Until it is read in to memory then any access is extremely slow. If it doesn't all fit in to memory, then it has to be read from (and written to) disk instead- which is orders of magnitude slower than memory, particularly for random reads & write.
Grant
Darwin NT
ID: 1963104 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1963109 - Posted: 3 Nov 2018, 9:17:41 UTC - in response to Message 1963097.  

My oldest valid is from 18 July WU 3056429782
Apples and Oranges, I think...
Likewise, your oldest valid has only been valid for 6 hours, as that's when the wingman reported that would compare against two previous results to resolve an inconclusive.
Agreed. That one should be purged at 4 Nov 2018, 01:32 UTC - 24 hours after it was validated. I'll be asleep in bed, but if anyone wants to check, they can work out how long the backlog is at that time.
ID: 1963109 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963115 - Posted: 3 Nov 2018, 11:02:35 UTC - in response to Message 1963084.  
Last modified: 3 Nov 2018, 11:03:31 UTC

edit : ah my misunderstanding. thought you guys were talking about valids, not pendings

My oldest pending was returned 19 July. WU 3058299862

My oldest valid is from 18 July WU 3056429782


. . Sorry Keith but that valid is from the 3rd November 2018. Prior to that it was an inconclusive. Before that it was a pending because you had two dud wingmen that timed out. And back to my pet gripe .... funny how it keeps turning up.

Stephen

:(
ID: 1963115 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963117 - Posted: 3 Nov 2018, 11:08:39 UTC - in response to Message 1963096.  


If it can't fit into RAM where does it go?

. . Umm, it is on disk. That is what makes access to it slow ...
Stephen

I thought it was on disk as well


. . If it cannot fit into RAM it is ONLY on disk, ... hence slow ...

Stephen

. .
ID: 1963117 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963119 - Posted: 3 Nov 2018, 11:23:30 UTC - in response to Message 1963073.  
Last modified: 3 Nov 2018, 11:26:12 UTC

LOL. You only have 16 tasks outstanding in the database. I have 55,000 outstanding in the database.


Hey... be nice... my mac mini is proud of every one of the tasks it does. It may be old and slow but it has done a bunch of WUs over many years.

But I'm still not sure why you see more than 24 hours of valids? Maybe your machine spits them out so fast that you have a long list of pending... just wanting around on the partner machine (probably something slow like mine).

It wouldn't cost the db much to show me more than 24 hours of history.

edit : ah my misunderstanding. thought you guys were talking about valids, not pendings


. . In that case it is not a matter of purging the database as they will remain until the last lame duck wingman completes the task or times out and someone else does. And that could be years ... :( {as in where one dud wingman times out on a task only to have it pass to another delinquent host and then another ... etc}. Which brings me back to an old bug bear of mine about deadlines ... :)

Stephen

. . Message from earlier got stuck when my PC crashed (again) due to temp problem.

:(
ID: 1963119 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1963138 - Posted: 3 Nov 2018, 14:46:03 UTC

Stephen has said it in a way I get now. It (A task and it's associated WUs) is purged 24 hours after the slowest valid on the task. It can't be any closer than that or I'd never see any stats. I'm usually the slow one. Of course when the system has a hiccup db purge is usually the thing last done, so that is a part of it too.

I've often wondered how they pick due dates for the WUs, but that might be a discussion for another thread. I'm sure it is a delicate balance between managing the size of the database, and making sure the slower machines get a chance to be a part of the seti project.

Is your outstanding usually this high? or the result of finishing more WUs than usual due to the recent noise bombs??
ID: 1963138 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1963151 - Posted: 3 Nov 2018, 16:11:09 UTC - in response to Message 1963138.  

OK, poor choice in examples. Not relative to the original discussion. I do have a tendency to accumulate outstanding pendings that DON'T CLEAR after the last validating task is validated and past the deadline. And the database misses it. Then have to wait the database to pick it up again after the original deadline on the third or fourth wingman.

I would say my task load is maybe 20-30% higher due to all the noise bombs recently. The point I was trying to get across is when the purgers and validators are still trying to clear the backlog after the outage, the website is unusable for me to view tasks because of the long waits and timeouts. As I said, the slope in the purge line finally turned downward yesterday which made my attempt to find my oldest tasks even possible.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1963151 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1963168 - Posted: 3 Nov 2018, 17:37:54 UTC - in response to Message 1963138.  

@ Unixchick:
I've often wondered how they pick due dates for the WUs, but that might be a discussion for another thread. I'm sure it is a delicate balance between managing the size of the database, and making sure the slower machines get a chance to be a part of the seti project.
Yes, the deadlines are tuned in accordance with the estimated runtimes of the tasks, and (for SETI data, not Astropulse) with a very strong presumption that even the slowest devices should be able to complete tasks and contribute to the search.

I don't think we've exhumed it since you started contributing to the boards, but there's a 10-year-old thread - Estimates and Deadlines revisited - which might give you some insight into the thought processes involved.
ID: 1963168 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963208 - Posted: 3 Nov 2018, 22:35:13 UTC - in response to Message 1963151.  

OK, poor choice in examples. Not relative to the original discussion. I do have a tendency to accumulate outstanding pendings that DON'T CLEAR after the last validating task is validated and past the deadline. And the database misses it. Then have to wait the database to pick it up again after the original deadline on the third or fourth wingman.

I would say my task load is maybe 20-30% higher due to all the noise bombs recently. The point I was trying to get across is when the purgers and validators are still trying to clear the backlog after the outage, the website is unusable for me to view tasks because of the long waits and timeouts. As I said, the slope in the purge line finally turned downward yesterday which made my attempt to find my oldest tasks even possible.


. . In that respect you have my sympathy. For the last few months I have been monitoring my inconclusive rate looking for any pattern, reviewing page after page of valid tasks (twice per day) to identify the oldest one listed so I know which inconclusive tasks to count, then reviewing page after page of inconclusives (looking at each task individually) sorting out the current ones from many that are months old due to dud wingmen. On the slow machines it isn't too hard, but on my "good" machine it can take forever with only 2000 or so results. With your numbers I wouldn't even try, I'd go nuts.

Stephen
ID: 1963208 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1963306 - Posted: 4 Nov 2018, 19:34:50 UTC

The results received in the last hour has hit 150K and all is running fine, and there only appears to be some minor lag in db purging. I think we will have a mild panic Monday morning as the files to split will get low as we seem to be going through them quicker. (1800-2000/day as a rough guess)

and a throw back to another topic:
Thank you Richard for the link to the thread about WU deadline dates. Some good work on that thread. Enjoyable reading.
ID: 1963306 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963340 - Posted: 4 Nov 2018, 23:57:16 UTC - in response to Message 1963306.  

and a throw back to another topic:
Thank you Richard for the link to the thread about WU deadline dates. Some good work on that thread. Enjoyable reading.


. . But looking at the data Joe Segur posted over 10 years ago with hardware much less powerful than that currently being used, deadlines should be no more than one month ( about 4 weeks) yet today they are still set to 2 months or even longer. As you say, a lot of good work, but is anybody reading it (apart from Richard).

Stephen

? ?
ID: 1963340 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963395 - Posted: 5 Nov 2018, 8:07:54 UTC - in response to Message 1963306.  

The results received in the last hour has hit 150K and all is running fine, and there only appears to be some minor lag in db purging. I think we will have a mild panic Monday morning as the files to split will get low as we seem to be going through them quicker. (1800-2000/day as a rough guess)


. . Less than 1800 channels left so I hope they have another dataset to load when the guys get to work :)

. . Or we will be running on empty by the end o f day ...

Stephen

<fingers crossed>
ID: 1963395 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1963397 - Posted: 5 Nov 2018, 8:25:54 UTC - in response to Message 1963340.  

. . But looking at the data Joe Segur posted over 10 years ago with hardware much less powerful than that currently being used, deadlines should be no more than one month ( about 4 weeks) yet today they are still set to 2 months or even longer. As you say, a lot of good work, but is anybody reading it (apart from Richard).
Ten years ago, we didn't support Android devices, either.
ID: 1963397 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963415 - Posted: 5 Nov 2018, 12:17:15 UTC - in response to Message 1963397.  

Ten years ago, we didn't support Android devices, either.


. . And how is that working out? :)

Stephen

:)
ID: 1963415 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1963426 - Posted: 5 Nov 2018, 13:56:08 UTC - in response to Message 1963415.  

About this well :-)

Android (ARM processor)			8.00 (armv6-neon)		192 GigaFLOPS
Android (ARM processor)			8.00 (armv6-neon-nopie)		22 GigaFLOPS
Android (ARM processor)			8.00 (armv6-vfp)		64 GigaFLOPS
Android (ARM processor)			8.00 (armv6-vfp-nopie)		15 GigaFLOPS
Android (ARM processor)			8.00 (armv7-neon)		204 GigaFLOPS
Android (ARM processor)			8.00 (armv7-neon-nopie)		31 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3)		207 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3-nopie)	29 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3d16)		192 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3d16-nopie)	28 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv4)		195 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv4-nopie)	21 GigaFLOPS
Android (Intel/AMD x86 processor)	8.00 (nopie)			3 GigaFLOPS
Android (Intel/AMD x86 processor)	8.00 (pie)			80 GigaFLOPS
Android (ARM64 processor)		8.00 (arm64-neon)		133 GigaFLOPS
Android (ARM64 processor)		8.00 (arm64-vfpv4)		118 GigaFLOPS
Android (ARM64 processor)		8.01				716 GigaFLOPS
		
									2250 GigaFLOPS
ID: 1963426 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1963485 - Posted: 5 Nov 2018, 19:01:02 UTC

See we have new BLC04 work posted now. Should slow down the returns and slow the drop in RAC from the fast, low-paying BLC01.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1963485 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963500 - Posted: 5 Nov 2018, 21:26:53 UTC - in response to Message 1963426.  

About this well :-)

Android (ARM processor)			8.00 (armv6-neon)		192 GigaFLOPS
Android (ARM processor)			8.00 (armv6-neon-nopie)		22 GigaFLOPS
Android (ARM processor)			8.00 (armv6-vfp)		64 GigaFLOPS
Android (ARM processor)			8.00 (armv6-vfp-nopie)		15 GigaFLOPS
Android (ARM processor)			8.00 (armv7-neon)		204 GigaFLOPS
Android (ARM processor)			8.00 (armv7-neon-nopie)		31 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3)		207 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3-nopie)	29 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3d16)		192 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv3d16-nopie)	28 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv4)		195 GigaFLOPS
Android (ARM processor)			8.00 (armv7-vfpv4-nopie)	21 GigaFLOPS
Android (Intel/AMD x86 processor)	8.00 (nopie)			3 GigaFLOPS
Android (Intel/AMD x86 processor)	8.00 (pie)			80 GigaFLOPS
Android (ARM64 processor)		8.00 (arm64-neon)		133 GigaFLOPS
Android (ARM64 processor)		8.00 (arm64-vfpv4)		118 GigaFLOPS
Android (ARM64 processor)		8.01				716 GigaFLOPS
		
									2250 GigaFLOPS


. . Those numbers seem good(ish) but apart from highlighting the need for 'pie' (whatever that is) how does that compare to say an i3 or an AMD8350?

Stephen

?
ID: 1963500 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1963556 - Posted: 6 Nov 2018, 1:12:03 UTC - in response to Message 1963485.  

See we have new BLC04 work posted now. Should slow down the returns and slow the drop in RAC from the fast, low-paying BLC01.


. . Well good news (for me) but bad news for you I think, those Blc04 tapes are also GBT-New Format. Run times only a few seconds longer than the Blc01 tapes.

Stephen

:)
ID: 1963556 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22256
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1963582 - Posted: 6 Nov 2018, 6:06:40 UTC

Since all data split for more than a year has been the "new" format cancel your thought about anything we are seeing today being a "new format" because it isn't.
It's the data contained within the tape that is having an impact on the run-time -sometimes we get data that is much faster to process than others.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1963582 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13761
Credit: 208,696,464
RAC: 304
Australia
Message 1963594 - Posted: 6 Nov 2018, 7:37:34 UTC - in response to Message 1963500.  

. . Those numbers seem good(ish) but apart from highlighting the need for 'pie' (whatever that is) how does that compare to say an i3 or an AMD8350?

ARM CPUs are computationally roughly on par with P4 CPUs from many years ago, but use way, way, waaay less power.
Grant
Darwin NT
ID: 1963594 · Report as offensive
Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.