Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 37 · Next

AuthorMessage
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1959020 - Posted: 7 Oct 2018, 5:42:16 UTC - in response to Message 1959019.  

Only one gbt splitter is running. Time for Panic?


definitely :-)
ID: 1959020 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1959023 - Posted: 7 Oct 2018, 5:58:05 UTC - in response to Message 1959017.  

the sah assimilators are not running so we have a little less than 5 hours of WUs left.

The Assimilators clean up after WUs have been processed & the results returned, but they have nothing to do with the production or distribution of work to be processed.



Thank you. I misunderstood. It sounds like it is the Results of the processed WU that is assimilated into the science db, so the bit labeled "Workunits waiting for assimilation" is just the WU waiting for the result to be put into the science db then the WU moves to the "Workunit files waiting for deletion" stage.

So the data file is split into WUs. The WU is validated, and then put into the RTS. then it comes back with a result that waits for validation once validated. the result is assimilated. once the result is assimilated the original WU can be deleted. ???
ID: 1959023 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13861
Credit: 208,696,464
RAC: 304
Australia
Message 1959027 - Posted: 7 Oct 2018, 6:18:20 UTC - in response to Message 1959023.  
Last modified: 7 Oct 2018, 6:20:24 UTC

So the data file is split into WUs.

Yep.

The WU is validated

Nope.
Once it's split, it goes straight to the Ready-to-send buffer. From there it gets sent out to 2 systems to be processed.

Validation happens after the WU has been processed, and the result is returned to the server. That result is compared with the result returned from the other system for that WU- if they match then they are Validated (if they don't match, another copy is sent out to be processed, then that result is compared with the other(s)), the result is then Assimilated in to the science database, then the WU & results are eventually deleted, then purged from the working database.
Grant
Darwin NT
ID: 1959027 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13861
Credit: 208,696,464
RAC: 304
Australia
Message 1959028 - Posted: 7 Oct 2018, 6:22:34 UTC - in response to Message 1959019.  

Only one gbt splitter is running. Time for Panic?

Oh yeah- there is a whole sea of red on the Server status page now.
At least the web site & forums are staying up (for now).
Grant
Darwin NT
ID: 1959028 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1959029 - Posted: 7 Oct 2018, 6:25:49 UTC

Passes a "Xanax" to everyone :)
ID: 1959029 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1959048 - Posted: 7 Oct 2018, 10:19:02 UTC - in response to Message 1958981.  

If you have your checkpoints sufficiently long enough to be greater than the crunching time of the task, then the task just starts over from zero when you restart BOINC. No harm, no foul.

What checkpoint time you use? And where i put that?


. . It is in the BOINC preferences page where you find the No of CPUs to use etc, it is 'Minimum time to checkpoint'.

. . Use whatever your average run time is plus 10% to 15%. On my machine where the average run time is 5 mins I use 6 minutes. (I know that is plus 20% but I like round numbers)

Stephen

:)
ID: 1959048 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1959050 - Posted: 7 Oct 2018, 10:22:17 UTC - in response to Message 1959003.  

Didn't we all have this trouble this time last year also in April this year? Wonder if it's due to Day Light Savings starting and Ending around the world?


. . I guess it could be but since SETI runs on UTC I would think the only conflict would be local 'Day Light Saving' changes at Berkeley.

Stephen

:)
ID: 1959050 · Report as offensive
RickToTheMax

Send message
Joined: 22 May 99
Posts: 105
Credit: 7,958,297
RAC: 0
Canada
Message 1959051 - Posted: 7 Oct 2018, 10:41:37 UTC

And we're just a few minutes away from running dry...
Quite convenient that i was fiddling with rescheduler script last night and managed to fill up my cache before this unexpected down time!
ID: 1959051 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1959054 - Posted: 7 Oct 2018, 10:45:48 UTC

I just remembered having this discussion 6 months back then further 6 months bout the time I rejoined..
ID: 1959054 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14680
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1959056 - Posted: 7 Oct 2018, 10:51:40 UTC - in response to Message 1959048.  

What checkpoint time you use? And where i put that?
. . It is in the BOINC preferences page where you find the No of CPUs to use etc, it is 'Minimum time to checkpoint'.
Usual caveat: there are two different places to set preferences.

If you have ever viewed your settings in BOINC Manager, and exited by clicking 'OK' (old versions) or 'Save' (very recent versions only) you will have created a 'global preferences override' file, and BOINC will exclusively use the settings contained in that file. Changing the web setting after that event will have no effect at all.

If you want to change the checkpoint interval, look first at the Options --> Computing preferences dialog in BOINC Manager. Read the top panel. If it says 'Using web preferences', back away very carefully by clicking the 'Cancel' button (or pressing 'Esc'), and go change it on the web as described. If it says 'Using local preferences', then change it there: Computing tab, last box.
ID: 1959056 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51484
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1959057 - Posted: 7 Oct 2018, 10:59:10 UTC

Ahhh...just lovely.
I see the splitters are on the lam again and RTS has disappeared.
I sent the word out, but of course it's 4AM in Berkeley and Sunday.
So I don't expect to see any kicking done real soon.
Meowsigh.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1959057 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1959069 - Posted: 7 Oct 2018, 15:02:03 UTC

If nothing changes in the next hour the big crunchers will start to suffer with WU starvation very soon.

Why this things normaly happening in the dawn of a weekend day?

I know is the Murphys law in action.
ID: 1959069 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1959073 - Posted: 7 Oct 2018, 15:23:02 UTC

How about that "gbt splitter 3" . Hanging tough and splitting long after all the others have given up.
ID: 1959073 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1959075 - Posted: 7 Oct 2018, 15:39:17 UTC

someone needs to give all the splitters a kick (I.E. a re-boot, or boot in the bum...)
.

Hello, from Albany, CA!...
ID: 1959075 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11419
Credit: 29,581,041
RAC: 66
United States
Message 1959082 - Posted: 7 Oct 2018, 16:09:22 UTC

Splitters are still borked.
ID: 1959082 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1959086 - Posted: 7 Oct 2018, 16:31:04 UTC - in response to Message 1959082.  

Splitters are still borked.

it`s sunday very early in CA.
ID: 1959086 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1959088 - Posted: 7 Oct 2018, 16:57:41 UTC

Weeellll, it's 10am, I suppose it could be considered early, depending on the activities of the previous evening..? ;-)

ID: 1959088 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1959094 - Posted: 7 Oct 2018, 17:40:48 UTC

Woke up this morning to no tasks but I expected that outcome given the hiccups and yo-yo'ing project yesterday. Servers are borked and will probably need a complete reboot of the project in sequence.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1959094 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1959103 - Posted: 7 Oct 2018, 18:24:43 UTC

still no tasks here
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1959103 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1959111 - Posted: 7 Oct 2018, 19:46:46 UTC - in response to Message 1958615.  
Last modified: 7 Oct 2018, 19:54:18 UTC

That's why BOINC 7.4.44 was developed. Required software for your Linux mini-super computer. Not only does 7.4.44 allow your Investment to store enough work to make it through a Tuesday Outrage, it doesn't have a Back-Off 'Feature'. 7.4.44 will bang on the Server relentlessly every 5 minutes requesting work. Considering a mini-super computer returns around 10 tasks every 5 minutes, it makes sense to request work every 5 minutes. 7.4.44 is available in the Docs folder of the Linux BOINC All-In-One package.


Since the project has run out of work, this seemed like an opportune time to try to switch to this version.

i installed the dependencies listed in your post on CA
i copied all files from my /var/lib/boinc-client/projects/seti directory to the BOINC/projects/seti directory in my Home folder
i moved the client_state file over also

but i think maybe something is not right. it's saying my boinc version is 7.8.3, not 7.4.44

EDIT: nevermind, i got it. i missed that this package was not 7.4.44 natively and you had to extract the 7.4.44 components separately. got it now.

also, do i have to do anything special to remove my old service install of BOINC 7.9.3? when installing the dependencies and copying boinc folder to my home folder, i noticed my service install blanked out and removed my project.

running sudo apt-get remove boinc-client boinc-manager returned that neither were installed, which seemed odd to me. is it gone from something i did? can i just delete the boinc-client and boinc folders from my var/lib/ directory?

will this new version auto-start at system boot like the service install does?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1959111 · Report as offensive
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.