Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 37 · Next

AuthorMessage
RickToTheMax

Send message
Joined: 22 May 99
Posts: 105
Credit: 7,958,297
RAC: 0
Canada
Message 1958636 - Posted: 5 Oct 2018, 20:07:37 UTC
Last modified: 5 Oct 2018, 20:31:31 UTC

Does switching to 7.4.44 creates a new computer ID? or you can switch back and forth and it will remain on the same computer ID?
And what is rescheduler?

EDIT: nevermind, found info about the rescheduler, that is the cpu2gpu script right?
ID: 1958636 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1958641 - Posted: 5 Oct 2018, 20:53:33 UTC - in response to Message 1958625.  

That's why BOINC 7.4.44 was developed.

Just curisity it still has the 1000 WU limit or that was changed to 3000?

Are you asking about 7.4.44 or 7.8.3? 7.4.44 has the 3000 limit and 7.8.3 still has the 1000 limit.


. . While we are getting just 3 to 4 hour outages that is a moot point :) ... touch wood ... (puts hand on head)

Stephen

:)
ID: 1958641 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1958642 - Posted: 5 Oct 2018, 20:56:53 UTC - in response to Message 1958636.  

Does switching to 7.4.44 creates a new computer ID? or you can switch back and forth and it will remain on the same computer ID?
And what is rescheduler?

EDIT: nevermind, found info about the rescheduler, that is the cpu2gpu script right?


. . That is the version by Laurent, it works well for me ... There are others too, such as Jeff Buck's very nice effort.

Stephen

. .
ID: 1958642 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958645 - Posted: 5 Oct 2018, 21:05:48 UTC - in response to Message 1958636.  

No, installing any version of BOINC never creates a new host ID. In the case of TBar's BOINC versions, it is as simple as copying five files into the BOINC directory to change versions and then restarting BOINC.

That is one of the reschedulers. There have been several over time from Mr. Kevvy's GUPPIRescheduler to W3Perl's cpu2gpu to Jeff Buck's GUI rescheduler.
I have made all of the reschedulers available in my Dropbox with this Seti Reschedulers link.

I think Jeff's is particularly nice since there are both Windows and Linux versions and the fact they are GUI's make them simple to use.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958645 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1958653 - Posted: 5 Oct 2018, 21:36:38 UTC - in response to Message 1958625.  

That's why BOINC 7.4.44 was developed.

Just curisity it still has the 1000 WU limit or that was changed to 3000?

Are you asking about 7.4.44 or 7.8.3? 7.4.44 has the 3000 limit and 7.8.3 still has the 1000 limit.


i know this is kind of mixing threads a bit. but i was looking at wildcard's 13-GPU host. i was wondering if he'd hit the 1000 limit (he's on 7.9.3) since he has 13 GPUs (x100 per = 1300)

interestingly enough it shows over 3000 in progress. https://setiathome.berkeley.edu/results.php?hostid=8568434

how can this be?

i thought under normal circumstances you'd only get your [# of GPU]x100 + 100 for CPU, and even with recheduling you couldnt get over the BOINC limit (3000 on 7.4.44, and 1000 on the newer 7.8.3 and up versions). with some ghosts, sure he could get over the limit maybe, but he's over 2000 tasks over his limit. what do you think?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1958653 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958656 - Posted: 5 Oct 2018, 21:50:25 UTC - in response to Message 1958653.  

I would bet on ghosts or possibly leftovers if he converted a previous Windows machine to Linux and kept the host ID.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958656 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1958679 - Posted: 5 Oct 2018, 22:48:19 UTC
Last modified: 5 Oct 2018, 22:51:47 UTC

He runs Boinc 7.9.3. Does anybody knows what is the WU limit on this build?

By Free-DC today's stats:
8568434	wildcardcorp.com 628,565
7475713	petri33 381,247

ID: 1958679 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1958682 - Posted: 5 Oct 2018, 22:58:58 UTC - in response to Message 1958679.  
Last modified: 5 Oct 2018, 23:00:36 UTC

All builds are set to 1000, it's in the code.
All you have to do to find Ghosts is to go back far enough to find Unfinished tasks prior to finished tasks, here, https://setiathome.berkeley.edu/results.php?hostid=8568434&offset=16360
He has, State: All (19541), with Ghosts back at16020.
ID: 1958682 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958687 - Posted: 5 Oct 2018, 23:20:24 UTC - in response to Message 1958679.  

All BOINC versions past 7.0.2 have the 1000 task limit. Only the specially compiled TBar 7.4.44 has the 3000 task limit.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958687 · Report as offensive
Sleepy
Volunteer tester
Avatar

Send message
Joined: 21 May 99
Posts: 219
Credit: 98,947,784
RAC: 28,360
Italy
Message 1958851 - Posted: 6 Oct 2018, 16:09:11 UTC - in response to Message 1958645.  

I have made all of the reschedulers available in my Dropbox with this Seti Reschedulers link.
Dear Keith, I was very curious about the reschedulers you made available (thank you), in particular Jeff's, that you cherish particularly.
But I seem not to be able to extract the .7z files (not only Jeff's) with any means I know of (such as the obvious Ark and others I tested on the occasion).

What am I doing wrong (or what is the "trick")?

Thank you!

Sleepy
ID: 1958851 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1958854 - Posted: 6 Oct 2018, 16:26:29 UTC - in response to Message 1958851.  

Install a 7zip application, it is similar to WinZip.
ID: 1958854 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1958857 - Posted: 6 Oct 2018, 16:40:30 UTC - in response to Message 1958851.  

I have made all of the reschedulers available in my Dropbox with this Seti Reschedulers link.

I'm getting "Too many requests", is the link banned due to traffic?
ID: 1958857 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1958863 - Posted: 6 Oct 2018, 17:13:25 UTC - in response to Message 1958857.  

Same here in Canada on that link.
ID: 1958863 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1958864 - Posted: 6 Oct 2018, 17:32:27 UTC
Last modified: 6 Oct 2018, 17:58:36 UTC

I'm hoping that the seti team has figured out the problem the system has been having and address it during Tuesday's down time.

Right now the system should be running well as there isn't heavy demand, and no noise bombs in the data, but still I see some bad signs.
The db isn't being purged. It is at 3.6 million right now and can get alot higher before crashing, so this could be a false alert, or an early warning sign.
(edit : false alarm, my list of valids is back to 24 hours - so db is being purged)

27se18ab splitting seems to have stopped or be stuck.

The good news is if the system can stay up, then we have enough datafiles available to get us through the weekend.
ID: 1958864 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1958865 - Posted: 6 Oct 2018, 17:40:46 UTC
Last modified: 6 Oct 2018, 18:39:39 UTC

Just a warning when you use some reschedulers:

If you run the linux special cuda apps be aware, the builds not like to suspend a WU in the middle of the crunching process.

Sometimes things weird happening like crashing the WU process or even worst, send the WU to be crunched by the CPU and stop the GPU crunch until that process ended (a big loss in GPU processing time while wait).

If you look at the docs supplied with the builds they talk about to be care not to suspend the WU in the middle of the process.

From the README file:

6) The App may give Incorrect results on a restarted task. One way to avoid restarted tasks is to set the checkpoint higher than the task's estimated run-time, and also avoid suspending a task.

ID: 1958865 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1958872 - Posted: 6 Oct 2018, 18:23:31 UTC - in response to Message 1958864.  

db_purge: Deletes result/workunit records when no longer needed (i.e. after the associated files have been deleted). Depending on server load, we usually wait 24 hours between the files have been removed before purging the corresponding results so people can still view these results online for a day after they've been processed/assimilated. This program keeps our BOINC database as small as possible so that it fits in RAM (and therefore operates much faster).
Since the Assimilator just went through a pile of files less than 24h ago, the purge numbers will be high.

And yea, I hope they sort things out with the dB. The outage last Sunday/Monday Eric mentioned to me
Database issue. Logical log backups were waiting on a keypress. It should be working now, but will take some time to build a work queue.
I don't know if that was on the Master or Science database. My thought is the latter. I don't know what happened yesterday, but ecpect it was the same.
ID: 1958872 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1958883 - Posted: 6 Oct 2018, 19:25:53 UTC - in response to Message 1958872.  

And yea, I hope they sort things out with the dB. The outage last Sunday/Monday Eric mentioned to me
Database issue. Logical log backups were waiting on a keypress. It should be working now, but will take some time to build a work queue.
I don't know if that was on the Master or Science database. My thought is the latter. I don't know what happened yesterday, but expect it was the same.
I think we should all club together and buy Eric one of these:

Feely finger phone

The MobiLimb finger can crawl across the desk, waggle for attention when messages arrive and be used as an interface to control apps and games.
ID: 1958883 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958888 - Posted: 6 Oct 2018, 19:58:44 UTC - in response to Message 1958857.  

I have made all of the reschedulers available in my Dropbox with this Seti Reschedulers link.

I'm getting "Too many requests", is the link banned due to traffic?

That is the first time I've heard of this issue. I had to investigate what the limits for a free Dropbox account was. This is from the Help community answer.
Dropbox Basic (free) accounts:

The total amount of traffic that all of your links and file requests together can generate without getting banned is 20 GB per day.
The total number of downloads that all of your links together can generate is 100,000 downloads per day.


I find it hard to believe any downloads added up to 20GB or over 100,000 requests. None of the Rescheduler files is more than a meg is size.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958888 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958891 - Posted: 6 Oct 2018, 20:09:55 UTC - in response to Message 1958865.  

Just a warning when you use some reschedulers:

If you run the linux special cuda apps be aware, the builds not like to suspend a WU in the middle of the crunching process.

Sometimes things weird happening like crashing the WU process or even worst, send the WU to be crunched by the CPU and stop the GPU crunch until that process ended (a big loss in GPU processing time while wait).

If you look at the docs supplied with the builds they talk about to be care not to suspend the WU in the middle of the process.

From the README file:

6) The App may give Incorrect results on a restarted task. One way to avoid restarted tasks is to set the checkpoint higher than the task's estimated run-time, and also avoid suspending a task.

No Rescheduler will move a task that is in the process of being crunched. So this warning is not valid.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958891 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1958893 - Posted: 6 Oct 2018, 20:16:08 UTC

I guess we will have to wait for the banning to end. I deleted the link to not generate any more download requests. Once the account allows downloads again, I can make a new link to a rescheduler file on a personal request if whoever is interested gives me a email address to share to.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1958893 · Report as offensive
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.