Panic Mode On (98) Server Problems?

Message boards : Number crunching : Panic Mode On (98) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 30 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1697410 - Posted: 1 Jul 2015, 18:32:02 UTC - in response to Message 1697402.  

BTW, I noticed the replica DB is offline. Hope that doesn't foreshadow any coming difficulties.


This server crashed last night (taking the web site with it for a couple hours). Garden variety crash, unfortunately coincidentally timed with the leap second, but I'm not 100% sure that was the cause. I'm rebuilding the db now.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1697410 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1697430 - Posted: 1 Jul 2015, 19:24:10 UTC - in response to Message 1697394.  

All my AP-only hosts are now full as they can be. (Except one, no cpu AP's done before, so over 150h estimated remaining -> no new CPU tasks). Is there anyway manually edit estimated time or do I have to wait 11 tasks validated?


I'm in the same boat with my CPU on my new computer. It hasn't finished enough tasks to get the estimates right. Now it's over 4 days, and it takes less than 4 hours.

Also, 11 tasks is not the full truth either (shown as "Number of tasks completed"). It has to be 11 tasks with less than a certain percentage blanked (I do not remember the exact percentage): I have so far 18 "Consecutive valid tasks", but as can be seen, only 9 of those are under the limits of blanking. Two more to go, and my estimates should be OK, or at least much better.

It can take a long time to reach that goal, especially as if now, lots of the AP's I have for my CPU, is heavily blanked.

The criteria to qualify for "number of tasks completed" are:

1. Less than 10.00% blanked (9.99 or less)
2. Did not 30/30 early exit
3. Did not exit with an exit status of anything non-zero.

All three of those have to be met for it to count. Depending on the mixture of WUs, it can be done in as little as 11 tasks, but more than likely, will take 20-30 tasks to get there.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1697430 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1697435 - Posted: 1 Jul 2015, 19:43:11 UTC - in response to Message 1697410.  

BTW, I noticed the replica DB is offline. Hope that doesn't foreshadow any coming difficulties.


This server crashed last night (taking the web site with it for a couple hours). Garden variety crash, unfortunately coincidentally timed with the leap second, but I'm not 100% sure that was the cause. I'm rebuilding the db now.

- Matt



We were all speculating if that leap second was the cause. Many thanks for getting back up and running. Sorry about having to rebuild the Database.
ID: 1697435 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1697633 - Posted: 2 Jul 2015, 6:28:14 UTC

I have 58 MB tasks and 10 AP tasks assigned to me however I have lost them due to a driver crash & an error in adapting the app info file when removing the version 6 data. Any idea when we send lost ask will be turned on?
ID: 1697633 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22444
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1697641 - Posted: 2 Jul 2015, 6:59:18 UTC

Don't worry about lost tasks, they will either be sent back to you or another user so nothing will be lost apart from your time due to the driver crash and other problems
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1697641 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1697644 - Posted: 2 Jul 2015, 7:17:15 UTC - in response to Message 1697410.  

BTW, I noticed the replica DB is offline. Hope that doesn't foreshadow any coming difficulties.


This server crashed last night (taking the web site with it for a couple hours). Garden variety crash, unfortunately coincidentally timed with the leap second, but I'm not 100% sure that was the cause. I'm rebuilding the db now.

- Matt

Thanks much for that feedback, Matt.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1697644 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1697687 - Posted: 2 Jul 2015, 10:37:56 UTC - in response to Message 1697633.  

I have 58 MB tasks and 10 AP tasks assigned to me however I have lost them due to a driver crash & an error in adapting the app info file when removing the version 6 data. Any idea when we send lost ask will be turned on?


That's exactly why I'm not touching my app_info until I run my AP tasks through!
ID: 1697687 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1697717 - Posted: 2 Jul 2015, 13:26:31 UTC
Last modified: 2 Jul 2015, 13:27:30 UTC

I wonder why I'm not getting any AP tasks while many of you seem to be enjoying them...
ID: 1697717 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1697722 - Posted: 2 Jul 2015, 13:36:54 UTC - in response to Message 1697717.  

There are a number of people that change their profile to "AP only" when they see the splitters fire up.

Also, ( and i'm not sure on this) set profile to AP only, and allow other work, the system MAY ask for AP before MB tasks.
ID: 1697722 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1697727 - Posted: 2 Jul 2015, 13:48:15 UTC
Last modified: 2 Jul 2015, 13:53:45 UTC

Mine are set as AP only and allow other work.
Apparently it's not working because I see 'AP work ready to send' are accumulating and I'm not getting any.....
ID: 1697727 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1697730 - Posted: 2 Jul 2015, 13:52:15 UTC - in response to Message 1697727.  

then don't allow other work "AP only"

when your cache starts declining , allow other work again.
ID: 1697730 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1697731 - Posted: 2 Jul 2015, 13:54:32 UTC - in response to Message 1697730.  

Thank you, Brent.
I will try AP only when I get home :)
ID: 1697731 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1697735 - Posted: 2 Jul 2015, 14:05:49 UTC - in response to Message 1697731.  

you don't have to be at home, you can change your profile settings on the web page. Your home computer will pick it up.
ID: 1697735 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1697736 - Posted: 2 Jul 2015, 14:11:56 UTC

FWIW: I have both my machines set as AP7 yes MB7 no, accept other apps yes and it certainly seems to be working - I am getting scads of APs now that they are available again.

I only do AP on GPU, MB on CPU and GPU.
ID: 1697736 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1697740 - Posted: 2 Jul 2015, 14:45:12 UTC - in response to Message 1697731.  
Last modified: 2 Jul 2015, 14:59:49 UTC

Thank you, Brent.
I will try AP only when I get home :)


ReiAyanami..

You have the latest driver on both machines. Have you either

1) edit the kernal.cl file to rename all the bool2 as Petri suggested or

2) added Raistmer's fix for the issue with OpenCl 1.2 and the .cl file?

if you haven't done either, APs won't run on those machines.

the last option if you don't do either of those is to roll back the driver to NV 347.88

Zalster
ID: 1697740 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1697741 - Posted: 2 Jul 2015, 14:50:14 UTC - in response to Message 1697740.  
Last modified: 2 Jul 2015, 14:58:39 UTC

Yea I noticed he had new drivers, but it should still download, right? mb run CL Build errors, but still download.

He had no tasks at all.

EDIT: Actually NO, my driver 350.12 stopped downloading (for some unknown reason) until I installed 347.88 and reinstalled Lunatics
ID: 1697741 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1697746 - Posted: 2 Jul 2015, 15:01:27 UTC - in response to Message 1697741.  

I don't know if it will download APs if the computer fails to build a BIN after it reads the Kernal.

I wouldn't think it would. But others would know better than I
ID: 1697746 · Report as offensive
Profile ReiAyanami
Avatar

Send message
Joined: 6 Dec 05
Posts: 116
Credit: 222,900,202
RAC: 174
Japan
Message 1697789 - Posted: 2 Jul 2015, 17:50:34 UTC - in response to Message 1697746.  

I reverted back to 347.88 and still no AP.
Then I found a message under Notice tub says 'Your app_info.xml file doesn't have a usable version of AstroPulse v7'
I don't know how this happened but at least I know why no AP.
All I did was to install Lunatics_Win64_v0.42 and split GPU to 3.
ID: 1697789 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1697792 - Posted: 2 Jul 2015, 17:54:20 UTC - in response to Message 1697789.  

I reverted back to 347.88 and still no AP.
Then I found a message under Notice tub says 'Your app_info.xml file doesn't have a usable version of AstroPulse v7'
I don't know how this happened but at least I know why no AP.
All I did was to install Lunatics_Win64_v0.42 and split GPU to 3.

You might try rerunning the installer and make sure you have made all the correct selections during the install. Sounds like perhaps you did not select AP properly.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1697792 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22444
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1697796 - Posted: 2 Jul 2015, 18:01:19 UTC

A couple of things - your caches are full, so until there is a bit of space there you will not get any.
Second, have you made sure that you have got "AstroPulse v7" checked on your options page? This can take a bit of time to take effect.
Third APs are generally in quite short supply, so there can be times when you don't get any despite everything appearing to be OK.

Ah, re-read you note, if the message tab is saying that you haven't got a valid version of AP7 then you will not get any APs.
So, what options did you select from the Lunatics installer (there is quite a choice so it is fairly easy to choose the wrong one for your processor)

Finally (for now), don;t try three APs per GPU until you know that your system is stable and producing valid, error free results with one per GPU.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1697796 · Report as offensive
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 30 · Next

Message boards : Number crunching : Panic Mode On (98) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.