Panic Mode On (84) Server Problems?

Message boards : Number crunching : Panic Mode On (84) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 21 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1374760 - Posted: 1 Jun 2013, 14:19:43 UTC - in response to Message 1374751.  
Last modified: 1 Jun 2013, 14:23:09 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?

let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

You're welcome to look at my Kepler on Beta (Tests of new scheduler features)

I've updated the thread since you dropped by.


Only raises more questions for me, as per my response there


Yep different numbers than when I looked (which is certainly a factor in trying to work out if the mechanism is remotely working).

With nearly 1000 consecutive valid on Cuda5, are you suggesting the average processing rate for that is more or less accurate than the ~100 & ~300 for the lower Cuda revisions ?

With a given more or less random mix of tasks in a large enough population, which APR is correct ? The measured one or one concocted from a synthetic benchmark?

I'm not attempting to answer those questions myself, other than to suggest perhaps 100-300 tasks for a given app version isn't enough AR spread to dial in.

How many would make averages relatively stable, and what would happen if you upgraded to 2 x classified 780 water cooled, or downgraded to an 8400GS ? .... and should the system handle that without reset of some sort ?

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1374760 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1374778 - Posted: 1 Jun 2013, 15:02:45 UTC - in response to Message 1374728.  
Last modified: 1 Jun 2013, 15:11:02 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

Sorry, i was sleeping.... here some examples: (all runs V7 from the begining)

This host is a 590+2x580: http://setiathome.berkeley.edu/host_app_versions.php?hostid=5264653

it receives only cuda50, expected cuda42 since all the GPUs are Fermis.

This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764

it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers.

All GPUs are running at stock speeds and the keplers are EVGA Classified/FTW.

I will check my others hosts for any anormalities.

Hope that helps to clarify what i try show. I´m still thinking the old way, where we choose the best apps is better.
ID: 1374778 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1374788 - Posted: 1 Jun 2013, 15:13:59 UTC - in response to Message 1374778.  
Last modified: 1 Jun 2013, 15:15:26 UTC

Thanks, having a look with first example first. Will start a new thread
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1374788 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1375137 - Posted: 2 Jun 2013, 6:40:01 UTC
Last modified: 2 Jun 2013, 6:44:24 UTC

That's interesting... tasks page and wuID page shows credit is pending.. but taskID page shows [claimed...?] credit with a validate state of 'initial'.

Unless my wingmate reports before anyone reads this..

task 3011463763
workunit 1251615664

I see that same scenario for one other task, too. What seemed like probably about 10-12 hours ago, credit on the taskID was showing 0.00 if it was pending. What changed since then?


edit: or how about this one? workunit 1250118735
_0 had an error while computing but was given credit, _1 (me) is pending, and now I'm waiting for _2 to return it. *scratches head*
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1375137 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64898
Credit: 55,293,173
RAC: 49
United States
Message 1376079 - Posted: 3 Jun 2013, 18:19:37 UTC

Say is anyone getting any cpu wu's?

I don't have any errors in the app info file, state or otherwise, I've got cpu and both v6 and v7 checked, yet no cpu wu's, I've got plenty of gpu wu's of course.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1376079 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13350
Credit: 208,696,464
RAC: 304
Australia
Message 1376092 - Posted: 3 Jun 2013, 18:47:18 UTC - in response to Message 1376079.  

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.
Grant
Darwin NT
ID: 1376092 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64898
Credit: 55,293,173
RAC: 49
United States
Message 1376105 - Posted: 3 Jun 2013, 19:03:22 UTC - in response to Message 1376092.  

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.

I finally got 11 cpu wu's, after a bit.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1376105 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13350
Credit: 208,696,464
RAC: 304
Australia
Message 1376586 - Posted: 4 Jun 2013, 17:56:26 UTC - in response to Message 1376105.  


Lower credit, over the top time estimates since installing the new optimised application, and now i'm getting sticky downloads. Multiple re-tries & they're still there.
Grant
Darwin NT
ID: 1376586 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1376589 - Posted: 4 Jun 2013, 17:58:58 UTC - in response to Message 1376586.  

Same here Grant & for my team too...
Before 575KC / On stage 476KC
ID: 1376589 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13350
Credit: 208,696,464
RAC: 304
Australia
Message 1376606 - Posted: 4 Jun 2013, 18:21:01 UTC - in response to Message 1376589.  


Don't know if they did some server tweaking during the outage- if they did i hope they undo it ASAP.
All downloads are sitting there for several minutes before they start to download. When they do start to download they timeout while doing so. It's taking multiple re-tries to download each WU, and the download rates have gone from 120kB/s down to less than 10kB/s.
Grant
Darwin NT
ID: 1376606 · Report as offensive
Bob Giel
Volunteer tester

Send message
Joined: 11 Jan 04
Posts: 76
Credit: 5,419,128
RAC: 0
United States
Message 1376630 - Posted: 4 Jun 2013, 19:27:40 UTC

Looks like we're back to the dreaded "transient HTTP error". Guess something on the backend side went "ka-put".


6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:19:16 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:19:58 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:19:58 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.250: transient HTTP error
6/4/2013 2:19:58 PM | SETI@home | Backing off 40 min 35 sec on download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:59 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:21:19 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:21:19 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.253: transient HTTP error
6/4/2013 2:21:19 PM | SETI@home | Backing off 52 min 9 sec on download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:21:20 PM | | Internet access OK - project servers may be temporarily down.

ID: 1376630 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1377149 - Posted: 5 Jun 2013, 20:57:33 UTC
Last modified: 5 Jun 2013, 21:02:58 UTC

There must be something the servers didn't like about me installing MB v7. My Laptop is down to 0 tasks per day for v7 after just receiving it's first v7 WU, my desktop is down to 33 per day. Both computers have not returned any results for v7... have I done something wrong? Or have the servers done something wrong?

I mean, it's not an issue with 33 tasks per day, but 0 per day means actually, that this machine should never get any task for this application... that would be somehow disappointing...
ID: 1377149 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1377229 - Posted: 6 Jun 2013, 0:19:45 UTC - in response to Message 1377149.  


I gather the laptop is the one with the (forgive the experssion) "m processor".

Did you load v7 standard or lunatics 0.41.
ID: 1377229 · Report as offensive
Profile Akio
Avatar

Send message
Joined: 18 May 11
Posts: 375
Credit: 32,129,242
RAC: 0
United States
Message 1377242 - Posted: 6 Jun 2013, 1:30:21 UTC

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?
ID: 1377242 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1377246 - Posted: 6 Jun 2013, 1:45:16 UTC - in response to Message 1377242.  

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?


They do take longer to crunch as they have additional items to search for.

ID: 1377246 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1377306 - Posted: 6 Jun 2013, 5:30:33 UTC - in response to Message 1377242.  

Same here samten
ID: 1377306 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1377331 - Posted: 6 Jun 2013, 6:26:41 UTC - in response to Message 1377306.  


Both of you are not imagining it. As per Arkayn, v7 WUs take longer to crunch so you will do fewer WUs per day compared to v6 (albiet more processing in v7 compared to v6).

There will also an effect on your daily total.





ID: 1377331 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1377347 - Posted: 6 Jun 2013, 7:27:47 UTC - in response to Message 1377229.  


I gather the laptop is the one with the (forgive the experssion) "m processor".

Did you load v7 standard or lunatics 0.41.

I installed the v7 app manually, the X2 has already started it's first WU, runs OK so far. None of those machines have ever returned any MB v7 result, so I don't understand why their quotas are that low, for all other applications they started at 100.
ID: 1377347 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13350
Credit: 208,696,464
RAC: 304
Australia
Message 1377349 - Posted: 6 Jun 2013, 7:38:21 UTC - in response to Message 1377149.  

There must be something the servers didn't like about me installing MB v7. My Laptop is down to 0 tasks per day for v7 after just receiving it's first v7 WU, my desktop is down to 33 per day. Both computers have not returned any results for v7... have I done something wrong? Or have the servers done something wrong?

Those stats are reset at migdnight Berkeley time each day.
With the increadibly long runtimes with the v7 stock application i was only returning about 7 per day on my slow machine.
The important numbers are the "Max tasks per day" and "Consecutive valid tasks".

As of a couple of minutes ago, for my slow system-
Number of tasks completed 415
Max tasks per day 544
Number of tasks today 0
Consecutive valid tasks 444
Average processing rate 193.67956428601
Average turnaround time 0.87 days

Grant
Darwin NT
ID: 1377349 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1377366 - Posted: 6 Jun 2013, 9:27:49 UTC - in response to Message 1377349.  

There must be something the servers didn't like about me installing MB v7. My Laptop is down to 0 tasks per day for v7 after just receiving it's first v7 WU, my desktop is down to 33 per day. Both computers have not returned any results for v7... have I done something wrong? Or have the servers done something wrong?

(...)
The important numbers are the "Max tasks per day" and "Consecutive valid tasks".

I'm talking about max tasks per day. That's 0 for my laptop and (now after returning 1 valid result) 34 for my desktop.
ID: 1377366 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.