Panic Mode On (84) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (84) Server Problems?

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 22 · Next
Author Message
Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23803
Credit: 32,628,641
RAC: 23,626
Germany
Message 1374708 - Posted: 1 Jun 2013, 11:08:58 UTC

Yep, just got 52 tasks for GPU.

____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5229
Credit: 285,135,221
RAC: 455,084
Brazil
Message 1374715 - Posted: 1 Jun 2013, 11:22:46 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?
____________

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4964
Credit: 73,106,249
RAC: 15,247
Australia
Message 1374728 - Posted: 1 Jun 2013, 11:49:51 UTC - in response to Message 1374715.
Last modified: 1 Jun 2013, 11:50:23 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8465
Credit: 48,955,052
RAC: 75,445
United Kingdom
Message 1374751 - Posted: 1 Jun 2013, 13:28:33 UTC - in response to Message 1374728.

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?

let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

You're welcome to look at my Kepler on Beta (Tests of new scheduler features)

I've updated the thread since you dropped by.

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4964
Credit: 73,106,249
RAC: 15,247
Australia
Message 1374760 - Posted: 1 Jun 2013, 14:19:43 UTC - in response to Message 1374751.
Last modified: 1 Jun 2013, 14:23:09 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?

let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

You're welcome to look at my Kepler on Beta (Tests of new scheduler features)

I've updated the thread since you dropped by.


Only raises more questions for me, as per my response there


Yep different numbers than when I looked (which is certainly a factor in trying to work out if the mechanism is remotely working).

With nearly 1000 consecutive valid on Cuda5, are you suggesting the average processing rate for that is more or less accurate than the ~100 & ~300 for the lower Cuda revisions ?

With a given more or less random mix of tasks in a large enough population, which APR is correct ? The measured one or one concocted from a synthetic benchmark?

I'm not attempting to answer those questions myself, other than to suggest perhaps 100-300 tasks for a given app version isn't enough AR spread to dial in.

How many would make averages relatively stable, and what would happen if you upgraded to 2 x classified 780 water cooled, or downgraded to an 8400GS ? .... and should the system handle that without reset of some sort ?

____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5229
Credit: 285,135,221
RAC: 455,084
Brazil
Message 1374778 - Posted: 1 Jun 2013, 15:02:45 UTC - in response to Message 1374728.
Last modified: 1 Jun 2013, 15:11:02 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

Sorry, i was sleeping.... here some examples: (all runs V7 from the begining)

This host is a 590+2x580: http://setiathome.berkeley.edu/host_app_versions.php?hostid=5264653

it receives only cuda50, expected cuda42 since all the GPUs are Fermis.

This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764

it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers.

All GPUs are running at stock speeds and the keplers are EVGA Classified/FTW.

I will check my others hosts for any anormalities.

Hope that helps to clarify what i try show. I´m still thinking the old way, where we choose the best apps is better.
____________

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4964
Credit: 73,106,249
RAC: 15,247
Australia
Message 1374788 - Posted: 1 Jun 2013, 15:13:59 UTC - in response to Message 1374778.
Last modified: 1 Jun 2013, 15:15:26 UTC

Thanks, having a look with first example first. Will start a new thread
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2247
Credit: 8,597,427
RAC: 4,259
United States
Message 1375137 - Posted: 2 Jun 2013, 6:40:01 UTC
Last modified: 2 Jun 2013, 6:44:24 UTC

That's interesting... tasks page and wuID page shows credit is pending.. but taskID page shows [claimed...?] credit with a validate state of 'initial'.

Unless my wingmate reports before anyone reads this..

task 3011463763
workunit 1251615664

I see that same scenario for one other task, too. What seemed like probably about 10-12 hours ago, credit on the taskID was showing 0.00 if it was pending. What changed since then?


edit: or how about this one? workunit 1250118735
_0 had an error while computing but was given credit, _1 (me) is pending, and now I'm waiting for _2 to return it. *scratches head*
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3406
Credit: 19,620,932
RAC: 18,351
Sweden
Message 1375197 - Posted: 2 Jun 2013, 7:49:06 UTC - in response to Message 1375137.

That's interesting... tasks page and wuID page shows credit is pending.. but taskID page shows [claimed...?] credit with a validate state of 'initial'.

Unless my wingmate reports before anyone reads this..

task 3011463763
workunit 1251615664

I see that same scenario for one other task, too. What seemed like probably about 10-12 hours ago, credit on the taskID was showing 0.00 if it was pending. What changed since then?


edit: or how about this one? workunit 1250118735
_0 had an error while computing but was given credit, _1 (me) is pending, and now I'm waiting for _2 to return it. *scratches head*



Eric did run his credit granting script, and reset some data yesterday, because too low credit was given to some APs. That's probably why you see what you see now.

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1375011
____________

zoom314Project donor
Avatar
Send message
Joined: 30 Nov 03
Posts: 46122
Credit: 36,593,924
RAC: 5,358
Message 1376079 - Posted: 3 Jun 2013, 18:19:37 UTC

Say is anyone getting any cpu wu's?

I don't have any errors in the app info file, state or otherwise, I've got cpu and both v6 and v7 checked, yet no cpu wu's, I've got plenty of gpu wu's of course.
____________
My Facebook, War Commander, 2015

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,038,641
RAC: 48,271
Australia
Message 1376092 - Posted: 3 Jun 2013, 18:47:18 UTC - in response to Message 1376079.

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.
____________
Grant
Darwin NT.

zoom314Project donor
Avatar
Send message
Joined: 30 Nov 03
Posts: 46122
Credit: 36,593,924
RAC: 5,358
Message 1376105 - Posted: 3 Jun 2013, 19:03:22 UTC - in response to Message 1376092.

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.

I finally got 11 cpu wu's, after a bit.
____________
My Facebook, War Commander, 2015

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,038,641
RAC: 48,271
Australia
Message 1376586 - Posted: 4 Jun 2013, 17:56:26 UTC - in response to Message 1376105.


Lower credit, over the top time estimates since installing the new optimised application, and now i'm getting sticky downloads. Multiple re-tries & they're still there.
____________
Grant
Darwin NT.

Profile {BDC} Thomas DupontProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3726
Credit: 1,310,582
RAC: 780
France
Message 1376589 - Posted: 4 Jun 2013, 17:58:58 UTC - in response to Message 1376586.

Same here Grant & for my team too...
Before 575KC / On stage 476KC
____________
Team Founder BRIGADE DU COSMOS




BRIGADE DU COSMOS is proudly sponsored by Zenovia Digital Exchange

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,038,641
RAC: 48,271
Australia
Message 1376606 - Posted: 4 Jun 2013, 18:21:01 UTC - in response to Message 1376589.


Don't know if they did some server tweaking during the outage- if they did i hope they undo it ASAP.
All downloads are sitting there for several minutes before they start to download. When they do start to download they timeout while doing so. It's taking multiple re-tries to download each WU, and the download rates have gone from 120kB/s down to less than 10kB/s.
____________
Grant
Darwin NT.

Bob Giel
Volunteer tester
Send message
Joined: 11 Jan 04
Posts: 55
Credit: 5,028,706
RAC: 165
United States
Message 1376630 - Posted: 4 Jun 2013, 19:27:40 UTC

Looks like we're back to the dreaded "transient HTTP error". Guess something on the backend side went "ka-put".


6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:19:16 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:19:58 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:19:58 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.250: transient HTTP error
6/4/2013 2:19:58 PM | SETI@home | Backing off 40 min 35 sec on download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:59 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:21:19 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:21:19 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.253: transient HTTP error
6/4/2013 2:21:19 PM | SETI@home | Backing off 52 min 9 sec on download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:21:20 PM | | Internet access OK - project servers may be temporarily down.

____________

Profile Link
Avatar
Send message
Joined: 18 Sep 03
Posts: 828
Credit: 1,559,292
RAC: 410
Germany
Message 1377149 - Posted: 5 Jun 2013, 20:57:33 UTC
Last modified: 5 Jun 2013, 21:02:58 UTC

There must be something the servers didn't like about me installing MB v7. My Laptop is down to 0 tasks per day for v7 after just receiving it's first v7 WU, my desktop is down to 33 per day. Both computers have not returned any results for v7... have I done something wrong? Or have the servers done something wrong?

I mean, it's not an issue with 33 tasks per day, but 0 per day means actually, that this machine should never get any task for this application... that would be somehow disappointing...
____________
.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 224,337,830
RAC: 215,673
Australia
Message 1377229 - Posted: 6 Jun 2013, 0:19:45 UTC - in response to Message 1377149.


I gather the laptop is the one with the (forgive the experssion) "m processor".

Did you load v7 standard or lunatics 0.41.
____________

Profile SliverProject donor
Avatar
Send message
Joined: 18 May 11
Posts: 281
Credit: 7,058,946
RAC: 915
United States
Message 1377242 - Posted: 6 Jun 2013, 1:30:21 UTC

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?
____________

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3623
Credit: 48,548,246
RAC: 29,622
United States
Message 1377246 - Posted: 6 Jun 2013, 1:45:16 UTC - in response to Message 1377242.

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?


They do take longer to crunch as they have additional items to search for.
____________

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?

Copyright © 2014 University of California