Panic Mode On (84) Server Problems?

Message boards : Number crunching : Panic Mode On (84) Server Problems?

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 22 · Next

AuthorMessage
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 29561
Credit: 49,020,728
RAC: 16,630
Germany
Message 1374708 - Posted: 1 Jun 2013, 11:08:58 UTC

Yep, just got 52 tasks for GPU.


With each crime and every kindness we birth our future.

ID: 1374708 · Report as offensive
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 5847
Credit: 330,516,198
RAC: 7,684
Panama
Message 1374715 - Posted: 1 Jun 2013, 11:22:46 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


ID: 1374715 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7229
Credit: 87,235,769
RAC: 7,350
Australia
Message 1374728 - Posted: 1 Jun 2013, 11:49:51 UTC - in response to Message 1374715.
Last modified: 1 Jun 2013, 11:50:23 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

ID: 1374728 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11136
Credit: 83,520,456
RAC: 41,232
United Kingdom
Message 1374751 - Posted: 1 Jun 2013, 13:28:33 UTC - in response to Message 1374728.

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?

let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

You're welcome to look at my Kepler on Beta (Tests of new scheduler features)

I've updated the thread since you dropped by.

ID: 1374751 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7229
Credit: 87,235,769
RAC: 7,350
Australia
Message 1374760 - Posted: 1 Jun 2013, 14:19:43 UTC - in response to Message 1374751.
Last modified: 1 Jun 2013, 14:23:09 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?

let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

You're welcome to look at my Kepler on Beta (Tests of new scheduler features)

I've updated the thread since you dropped by.


Only raises more questions for me, as per my response there


Yep different numbers than when I looked (which is certainly a factor in trying to work out if the mechanism is remotely working).

With nearly 1000 consecutive valid on Cuda5, are you suggesting the average processing rate for that is more or less accurate than the ~100 & ~300 for the lower Cuda revisions ?

With a given more or less random mix of tasks in a large enough population, which APR is correct ? The measured one or one concocted from a synthetic benchmark?

I'm not attempting to answer those questions myself, other than to suggest perhaps 100-300 tasks for a given app version isn't enough AR spread to dial in.

How many would make averages relatively stable, and what would happen if you upgraded to 2 x classified 780 water cooled, or downgraded to an 8400GS ? .... and should the system handle that without reset of some sort ?

"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

ID: 1374760 · Report as offensive
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 5847
Credit: 330,516,198
RAC: 7,684
Panama
Message 1374778 - Posted: 1 Jun 2013, 15:02:45 UTC - in response to Message 1374728.
Last modified: 1 Jun 2013, 15:11:02 UTC

The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem?


let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it)

Sorry, i was sleeping.... here some examples: (all runs V7 from the begining)

This host is a 590+2x580: http://setiathome.berkeley.edu/host_app_versions.php?hostid=5264653

it receives only cuda50, expected cuda42 since all the GPUs are Fermis.

This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764

it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers.

All GPUs are running at stock speeds and the keplers are EVGA Classified/FTW.

I will check my others hosts for any anormalities.

Hope that helps to clarify what i try show. I´m still thinking the old way, where we choose the best apps is better.

ID: 1374778 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7229
Credit: 87,235,769
RAC: 7,350
Australia
Message 1374788 - Posted: 1 Jun 2013, 15:13:59 UTC - in response to Message 1374778.
Last modified: 1 Jun 2013, 15:15:26 UTC

Thanks, having a look with first example first. Will start a new thread


"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

ID: 1374788 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 2871
Credit: 10,620,139
RAC: 305
United States
Message 1375137 - Posted: 2 Jun 2013, 6:40:01 UTC
Last modified: 2 Jun 2013, 6:44:24 UTC

That's interesting... tasks page and wuID page shows credit is pending.. but taskID page shows [claimed...?] credit with a validate state of 'initial'.

Unless my wingmate reports before anyone reads this..

task 3011463763
workunit 1251615664

I see that same scenario for one other task, too. What seemed like probably about 10-12 hours ago, credit on the taskID was showing 0.00 if it was pending. What changed since then?


edit: or how about this one? workunit 1250118735
_0 had an error while computing but was given credit, _1 (me) is pending, and now I'm waiting for _2 to return it. *scratches head*


Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)

ID: 1375137 · Report as offensive
Tutankhamon "Communist"
Volunteer tester
Avatar

Send message
Joined: 1 Nov 08
Posts: 6081
Credit: 37,609,327
RAC: 15,361
Sweden
Message 1375197 - Posted: 2 Jun 2013, 7:49:06 UTC - in response to Message 1375137.

That's interesting... tasks page and wuID page shows credit is pending.. but taskID page shows [claimed...?] credit with a validate state of 'initial'.

Unless my wingmate reports before anyone reads this..

task 3011463763
workunit 1251615664

I see that same scenario for one other task, too. What seemed like probably about 10-12 hours ago, credit on the taskID was showing 0.00 if it was pending. What changed since then?


edit: or how about this one? workunit 1250118735
_0 had an error while computing but was given credit, _1 (me) is pending, and now I'm waiting for _2 to return it. *scratches head*



Eric did run his credit granting script, and reset some data yesterday, because too low credit was given to some APs. That's probably why you see what you see now.

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1375011
This is a test of the Emergency Moron System. Had there been a real moron in the room, there would've been a small mushroom cloud in the place where the idiot had been standing.

ID: 1375197 · Report as offensive
zoom314
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 56731
Credit: 40,722,803
RAC: 4,812
United States
Message 1376079 - Posted: 3 Jun 2013, 18:19:37 UTC

Say is anyone getting any cpu wu's?

I don't have any errors in the app info file, state or otherwise, I've got cpu and both v6 and v7 checked, yet no cpu wu's, I've got plenty of gpu wu's of course.


Pluto is still a planet.

Beep! Beep!

ID: 1376079 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7475
Credit: 90,886,558
RAC: 45,297
Australia
Message 1376092 - Posted: 3 Jun 2013, 18:47:18 UTC - in response to Message 1376079.

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.
Grant
Darwin NT

ID: 1376092 · Report as offensive
zoom314
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 56731
Credit: 40,722,803
RAC: 4,812
United States
Message 1376105 - Posted: 3 Jun 2013, 19:03:22 UTC - in response to Message 1376092.

Say is anyone getting any cpu wu's?

Just installed Lunatics optimised, so i won't be getting any work till the time to completion settles back down.

I finally got 11 cpu wu's, after a bit.
Pluto is still a planet.

Beep! Beep!

ID: 1376105 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7475
Credit: 90,886,558
RAC: 45,297
Australia
Message 1376586 - Posted: 4 Jun 2013, 17:56:26 UTC - in response to Message 1376105.


Lower credit, over the top time estimates since installing the new optimised application, and now i'm getting sticky downloads. Multiple re-tries & they're still there.


Grant
Darwin NT

ID: 1376586 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1376589 - Posted: 4 Jun 2013, 17:58:58 UTC - in response to Message 1376586.

Same here Grant & for my team too...
Before 575KC / On stage 476KC


ID: 1376589 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7475
Credit: 90,886,558
RAC: 45,297
Australia
Message 1376606 - Posted: 4 Jun 2013, 18:21:01 UTC - in response to Message 1376589.


Don't know if they did some server tweaking during the outage- if they did i hope they undo it ASAP.
All downloads are sitting there for several minutes before they start to download. When they do start to download they timeout while doing so. It's taking multiple re-tries to download each WU, and the download rates have gone from 120kB/s down to less than 10kB/s.


Grant
Darwin NT

ID: 1376606 · Report as offensive
Bob Giel
Volunteer tester

Send message
Joined: 11 Jan 04
Posts: 74
Credit: 5,281,610
RAC: 267
United States
Message 1376630 - Posted: 4 Jun 2013, 19:27:40 UTC

Looks like we're back to the dreaded "transient HTTP error". Guess something on the backend side went "ka-put".


6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:14 PM | SETI@home | Started download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:19:16 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:19:58 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:19:58 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.250: transient HTTP error
6/4/2013 2:19:58 PM | SETI@home | Backing off 40 min 35 sec on download of 21fe09ac.5884.5810.11.12.250
6/4/2013 2:19:59 PM | | Internet access OK - project servers may be temporarily down.
6/4/2013 2:21:19 PM | | Project communication failed: attempting access to reference site
6/4/2013 2:21:19 PM | SETI@home | Temporarily failed download of 21fe09ac.5884.5810.11.12.253: transient HTTP error
6/4/2013 2:21:19 PM | SETI@home | Backing off 52 min 9 sec on download of 21fe09ac.5884.5810.11.12.253
6/4/2013 2:21:20 PM | | Internet access OK - project servers may be temporarily down.


ID: 1376630 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 805
Credit: 1,678,562
RAC: 44
Germany
Message 1377149 - Posted: 5 Jun 2013, 20:57:33 UTC
Last modified: 5 Jun 2013, 21:02:58 UTC

There must be something the servers didn't like about me installing MB v7. My Laptop is down to 0 tasks per day for v7 after just receiving it's first v7 WU, my desktop is down to 33 per day. Both computers have not returned any results for v7... have I done something wrong? Or have the servers done something wrong?

I mean, it's not an issue with 33 tasks per day, but 0 per day means actually, that this machine should never get any task for this application... that would be somehow disappointing...


.

ID: 1377149 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 664
Credit: 350,361,769
RAC: 131,118
Australia
Message 1377229 - Posted: 6 Jun 2013, 0:19:45 UTC - in response to Message 1377149.


I gather the laptop is the one with the (forgive the experssion) "m processor".

Did you load v7 standard or lunatics 0.41.


ID: 1377229 · Report as offensive
Profile AkioProject Donor
Avatar

Send message
Joined: 18 May 11
Posts: 373
Credit: 26,369,849
RAC: 2,321
United States
Message 1377242 - Posted: 6 Jun 2013, 1:30:21 UTC

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?


ID: 1377242 · Report as offensive
Profile arkaynProject Donor
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4097
Credit: 51,576,090
RAC: 1,593
United States
Message 1377246 - Posted: 6 Jun 2013, 1:45:16 UTC - in response to Message 1377242.

Just for clarification - Is it just me, or does it seem like these v7 WU's take longer to crunch?


They do take longer to crunch as they have additional items to search for.

ID: 1377246 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.