Panic Mode On (23) Server problems

Message boards : Number crunching : Panic Mode On (23) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

AuthorMessage
Profile Michele Di Masi (KE6GRN)

Send message
Joined: 26 Aug 09
Posts: 3
Credit: 40,244
RAC: 0
United States
Message 930391 - Posted: 2 Sep 2009, 16:12:46 UTC

Hi All,

I have not gotten any assigments for almost two days. Is there a problem with SETI? Has anybody else been having problems getting assigments? Are the SETI servers down? If so does anybody know when they will come back on?

Thanks,
Michele Di Masi
ID: 930391 · Report as offensive
Patrick Pearson

Send message
Joined: 19 May 99
Posts: 6
Credit: 12,206,167
RAC: 15
United Kingdom
Message 930392 - Posted: 2 Sep 2009, 16:22:48 UTC

Part of the problem of lack of work may be the number of work units that users are hogging on their systems. I have just checked back on a WU one of my computers finished in July and found that the person who is also crunching that unit has over 1200 work units waiting to be processed. There is no way he is going to get them done in time. How do users manage to download and hog so many units and so leave little or no work for many of us?

Is there not a ceiling on the number of units any one computer can download at any one time? Surely it should be linked in some way to the processing capacity of that particular machine or it just wastes everyone's time.
ID: 930392 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 930393 - Posted: 2 Sep 2009, 16:34:27 UTC - in response to Message 930361.  

Can someone please explain why the technical folks have not given any indication of what is currently wrong? It's been this way for quite a while, now. Certainly long enough for a comment or explanation. I understand that they are overworked, BUT...it is just commomn courtesy.

Matt is the one who generally writes, so if we aren't getting updates, it probably means he's on vacation.

Eric has posted in a couple of threads, but as you point out, they're short on staff, and if Matt is on vacation then they're down by 1/3rd.

Besides the project is doing what they've always promised. They've always promised that there will be times when there isn't enough to go around.
ID: 930393 · Report as offensive
Profile David @ TPS

Send message
Joined: 30 Sep 04
Posts: 70
Credit: 11,323,275
RAC: 0
United States
Message 930396 - Posted: 2 Sep 2009, 16:42:12 UTC - in response to Message 930392.  

Patrick:

Not being sure where you got that data, I have probably that many "pending" on my user page but they are spread over probably 15 boxes and 50 cores. My cache is set for 10 days. I could suspend SETI for a week or 2 and still (proably) finish them in time, but I wont. I am just increasing the shares of my backup projects. Boinc will throw SETI into High Priority if it feels it needs to.

If you look at individual computers, I have probably 50 or less on each box.

Dave
ID: 930396 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 930400 - Posted: 2 Sep 2009, 16:48:57 UTC - in response to Message 930396.  

Patrick:

Not being sure where you got that data, I have probably that many "pending" on my user page but they are spread over probably 15 boxes and 50 cores. My cache is set for 10 days. I could suspend SETI for a week or 2 and still (proably) finish them in time, but I wont. I am just increasing the shares of my backup projects. Boinc will throw SETI into High Priority if it feels it needs to.

If you look at individual computers, I have probably 50 or less on each box.

Dave


I have seen several users making post of "i just downloaded 800, 1500, or 1200 tasks" after an outage and the servers comming back up. Some of these are users with just 1 or 2 machines.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 930400 · Report as offensive
Profile Salvador Lopez

Send message
Joined: 20 May 99
Posts: 12
Credit: 2,183,995
RAC: 0
Mexico
Message 930402 - Posted: 2 Sep 2009, 16:58:35 UTC - in response to Message 930391.  

I'm exactly in the same position since yesterday I receive
02/09/2009 09:53:41 a.m. SETI@home Message from server: (Project has no jobs available)

ID: 930402 · Report as offensive
Profile David @ TPS

Send message
Joined: 30 Sep 04
Posts: 70
Credit: 11,323,275
RAC: 0
United States
Message 930403 - Posted: 2 Sep 2009, 17:03:20 UTC - in response to Message 930400.  

WOW! Guess my machines aren't that greedy!
ID: 930403 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 930408 - Posted: 2 Sep 2009, 17:19:10 UTC - in response to Message 930402.  

This tells you what we are dealing with currently.

Results ready to send	1	5	0m
Current result creation rate	0.1940/sec	0.5151/sec	5m


Currently there is no work being created so there is nothing to be sent out.

Keep an eye on this link if you see those messages.
http://setiathome.berkeley.edu/sah_status.html

ID: 930408 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 930409 - Posted: 2 Sep 2009, 17:20:54 UTC - in response to Message 930392.  

...There is no way he is going to get them done in time. How do users manage to download and hog so many units and so leave little or no work for many of us?

Why not? On some CUDA GPUs, a WU takes about 5 minutes to finish (some more some less). That makes 288 per day per GPU. Some hosts have 4 GPUs; that makes 1152 tasks per day. So, the host in question probably has a cache of one or two days (for bigger caches, BOINC manager seems to get instable).

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)

SETI@home classic workunits 3,758
SETI@home classic CPU time 66,520 hours
ID: 930409 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 930411 - Posted: 2 Sep 2009, 17:36:24 UTC - in response to Message 930393.  
Last modified: 2 Sep 2009, 17:38:14 UTC

Can someone please explain why the technical folks have not given any indication of what is currently wrong? It's been this way for quite a while, now. Certainly long enough for a comment or explanation. I understand that they are overworked, BUT...it is just commomn courtesy.

Matt is the one who generally writes, so if we aren't getting updates, it probably means he's on vacation.

Eric has posted in a couple of threads, but as you point out, they're short on staff, and if Matt is on vacation then they're down by 1/3rd.

Besides the project is doing what they've always promised. They've always promised that there will be times when there isn't enough to go around.


LOL...

Especially when they are doing things like bringing a new Master BOINC Database Server online, rolling out ATi GPU support, getting started on a new school year, and probably a dozen other items I'm not thinking of right offhand! ;-)

The part I'm wondering about is what's the story with Jocelyn being disabled? This means that all queries have to go through Mork, which isn't going to help its performance if past history is any indication. I suppose it could be they are trying to see how much grunt Mork has. The alternative is something we probably really don't want to think about (went in-Sidious on us)! :-D

Alinator
ID: 930411 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 930412 - Posted: 2 Sep 2009, 17:41:38 UTC - in response to Message 930411.  
Last modified: 2 Sep 2009, 17:44:10 UTC

...(went in-Sidious on us)! :-D

rotflmao

[edit]When again was Sidious taken out of service?[/edit]
ID: 930412 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 930413 - Posted: 2 Sep 2009, 17:47:22 UTC - in response to Message 930412.  

A week or ten days ago, IIRC.

Alinator
ID: 930413 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 930415 - Posted: 2 Sep 2009, 18:24:01 UTC

Or more simply:

1.) Yes

2.) Yes

3.) No, but currently as useful as pockets in underwear.

4.) No, but we could always start a pool on when the jam will clear!

:-D

Alinator
ID: 930415 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 930419 - Posted: 2 Sep 2009, 18:29:00 UTC - in response to Message 930408.  

This tells you what we are dealing with currently.

Results ready to send	1	5	0m
Current result creation rate	0.1940/sec	0.5151/sec	5m


Currently there is no work being created so there is nothing to be sent out.

If the creation rate is not zero, then work is being created. The numbers are pretty small, so there is likely more demand than there is work, and many requests will go unfilled.

ID: 930419 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 930422 - Posted: 2 Sep 2009, 18:32:18 UTC - in response to Message 930411.  


LOL...

Especially when they are doing things like bringing a new Master BOINC Database Server online, rolling out ATi GPU support, getting started on a new school year, and probably a dozen other items I'm not thinking of right offhand! ;-)

The part I'm wondering about is what's the story with Jocelyn being disabled? This means that all queries have to go through Mork, which isn't going to help its performance if past history is any indication. I suppose it could be they are trying to see how much grunt Mork has. The alternative is something we probably really don't want to think about (went in-Sidious on us)! :-D

Alinator

Do we know that SETI is working on ATI support? We know that BOINC is working on that, but first BOINC has to know how to detect and schedule another GPU type, and then someone has to develop the app.

I don't know how Mork performs, and I'm not going to second-guess the reason for the change. Someone will probably tell us eventually.
ID: 930422 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 930427 - Posted: 2 Sep 2009, 18:42:36 UTC - in response to Message 930422.  

I don't know about application support for MB and AP, but we do know we are being used for testing the BOINC framework support for ATi GPUs, which was what I referring to in this case.

So as far as that goes, I don't remember a case where something new got added to BOINC which didn't break three other things or cause all kinds of unexpected problems. ;-)

In any event, since I run multiple projects the net impact on my hosts overall is zero regardless of what happens here! :-D

Alinator
ID: 930427 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 930430 - Posted: 2 Sep 2009, 18:44:54 UTC - in response to Message 930422.  


LOL...

Especially when they are doing things like bringing a new Master BOINC Database Server online, rolling out ATi GPU support, getting started on a new school year, and probably a dozen other items I'm not thinking of right offhand! ;-)

The part I'm wondering about is what's the story with Jocelyn being disabled? This means that all queries have to go through Mork, which isn't going to help its performance if past history is any indication. I suppose it could be they are trying to see how much grunt Mork has. The alternative is something we probably really don't want to think about (went in-Sidious on us)! :-D

Alinator

Do we know that SETI is working on ATI support? We know that BOINC is working on that, but first BOINC has to know how to detect and schedule another GPU type, and then someone has to develop the app.

I don't know how Mork performs, and I'm not going to second-guess the reason for the change. Someone will probably tell us eventually.


Also there is the chance of ATI & Nvidia cards being mixed in one machine. I'm sure someone will have that configuration. I would for testing at least. BOINC then has to track x # of cpus, y # of nvida gpus, & z # of ati gpus. Might get messy in there.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 930430 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 930431 - Posted: 2 Sep 2009, 18:47:46 UTC - in response to Message 930409.  

2 newbie cents about being a Wingman Hostage: My machine (a NetBook w/Atom and 3G -- yes, mobile all day long) is lucky to crank out 1 WU a day, but despite that, it seems I'm finding myself lately waiting up to 6+ days for a Wingman (Anvilman) with some heaven sent dream machine to pipe in with their Concur so that my tiny Acct can be credited. And I agree, I too see Wingman queues with hundreds of In-Progress items, and WORSE, they're not always being tackled in DATE order (and sometimes even have in-progress for deadline expired items). I'm glad that what takes me 30,000 seconds only takes them 120, but I've no accurate idea of when that 120 might begin.

I know the queuing system is complex and evolved to optimize the complex needs of the project and the conflicting needs of participants. And having a 30+ year predictive scope makes SETI the wrong project for people gratified by real-time measurable progress, especially as a quorum is apparently required to keep the TI honest in their search for ETI, but perhaps the rules can be changed up a bit so those massive contributing heavy lifters (and I fully accept their contrbution far outstrips mine thus they're entitled to a bigger slice of the pie -- even though others call this hogging or land grabbing) don't hold hostage us multitude of little guys (the larger community of public support which is equally beneficial to the project from an awareness & political perspective).

#1. Perhaps try and pair wing partners a bit closer on Avg turn around time. Don't pair a 1-dayer with 6-dayer. Keep it within 100%.

#2. Or perhaps change to CREDIT the work performed assumptive that the Concur will come thru (which I bet it does 99.998% of the time), and debit it later if that assumption proves false (ya know, just like a bank balance shows checks that have not yet cleared). You can govern this to historical daily WU's for that individual.

#3. Or perhaps slightly skew the points awarded to FAVOR the faster turn-around guy (still dependant on across the board results Concur). Maybe swipe 10% of the delaying partner's eventual credit for the WU for each day of wait time since initial result submission (gets a bit tricky if quorums of 3 or 4 are required -- but what a great Game Theory exercise).

... or maybe it stays the way it is, and that's just part of the price of participation.

Cheers !
ID: 930431 · Report as offensive
Profile jrusling
Avatar

Send message
Joined: 8 Sep 02
Posts: 37
Credit: 4,764,889
RAC: 0
United States
Message 930435 - Posted: 2 Sep 2009, 18:55:46 UTC - in response to Message 930419.  

Looks like the splitters are back online and splitting. It is probably going to be a while before we can get caught back up though.

http://boincstats.com/signature/-1/user/18390/sig.png
ID: 930435 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 930438 - Posted: 2 Sep 2009, 18:59:54 UTC - in response to Message 930431.  

2 newbie cents about being a Wingman Hostage: My machine (a NetBook w/Atom and 3G -- yes, mobile all day long) is lucky to crank out 1 WU a day, but despite that, it seems I'm finding myself lately waiting up to 6+ days for a Wingman (Anvilman) with some heaven sent dream machine to pipe in with their Concur so that my tiny Acct can be credited. And I agree, I too see Wingman queues with hundreds of In-Progress items, and WORSE, they're not always being tackled in DATE order (and sometimes even have in-progress for deadline expired items). I'm glad that what takes me 30,000 seconds only takes them 120, but I've no accurate idea of when that 120 might begin.

I know the queuing system is complex and evolved to optimize the complex needs of the project and the conflicting needs of participants. And having a 30+ year predictive scope makes SETI the wrong project for people gratified by real-time measurable progress, especially as a quorum is apparently required to keep the TI honest in their search for ETI, but perhaps the rules can be changed up a bit so those massive contributing heavy lifters (and I fully accept their contrbution far outstrips mine thus they're entitled to a bigger slice of the pie -- even though others call this hogging or land grabbing) don't hold hostage us multitude of little guys (the larger community of public support which is equally beneficial to the project from an awareness & political perspective).

#1. Perhaps try and pair wing partners a bit closer on Avg turn around time. Don't pair a 1-dayer with 6-dayer. Keep it within 100%.

#2. Or perhaps change to CREDIT the work performed assumptive that the Concur will come thru (which I bet it does 99.998% of the time), and debit it later if that assumption proves false (ya know, just like a bank balance shows checks that have not yet cleared). You can govern this to historical daily WU's for that individual.

#3. Or perhaps slightly skew the points awarded to FAVOR the faster turn-around guy (still dependant on across the board results Concur). Maybe swipe 10% of the delaying partner's eventual credit for the WU for each day of wait time since initial result submission (gets a bit tricky if quorums of 3 or 4 are required -- but what a great Game Theory exercise).

... or maybe it stays the way it is, and that's just part of the price of participation.

Cheers !


I've always thought recent average credit was stupid myself. If I am trying to judge performance of two machines. I will really only know if those two machines are preforming completed tasks.

I think an option you would want would be something along the lines of "only send me tasks that need validation". Which I imagine would be rather complex on the server end & doesn't really benifit the project in any way.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 930438 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

Message boards : Number crunching : Panic Mode On (23) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.