Panic Mode On (86) Server Problems?

Message boards : Number crunching : Panic Mode On (86) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

AuthorMessage
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1480820 - Posted: 23 Feb 2014, 2:00:07 UTC

Work for AP is still a no go.
Very frustrating to see that channels are being done but no work is being sent.

So much for this run

Michael
ID: 1480820 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1480835 - Posted: 23 Feb 2014, 3:42:24 UTC

13 AP + 13 MB channels in progress Ops, we still have a problem.
ID: 1480835 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1480837 - Posted: 23 Feb 2014, 4:08:15 UTC - in response to Message 1480835.  
Last modified: 23 Feb 2014, 4:10:39 UTC

Every channel is ending in error

84 AP so far
Nuts
ID: 1480837 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 1480839 - Posted: 23 Feb 2014, 4:42:36 UTC - in response to Message 1480837.  

Every channel is ending in error

84 AP so far
Nuts



+1
Dave

ID: 1480839 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 1480844 - Posted: 23 Feb 2014, 5:36:19 UTC - in response to Message 1480837.  

Every channel is ending in error

84 AP so far
Nuts



Now over 100
Dave

ID: 1480844 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1480848 - Posted: 23 Feb 2014, 6:25:29 UTC - in response to Message 1480790.  
Last modified: 23 Feb 2014, 6:28:15 UTC

Watching ssp for a while, it appears that all AP channels are ending in error.

If wu cannot be sent because of db off line, why split to error?

AFAIK,
the AstroPulse database could be offline during splitters are running.
The database just safe the results of the members PCs.
The AstroPulse channel errors are receiving errors from the telescope (interferences).
ID: 1480848 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1480850 - Posted: 23 Feb 2014, 6:35:26 UTC - in response to Message 1480808.  

The query to the database was changed to order the results by sent time rather than ID. That will keep all tasks assigned for a single request together, but how MySQL should order them within a group sent at the same time is not specified. If it turns out to be a problem I expect Dr. Anderson would add an additional term to the query.
                                                                   Joe

That change was made after a discussion that I started with one or two others with an interest in task results. The divergence between a list of results sorted by ID and a list sorted by date is much greater at other BOINC projects - Einstein was the example in question.

It would be ideal (IMHO) if we could choose our own display order - name, ID, date, etc. - independently of the format of the display (name or ID). It's the sort of thing which BOINCstats makes look very easy.

David indicated that he wouldn't have time to make more complicated changes himself, but would be happy to accept code contributions. Any PHP/MySQL coders out there who could make some queries and page controls?


That's a disturbing situation, indicating that California might have slipped below the internationally accepted extreme poverty line of one US dollar (US$1) per day. "Please send patches, hungry"
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1480850 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1481053 - Posted: 23 Feb 2014, 21:18:27 UTC
Last modified: 23 Feb 2014, 21:23:19 UTC

I'm not sure why but I did get 1 AP task at 2.00 am and 2 AP tasks at 9:30 am
BC Canada time

The Astropulse Science DB was still disabled at the time of the downloads.

Thank god I got 120 tasks Friday.
As it is I can crunch AP for 2 days max then totally dry. About 40 hours
Right now I have 74 total and 29 of those are 601 CPU tasks

Please turn on the Science database so we can get our cache filled if that is the only problem.
Are the tapes that were loaded any good?
Seems like a awful waste of data


Michael Miles
ID: 1481053 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1481055 - Posted: 23 Feb 2014, 21:22:22 UTC - in response to Message 1481053.  

I'm not sure why but I did get 1 AP task at 2.00 am and 2 AP tasks at 9:30 am
BC Canada time

The Astropulse Science DB was still disabled at the time of the downloads.

Thank god I got 120 tasks Friday.
As it is I can crunch AP for 2 days max then totally dry. About 40 hours
Right now I have 74 total and 29 of those are 601 CPU tasks

Please turn on the Science database so we can get our cache filled if that is the only problem.
Are the tapes that were loaded any good?
Or are they just being allowed to split to error on purpose. Seems like a awful waste of data if that is the case.


Michael Miles

Resends most likely I guess.

Cheers.
ID: 1481055 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1481058 - Posted: 23 Feb 2014, 21:24:06 UTC - in response to Message 1481055.  

That is what I guess too. It was exciting though. LOL
ID: 1481058 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1481061 - Posted: 23 Feb 2014, 21:52:04 UTC - in response to Message 1481053.  

Please turn on the Science database so we can get our cache filled if that is the only problem.

That the science database is down doesn't really matter to us crunchers, it just means there is a backlog of completed AP Workunits waiting for assimilation, and it'll grow as more results are validated.

Claggy
ID: 1481061 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1481066 - Posted: 23 Feb 2014, 22:11:46 UTC - in response to Message 1481061.  

That the science database is down doesn't really matter to us crunchers, it just means there is a backlog of completed AP Workunits waiting for assimilation, and it'll grow as more results are validated.

Claggy

Just curious as to why no work is reaching us when so many are being split and channels ending in error?
ID: 1481066 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1481087 - Posted: 23 Feb 2014, 23:17:39 UTC

Something going on with the SSP. Has someone entered the building?
                            Progress
 	             Multibeam  Astropulse  

total channels to do:	462	   442
ID: 1481087 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1481088 - Posted: 23 Feb 2014, 23:21:42 UTC - in response to Message 1481066.  

Just curious as to why no work is reaching us when so many are being split and channels ending in error?

The majority of the work going out is going to be Seti v7, that's what's got the high result creation rate.

Claggy
ID: 1481088 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1481143 - Posted: 24 Feb 2014, 6:25:25 UTC - in response to Message 1481088.  

For whatever reason, splitters still struggling to meet demand, let alone build up a ready-to-send buffer.
Grant
Darwin NT
ID: 1481143 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1481152 - Posted: 24 Feb 2014, 6:52:57 UTC

Still limbo in Server land I see this morning : I actually got 2 tasks in the last 48 hours...
ID: 1481152 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1481164 - Posted: 24 Feb 2014, 8:45:31 UTC

SSP indicates the MB splitters has 13 channels in progress (creation rate 25/sec( as the AP too (creation rate close to 0). I never see somethign like that before (normaly few MB (maybe 6 or 7) and few AP chanels in progress only). This could indicate aproblem? That starts after the last JBOD crash.
ID: 1481164 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1481166 - Posted: 24 Feb 2014, 9:00:40 UTC - in response to Message 1481164.  

SSP indicates the MB splitters has 13 channels in progress (creation rate 25/sec( as the AP too (creation rate close to 0). I never see somethign like that before (normaly few MB (maybe 6 or 7) and few AP chanels in progress only). This could indicate aproblem? That starts after the last JBOD crash.

I'd take those "13 channels in progress" with a pinch of salt.

My working guess (but without actual knowledge) is that when there is a crash/restart, the splitters don't necessarily go back to working on the same tapes that were 'in progress' before the crash. So, the new tapes are genuinely 'running', but the old tapes are maybe 'waiting to run', as we see in our BOINC Managers for a task which has been started but then paused for some reason. Unfortunately, the SSP doesn't give us that fine level of detail, and it doesn't change quickly enough to make it obvious which tapes are moving and which are waiting.
ID: 1481166 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1481168 - Posted: 24 Feb 2014, 9:05:33 UTC - in response to Message 1481166.  

Let´s see is your guess is right and wath we get in the next days. tks.

Could be an AP tape stucked? The AP creation rate is close to zero.
ID: 1481168 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1481169 - Posted: 24 Feb 2014, 9:13:27 UTC - in response to Message 1481168.  

Let´s see is your guess is right and wath we get in the next days. tks.

Could be an AP tape stucked? The AP creation rate is close to zero.

No AP splitters are running (likely due to the AP database being disabled) and what you see in the creation rate for AP's is just resends going out.

Cheers.
ID: 1481169 · Report as offensive
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

Message boards : Number crunching : Panic Mode On (86) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.