Panic Mode On (89) Server Problems?

Message boards : Number crunching : Panic Mode On (89) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 24 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1562771 - Posted: 27 Aug 2014, 0:49:07 UTC - in response to Message 1562754.  

Well it looks like those files to be split at the bottom of the list, that have been there for 2-3 months now, look like they'll be there even longer seeing as new files are already being added again over the top of them again (those new files could've waited until tomorrow) and the 1st new file has 3 AP splitters again tied up plus that file only seems to be producing errors as well. :-(

Cheers.

Given the choice I would take error channels over hung/stuck ones.

Not a good day for AP, it appears.
Looks like the second dataset being split is producing errors as well.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1562771 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1562819 - Posted: 27 Aug 2014, 2:45:42 UTC

Crap....
A third and fourth AP dataset have launched, and both showing errors as well.
And they are not from the same timeframe.
I hope the WUs that are going out are not corrupted.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1562819 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563043 - Posted: 27 Aug 2014, 13:13:59 UTC
Last modified: 27 Aug 2014, 13:18:22 UTC

Did most of the channels for the 8 data sets have mostly errors? It looks like they stopped loading more data to be split for now.
Let's hope they noticed the errors and are looking into it & that the MB channels splitters don't get hung up on those data sets.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563043 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1563045 - Posted: 27 Aug 2014, 13:16:11 UTC - in response to Message 1563043.  

Did the most of the channels for the 8 data sets have mostly errors? It looks like they stopped loading more data to be split for now.
Let's hope they noticed the errors and are looking into it & that the MB channels splitters don't get hung up on those data sets.

I didn't see any without errors, but then I may not have see all that were loaded today.

Cheers.
ID: 1563045 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1563055 - Posted: 27 Aug 2014, 13:36:50 UTC
Last modified: 27 Aug 2014, 13:37:39 UTC

Please forgive my "stupid question" but i´m a very curious.

If spliting is a simple process of read a big amount of data (50 GB tape) dive and write it in smaller parts (8MB in the case of AP), then how a is possible to get an "splitting error" since all you do is read the data and re-write it on another place?

Obviusly taking away hardware error itself (a disk media error for example). But you are sugesting the error is in the stream of data not the media, of when you talk about splitter erros you are relay talking about a disk media error?
ID: 1563055 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563064 - Posted: 27 Aug 2014, 13:51:03 UTC - in response to Message 1563055.  

Please forgive my "stupid question" but i´m a very curious.

If spliting is a simple process of read a big amount of data (50 GB tape) dive and write it in smaller parts (8MB in the case of AP), then how a is possible to get an "splitting error" since all you do is read the data and re-write it on another place?

Obviusly taking away hardware error itself (a disk media error for example). But you are sugesting the error is in the stream of data not the media, of when you talk about splitter erros you are relay talking about a disk media error?

The server status marks a channels with a gray block when there is an error with that channel. Instead of the bright green block for completed. It may not mean that all data in that channel had errors.
We don't know the error that occurred but I can imagine several types.
- Splitter processes has a memory issue.
- Data read error
- Data write error
- Corrupt data in data set
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563064 · Report as offensive
Profile bill
Avatar

Send message
Joined: 27 Apr 12
Posts: 171
Credit: 2,167,701
RAC: 0
United Kingdom
Message 1563080 - Posted: 27 Aug 2014, 14:28:24 UTC

Can seti not send out the data without spliting or would 50G be to much for hosts pc
ID: 1563080 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1563089 - Posted: 27 Aug 2014, 14:47:31 UTC - in response to Message 1563080.  

Can seti not send out the data without spliting or would 50G be to much for hosts pc

Beside a lot of other problems, even if you could DL 50 GB of data (very few have the capacity to safely do that, could take days to DL) process this hugh amount of data could take about a year even on the high end GPU´s. So is not practical.
ID: 1563089 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563091 - Posted: 27 Aug 2014, 14:48:35 UTC - in response to Message 1563080.  
Last modified: 27 Aug 2014, 14:52:57 UTC

Can seti not send out the data without spliting or would 50G be to much for hosts pc

With a 1.5Mb connection it would take several hours to download that much data. With even slower upload rate it would take much much longer to upload the results.
There are some people that still use dial up as well.

Could they? Sure.
Is it practical? No.
Would it limit the number of people that could participate? Yes.

Never mind the fact a new apps would have to be written, or if a normal desktop could even handle that much work to chew on.

One 50GB data set generates something like 64,000 MB & 6,250 AP tasks.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563091 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1563101 - Posted: 27 Aug 2014, 15:03:41 UTC - in response to Message 1563091.  

One 50GB data set generates something like 64,000 MB & 6,250 AP tasks.


I think I just heard one of my computers salivate, if that is EVEN possible...lol
ID: 1563101 · Report as offensive
Profile bill
Avatar

Send message
Joined: 27 Apr 12
Posts: 171
Credit: 2,167,701
RAC: 0
United Kingdom
Message 1563103 - Posted: 27 Aug 2014, 15:12:17 UTC

yes I now see the problems with big down load the dial up would be on for ever.I am learn thank you lot I use my first pc a few years ago just before I came to seti
ID: 1563103 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1563124 - Posted: 27 Aug 2014, 16:24:28 UTC - in response to Message 1563043.  

Did most of the channels for the 8 data sets have mostly errors? It looks like they stopped loading more data to be split for now.
Let's hope they noticed the errors and are looking into it & that the MB channels splitters don't get hung up on those data sets.


And all AP:s gone out. Big problems with AP splitting? Only about 30k units were split...
ID: 1563124 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1563179 - Posted: 27 Aug 2014, 17:58:02 UTC

Still have some AP's left, I'm good:)
rOZZ
Music
Pictures
ID: 1563179 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1563195 - Posted: 27 Aug 2014, 18:25:55 UTC

I'd like to point one problem in AstroPulse units in Seti@home:

How it is still possible to ONE computer to have almost 10% of the total AstroPulse units in progress?

SSP as of 27 Aug 2014, 18:20:09 UTC: Results out in the field 120,599
As of 27 Aug 2014, 18:24:00 UTC: In progress AstroPulse v6 tasks for computer 6016862 In progress (1189)

I hope You get my point. And somebody in S@H staff.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1563195 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563204 - Posted: 27 Aug 2014, 18:33:23 UTC - in response to Message 1563195.  
Last modified: 27 Aug 2014, 18:34:02 UTC

I'd like to point one problem in AstroPulse units in Seti@home:

How it is still possible to ONE computer to have almost 10% of the total AstroPulse units in progress?

SSP as of 27 Aug 2014, 18:20:09 UTC: Results out in the field 120,599
As of 27 Aug 2014, 18:24:00 UTC: In progress AstroPulse v6 tasks for computer 6016862 In progress (1189)

I hope You get my point. And somebody in S@H staff.

1189/120599 = 0.009859 or 0.9%
12,000 would be 10% of 120,000
Some older version of BOINC don't work with the limit system. I don't think their version is one of them. There is also rescheduling to hoard more work. Which was discussed some time ago.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563204 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1563213 - Posted: 27 Aug 2014, 18:39:33 UTC - in response to Message 1563204.  
Last modified: 27 Aug 2014, 18:45:18 UTC

I'd like to point one problem in AstroPulse units in Seti@home:

How it is still possible to ONE computer to have almost 10% of the total AstroPulse units in progress?

SSP as of 27 Aug 2014, 18:20:09 UTC: Results out in the field 120,599
As of 27 Aug 2014, 18:24:00 UTC: In progress AstroPulse v6 tasks for computer 6016862 In progress (1189)

I hope You get my point. And somebody in S@H staff.

1189/120599 = 0.009859 or 0.9%
12,000 would be 10% of 120,000
Some older version of BOINC don't work with the limit system. I don't think their version is one of them. There is also rescheduling to hoard more work. Which was discussed some time ago.

I'm on 6.10.58 on all my rigs, that rig is on 6.10.60...
My rigs use the limits. I suspect rescheduling.

And I believe that a small bit of code tweaking could probably put an end to that nonesense if it looked at the total amount of work a computer has downloaded and compared that to what it is entitled to.

It would appear the right now, the servers only look at it on a per device basis, and if any device is not up to the 100 limit, it sends some more. It does not take into account, apparently, the fact that another device on the host has way more work than the limits should allow it to have received. The host in question should have no more than 500 WUs at a time.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1563213 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1563221 - Posted: 27 Aug 2014, 18:44:49 UTC - in response to Message 1563204.  

I'd like to point one problem in AstroPulse units in Seti@home:

How it is still possible to ONE computer to have almost 10% of the total AstroPulse units in progress?

SSP as of 27 Aug 2014, 18:20:09 UTC: Results out in the field 120,599
As of 27 Aug 2014, 18:24:00 UTC: In progress AstroPulse v6 tasks for computer 6016862 In progress (1189)

I hope You get my point. And somebody in S@H staff.

1189/120599 = 0.009859 or 0.9%
12,000 would be 10% of 120,000
Some older version of BOINC don't work with the limit system. I don't think their version is one of them. There is also rescheduling to hoard more work. Which was discussed some time ago.


My mistake on calculations, I am sorry. (note to self, do not make calculations on high fever)

Anyway this computer has been mentioned several times in rescheduling threads.

Time to go bed, sorry again for this fever "rage" message...
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1563221 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563228 - Posted: 27 Aug 2014, 18:51:27 UTC - in response to Message 1563221.  

I'd like to point one problem in AstroPulse units in Seti@home:

How it is still possible to ONE computer to have almost 10% of the total AstroPulse units in progress?

SSP as of 27 Aug 2014, 18:20:09 UTC: Results out in the field 120,599
As of 27 Aug 2014, 18:24:00 UTC: In progress AstroPulse v6 tasks for computer 6016862 In progress (1189)

I hope You get my point. And somebody in S@H staff.

1189/120599 = 0.009859 or 0.9%
12,000 would be 10% of 120,000
Some older version of BOINC don't work with the limit system. I don't think their version is one of them. There is also rescheduling to hoard more work. Which was discussed some time ago.


My mistake on calculations, I am sorry. (note to self, do not make calculations on high fever)

Anyway this computer has been mentioned several times in rescheduling threads.

Time to go bed, sorry again for this fever "rage" message...

I always suggest plenty of alcoholic drinks to kill bugs that cause fever.

I'm not sure SETI@home staff has the free time to verify people reschedule hoarding tasks. Then create a method to stop it from occurring.
The few people that wish to do this will spend much more effort working around it than the people running it have to keep up.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563228 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1563236 - Posted: 27 Aug 2014, 18:59:42 UTC - in response to Message 1563228.  



I'm not sure SETI@home staff has the free time to verify people reschedule hoarding tasks. Then create a method to stop it from occurring.
The few people that wish to do this will spend much more effort working around it than the people running it have to keep up.

I proposed what I think might be a possible solution in my post below.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1563236 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1563262 - Posted: 27 Aug 2014, 19:32:21 UTC - in response to Message 1563236.  



I'm not sure SETI@home staff has the free time to verify people reschedule hoarding tasks. Then create a method to stop it from occurring.
The few people that wish to do this will spend much more effort working around it than the people running it have to keep up.

I proposed what I think might be a possible solution in my post below.

They have the ability to set limits per host or per device. Per host is a fixed value instead of one that varies with hardware.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1563262 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (89) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.