The Snap App hung again. We will be down for the night.

Message boards : Number crunching : The Snap App hung again. We will be down for the night.
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 21680 - Posted: 3 Sep 2004, 2:06:31 UTC

D'oh...The Snap Appliance went Snap Crackle & POP once again...Is anybody really surprised by this in the least bit... :/
ID: 21680 · Report as offensive
Petit Soleil
Avatar

Send message
Joined: 17 Feb 03
Posts: 1497
Credit: 70,934
RAC: 0
Canada
Message 21683 - Posted: 3 Sep 2004, 2:12:43 UTC
Last modified: 3 Sep 2004, 2:13:02 UTC

What is actually store in the SNAP ?
The WU, our profiles, credits, everything ?
What would be lost if this thing would go
completely dead ? Thanks
ID: 21683 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 21684 - Posted: 3 Sep 2004, 2:24:31 UTC - in response to Message 21680.  

I'm guessing the folks at SNAP are a bit unhappy. It's their product.

... and since this is a great showcase for their high-performance storage device, I'm sure it impresses the hell out of those of us who need high performance storage.

> D'oh...The Snap Appliance went Snap Crackle & POP once again...Is anybody
> really surprised by this in the least bit... :/

ID: 21684 · Report as offensive
Redshift
Avatar

Send message
Joined: 3 Apr 99
Posts: 122
Credit: 1,244,536
RAC: 0
United States
Message 21688 - Posted: 3 Sep 2004, 2:31:42 UTC - in response to Message 21684.  

ID: 21688 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 21689 - Posted: 3 Sep 2004, 2:35:01 UTC
Last modified: 3 Sep 2004, 2:35:33 UTC

What is actually store in the SNAP ?
The WU, our profiles, credits, everything ?
What would be lost if this thing would go
completely dead ? Thanks
==========

Well I don't know what is really stored in the Snap Appliance but don't be surprised if one of the next Messages is "Ummmmm, Errrrrr, Well we are Sorry for the Inconvenience Folks but we seem to have misplaced some more WU's & will need to send them back out" Really... :/



ID: 21689 · Report as offensive
Profile Atangel

Send message
Joined: 14 May 99
Posts: 61
Credit: 1,024,161
RAC: 0
United States
Message 21691 - Posted: 3 Sep 2004, 2:37:20 UTC - in response to Message 21689.  

> What is actually store in the SNAP ?
> The WU, our profiles, credits, everything ?
> What would be lost if this thing would go
> completely dead ? Thanks
> ==========
>
> Well I don't know what is really stored in the Snap Appliance but don't be
> surprised if one of the next Messages is "Ummmmm, Errrrrr, Well we are Sorry
> for the Inconvenience Folks but we seem to have misplaced some more WU's &
> will need to send them back out" Really... :/
>

Might be time to lower my SETI resource share... make the WUs last....
ID: 21691 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 21693 - Posted: 3 Sep 2004, 2:41:49 UTC

Might be time to lower my SETI resource share... make the WUs last....
=========

If something really serious happened that they lost data it won't do you any good to do that because they may not have the data anymore and have to send the WU's out again...I don't know what happened because as usually we are kept so well informed by that message...I'm just painting a worst case scenario

ID: 21693 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 21694 - Posted: 3 Sep 2004, 2:42:29 UTC - in response to Message 21680.  
Last modified: 3 Sep 2004, 2:51:20 UTC

> D'oh...The Snap Appliance went Snap Crackle & POP once again...Is anybody
> really surprised by this in the least bit... :/
>
>

I wonder if SNAP (adaptec) understand how much "press" they are getting here! Might be worth sending someone up the bay to berkeley to fight problems first hand. People can be much more responsive if "you can go home when it works" is included

I'm not being negitive, but proposing a solution. Heck I did trouble shooting for stuff like this with Honeywell and IBM, and when things got flakey, onsite at a "visible site" could solve a problem in hours that could take days over the phone....


(SNAP is in SJ, right? And UCB is just the other end of the metro area, right?)
ID: 21694 · Report as offensive
Petit Soleil
Avatar

Send message
Joined: 17 Feb 03
Posts: 1497
Credit: 70,934
RAC: 0
Canada
Message 21695 - Posted: 3 Sep 2004, 2:44:36 UTC

I don't know what to say. I have posted a long and positive thinking
message in this thread so I can't start complaining again.

http://setiweb.ssl.berkeley.edu/sah/forum_thread.php?id=3489

Although it's tempting... No I feel sorry for them. They must be
really pissed now. I just hope it won't be resulting in data loss.
Cross finger !
ID: 21695 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 21697 - Posted: 3 Sep 2004, 2:58:31 UTC
Last modified: 3 Sep 2004, 2:59:57 UTC

One thing I find very weird, is that in the time that Seti has been down, my linux client has not been "pre-empting" but only running CP. It's been about 4 hours with CP only, and the last I heard from Seti was an attempt to contact the seti scheduler. I have seti WU's in the queue, but have not been touched since the "RTR" had an error.... Prior to the seti sched's going down, things were timeslicing quite nicely!

I have Seti at 100, and CP at 50, so 2/3 of the time should be seti, if I understand the timeslicing logic....
ID: 21697 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 21698 - Posted: 3 Sep 2004, 3:03:59 UTC
Last modified: 3 Sep 2004, 3:09:49 UTC

I have Seti at 100, and CP at 50, so 2/3 of the time should be seti, if I understand the timeslicing logic....
==========

I believe the share will be 50/50 with those settings Woody but I could be wrong...The 50 at CPDN will override the 100 at Seti and make it a 50/50 deal...If you look at your Projects Tab In the BOINC GUI it should tell you what they are running at...But with the server down it might not be correct because you can't update it...
ID: 21698 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 21700 - Posted: 3 Sep 2004, 3:15:44 UTC - in response to Message 21698.  

> I have Seti at 100, and CP at 50, so 2/3 of the time should be seti, if I
> understand the timeslicing logic....
> ==========
>
> I believe the share will be 50/50 with those settings Woody but I could be
> wrong...The 50 at CPDN will override the 100 at Seti and make it a 50/50
> deal...If you look at your Projects Tab In the BOINC GUI it should tell you
> what they are running at...

And Poorboy, this is on Linux, so there is no gui to look at! (It's all CLI!)

I'm only mixing projects on Linux boxes!
ID: 21700 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 21701 - Posted: 3 Sep 2004, 3:18:06 UTC - in response to Message 21698.  

>
> I belive the share will be 50/50 with those settings Woody but I could be
> wrong...The 50 at CPDN will override the 100 at Seti and make it a 50/50
> deal...If you look at your Projects Tab In the BOINC GUI it should tell you
> what they are running at...But with the server down it might not be correct
> because you can't update it...
>
>

No, 50 CPDN & 100 seti gives:
CPDN: 50 / (50 + 100) = 1/3
seti: 100 / (50 + 100) = 2/3

No idea for the reason seti has been pre-empted some hours, but seems to make some "interesting" choises sometimes...

As for the Snap, last something was posted about this it contained the upload & download-directories. So most likely the "hung" only needs someone hitting the reset-switch...
ID: 21701 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 21702 - Posted: 3 Sep 2004, 3:18:14 UTC

Eeerrrmmmmm...ok

ID: 21702 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 21703 - Posted: 3 Sep 2004, 3:30:35 UTC - in response to Message 21701.  

> >
> > I belive the share will be 50/50 with those settings Woody but I could
> be
> > wrong...The 50 at CPDN will override the 100 at Seti and make it a 50/50
> > deal...If you look at your Projects Tab In the BOINC GUI it should tell
> you
> > what they are running at...But with the server down it might not be
> correct
> > because you can't update it...
> >
> >
>
> No, 50 CPDN & 100 seti gives:
> CPDN: 50 / (50 + 100) = 1/3
> seti: 100 / (50 + 100) = 2/3
>
> No idea for the reason seti has been pre-empted some hours, but seems to make
> some "interesting" choises sometimes...
>
> As for the Snap, last something was posted about this it contained the upload
> & download-directories. So most likely the "hung" only needs someone
> hitting the reset-switch...
>

Even if the 50/50 was correct, it should be swithing (preempting) (as it did before the schedulers went down), but it's not! No seti is running for 6 hours now! It seems a failure in "RTR" on Linux at least, means no premeption for that project!


ID: 21703 · Report as offensive
Profile Qui-Gon
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 2940
Credit: 19,199,902
RAC: 11
United States
Message 21704 - Posted: 3 Sep 2004, 3:34:05 UTC

I know this is off subject, and you guys will think I'm a big dork, but I enjoy the tone of this thread. Complaints, but no venom. (And besides, I'm learning something.)
ID: 21704 · Report as offensive
grumpy

Send message
Joined: 2 Jun 99
Posts: 209
Credit: 152,987
RAC: 0
Canada
Message 21708 - Posted: 3 Sep 2004, 3:48:33 UTC

I have lch and seti projects.
Seti has not run for about 6 hours also. Could be because their server is down.

this is from the boinc web site.

"Where to edit your preferences
Your preferences are stored on BOINC servers. When your hosts communicate with a server they get the latest preferences, and they pass along these preferences to other servers. Thus, when you change your preferences on one project's web site, these changes will quickly spread to all your hosts, and to the web sites of all the other projects in which you participate.

If you change your preferences first at one project and then at another, the second changes will overwrite the first. To avoid this, do all your edits at one project.

Some projects may provide a web interface for editing their project-specific preferences. In this case it may be necessary to edit preferences at different sites. To avoid overwriting edits, wait until previous edits have propagated to a site before editing preferences there."

So my guest is if the preferences dump ( inbound or outbound) as not occured at the seti server the
time split won't work.It's all connected.



ID: 21708 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 21712 - Posted: 3 Sep 2004, 3:59:20 UTC - in response to Message 21708.  

> I have lch and seti projects.
> Seti has not run for about 6 hours also. Could be because their server is
> down.


But, isn't part of the idea of boinc that they can run if the project is down? They shouldn't be stalled by the server(s) unless there is nothing to do!

While Seti is down (right now), shouldn't my Seti WU's be crunched per the prefernces that my computer had prior to seti going down? Maybe that's the problem... My seti-specific prefs on my local machine were lost as the seti scheduler wasn't responding, and therefore, all the time goes to CP!

(kind of like where credits get set to 0 during the time that the scheduler has no work....)
ID: 21712 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 21713 - Posted: 3 Sep 2004, 4:05:15 UTC - in response to Message 21712.  

> > I have lch and seti projects.
> > Seti has not run for about 6 hours also. Could be because their server
> is
> > down.
>
>
> But, isn't part of the idea of boinc that they can run if the project is down?
> They shouldn't be stalled by the server(s) unless there is nothing to do!
>
> While Seti is down (right now), shouldn't my Seti WU's be crunched per the
> prefernces that my computer had prior to seti going down? Maybe that's the
> problem... My seti-specific prefs on my local machine were lost as the seti
> scheduler wasn't responding, and therefore, all the time goes to CP!
>
> (kind of like where credits get set to 0 during the time that the scheduler
> has no work....)
>
I have a feeling I know what the bug is. Could you check your client_state.xml to see what the debt for S@H is? If it is 0, stop BOINC, change the value to something else, and restart BOINC. I believe that what may be happening is that BOINC attempts to attach to S@H, finds the project down, and does not check to see if there is a WU running, and turns off crunching for that project. A few more data points to see if this is the problem would be nice.
ID: 21713 · Report as offensive
grumpy

Send message
Joined: 2 Jun 99
Posts: 209
Credit: 152,987
RAC: 0
Canada
Message 21714 - Posted: 3 Sep 2004, 4:11:25 UTC

@ AZ Woody

I have not yet figured out yet how the time split works on my computer.
This is from the boinc site:
Time-slicing
Starting with version 4.00, the BOINC core client does time-slicing. This means that the core client may switch back and forth between results of different projects. This is done in a way that allocates CPU time according to the 'resource shares' you have assigned to each project.

For example, suppose you participate in SETI@home with resource share 100 and Predictor@home with resource share 200. A single-processor machine might be scheduled as follows:

1:00 - 2:00: SETI@home
2:00 - 3:00: Predictor@home
3:00 - 4:00: Predictor@home
4:00 - 5:00: SETI@home
5:00 - 6:00: Predictor@home
6:00 - 7:00: Predictor@home
...

A two-processor machine might be scheduled as follows:
CPU 0 CPU 1
1:00 - 2:00: Predictor@home SETI@home
2:00 - 3:00: Predictor@home SETI@home
3:00 - 4:00: Predictor@home Predictor@home
4:00 - 5:00: Predictor@home SETI@home
5:00 - 6:00: Predictor@home SETI@home
6:00 - 7:00: Predictor@home Predictor@home

In every 3 hour period, your computer spends 4 hours on Predictor@home and 2 hours on SETI@home, which is the desired ratio.
This feature is necessary to handle projects like Climateprediction.net, whose work units take a long time (1 or 2 months) to complete on a typical computer. Without time-slicing, your computer would have to finish an entire work unit before it could start working on a different project.

Preemption
When BOINC switches from one application to another, the first application is said to be preempted. BOINC can do preemption in two different ways; you can select this as part of your General Preferences.

Don't leave the suspended applications in memory (default). Applications are preempted by killing them; they are later restarted, and resume from their last checkpoint. This saves virtual memory (swap space) but can waste CPU time, especially if applications checkpoint infrequently.
Leave suspended applications in memory. Applications are preempted by suspending them; they remain in virtual memory while preempted (they don't necessarily occupy physical memory).

I have set also 100/200 and I am not getting that time split.

Seti just started a wu on my computer right now, it took 6+ hours for it to do so.I have seen time splits as low as 1.5 hours up to 11 hours.

ID: 21714 · Report as offensive
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : The Snap App hung again. We will be down for the night.


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.