Lookin' Up (May 23 2007)

Message boards : Technical News : Lookin' Up (May 23 2007)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 574583 - Posted: 23 May 2007, 21:15:51 UTC

Some good news in general. With some extreme debugging by Jeff and the rare manual-reading by me we got fastCGI working for both the scheduler CGI and the file upload handler under linux/apache2. On hindsight not terribly difficult but it wasn't very easy to track down the issues given fastCGI's penchant for overloading FILE streams and whatnot. The servers were going up and down this afternoon as we were employing the new executables and working out the configuration kinks.

The results were vast and immediate, which then caused us to quickly hit our next (and possibly final) bottleneck: the rate at which we can create new work. As it stands now the splitters (which create the work) can only run on Solaris machines, three of which we recently retired (koloth, kryten, and galileo). We have every possible Solaris box we have working on this now including three not-so-hefty desktop systems (milkyway, glenn, and kang). We could put some effort into making a linux version of splitter, but I don't think we'll bother for several reasons including: 1. we are sending out workunits faster than we get raw data from the telescope (we always claimed that this would be our "ceiling" and wouldn't put any effort into making work beyond this rate if we don't have the resources), and 2. we are quite close to running out of classic work that is of any scientific use. Any programming effort should pour into the new multibeam splitter, and I sure hope we finish that real soon.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 574583 · Report as offensive
Profile darkclown

Send message
Joined: 14 May 99
Posts: 20
Credit: 77,700
RAC: 0
United States
Message 574589 - Posted: 23 May 2007, 21:20:46 UTC - in response to Message 574583.  

Some good news in general. With some extreme debugging by Jeff and the rare manual-reading by me we got fastCGI working for both the scheduler CGI and the file upload handler under linux/apache2. On hindsight not terribly difficult but it wasn't very easy to track down the issues given fastCGI's penchant for overloading FILE streams and whatnot. The servers were going up and down this afternoon as we were employing the new executables and working out the configuration kinks.

The results were vast and immediate, which then caused us to quickly hit our next (and possibly final) bottleneck: the rate at which we can create new work. As it stands now the splitters (which create the work) can only run on Solaris machines, three of which we recently retired (koloth, kryten, and galileo). We have every possible Solaris box we have working on this now including three not-so-hefty desktop systems (milkyway, glenn, and kang). We could put some effort into making a linux version of splitter, but I don't think we'll bother for several reasons including: 1. we are sending out workunits faster than we get raw data from the telescope (we always claimed that this would be our "ceiling" and wouldn't put any effort into making work beyond this rate if we don't have the resources), and 2. we are quite close to running out of classic work that is of any scientific use. Any programming effort should pour into the new multibeam splitter, and I sure hope we finish that real soon.

- Matt


Great news. Does this mean the anonymous_platform issue is resolved?
ID: 574589 · Report as offensive
Profile Henk Haneveld
Volunteer tester

Send message
Joined: 16 May 99
Posts: 154
Credit: 1,577,293
RAC: 1
Netherlands
Message 574596 - Posted: 23 May 2007, 21:26:51 UTC
Last modified: 23 May 2007, 21:30:18 UTC

The timeout problem is not solved. It causes large numbers of ghosts.

I just got 12 new ones.
ID: 574596 · Report as offensive
Profile Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 21 Oct 99
Posts: 2246
Credit: 6,136,250
RAC: 0
Germany
Message 574597 - Posted: 23 May 2007, 21:27:01 UTC

very good news, but we have some Problems

HTTP internal server error

HTTP internal server error

i have 3 Hosts here and all have the same problem....

all using pure Seti clients, no optimized...

i think that Problem was cleared out

...

Greetings from Germany NRW
Ulli


ID: 574597 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 574603 - Posted: 23 May 2007, 21:45:00 UTC

We're still having a different problem than reported earlier in the thread - I have a completed WU that starts to upload, gets to 98.85%, then stalls.
This is on a cable modem, no optimized app, client 5.8.16.

Just tried it 1445 PDT...
.

Hello, from Albany, CA!...
ID: 574603 · Report as offensive
Conrad Human
Volunteer tester

Send message
Joined: 17 Nov 00
Posts: 67
Credit: 2,009,224
RAC: 0
South Africa
Message 574607 - Posted: 23 May 2007, 21:50:31 UTC
Last modified: 23 May 2007, 21:52:25 UTC

ghosts has always been there it is just now more in the public if they are not too manny it is acceptable

Mat
I am sure alot of us would like to see an "result download rate"
This can be dirived from the rate teady to send is falling

Ex
RCR=result creation rate
RDR=result download rate
if ready to send is standing still then RDR = RCR
If ready to send is increasing by 1 result a second then RDR=RCR-1
If ready to send is decreasing by 1 result a second then RDR=RCR+1
u can use avarages this dont have to be perfect


ID: 574607 · Report as offensive
Profile Henk Haneveld
Volunteer tester

Send message
Joined: 16 May 99
Posts: 154
Credit: 1,577,293
RAC: 1
Netherlands
Message 574616 - Posted: 23 May 2007, 22:00:46 UTC - in response to Message 574607.  

ghosts has always been there it is just now more in the public if they are not too manny it is acceptable


True but as far as I know the 5 minute timeout problem is from the last few days.
ID: 574616 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 574622 - Posted: 23 May 2007, 22:11:32 UTC

Just twiddled the timeouts - maybe that'll fix the ghost workunit/result issue.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 574622 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 574625 - Posted: 23 May 2007, 22:18:14 UTC - in response to Message 574622.  

Just twiddled the timeouts - maybe that'll fix the ghost workunit/result issue.

- Matt


Matt...

Uploads are impossible at the moment. Might want to check to see if there's anything wrong other than extra traffic...

Brian
ID: 574625 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 574626 - Posted: 23 May 2007, 22:21:38 UTC - in response to Message 574625.  

Just twiddled the timeouts - maybe that'll fix the ghost workunit/result issue.

- Matt


Matt...

Uploads are impossible at the moment. Might want to check to see if there's anything wrong other than extra traffic...

Brian


I guess I just needed to post something here, because right after I did, the upload I had worked... Now it is just "No work from project", which I would guess is probably either legit or slow feeder...
ID: 574626 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 574627 - Posted: 23 May 2007, 22:22:00 UTC

Uploads go to 100%, and *then* they fail... After the network traffic is complete.
ID: 574627 · Report as offensive
Profile S@NL - Mark Thijssen

Send message
Joined: 13 Apr 00
Posts: 1
Credit: 3,799,271
RAC: 0
Netherlands
Message 574635 - Posted: 23 May 2007, 22:30:18 UTC - in response to Message 574622.  

Just twiddled the timeouts - maybe that'll fix the ghost workunit/result issue.

- Matt


I just got a WU on my FreeBSD box (with a app_info.xml) so that appears to work now :)
ID: 574635 · Report as offensive
Profile Bax
Volunteer tester
Avatar

Send message
Joined: 16 May 99
Posts: 182
Credit: 3,919,072
RAC: 0
Canada
Message 574643 - Posted: 23 May 2007, 22:50:28 UTC

Thanks for the update and all the hard work, Matt!


Join The Assimilators

Free Internet Radio! "The Assimilators" Browser Toolbar!


ID: 574643 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 574668 - Posted: 23 May 2007, 23:44:03 UTC - in response to Message 574583.  

Some good news in general. With some extreme debugging by Jeff and the rare manual-reading by me we got fastCGI working for both the scheduler CGI and the file upload handler under linux/apache2. On hindsight not terribly difficult but it wasn't very easy to track down the issues given fastCGI's penchant for overloading FILE streams and whatnot. The servers were going up and down this afternoon as we were employing the new executables and working out the configuration kinks.

The results were vast and immediate, which then caused us to quickly hit our next (and possibly final) bottleneck: the rate at which we can create new work. As it stands now the splitters (which create the work) can only run on Solaris machines, three of which we recently retired (koloth, kryten, and galileo). We have every possible Solaris box we have working on this now including three not-so-hefty desktop systems (milkyway, glenn, and kang). We could put some effort into making a linux version of splitter, but I don't think we'll bother for several reasons including: 1. we are sending out workunits faster than we get raw data from the telescope (we always claimed that this would be our "ceiling" and wouldn't put any effort into making work beyond this rate if we don't have the resources), and 2. we are quite close to running out of classic work that is of any scientific use. Any programming effort should pour into the new multibeam splitter, and I sure hope we finish that real soon.

- Matt


. . . keep up the good work Matt (and that goes without sayin' - the rest @

Berkeley too . . .)



BOINC Wiki . . .

Science Status Page . . .
ID: 574668 · Report as offensive
Kim Vater
Volunteer tester

Send message
Joined: 27 May 99
Posts: 227
Credit: 22,743,307
RAC: 0
Norway
Message 574700 - Posted: 24 May 2007, 0:46:53 UTC
Last modified: 24 May 2007, 1:11:02 UTC

Hi,

Thanks for your hard work at the Lab ;)

Just managed to download new work with optimized apps (app_info.xml) without renaming it.
So apperently it works ;-)

BTW: It loosks as it was the "too short timings" that cause the trouble for the opimized apps as they need sligtly longer time to connect/comunicate than the ordinary stock apss.

Kiva
Greetings from Norway

Crunch3er & AK-V8 Inside
ID: 574700 · Report as offensive
Profile Francesco Forti
Avatar

Send message
Joined: 24 May 00
Posts: 334
Credit: 204,421,005
RAC: 15
Switzerland
Message 574810 - Posted: 24 May 2007, 5:57:49 UTC - in response to Message 574622.  
Last modified: 24 May 2007, 5:58:40 UTC

Just twiddled the timeouts - maybe that'll fix the ghost workunit/result issue.

- Matt


Uploads now are running with no problems.
What I see is that communicating with the scheduler is still difficult
and I get error messages after 300 secs from initial reporting request.

errors are:

24.05.2007 07:46:23||Project communication failed: attempting access to reference site
24.05.2007 07:46:24||Access to reference site succeeded - project servers may be temporarily down.

or

24.05.2007 07:37:51||Access to reference site succeeded - project servers may be temporarily down.
24.05.2007 07:37:51|SETI@home|Scheduler request failed: failed sending data to the peer

(theese are messages received some minute ago ... my time is UTC+2)

In theese hosts I don't have any app_info file and
in this moment I can't say anything about my far
hosts that are still using app_info.
If I'll see them working, I'll tell you.

Bye,
Franz
ID: 574810 · Report as offensive
aplayer

Send message
Joined: 26 Apr 00
Posts: 13
Credit: 15,217,341
RAC: 0
United States
Message 574812 - Posted: 24 May 2007, 6:10:20 UTC

Thank you for the news update. All you guys are doing a great job.
ID: 574812 · Report as offensive
Profile eaglescouter

Send message
Joined: 28 Dec 02
Posts: 162
Credit: 42,012,553
RAC: 0
United States
Message 574815 - Posted: 24 May 2007, 6:21:23 UTC

Is there any way for us to know if the project is offline vs out of work?
It's not too many computers, it's a lack of circuit breakers for this room. But we can fix it :)
ID: 574815 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 574822 - Posted: 24 May 2007, 7:44:21 UTC - in response to Message 574815.  

Is there any way for us to know if the project is offline vs out of work?

Check the Server Status page.
Grant
Darwin NT
ID: 574822 · Report as offensive
H Elzinga
Volunteer tester

Send message
Joined: 20 Aug 99
Posts: 125
Credit: 8,277,116
RAC: 0
Netherlands
Message 574829 - Posted: 24 May 2007, 8:16:19 UTC - in response to Message 574583.  
Last modified: 24 May 2007, 8:23:28 UTC

The results were vast and immediate, which then caused us to quickly hit our next (and possibly final) bottleneck: the rate at which we can create new work. As it stands now the splitters (which create the work) can only run on Solaris machines, three of which we recently retired (koloth, kryten, and galileo). We have every possible Solaris box we have working on this now including three not-so-hefty desktop systems (milkyway, glenn, and kang). We could put some effort into making a linux version of splitter, but I don't think we'll bother for several reasons including: 1. we are sending out workunits faster than we get raw data from the telescope (we always claimed that this would be our "ceiling" and wouldn't put any effort into making work beyond this rate if we don't have the resources), and 2. we are quite close to running out of classic work that is of any scientific use. Any programming effort should pour into the new multibeam splitter, and I sure hope we finish that real soon.


Matt

Im somwhat confused.

As far as i understand the multibeam reciever will increase the amount of data to be recorded at the radio telescope (7 recievers 2 polarasition = 14 chanels).
As you say yourself there is still work going on on the multibeam splitter.

But would more data not require more tapes so more splitters.
Then you would probbably also need the linux machines.

Or have i missed somthing

Greetings from the netherlands
ID: 574829 · Report as offensive
1 · 2 · 3 · Next

Message boards : Technical News : Lookin' Up (May 23 2007)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.