Of the Woods (Feb 19 2009)

Message boards : Technical News : Of the Woods (Feb 19 2009)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 867158 - Posted: 19 Feb 2009, 20:41:57 UTC

As we move toward the weekend we're sticking with the current raw data storage workarounds, which means servers are loaded heavier than we'd like, but at least data is still flowing. I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.

So far this morning lots of chores. Bob and I got a shipment of empty data drives bundled up to be sent to Arecibo. I finished getting the new CPU server configured (now me, Eric, Josh, and Jeff are in less competition for cycles). I made more strides towards retiring the last two Solaris machines. Honestly, depending on the development/production environment I'd still probably prefer Solaris over linux. So I'm sad to see these systems go, but they are both very old Sparc machines that we simply don't need anymore.

Late last week Eric, Jeff and I had a quick meeting to discuss current candidate scoring algorithms - we're pretty sure we'll have to tweak them as we go, but we're in enough agreement to get started implementing this part of the NTPCker. Jeff's been all over that this week. I'm just now turning my focus back to actual development, too. My software radar blanker now agrees with the hardware blanker 90% of the time, which is a very good start. I can add an additional 5% just by adjusting thresholds, but the real test is to run software blanked data through the pipeline and see which workunits generate more RFI (the ones using hardware blanking or the ones using software blanking).

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 867158 · Report as offensive
Profile Bob Mahoney Design
Avatar

Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 867237 - Posted: 20 Feb 2009, 1:09:52 UTC

Lots of progress - amazing. This is one intense project.

Thanks for the inside info.

Bob Mahoney
ID: 867237 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 867816 - Posted: 21 Feb 2009, 23:54:07 UTC
Last modified: 21 Feb 2009, 23:55:13 UTC

Patience required, seems like something is amiss, uploads / downloads are working but extremely slow at the moment.

As Matt say the hiccups mentioned may have started.

No big deal, have a decent weekend break guys and sort things Monday, you deserve it.
ID: 867816 · Report as offensive
Profile SMW

Send message
Joined: 16 May 99
Posts: 22
Credit: 29,285,238
RAC: 16
United States
Message 867823 - Posted: 22 Feb 2009, 0:20:06 UTC

Well I haven't been able to upload all day:( Oh well it will wait until next week. I have noticed that all of my computers have had a major drop in work credit. One computer has dropped to 25% of last month and the others have lost about 30%. <----assumes that this too will pass.
"It is better to be hated for what you are then to be loved for what you are not"
- Andre Gide (1869-1951)
ID: 867823 · Report as offensive
John G

Send message
Joined: 29 Dec 01
Posts: 68
Credit: 10,932,850
RAC: 0
Canada
Message 867838 - Posted: 22 Feb 2009, 1:24:52 UTC - in response to Message 867823.  

Yes dido here. I am running a CUDA and I have noticed some WU's are taking up to almost 3 hours of processing time and I am only getting credit for like 3.4 minutes on my CUDA.(Most WU's on a CUDA finish in less than 25 min.) Oh well there was some talk about having to adjust the scoring for some reason.

Regards

ID: 867838 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 551
Credit: 4,673,015
RAC: 0
Canada
Message 867861 - Posted: 22 Feb 2009, 2:23:34 UTC

Traffic

They are currently at maxed out for network traffic hence the issues.

Also the long CUDA tasks are VLAR. A fix is on its way.


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 867861 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 867870 - Posted: 22 Feb 2009, 2:45:28 UTC

As Matt mentioned, they expected some problems to materialize over the weekend.

I do not blame them for not fixing them remotely, if indeed they even could. They deserve to have at least one weekend of "leave it until it clears itself or until we are in the lab again"

Just noticed that the replica db is offline :
BOINC Database Engine State # As of*
Master database queries/second 192 0m
Replica seconds behind master Offline 0m

The one Astropulse workunit should keep the processors happy until the problem either clears or is fixed Monday/Tuesday.

Could do with a break from crunching for a bit. Might actually give me time to clear out my liquid cooling loop and replace the liquid, due a replacement soon anyways.

Don't really worry too much about RAC, to be honest to be actively searching for something that could be life changing for the entire planet is more than gratification for participating.

Hope all your crunchers out there, wherever you may be have a good weekend.
ID: 867870 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 867990 - Posted: 22 Feb 2009, 10:05:12 UTC
Last modified: 22 Feb 2009, 10:47:34 UTC

Matt,

I think you need to turn on Coral Cache for distribution of the new Astropulse_v5 5.03 executables. Have a look at WU 417685549 - six download failures, all on the .exe

Edit - is there any way that BOINC can set a separate quota for Astropulse? 100/CPU/day, plus GPU allocation, is ludicrous for an 8MB, 3-day task.
ID: 867990 · Report as offensive
Nick

Send message
Joined: 17 May 99
Posts: 96
Credit: 17,356,094
RAC: 0
United States
Message 868137 - Posted: 22 Feb 2009, 18:29:29 UTC
Last modified: 22 Feb 2009, 18:29:55 UTC

My messages are reporting:

2/22/2009 11:22:18 AM||Internet access OK - project servers may be temporarily down.

However the stats page doesn't seem to report much in the way of problems. I'm unable to download any work units. Should I assume this is a result of the problems reported above?

Nick
ID: 868137 · Report as offensive
Profile RandyF
Volunteer tester
Avatar

Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,296,855
RAC: 1
United States
Message 868164 - Posted: 22 Feb 2009, 19:15:18 UTC - in response to Message 868137.  

Yes. This is exactly the reason why I always have 10 days of work on hand.. ;)
ID: 868164 · Report as offensive
Swibby Bear

Send message
Joined: 1 Aug 01
Posts: 246
Credit: 7,945,093
RAC: 0
United States
Message 868167 - Posted: 22 Feb 2009, 19:18:10 UTC
Last modified: 22 Feb 2009, 19:23:36 UTC

The Astropulse downloads have overwhelmed the available network bandwidth, causing many download errors, each of which causes a reissue of the WU, which then also errors out, which causes yet another reissue. It is a vicious cycle.

The easy solution is to shut off the AP splitters until run out, then run only one splitter. That way, the bandwidth can probably efficiently handle the downloads and allow uploads to get a little time, too.

Eventually, everyone will fill their caches, which, by the way, should be only one day or two until things settle down. After things are smoother, then we can greedily (but slowly) raise our AP cache sizes.

Whit
ID: 868167 · Report as offensive
Marook
Avatar

Send message
Joined: 22 Jul 99
Posts: 1
Credit: 3,266,827
RAC: 0
Denmark
Message 868184 - Posted: 22 Feb 2009, 19:49:43 UTC - in response to Message 867158.  

Seems like we have hit a load issue.

None of my workunits will upload... :-(


/Marook
ID: 868184 · Report as offensive
William Roeder
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 69
Credit: 523,414
RAC: 0
United States
Message 868200 - Posted: 22 Feb 2009, 20:25:30 UTC - in response to Message 868184.  

I just got one of nine to upload
It's just very slow.
ID: 868200 · Report as offensive
Profile *Viking*
Avatar

Send message
Joined: 2 Nov 03
Posts: 17
Credit: 1,051,900
RAC: 1
Canada
Message 868226 - Posted: 22 Feb 2009, 21:26:52 UTC

I've four workunits that won't upload, too... and that message, "Internet access OK - project servers may be temporarily down."

Cross the fingers and wait, I suppose.
* Viking *
ID: 868226 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 868232 - Posted: 22 Feb 2009, 21:37:18 UTC

Since yesterday, none of my units uploaded. I have run out of work...
ID: 868232 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 868254 - Posted: 22 Feb 2009, 22:14:41 UTC - in response to Message 868240.  

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.
ID: 868254 · Report as offensive
RJS

Send message
Joined: 14 Jan 09
Posts: 2
Credit: 53,526
RAC: 0
United States
Message 868258 - Posted: 22 Feb 2009, 22:22:35 UTC

I am down to 2 thing only running I have a big list now wanting to upload, my poor processors are starting to get the jitters with nothing to do. But I do understand new things happening and long weekends.
ID: 868258 · Report as offensive
Profile Jack Shaftoe
Avatar

Send message
Joined: 19 Aug 04
Posts: 44
Credit: 2,343,242
RAC: 0
United States
Message 868265 - Posted: 22 Feb 2009, 22:34:48 UTC - in response to Message 868258.  

I am down to 2 thing only running I have a big list now wanting to upload, my poor processors are starting to get the jitters with nothing to do. But I do understand new things happening and long weekends.


This is why you should always have a second project attached with a low priority - like 5 or 10. When your main project (SETI) locks up, your cpu will work for another project rather than sit idle. There are dozens of other BOINC projects that would appreciate the extra cycles here and there...
ID: 868265 · Report as offensive
Gorim1

Send message
Joined: 15 Nov 06
Posts: 4
Credit: 1,536,081
RAC: 0
Poland
Message 868271 - Posted: 22 Feb 2009, 22:43:13 UTC

I hope that this problem will be solved asap. Im runnig out of work.
ID: 868271 · Report as offensive
Swibby Bear

Send message
Joined: 1 Aug 01
Posts: 246
Credit: 7,945,093
RAC: 0
United States
Message 868279 - Posted: 22 Feb 2009, 22:58:12 UTC - in response to Message 868254.  

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.
ID: 868279 · Report as offensive
1 · 2 · 3 · 4 · Next

Message boards : Technical News : Of the Woods (Feb 19 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.