Of the Woods (Feb 19 2009)


log in

Advanced search

Message boards : Technical News : Of the Woods (Feb 19 2009)

1 · 2 · 3 · 4 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 867158 - Posted: 19 Feb 2009, 20:41:57 UTC

As we move toward the weekend we're sticking with the current raw data storage workarounds, which means servers are loaded heavier than we'd like, but at least data is still flowing. I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.

So far this morning lots of chores. Bob and I got a shipment of empty data drives bundled up to be sent to Arecibo. I finished getting the new CPU server configured (now me, Eric, Josh, and Jeff are in less competition for cycles). I made more strides towards retiring the last two Solaris machines. Honestly, depending on the development/production environment I'd still probably prefer Solaris over linux. So I'm sad to see these systems go, but they are both very old Sparc machines that we simply don't need anymore.

Late last week Eric, Jeff and I had a quick meeting to discuss current candidate scoring algorithms - we're pretty sure we'll have to tweak them as we go, but we're in enough agreement to get started implementing this part of the NTPCker. Jeff's been all over that this week. I'm just now turning my focus back to actual development, too. My software radar blanker now agrees with the hardware blanker 90% of the time, which is a very good start. I can add an additional 5% just by adjusting thresholds, but the real test is to run software blanked data through the pipeline and see which workunits generate more RFI (the ones using hardware blanking or the ones using software blanking).

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Bob Mahoney Design
Avatar
Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 867237 - Posted: 20 Feb 2009, 1:09:52 UTC

Lots of progress - amazing. This is one intense project.

Thanks for the inside info.

Bob Mahoney

Profile Neil Blaikie
Volunteer tester
Avatar
Send message
Joined: 17 May 99
Posts: 142
Credit: 6,466,200
RAC: 1
Canada
Message 867816 - Posted: 21 Feb 2009, 23:54:07 UTC
Last modified: 21 Feb 2009, 23:55:13 UTC

Patience required, seems like something is amiss, uploads / downloads are working but extremely slow at the moment.

As Matt say the hiccups mentioned may have started.

No big deal, have a decent weekend break guys and sort things Monday, you deserve it.
____________

SMWProject donor
Send message
Joined: 16 May 99
Posts: 21
Credit: 10,980,436
RAC: 2,848
United States
Message 867823 - Posted: 22 Feb 2009, 0:20:06 UTC

Well I haven't been able to upload all day:( Oh well it will wait until next week. I have noticed that all of my computers have had a major drop in work credit. One computer has dropped to 25% of last month and the others have lost about 30%. <----assumes that this too will pass.
____________
"It is better to be hated for what you are then to be loved for what you are not"
- Andre Gide (1869-1951)

John G
Send message
Joined: 29 Dec 01
Posts: 63
Credit: 10,142,278
RAC: 0
Canada
Message 867838 - Posted: 22 Feb 2009, 1:24:52 UTC - in response to Message 867823.

Yes dido here. I am running a CUDA and I have noticed some WU's are taking up to almost 3 hours of processing time and I am only getting credit for like 3.4 minutes on my CUDA.(Most WU's on a CUDA finish in less than 25 min.) Oh well there was some talk about having to adjust the scoring for some reason.

Regards

Profile popandbob
Volunteer tester
Send message
Joined: 19 Mar 05
Posts: 535
Credit: 1,896,421
RAC: 0
Canada
Message 867861 - Posted: 22 Feb 2009, 2:23:34 UTC

Traffic

They are currently at maxed out for network traffic hence the issues.

Also the long CUDA tasks are VLAR. A fix is on its way.
____________


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957

Profile Neil Blaikie
Volunteer tester
Avatar
Send message
Joined: 17 May 99
Posts: 142
Credit: 6,466,200
RAC: 1
Canada
Message 867870 - Posted: 22 Feb 2009, 2:45:28 UTC

As Matt mentioned, they expected some problems to materialize over the weekend.

I do not blame them for not fixing them remotely, if indeed they even could. They deserve to have at least one weekend of "leave it until it clears itself or until we are in the lab again"

Just noticed that the replica db is offline :
BOINC Database Engine State # As of*
Master database queries/second 192 0m
Replica seconds behind master Offline 0m

The one Astropulse workunit should keep the processors happy until the problem either clears or is fixed Monday/Tuesday.

Could do with a break from crunching for a bit. Might actually give me time to clear out my liquid cooling loop and replace the liquid, due a replacement soon anyways.

Don't really worry too much about RAC, to be honest to be actively searching for something that could be life changing for the entire planet is more than gratification for participating.

Hope all your crunchers out there, wherever you may be have a good weekend.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8431
Credit: 47,875,374
RAC: 56,543
United Kingdom
Message 867990 - Posted: 22 Feb 2009, 10:05:12 UTC
Last modified: 22 Feb 2009, 10:47:34 UTC

Matt,

I think you need to turn on Coral Cache for distribution of the new Astropulse_v5 5.03 executables. Have a look at WU 417685549 - six download failures, all on the .exe

Edit - is there any way that BOINC can set a separate quota for Astropulse? 100/CPU/day, plus GPU allocation, is ludicrous for an 8MB, 3-day task.

Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,027,153
RAC: 1,529
United States
Message 868137 - Posted: 22 Feb 2009, 18:29:29 UTC
Last modified: 22 Feb 2009, 18:29:55 UTC

My messages are reporting:

2/22/2009 11:22:18 AM||Internet access OK - project servers may be temporarily down.

However the stats page doesn't seem to report much in the way of problems. I'm unable to download any work units. Should I assume this is a result of the problems reported above?

Nick
____________

Profile 335deezl
Volunteer tester
Avatar
Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,207,926
RAC: 0
United States
Message 868164 - Posted: 22 Feb 2009, 19:15:18 UTC - in response to Message 868137.

Yes. This is exactly the reason why I always have 10 days of work on hand.. ;)

Swibby Bear
Send message
Joined: 1 Aug 01
Posts: 236
Credit: 7,275,432
RAC: 1,147
United States
Message 868167 - Posted: 22 Feb 2009, 19:18:10 UTC
Last modified: 22 Feb 2009, 19:23:36 UTC

The Astropulse downloads have overwhelmed the available network bandwidth, causing many download errors, each of which causes a reissue of the WU, which then also errors out, which causes yet another reissue. It is a vicious cycle.

The easy solution is to shut off the AP splitters until run out, then run only one splitter. That way, the bandwidth can probably efficiently handle the downloads and allow uploads to get a little time, too.

Eventually, everyone will fill their caches, which, by the way, should be only one day or two until things settle down. After things are smoother, then we can greedily (but slowly) raise our AP cache sizes.

Whit

Marook
Avatar
Send message
Joined: 22 Jul 99
Posts: 1
Credit: 1,489,347
RAC: 2,154
Denmark
Message 868184 - Posted: 22 Feb 2009, 19:49:43 UTC - in response to Message 867158.

Seems like we have hit a load issue.

None of my workunits will upload... :-(


____________
/Marook

William Roeder
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 69
Credit: 523,414
RAC: 0
United States
Message 868200 - Posted: 22 Feb 2009, 20:25:30 UTC - in response to Message 868184.

I just got one of nine to upload
It's just very slow.
____________

Profile *Viking*
Avatar
Send message
Joined: 2 Nov 03
Posts: 16
Credit: 505,332
RAC: 0
Canada
Message 868226 - Posted: 22 Feb 2009, 21:26:52 UTC

I've four workunits that won't upload, too... and that message, "Internet access OK - project servers may be temporarily down."

Cross the fingers and wait, I suppose.
____________
Viking

Profile SoNic
Send message
Joined: 24 Dec 00
Posts: 137
Credit: 2,849,499
RAC: 0
Romania
Message 868232 - Posted: 22 Feb 2009, 21:37:18 UTC

Since yesterday, none of my units uploaded. I have run out of work...

XWing69
Avatar
Send message
Joined: 3 Jan 08
Posts: 43
Credit: 2,411,695
RAC: 291
United States
Message 868240 - Posted: 22 Feb 2009, 21:53:22 UTC - in response to Message 868232.

Since yesterday, none of my units uploaded. I have run out of work...


I can relate. I have 6 WUs (MB) that are still trying to upload. Nothing in queue to crunch. Have Seti+AP+AP_v5 all "on" at the website, and have all the optimized apps onboard. Every time I do a manual update, I get the "requesting 0 seconds of work". I know its not the optimized apps, because this happened before I put the AP_v5 in the XML file (and I did reboot my computer, just to make sure everything was 'clean' before restarting BOINC).

I wonder how others are still getting downloads...



____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13563
Credit: 29,757,946
RAC: 16,818
United States
Message 868254 - Posted: 22 Feb 2009, 22:14:41 UTC - in response to Message 868240.

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.
____________

RJS
Send message
Joined: 14 Jan 09
Posts: 2
Credit: 53,526
RAC: 0
United States
Message 868258 - Posted: 22 Feb 2009, 22:22:35 UTC

I am down to 2 thing only running I have a big list now wanting to upload, my poor processors are starting to get the jitters with nothing to do. But I do understand new things happening and long weekends.

Profile Jack Shaftoe
Avatar
Send message
Joined: 19 Aug 04
Posts: 44
Credit: 2,343,242
RAC: 0
United States
Message 868265 - Posted: 22 Feb 2009, 22:34:48 UTC - in response to Message 868258.

I am down to 2 thing only running I have a big list now wanting to upload, my poor processors are starting to get the jitters with nothing to do. But I do understand new things happening and long weekends.


This is why you should always have a second project attached with a low priority - like 5 or 10. When your main project (SETI) locks up, your cpu will work for another project rather than sit idle. There are dozens of other BOINC projects that would appreciate the extra cycles here and there...

Gorim1
Send message
Joined: 15 Nov 06
Posts: 4
Credit: 1,474,936
RAC: 0
Poland
Message 868271 - Posted: 22 Feb 2009, 22:43:13 UTC

I hope that this problem will be solved asap. Im runnig out of work.

1 · 2 · 3 · 4 · Next

Message boards : Technical News : Of the Woods (Feb 19 2009)

Copyright © 2014 University of California