Message boards :
Number crunching :
Astropulse! Finally! =D
Message board moderation
Author | Message |
---|---|
Hellsheep Send message Joined: 12 Sep 08 Posts: 428 Credit: 784,780 RAC: 0 |
I've been waiting for the day i start to get AP Work units! I've now got 11, they're downloading every request now, seems i get a AP work unit every request, but nothing else. :) - Jarryd |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Yes - apparently, they have screwed up MB distribution, if I read the threads about "new" error messages correctly. And, apparently, they aren't about to roll back whatever server changes have been made that are causing the problem. Which sucks. |
Hellsheep Send message Joined: 12 Sep 08 Posts: 428 Credit: 784,780 RAC: 0 |
Yes - apparently, they have screwed up MB distribution, if I read the threads about "new" error messages correctly. And, apparently, they aren't about to roll back whatever server changes have been made that are causing the problem. Which sucks. http://setiathome.berkeley.edu/forum_thread.php?id=60285 That's a great read about the issues going on at the moment, there is a lot of speculation as to what's going on. Richard has been doing some testing and writing up some of his completed research into the issue relating to the 100 quota per day. - Jarryd |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it. |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
I do agree that the server software here should have been rolled back when the problems surfaced regarding the new Fermi cards. Changes should have been tested on the Beta site first. Also as Matt described in this post and also mentioned on the front page there is a problem in removing the radar interference which is delaying the MB work. So Berkeley has many problems to solve right now. Many of them are self inflicted. Patience is required by us. Boinc....Boinc....Boinc....Boinc.... |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
The new Fermi app was tested on the Beta site. It was brought over a little early because of all the problems people were having trying to run them on the main site without reading up on how to get them to work right. That is not Berkeley's fault. The problems with the tasks pages are just teething problems getting an improvement working right.That will get worked out when they can. I have no idea where the problem about quota limits reached popped up but I'm sure it will be fixed soon also. Please be patient! THe guys are working on these problems and will get us back to normal as fast as they can! PROUD MEMBER OF Team Starfire World BOINC |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it. It's possible, but I honestly don't read it that way. From the front page news, the SETI staff were wrestling with the radar blanking up until 9pm local time on Friday evening. They seem to have had some success: there is still some data being split now, at 7:30am the following morning. The automated pipeline is feeding tapes into the splitter queue - slowly, but it's a start. And when you say 'Berkeley', who exactly do you mean? Matt, Eric, and the rest of the SETI crew? Their strength and skill - and paymaster - is Radio Astronomy: I would expect them to concentrate on the work supply and the radar blanking. Quotas and suchlike are the province of BOINC: the relevant section was tested at Beta, but it was written and installed by David Anderson. Some glitches showed up: they've been reported, and we'll carry on working through them. Testers, as always, are needed - particularly ones with the patience to observe, and the skill to report accurately. In the meantime, it's a summy weekend. I've just been out for a walk, and - having only bobbed in for a glass of water - I'm going out again. I suggest we let both the SETI staff, and the BOINC staff, have the weekend off as well. But rest assured that everything I've been able to work out over the weekend, with the help of any actual information gleaned here, will hit their inboxes as they arrive for work on Monday morning. With any luck, we'll be able to explain the source of the problem which is, I suspect, still a mystery to them. E.T. can wait a couple more days.... |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it. To tell the truth, I said "Berkeley" because I wasn't sure whodunit...yes, ET can certainly wait a few more days (or millenia, for that matter). I was just hoping for a quick rollback so I could get more MB for my 2 rigs, is all. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Sorry, not buying...or selling, for that matter. Not angry, except at what certainly looks like stubborn incompetence right now...I know the guys have lots of probs they have no control over (old h/w, network probs, lack of $$$ and manpower), but that's no excuse (IMO) for the current MB problems, which are entirely self-inflicted. And I have contributed what limited $$$ I can afford. I give 'em props for the usual amount of effort they put in, but NOT for this one. |
W-K 666 Send message Joined: 18 May 99 Posts: 19085 Credit: 40,757,560 RAC: 67 |
From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications. If you read some of the hardware review sites you will find that some of there standard applications, including games, fail on the Fermi cards. And as Nvidia did not tell Berkeley about the changes, BOINC does not detect Fermi cards as a new unknown device, and subsequently the Seti CUDA app was allowed to be run on these Fermi gpu's. This resulted in thousands of incorrect -9 overflows, which either incorrectly validated if both tasks were done on Fermi gpu's or caused an inconclusive result and a third issue of the task, which sometimes meant the correct result was rejected. This harmed the science and Seti was forced into introducing a very little tested Fermi app, and asking the BOINC people to make changes, which again have had to be rushed with little testing. The only other option would have been to stop all MB tasks until the problems were fully tested and resolved. They were damned no matter which option they choose. So, patience my friends, if you want to vent your feeling go to the Nvidia site. edit] gotta go http://www.youtube.com/watch?v=DYTwmKOHcGY&feature=related calls |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications. If true, the server side code could have been upgraded to reject results from Fermi cards until the problem has been solved and send msgs to Fermi users warning them about the situation; no need to screw around with any other changes until then. (The results files tell what card was used, I believe, so it couldn't have been too difficult - at least, I assume so). |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications. Yes, it's true - and a server code update is precisely what they did. Weren't you the one who wanted a server rollback just now? You can't have it both ways. In fact, it's even more ironic than WinterKnight made it sound. Programming a CUDA card - basically, a parallel supercomputer on a chip - is fiendishly complicated. Just ask Jason - he's having to dig out textbooks appropriate to multi-million dollar Crays from the 1980s. There's no way the Berkeley (SETI) staff could have learned to program these beasties while keeping on top of everything else. And they didn't - it was NVidia themselves who wrote the applications to run on their own hardware, and donated them to SETI. So it's NVidia's own code, in the cuda and cuda23 variants, which has turned out to be incompatible with the Fermi. That's what caused the pollution of the science results database. Since NVidia have now donated a third, Fermi-compatible, version of their program - which was being tested at Beta, with good results, when all this blew up - it makes sense to deploy it. But there are teething troubles..... |
Hellsheep Send message Joined: 12 Sep 08 Posts: 428 Credit: 784,780 RAC: 0 |
From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications. As we all know though, you can only grow so many teeth ;) Eventually these problems will be resolved and we just have to be patient and wait for them to do their best to fix the rest of the current issues. :) - Jarryd |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.