Astropulse! Finally! =D

Message boards : Number crunching : Astropulse! Finally! =D
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Hellsheep
Volunteer tester

Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1003605 - Posted: 12 Jun 2010, 13:25:56 UTC

I've been waiting for the day i start to get AP Work units! I've now got 11, they're downloading every request now, seems i get a AP work unit every request, but nothing else. :)


- Jarryd
ID: 1003605 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1003607 - Posted: 12 Jun 2010, 13:30:57 UTC - in response to Message 1003605.  
Last modified: 12 Jun 2010, 13:31:33 UTC

Yes - apparently, they have screwed up MB distribution, if I read the threads about "new" error messages correctly. And, apparently, they aren't about to roll back whatever server changes have been made that are causing the problem. Which sucks.
ID: 1003607 · Report as offensive
Profile Hellsheep
Volunteer tester

Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1003608 - Posted: 12 Jun 2010, 13:32:47 UTC - in response to Message 1003607.  

Yes - apparently, they have screwed up MB distribution, if I read the threads about "new" error messages correctly. And, apparently, they aren't about to roll back whatever server changes have been made that are causing the problem. Which sucks.



http://setiathome.berkeley.edu/forum_thread.php?id=60285

That's a great read about the issues going on at the moment, there is a lot of speculation as to what's going on. Richard has been doing some testing and writing up some of his completed research into the issue relating to the 100 quota per day.
- Jarryd
ID: 1003608 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1003611 - Posted: 12 Jun 2010, 13:39:14 UTC - in response to Message 1003608.  

Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it.
ID: 1003611 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1003618 - Posted: 12 Jun 2010, 14:09:05 UTC

I do agree that the server software here should have been rolled back when the problems surfaced regarding the new Fermi cards. Changes should have been tested on the Beta site first.


Also as Matt described in this post and also mentioned on the front page there is a problem in removing the radar interference which is delaying the MB work.

So Berkeley has many problems to solve right now. Many of them are self inflicted. Patience is required by us.
Boinc....Boinc....Boinc....Boinc....
ID: 1003618 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1003625 - Posted: 12 Jun 2010, 14:26:26 UTC - in response to Message 1003618.  

The new Fermi app was tested on the Beta site. It was brought over a little early because of all the problems people were having trying to run them on the main site without reading up on how to get them to work right. That is not Berkeley's fault. The problems with the tasks pages are just teething problems getting an improvement working right.That will get worked out when they can. I have no idea where the problem about quota limits reached popped up but I'm sure it will be fixed soon also.

Please be patient! THe guys are working on these problems and will get us back to normal as fast as they can!


PROUD MEMBER OF Team Starfire World BOINC
ID: 1003625 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1003631 - Posted: 12 Jun 2010, 14:41:20 UTC - in response to Message 1003611.  

Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it.

It's possible, but I honestly don't read it that way.

From the front page news, the SETI staff were wrestling with the radar blanking up until 9pm local time on Friday evening. They seem to have had some success: there is still some data being split now, at 7:30am the following morning. The automated pipeline is feeding tapes into the splitter queue - slowly, but it's a start.

And when you say 'Berkeley', who exactly do you mean? Matt, Eric, and the rest of the SETI crew? Their strength and skill - and paymaster - is Radio Astronomy: I would expect them to concentrate on the work supply and the radar blanking. Quotas and suchlike are the province of BOINC: the relevant section was tested at Beta, but it was written and installed by David Anderson. Some glitches showed up: they've been reported, and we'll carry on working through them. Testers, as always, are needed - particularly ones with the patience to observe, and the skill to report accurately.

In the meantime, it's a summy weekend. I've just been out for a walk, and - having only bobbed in for a glass of water - I'm going out again. I suggest we let both the SETI staff, and the BOINC staff, have the weekend off as well. But rest assured that everything I've been able to work out over the weekend, with the help of any actual information gleaned here, will hit their inboxes as they arrive for work on Monday morning. With any luck, we'll be able to explain the source of the problem which is, I suspect, still a mystery to them.

E.T. can wait a couple more days....
ID: 1003631 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1003638 - Posted: 12 Jun 2010, 15:13:13 UTC - in response to Message 1003631.  

Yes - I read that thread, too. The real problem seems to be stubbornness on the part of the Berkeley folks. When these errors surfaced, they should have immediately rolled back to previous versions - any customer-oriented organization would do so. New code should be tested on the Beta site, NOT production. If that means Fermi might have to be delayed, so be it.

It's possible, but I honestly don't read it that way.

From the front page news, the SETI staff were wrestling with the radar blanking up until 9pm local time on Friday evening. They seem to have had some success: there is still some data being split now, at 7:30am the following morning. The automated pipeline is feeding tapes into the splitter queue - slowly, but it's a start.

And when you say 'Berkeley', who exactly do you mean? Matt, Eric, and the rest of the SETI crew? Their strength and skill - and paymaster - is Radio Astronomy: I would expect them to concentrate on the work supply and the radar blanking. Quotas and suchlike are the province of BOINC: the relevant section was tested at Beta, but it was written and installed by David Anderson. Some glitches showed up: they've been reported, and we'll carry on working through them. Testers, as always, are needed - particularly ones with the patience to observe, and the skill to report accurately.

In the meantime, it's a summy weekend. I've just been out for a walk, and - having only bobbed in for a glass of water - I'm going out again. I suggest we let both the SETI staff, and the BOINC staff, have the weekend off as well. But rest assured that everything I've been able to work out over the weekend, with the help of any actual information gleaned here, will hit their inboxes as they arrive for work on Monday morning. With any luck, we'll be able to explain the source of the problem which is, I suspect, still a mystery to them.

E.T. can wait a couple more days....


To tell the truth, I said "Berkeley" because I wasn't sure whodunit...yes, ET can certainly wait a few more days (or millenia, for that matter). I was just hoping for a quick rollback so I could get more MB for my 2 rigs, is all.
ID: 1003638 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1003648 - Posted: 12 Jun 2010, 15:47:27 UTC - in response to Message 1003639.  



E.T. can wait a couple more days....


From all the angry, impatient, and sometimes downright evil comments towards those at Berkeley who IMO does an excellent job with the resources they have, I must conclude that at least some people here have found a way to convert their SETI credits into real money.

There can be no other explanation really, since some people react as if any downtime or lack of WU's directly affect their wallet or bank account.

Can those people please tell me how to convert SETI credit to some wordly currency?

Sten-Arne


Sorry, not buying...or selling, for that matter. Not angry, except at what certainly looks like stubborn incompetence right now...I know the guys have lots of probs they have no control over (old h/w, network probs, lack of $$$ and manpower), but that's no excuse (IMO) for the current MB problems, which are entirely self-inflicted.

And I have contributed what limited $$$ I can afford.

I give 'em props for the usual amount of effort they put in, but NOT for this one.
ID: 1003648 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19080
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1003649 - Posted: 12 Jun 2010, 15:55:02 UTC
Last modified: 12 Jun 2010, 16:04:30 UTC

From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications.
If you read some of the hardware review sites you will find that some of there standard applications, including games, fail on the Fermi cards.

And as Nvidia did not tell Berkeley about the changes, BOINC does not detect Fermi cards as a new unknown device, and subsequently the Seti CUDA app was allowed to be run on these Fermi gpu's.
This resulted in thousands of incorrect -9 overflows, which either incorrectly validated if both tasks were done on Fermi gpu's or caused an inconclusive result and a third issue of the task, which sometimes meant the correct result was rejected. This harmed the science and Seti was forced into introducing a very little tested Fermi app, and asking the BOINC people to make changes, which again have had to be rushed with little testing.

The only other option would have been to stop all MB tasks until the problems were fully tested and resolved.

They were damned no matter which option they choose.

So, patience my friends, if you want to vent your feeling go to the Nvidia site.

edit] gotta go http://www.youtube.com/watch?v=DYTwmKOHcGY&feature=related calls
ID: 1003649 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1003656 - Posted: 12 Jun 2010, 16:36:41 UTC - in response to Message 1003649.  

From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications.
If you read some of the hardware review sites you will find that some of there standard applications, including games, fail on the Fermi cards.

And as Nvidia did not tell Berkeley about the changes, BOINC does not detect Fermi cards as a new unknown device, and subsequently the Seti CUDA app was allowed to be run on these Fermi gpu's.
This resulted in thousands of incorrect -9 overflows, which either incorrectly validated if both tasks were done on Fermi gpu's or caused an inconclusive result and a third issue of the task, which sometimes meant the correct result was rejected. This harmed the science and Seti was forced into introducing a very little tested Fermi app, and asking the BOINC people to make changes, which again have had to be rushed with little testing.

The only other option would have been to stop all MB tasks until the problems were fully tested and resolved.

They were damned no matter which option they choose.

So, patience my friends, if you want to vent your feeling go to the Nvidia site.



If true, the server side code could have been upgraded to reject results from Fermi cards until the problem has been solved and send msgs to Fermi users warning them about the situation; no need to screw around with any other changes until then. (The results files tell what card was used, I believe, so it couldn't have been too difficult - at least, I assume so).
ID: 1003656 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1003659 - Posted: 12 Jun 2010, 17:07:08 UTC - in response to Message 1003656.  

From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications.
If you read some of the hardware review sites you will find that some of there standard applications, including games, fail on the Fermi cards.

And as Nvidia did not tell Berkeley about the changes, BOINC does not detect Fermi cards as a new unknown device, and subsequently the Seti CUDA app was allowed to be run on these Fermi gpu's.
This resulted in thousands of incorrect -9 overflows, which either incorrectly validated if both tasks were done on Fermi gpu's or caused an inconclusive result and a third issue of the task, which sometimes meant the correct result was rejected. This harmed the science and Seti was forced into introducing a very little tested Fermi app, and asking the BOINC people to make changes, which again have had to be rushed with little testing.

The only other option would have been to stop all MB tasks until the problems were fully tested and resolved.

They were damned no matter which option they choose.

So, patience my friends, if you want to vent your feeling go to the Nvidia site.

If true, the server side code could have been upgraded to reject results from Fermi cards until the problem has been solved and send msgs to Fermi users warning them about the situation; no need to screw around with any other changes until then. (The results files tell what card was used, I believe, so it couldn't have been too difficult - at least, I assume so).

Yes, it's true - and a server code update is precisely what they did. Weren't you the one who wanted a server rollback just now? You can't have it both ways.

In fact, it's even more ironic than WinterKnight made it sound. Programming a CUDA card - basically, a parallel supercomputer on a chip - is fiendishly complicated. Just ask Jason - he's having to dig out textbooks appropriate to multi-million dollar Crays from the 1980s. There's no way the Berkeley (SETI) staff could have learned to program these beasties while keeping on top of everything else. And they didn't - it was NVidia themselves who wrote the applications to run on their own hardware, and donated them to SETI.

So it's NVidia's own code, in the cuda and cuda23 variants, which has turned out to be incompatible with the Fermi. That's what caused the pollution of the science results database. Since NVidia have now donated a third, Fermi-compatible, version of their program - which was being tested at Beta, with good results, when all this blew up - it makes sense to deploy it.

But there are teething troubles.....
ID: 1003659 · Report as offensive
Profile Hellsheep
Volunteer tester

Send message
Joined: 12 Sep 08
Posts: 428
Credit: 784,780
RAC: 0
Australia
Message 1003662 - Posted: 12 Jun 2010, 17:09:28 UTC - in response to Message 1003659.  

From observation only, the problem is not Seti or BOINC but Nvidia. Nvidia has introduced the Fermi gpu, which has to be regarded as a second generatin CUDA device, and not informed the world that it many cases it is not compatible with some CUDA applications.
If you read some of the hardware review sites you will find that some of there standard applications, including games, fail on the Fermi cards.

And as Nvidia did not tell Berkeley about the changes, BOINC does not detect Fermi cards as a new unknown device, and subsequently the Seti CUDA app was allowed to be run on these Fermi gpu's.
This resulted in thousands of incorrect -9 overflows, which either incorrectly validated if both tasks were done on Fermi gpu's or caused an inconclusive result and a third issue of the task, which sometimes meant the correct result was rejected. This harmed the science and Seti was forced into introducing a very little tested Fermi app, and asking the BOINC people to make changes, which again have had to be rushed with little testing.

The only other option would have been to stop all MB tasks until the problems were fully tested and resolved.

They were damned no matter which option they choose.

So, patience my friends, if you want to vent your feeling go to the Nvidia site.

If true, the server side code could have been upgraded to reject results from Fermi cards until the problem has been solved and send msgs to Fermi users warning them about the situation; no need to screw around with any other changes until then. (The results files tell what card was used, I believe, so it couldn't have been too difficult - at least, I assume so).

Yes, it's true - and a server code update is precisely what they did. Weren't you the one who wanted a server rollback just now? You can't have it both ways.

In fact, it's even more ironic than WinterKnight made it sound. Programming a CUDA card - basically, a parallel supercomputer on a chip - is fiendishly complicated. Just ask Jason - he's having to dig out textbooks appropriate to multi-million dollar Crays from the 1980s. There's no way the Berkeley (SETI) staff could have learned to program these beasties while keeping on top of everything else. And they didn't - it was NVidia themselves who wrote the applications to run on their own hardware, and donated them to SETI.

So it's NVidia's own code, in the cuda and cuda23 variants, which has turned out to be incompatible with the Fermi. That's what caused the pollution of the science results database. Since NVidia have now donated a third, Fermi-compatible, version of their program - which was being tested at Beta, with good results, when all this blew up - it makes sense to deploy it.

But there are teething troubles.....


As we all know though, you can only grow so many teeth ;) Eventually these problems will be resolved and we just have to be patient and wait for them to do their best to fix the rest of the current issues. :)

- Jarryd
ID: 1003662 · Report as offensive

Message boards : Number crunching : Astropulse! Finally! =D


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.