Cuda is as Cuda does,,,,,,,,,,,,,,,,,

Message boards : Number crunching : Cuda is as Cuda does,,,,,,,,,,,,,,,,,
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 852185 - Posted: 11 Jan 2009, 12:42:49 UTC - in response to Message 852184.  

And they obviously aren't being truthful and telling everything about the cluster of a forced release of cuda well before it was ready for a production environment, so what else are they not saying??

Much has not been said....................and will never see the light of day........
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 852185 · Report as offensive
Beau

Send message
Joined: 24 Feb 08
Posts: 50
Credit: 129,080
RAC: 0
United States
Message 852186 - Posted: 11 Jan 2009, 12:46:47 UTC - in response to Message 852185.  

Which is sad on what is supposed to be a public project
ID: 852186 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 852188 - Posted: 11 Jan 2009, 12:48:45 UTC - in response to Message 852186.  

Which is sad on what is supposed to be a public project

ROFLMAO.......
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 852188 · Report as offensive
Beau

Send message
Joined: 24 Feb 08
Posts: 50
Credit: 129,080
RAC: 0
United States
Message 852189 - Posted: 11 Jan 2009, 12:49:46 UTC - in response to Message 852188.  

Was that not a correct statement? :-)
ID: 852189 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 852190 - Posted: 11 Jan 2009, 12:57:21 UTC

I bull<<<< a lot..........
But I support this project with my whole heart and soul......you know I do.........this Cuda thingy will pass..............but not before it pisses enough users to leave the project..............
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 852190 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 852191 - Posted: 11 Jan 2009, 13:05:00 UTC

Still think that the idea is good but it shouldn't have hit main as soon as it did..

From my point of view it's truly an awesome idea when this crashing wu issue has been resolved or should i say isolated so that we can get work that would validate in 100% of all cases regards of gpu or cpu..

When that has been fixed i really hope they can get a client that could process those mighty AP units because 15 hours on a relatively fast pc is quite heavy indeed, if that client could've been written in a gpu fashion and it would be working 100% it would be really nice..

Then i could start making some AP's again ;)

Kind regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 852191 · Report as offensive
Beau

Send message
Joined: 24 Feb 08
Posts: 50
Credit: 129,080
RAC: 0
United States
Message 852192 - Posted: 11 Jan 2009, 13:05:43 UTC - in response to Message 852190.  

Of course it will pass, eventually. It is obvious that you are dedicated to the project; just your amount of credits on seti alone tells that. You are definitely one of the good ones...
ID: 852192 · Report as offensive
Profile the silver surfer
Avatar

Send message
Joined: 24 Feb 01
Posts: 131
Credit: 3,739,307
RAC: 0
Austria
Message 852194 - Posted: 11 Jan 2009, 13:13:29 UTC - in response to Message 851921.  
Last modified: 11 Jan 2009, 13:17:11 UTC

Richard

[quote]We can have no idea of the overall CUDA rate from Matt's "currently roughly" remark. I suspect some snapshots would show lower, others higher - but how much lower or higher, neither of us has any way of knowing. An alternative factoid would be the List of recently connected client types, currently showing 14.25% Windows v6.4.5 (and hence potentially CUDA-capable).[/quoted]

I`m using 6.4.5 WITHOUT CUDA, just to be up to date with Boinc.

[quote]As it happens, I strongly agree with Mark that the CUDA release was a deeply flawed technical exercise. He has indicated in public that he has been told the behind-the-scenes reason for the release, but has been forbidden to repeat the information in public. I have heard much the same story in private from other people. It deeply saddens me that such secrecy has come about in an international, academic, scientific, publicly-funded (as in us, the public) research effort as this.

I think that both the technical, and the public relations, aspects of the CUDA release have been extremely poor, and need urgent remedial action. It spoke volumes that no member of the SETI@home project staff was prepared to be named, or quoted, in NVidia's press release on SETI CUDA launch day.[/quoted]

Exactly my point of view...........

Regards

Kurt

BTW: What`s wrong with my BBCode ???????????

ID: 852194 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 852195 - Posted: 11 Jan 2009, 13:13:38 UTC - in response to Message 852191.  

Still think that the idea is good but it shouldn't have hit main as soon as it did..

From my point of view it's truly an awesome idea when this crashing wu issue has been resolved or should i say isolated so that we can get work that would validate in 100% of all cases regards of gpu or cpu..

When that has been fixed i really hope they can get a client that could process those mighty AP units because 15 hours on a relatively fast pc is quite heavy indeed, if that client could've been written in a gpu fashion and it would be working 100% it would be really nice..

Then i could start making some AP's again ;)

Kind regards Vyper

Vyp........I could not agree with you more........

I have been this project's most staunch supporter for a long time now..........



But this just cuts to the quick........



Allowing invalid science into the mix due to commercial interests is wrong.......

And do not try to tell me that is not what has gone on......I know.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 852195 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 852198 - Posted: 11 Jan 2009, 13:26:01 UTC - in response to Message 852180.  

Maybe they already cashed the check and had to launch...? Just a thought...

The largest donation in December 2008 was only $1000.00, so it can't have been a very big check.... ;-)

Richard if it was made via other sources, it might not appear on that list. That list is only what they want us to see.

Maybe nVidia paid for the Overland Storage rack? :p The two of them happened pretty close to each other.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 852198 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 852201 - Posted: 11 Jan 2009, 13:49:39 UTC - in response to Message 852194.  
Last modified: 11 Jan 2009, 14:29:47 UTC



Kurt

BTW: What`s wrong with my BBCode ???????????


You have a trailing 'd' in your closing tag.

Back on topic;

Well, I'm not too thrilled with the way CUDA made its rollout (AP for that matter either, and it was far better tested), but lets keep the impact in perspective.

The CUDA VLAR problem is really just an annoyance for the project, but can be a real big problem for participants. The reason is the task will not validate and will go to a new replication, so there is only a minimal chance of MSD corruption. The main issue is it can lead to a string compute errors until a reboot, at best, up to a project state corruption and/or trigger a graphics driver crash at worst (with all the attendant hazards that brings).

The VHAR issue is the more serious of the two IMHO (at least for the project). The reason is the CUDA host does finish the task successfully, but with an erroneous Dash 9. The problem is that even though the WU will go to an extra replication, I have seen the faulty Dash 9 frequently still pass validation as weakly similar, and in one case I saw the third replication go to another CUDA host and the WU went into the MSD with a Dash 9 as canonical, and my host got Zedded (even though it had made a full length run and found absolutely nothing in the WU). One other thing, I noted one case of the erroneous Dash 9 outside what is commonly considered the VHAR range (something in the .4's IIRC).

There has also been a report of HD MBR corruption when running CUDA. However, I'm more inclined to believe this was a case of the host not really having the reserve PSU/cooling punch to handle running the graphics card and the CPU flatout at the same time (with the cc_config tweak). Keep in mind that the graphics driver is kernel level, and if the machine goes insane because the PSU sags or 'glitches', the card and/or the rest of system overheats, etc, anything is possible.

That's why I decided to start culling out CUDA hosts from my cache, unless it has reported already and I can take a look at its stderr report first.

So in summary, I'd have to say that the biggest mistake in the rollout was to set the project prefs to use it by default. If it had been opt-in, at least people would have had to make a conscious decision to partake in the experiment and thus hopefully be more inclined to monitor the machine to make sure it wasn't going renegade.

Alinator
ID: 852201 · Report as offensive
Wandering Willie
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 136
Credit: 2,127,073
RAC: 0
United Kingdom
Message 852218 - Posted: 11 Jan 2009, 15:01:00 UTC
Last modified: 11 Jan 2009, 15:06:14 UTC

IMHO You are no doute always welcome to come over and perhaps give us a hand in testing products. Instead of the continual threads on how bad they are. Then they might come up to your expectations.

Also take in to consideration that there is the possibility of receiving less or even no credits for WU's completed.

Also to be working on a project and finding a whole batch of WU's cancelled.

SETI@HOME BETA

As at 21:10:24 GMT 10/01/2009 there were 6,159 users of which only 1,440 were active.

Michael.
ID: 852218 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 852222 - Posted: 11 Jan 2009, 15:31:11 UTC - in response to Message 852201.  

I do, and any mildly sane seti enthusiast should, agree completely with alinator's comment below about the cuda roll out. Nobody is against progress, but in this case the CUDA roll out has been a black eye for no actual benefit. Isn't there anyone in charge of change control out there in boincland, or are everyone project-management amateurs. Maybe nVidia had their wallets open to force things forward before the process was better tested?

(The only reason I'm not wildly upset is that I can't afford a high end graphics card and suspect most volunteers are like me.)
ID: 852222 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 852230 - Posted: 11 Jan 2009, 15:45:23 UTC
Last modified: 11 Jan 2009, 16:02:55 UTC

I just got done with my morning data logging and came across another CUDA Dash 9 outside the VHAR range.

AR was 0.387354.

@ PhoneAcq: Agreed, if I had coughed up the cake for a GTX 200 series card, I wouldn't be too inclined to run BOINC on it, because if I buy high end component like that I already have a job for it which is going to take up all of its time.

The problem is that there are mainstream CUDA cards (1-200 dollar class), but those are the ones more likely to be marginal for things like cooling, and more likely to be installed in machines that don't have the reserve capacity to push them to their limits reliably. The combination of the two is just asking for trouble (IMHO).

Alinator
ID: 852230 · Report as offensive
Beau

Send message
Joined: 24 Feb 08
Posts: 50
Credit: 129,080
RAC: 0
United States
Message 852238 - Posted: 11 Jan 2009, 16:03:24 UTC - in response to Message 852230.  

I was thinking that exact same thing Cosmic_Ocean
ID: 852238 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14674
Credit: 200,643,578
RAC: 874
United Kingdom
Message 852240 - Posted: 11 Jan 2009, 16:04:47 UTC - in response to Message 852230.  

I just got done with my morning data logging and came across another CUDA Dash 9 outside the VHAR range.

If you look at the full task list for that host - fortunately it's quite short - you see a typical pattern.

First, some good work (three tasks showing at the moment)

Then, a couple of errors - both VLAR (you have to track the wingmate for one)

Next, three 'false' dash 9s, including yours. I think it's the earlier errors, rather than the AR of the task, which caused the dash 9s.

Finally, the user seems to have walked away (or his rig blew up), leaving four unfinished tasks in his cache.
ID: 852240 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 852250 - Posted: 11 Jan 2009, 16:22:15 UTC - in response to Message 852240.  

Man I sure hope no one looks at my task list. I messed up a bunch of WUs getting this thing running. :)

I did get some in the .005713 range that -9ed but I don't know if that's the vlar or it was me messing with the stuff. I was trying a bunch of different things at that time.


PROUD MEMBER OF Team Starfire World BOINC
ID: 852250 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 852254 - Posted: 11 Jan 2009, 16:28:51 UTC

Actually I had been wanting to upgrade a graphics card and S@H with CUDA is a great temptation to turn over to the dark side. Unfortunately it seems that the whole launch was blotched like most recent nVidian misadventures.

I am not a software developer but somehow by letting CUDA results validates a result by majority if two CUDA clients returns the same results before it is proven stable is sadly very damaging to the scientific case that is the reason why so much of us donates hardware/electricity to SETI over the years. Perhaps a lesson to be learnt from this is that the validation should be scripted that a result should not be validated and integrated if it is validated based on a new (beta-ish) client without a concurrence by a CPU client. I certainly hope that the team takes it to task to ensure that the scientific results are white than white before their next publication. Credibility on 10 years of science is too much to pay for whatever undisclosed reason the CUDA was steamrolled in...
ID: 852254 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 852256 - Posted: 11 Jan 2009, 16:33:25 UTC - in response to Message 852240.  
Last modified: 11 Jan 2009, 16:46:53 UTC

I just got done with my morning data logging and came across another CUDA Dash 9 outside the VHAR range.

If you look at the full task list for that host - fortunately it's quite short - you see a typical pattern.

First, some good work (three tasks showing at the moment)

Then, a couple of errors - both VLAR (you have to track the wingmate for one)

Next, three 'false' dash 9s, including yours. I think it's the earlier errors, rather than the AR of the task, which caused the dash 9s.

Finally, the user seems to have walked away (or his rig blew up), leaving four unfinished tasks in his cache.


Yep, I saw that pattern, but the curious part was the length of time between last two reports. I suppose interim ones could have been purged already, but it doesn't seem to have a lot of throughput for a a host of its class.

This suggests that it gets shut off, which should have cleared the graphics memory corruption from the VLAR going forward. Perhaps it justs gets hibernated when not being used.

<EDIT> Forget what I just said.... What was I thinking!!?? <give self slap on head>

LOL...

Too many tabs, not enough coffee. ;-)

Alinator
ID: 852256 · Report as offensive
Beau

Send message
Joined: 24 Feb 08
Posts: 50
Credit: 129,080
RAC: 0
United States
Message 852258 - Posted: 11 Jan 2009, 16:38:37 UTC - in response to Message 852256.  

seti credibility is already going downhill as a result of the mismanaged cuda force-feeding and the deafening silence from the admins on why it was screwed up so badly. Why dont they just come clean and explain exactly what happened in the dark corner of the parking garage where the whole deal went down with a single handshake and the passing of an envelope.
ID: 852258 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Cuda is as Cuda does,,,,,,,,,,,,,,,,,


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.