Problem with 'cuda_fermi' WU URGENT!!!!

Message boards : Number crunching : Problem with 'cuda_fermi' WU URGENT!!!!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372098 - Posted: 26 May 2013, 5:58:15 UTC

I use W7 with GTX 670 video card (320.18 nevest drivers). BOINC version 7.0.28. Lunatics v.0.40
Few days ago ALL my GPU WUs ends immediately after starts for reason 'Computation error'. I'm on PANIC mode NOW. What happend? How I can fix this issue? Thank you in advance.
PS. Sorry my english.
PPS. If you need additional info please tell me
ID: 1372098 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,259
RAC: 0
Greece
Message 1372100 - Posted: 26 May 2013, 6:16:11 UTC - in response to Message 1372098.  

ID: 1372100 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1372101 - Posted: 26 May 2013, 6:30:12 UTC

I use W7 with GTX 670 video card (320.18 nevest drivers). BOINC version 7.0.28. Lunatics v.0.40
Few days ago ALL my GPU WUs ends immediately after starts for reason 'Computation error'. I'm on PANIC mode NOW. What happend? How I can fix this issue?

Dit you do that?

http://setiathome.berkeley.edu/forum_thread.php?id=69735#1296126

If you are running Lunatics optimized apps, the environmental variable described in the sticky thread won't help - that fix is for Kepler cards running the outdated stock gpu application and is not needed for the ones running the newer optimized applications.

We can't see your computer or tasks to analyze the errors because it is hidden. Set your SETI@Home preferences to:
Should SETI@home show your computers on its web site? yes






Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1372101 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372103 - Posted: 26 May 2013, 6:37:19 UTC - in response to Message 1372101.  

I did it
ID: 1372103 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1372104 - Posted: 26 May 2013, 6:49:34 UTC

They still show as hidden. so you might want to try that again.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1372104 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1372110 - Posted: 26 May 2013, 7:46:44 UTC - in response to Message 1372103.  

I did it


Can you link us to one of the errored tasks instead.

ID: 1372110 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372116 - Posted: 26 May 2013, 8:41:53 UTC - in response to Message 1372110.  
Last modified: 26 May 2013, 8:43:07 UTC

Sure (for 7.1.1)
http://setiathome.berkeley.edu/result.php?resultid=3014253478
http://setiathome.berkeley.edu/result.php?resultid=3014253476
http://setiathome.berkeley.edu/result.php?resultid=3014253473

for 7.0.28
http://setiathome.berkeley.edu/result.php?resultid=3013393820

It is what you need?
ID: 1372116 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1372147 - Posted: 26 May 2013, 11:16:08 UTC

The computers are now visible - I don't know what caused the lag time.

Computer 6753440 is the one throwing all the errors after only a few seconds of run time. They say:

Exit status -1073741819 (0xffffffffc0000005) Unknown error number
and
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x6754E5DD read attempt to address 0x00000000

Can anyone help on this? The app is x41g.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1372147 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1372150 - Posted: 26 May 2013, 11:37:52 UTC - in response to Message 1372147.  

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x6754E5DD read attempt to address 0x00000000

Can anyone help on this? The app is x41g.


Looks like serious driver/OS crashes, which could be due to a long list of possible issues, just some:
- Some driver installation problem. Could even be the ms-update induced one. Remove driver in safe mode, Clean install (custom option) new driver.
- GPU temperatures, check with a tool like eVGA precision, or msi afterburner
- Card needs reseating
- Power supply failing or insufficient
- system memory failing or needs reseating, or Bios settings
- Application DLLs become somehow corrupted ? Check hard drive integrity and reinstall app
- Some motherboard utility for software monitoring or OC interfering.

Some worst cases that come to mind:
- Motherboard failing
- GPU failing

Probably more basic & worst case scenarios than this. Some diagnostics with other tools might help narrow down the culprit.

HTH
Jason


"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1372150 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372156 - Posted: 26 May 2013, 12:56:37 UTC - in response to Message 1372150.  

PSU - very goog. Enough. Be sure. SeaSonic X-660 (Gold)
GPU temperature - normal. Always control it.
Reinstall driver. Waiting for GPU WU. If error still occur maybe rollback to previous version. All began when I install new version.

Tell me I need to install http://jgopt.org ??
I'm advanced in PC and know how manual install.
Thanks
ID: 1372156 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1372174 - Posted: 26 May 2013, 14:36:42 UTC - in response to Message 1372156.  

PSU - very goog. Enough. Be sure. SeaSonic X-660 (Gold)
GPU temperature - normal. Always control it.
Reinstall driver. Waiting for GPU WU. If error still occur maybe rollback to previous version. All began when I install new version.

Tell me I need to install http://jgopt.org ??
I'm advanced in PC and know how manual install.
Thanks


I would definitely go with the CUDA 5.0 version as your card and OS will support it and run quite well.

http://jgopt.org/download/x41zc_Winx64_cuda50.7z

ID: 1372174 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1372181 - Posted: 26 May 2013, 15:13:56 UTC
Last modified: 26 May 2013, 15:16:42 UTC

Yes with the 670 the best one is the cuda50, you could run 2 or 3 WU at a time, in my 670 i run 3 WU at a time and works perfect.
ID: 1372181 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372225 - Posted: 26 May 2013, 18:42:22 UTC - in response to Message 1372181.  

Still no GPU WUs
ID: 1372225 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,259
RAC: 0
Greece
Message 1372230 - Posted: 26 May 2013, 19:08:31 UTC
Last modified: 26 May 2013, 19:14:34 UTC

You set your computers to ‘’hidden’’ again.

We can’t help you like that. Don’t worry we will not steal anything.

Tim
ID: 1372230 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1372233 - Posted: 26 May 2013, 19:20:31 UTC - in response to Message 1372230.  

You set your computers to ‘’hidden’’ again.

We can’t help you like that. Don’t worry we will not steal anything.

Tim

And there's nothing of consequence that we can learn from them by them not being hidden, with them hidden we can't alert you to problems if you're our wingman.

Claggy
ID: 1372233 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1372234 - Posted: 26 May 2013, 19:28:21 UTC - in response to Message 1372225.  

Still no GPU WUs

You errored out your GPU Wu's from this morning:

Computer 6753440

And you're now down to 11 tasks per day for the SETI@home Enhanced (anonymous platform, nvidia GPU) app version, once a new day occurs (it's ramdomised across hosts), you'll get some more tasks to try.

Claggy
ID: 1372234 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372327 - Posted: 27 May 2013, 3:45:52 UTC - in response to Message 1372234.  

I think now all OK. I have GPU WUs and no error.
I do follow:
1. Uninstall Lunatics.
2. Uninstall BOINC.
3. Clean all BOINC's folders.
4. Uninstall NVIDIA drivers (usind Driver Cleaner)
5. Install NVIDIA drivers.
6. Install BOINC.
7. Install Lunatics
ID: 1372327 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372328 - Posted: 27 May 2013, 3:50:25 UTC - in response to Message 1372327.  
Last modified: 27 May 2013, 3:52:53 UTC

As for http://jgopt.org
I must just copy appropriate files to here
C:\ProgramData\BOINC\projects\setiathome.berkeley.edu

Or/and do something else, e.g. change any files like MBCuda.aistub ?

Thanks.

juan BFB long time I crunch 5 WUs any time on my GTX670. Works perfect.
I try 6 WUs - not good. My PC freeze periodically. Can you tell me best number. Why you don't try 4 or 5 WUs. Maybe I don't understand something
ID: 1372328 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1372336 - Posted: 27 May 2013, 4:14:34 UTC
Last modified: 27 May 2013, 4:15:29 UTC

You could crunch 5 or more WU on a 670 (the actual max is the memory avaiable on the gPU) but the overhead to do that makes the total producton worst than with 3. The trick is: use a program that show you the GPU usage (Gpu-z for example) then increase the number of WU until the GPU usage reaches 98/99%, that is the optimal number, more make your entire sistem slower. Or you could use a program like the Fred performance test, it make the test for you automaticaly and indicate the optimal number. On my hosts that number is 3 on the 670, maybe that number is diferent in your host due your particular CPU/MB/GPU/OC combination but you need to test to be sure about that.
ID: 1372336 · Report as offensive
Profile Igogo Project Donor
Volunteer tester
Avatar

Send message
Joined: 18 Dec 04
Posts: 125
Credit: 65,303,299
RAC: 44
Thailand
Message 1372343 - Posted: 27 May 2013, 4:31:11 UTC - in response to Message 1372336.  
Last modified: 27 May 2013, 4:32:02 UTC

juan BFB thank you
I use here http://www.myfavoritegadgets.info/monitors/GPUMonitor/GPUMonitor.html
For control temperature and GPU load. If I use 5 WUs my GPU load 97/98%
Temperature normal no more than 75 deg Celsius
I think nomber of WUs not depends from CPU/MB/OS. From GPU only.I use ASUS GTX 670 with 2GB memory

Can you post me direct link to Fred performance test?
Thanks
ID: 1372343 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Problem with 'cuda_fermi' WU URGENT!!!!


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.