Panic Mode On (111) Server Problems?

Message boards : Number crunching : Panic Mode On (111) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 31 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929672 - Posted: 12 Apr 2018, 22:37:35 UTC - in response to Message 1929669.  

I keep seeing this kind of post. Someone show me the proof that all the naysayers state that older Nvidia cards will go poof or that the system becomes unusable.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929672 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929673 - Posted: 12 Apr 2018, 22:37:36 UTC - in response to Message 1929666.  

Let me guess that we're going to get lots and lots of _2, _3, _4, _x tasks being resent, when older Nvidia cards goes Poof......


. . Think of it as culling the herd ...

Stephen

:)
ID: 1929673 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929675 - Posted: 12 Apr 2018, 22:40:39 UTC - in response to Message 1929668.  

Let me guess that we're going to get lots and lots of _2, _3, _4, _x tasks being resent, when older Nvidia cards goes Poof......

Why do you think that? From those older cards letting out the magic blue smoke? Or disheartened volunteers who leave the project without clearing their caches?


. . I expect there will be some of both ...

Stephen
ID: 1929675 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929676 - Posted: 12 Apr 2018, 22:42:14 UTC - in response to Message 1929671.  

. .. I just hope there are sufficient safeguards to keep them off older nvidia hardware/apps.


Nope, they are going out to all Nvidia cards. No card family restrictions. Looks like they take double the computation time to complete. But pay double too.


. . I am guessing that decision is because the stock aps now are SoG 8.22 and Cuda60, which I am guessing will handle these tasks OK even on some older cards like Kepler, not sure about Fermi or earlier. It could see the retirement (or expiry) of some pre-Fermi cards in use ..

Stephen

? ?

I have posted about a Kepler 680 that didn't seem to have any issues with the VLAR. And I know that Richard Haselgrove said he got some on his GTX 470 card. That is probably the worst case I can think of. I don't think the earlier cards are even able to run the SoG app or the CUDA app below 3.2.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929676 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929677 - Posted: 12 Apr 2018, 22:44:01 UTC - in response to Message 1929669.  

The unaswered question is: Allowing Arecibo Vlar on NVidias was an accident or something scheduled?
Sure that will make a lot of old Nvidias hosts stop to crunching S@H .


. . A large part of the historical failures of these tasks on Nvidia hardware can be traced to the app running it. The new apps, SoG and CUDA60 upwards, can cope with them but I am not sure how far back in hardware terms. I suspect Pre-Fermi gear may see lots of blue smoke, if only from the ears of the operators ...

Stephen

:(
ID: 1929677 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1929678 - Posted: 12 Apr 2018, 22:44:45 UTC - in response to Message 1929672.  

I keep seeing this kind of post. Someone show me the proof that all the naysayers state that older Nvidia cards will go poof or that the system becomes unusable.

The last time that they were unleashed on us was under the old Cuda apps and back then they just brought my rigs with GTX 550/560 Ti's and GTX 660's to their unusable knees.

But it will be interesting to see how they go with the SoG app.

Cheers.
ID: 1929678 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929680 - Posted: 12 Apr 2018, 22:53:55 UTC - in response to Message 1929676.  


I have posted about a Kepler 680 that didn't seem to have any issues with the VLAR. And I know that Richard Haselgrove said he got some on his GTX 470 card. That is probably the worst case I can think of. I don't think the earlier cards are even able to run the SoG app or the CUDA app below 3.2.


. . I am confident SoG can handle them on Kepler and I think CUDA60 will too, but I was running an old 8600GS (256Mb) at one point, and I know there are still some hosts with that level of hardware, and they will be running CUDA32 or earlier and that will be interesting ....

. . The GTX470 is Fermi right? I am guessing it is running SoG? It will be interesting to hear from Richard on his results.

Stephen

? ?
ID: 1929680 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929681 - Posted: 12 Apr 2018, 22:55:26 UTC - in response to Message 1929678.  

I had a hunch all the negativity was on the older CUDA 32, 42 or 50 apps. Did they ever get tested on Beta with the SoG app which is the stock app now? The SoG app has lots of parameters for tuning to alleviate system lagginess.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929681 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929682 - Posted: 12 Apr 2018, 22:58:08 UTC - in response to Message 1929678.  

I keep seeing this kind of post. Someone show me the proof that all the naysayers state that older Nvidia cards will go poof or that the system becomes unusable.

The last time that they were unleashed on us was under the old Cuda apps and back then they just brought my rigs with GTX 550/560 Ti's and GTX 660's to their unusable knees.

But it will be interesting to see how they go with the SoG app.

Cheers.


. . This may be a good time to see if they help with the temp problems on the 560ti. They may let it run cool enough to press on with ...

Stephen

??
ID: 1929682 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929683 - Posted: 12 Apr 2018, 22:59:51 UTC - in response to Message 1929680.  
Last modified: 12 Apr 2018, 23:10:09 UTC


I have posted about a Kepler 680 that didn't seem to have any issues with the VLAR. And I know that Richard Haselgrove said he got some on his GTX 470 card. That is probably the worst case I can think of. I don't think the earlier cards are even able to run the SoG app or the CUDA app below 3.2.


. . I am confident SoG can handle them on Kepler and I think CUDA60 will too, but I was running an old 8600GS (256Mb) at one point, and I know there are still some hosts with that level of hardware, and they will be running CUDA32 or earlier and that will be interesting ....

. . The GTX470 is Fermi right? I am guessing it is running SoG? It will be interesting to hear from Richard on his results.

Stephen

? ?

Yes the GTX 470 is KeplerFermi. I think the only issue is going to be hosts still running the old CUDA apps. Does the Scheduler even send out those applications anymore with the SoG app standard? The only way I know of getting the older CUDA apps is to run Anonymous platform or accept the stock settings that the Lunatics installer sets that is wrong unless you notice the SoG app selection and manually set that.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929683 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929688 - Posted: 13 Apr 2018, 0:29:22 UTC - in response to Message 1929683.  


Yes the GTX 470 is KeplerFermi. I think the only issue is going to be hosts still running the old CUDA apps. Does the Scheduler even send out those applications anymore with the SoG app standard? The only way I know of getting the older CUDA apps is to run Anonymous platform or accept the stock settings that the Lunatics installer sets that is wrong unless you notice the SoG app selection and manually set that.


. . But there will be some old timers out there plugging along with Anonymous platform and running Cuda32 or 22 or whatever who may get some nasty surprises ...

Stephen

:(
ID: 1929688 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929689 - Posted: 13 Apr 2018, 0:33:47 UTC - in response to Message 1929688.  


Yes the GTX 470 is KeplerFermi. I think the only issue is going to be hosts still running the old CUDA apps. Does the Scheduler even send out those applications anymore with the SoG app standard? The only way I know of getting the older CUDA apps is to run Anonymous platform or accept the stock settings that the Lunatics installer sets that is wrong unless you notice the SoG app selection and manually set that.


. . But there will be some old timers out there plugging along with Anonymous platform and running Cuda32 or 22 or whatever who may get some nasty surprises ...

Stephen

:(

Yes, that might be expected. If so, I'm sure we will begin to see bewildered posts here in Number Crunching or the Questions and Problems forums. Plenty of us experts around to write a reply on how to fix the problem or deliver the bad news.

I think this might be a good thing in the end as this will force the set and forget type to reconnect with the project. They might be some of the bad hosts that plague us and we have never been able to contact to tell them their hosts have issues.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929689 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1929693 - Posted: 13 Apr 2018, 1:02:12 UTC - in response to Message 1929689.  

Actually the Old CUDA Apps are still active as a Stock App. If you don't have a working OpenCL Driver you will be sent the OLD Baseline CUDA App, as long as you have a working CUDA Driver. The main problem the last time the Arecibo VLARs were released was people had their machines set to run 2 or 3 CUDA MB tasks at once. One VLAR will load the One Compute Unit used by the Old CUDA App on the VLARs, more than One VLAR at a time will bring it to it's Knees. If you have the machine set to run just One task at a time it should be bearable, but annoying. The New Linux & Mac CUDA Apps use ALL the Compute Units on the Arecibo VLARs, so there really isn't any difference from the OpenCL App.

All My machines are pretty much loaded with the Arecibo VLARs, there must be a Large number of them. If they hadn't of started sending them, people would be moaning about being out of GPU tasks. Hopefully these episodes won't be very common and people will be running BLCs most of the time.
ID: 1929693 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929695 - Posted: 13 Apr 2018, 1:15:27 UTC - in response to Message 1929693.  

Good to know Tbar. Hadn't thought of the issue of no OpenCL driver loaded forcing CUDA as a case. I was rather surprised of the many Arecibo tapes loaded this morning. More recently it has been one or two Arecibo tapes loaded each day on a hit or miss. I think that most of us had expected a constant ramping down of Arecibo work what with the hurricane and funding issues telescope has had recently.

And we have been told that work from Arecibo would become rarer and to expect more BLC tasks as the norm. So these VLAR tasks are rather unexpected. I haven't found a Arecibo VLAR running on my Windows systems yet. Must have loaded up on standard AR tasks beforehand. Lots of Arecibo VLARS have moved through the high output Linux boxes with no issues.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929695 · Report as offensive
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 1929701 - Posted: 13 Apr 2018, 1:53:09 UTC

On my 780Ti the Arecibo VLARs run about double the time of non-VLARs and pay offt at the same rate...at least for now.
But at the same time, I have a couple of APs running on the CPU (SSE and SSE2). They were calculated to over 5 hours, but they're already past that and ticking down sooo slowly, I wonder how long they'll actually take. I don't think I've seen those before...
Humans may rule the world...but bacteria run it...
ID: 1929701 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1929702 - Posted: 13 Apr 2018, 1:55:52 UTC - in response to Message 1929701.  

Are we sure these are Arecibo? Didn't they mention something about a 3rd source of data would be rolling out?
ID: 1929702 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1929704 - Posted: 13 Apr 2018, 2:11:11 UTC - in response to Message 1929702.  

If you open a task with a TextEditor it says it's Arecibo;
<receiver_cfg>
    <s4_id>14</s4_id>
    <name>Arecibo 1.4GHz Array, Beam 5, Pol 1</name>
    <beam_width>0.0500000007</beam_width>
    <center_freq>1420</center_freq>
    <latitude>18.3538056</latitude>
    <longitude>-66.7552222</longitude

So, it's probably from Arecibo.
ID: 1929704 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929705 - Posted: 13 Apr 2018, 2:12:14 UTC - in response to Message 1929702.  

We're supposed to get Parkes data from Australia eventually. But I don't think that data has arrived yet. I don't think the prep work for that data has been finished yet. If and when we do get it, I bet it has its own naming convention like the Green Bank telescope does.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929705 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1929724 - Posted: 13 Apr 2018, 3:34:22 UTC
Last modified: 13 Apr 2018, 3:34:42 UTC

Volta does Arecibo tasks in 75 seconds and a 1080 does them in 150 seconds. Nice!
VLAR work is totally ok. Credit per wu is bigger too. I'll wait till APR scales back up.
If it does not I'll find out what do I need to optimize next.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1929724 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1929731 - Posted: 13 Apr 2018, 4:07:42 UTC - in response to Message 1929636.  

So does anybody that worked on Beta remember in general the compute times for Arecibo VLARs?

About all I can remember is they made the system slightly more laggy than GBT WUs when running SoG. Anyone running a CUDA32/42/50 application will be in for a world of pain.
Grant
Darwin NT
ID: 1929731 · Report as offensive
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 31 · Next

Message boards : Number crunching : Panic Mode On (111) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.