Panic Mode On (104) Server Problems?

Message boards : Number crunching : Panic Mode On (104) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 42 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1846909 - Posted: 5 Feb 2017, 22:43:28 UTC - in response to Message 1846891.  

. . I am now very new to Einstein but it must need some tweaking because my runtimes are far longer than Blc tasks and the credits don't look that good, in fact they seem about comparable to Seti. Overnight I accrued about 10,000 on a machine that does 31,000 a day on Seti. Looking at the monitors (afterburner) the CPU is overcommitted and the GPU usage is eratic and low.

Stephen to get Nvidea GPUs to run at a full and smooth level on Einstein open cl tasks you need to run more than 1 at a time. The easiest way to do that there is on your project preferences.
Your first couple of days on most projects a lot of your results will go into pending.
You should also run the beta testing, they have a fix for Windoz that improves throughput at least double. Einstein gives credit for beta tasks.


. . Thank you for that information. Though I don't see that running doubles on this machine would be feasible as the CPU is 100% committed just running singles. A pity about that. But I might try the Einstein beta testing option. Though right now I am going to try Seti again to see if I can get some work now that Eric has affected some sort of work around for the problem.

Stephen

:)
ID: 1846909 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1846930 - Posted: 6 Feb 2017, 1:06:01 UTC - in response to Message 1846909.  

Though I don't see that running doubles on this machine would be feasible as the CPU is 100% committed just running singles.

The way open cl works on Nvideas is if you run 2 it will grab a second cpu core. Nvidea open cl sucks.
ID: 1846930 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1846935 - Posted: 6 Feb 2017, 1:28:40 UTC - in response to Message 1846884.  

i just noticed that my main cruncher with 2x1070 is not getting Nvidia WU's anymore, no available tasks and it is processing Einstein meanwhile...


. . I guess I am going to have to join Einstein@home to give the rigs something todo, I am out of work on one rig and getting very low on the others.

Stephen

:(

At least Einstein pays credits well with the beta 1.18 application for FGRPB1G for those who care about that. On Nvidia Pascal cards, the tasks run about the equivalent time as Guppie BLC VLAR's but pay out at 3465 credits.

My slowest cruncher is even out of SETI GPU work and all crunchers are done to about 50 CPU tasks.


. . I am now very new to Einstein but it must need some tweaking because my runtimes are far longer than Blc tasks and the credits don't look that good, in fact they seem about comparable to Seti. Overnight I accrued about 10,000 on a machine that does 31,000 a day on Seti. Looking at the monitors (afterburner) the CPU is overcommitted and the GPU usage is eratic and low. Oh well, it is hopefully only a temporary thing.

Stephen

.

No tweaking required or even possible. Just run the stock 1.17 app for Nvidia or better still, select to run Beta apps and get the 1.18 app which is about twice as fast as the 1.17 app. They are OpenCL apps so require at least one full CPU core to support a GPU task. Just like here on SETI. I find that two tasks per card is the optimum on Pascal. You can run an app_config.xml file just like SETI but I have identical parameters for both projects. 0.5 GPU and 1.0 CPU per task.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1846935 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1846962 - Posted: 6 Feb 2017, 4:57:56 UTC - in response to Message 1846907.  

It turned out they had left the the power connector off the Optical drive which nearly blew the thing up, everything was running red hot (the IDE bus was still connected).(

If the data cable was connected, but not the power cable the result would be- the Optical drive wouldn't work. That's all.
Grant
Darwin NT
ID: 1846962 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1846965 - Posted: 6 Feb 2017, 5:03:26 UTC

My main cruncher has been able to get some GPU work, but still hasn't been able to time a CPU request to match up with available work.

It's a shame the PFB splitters can't keep up with demand (let alone get ahead of it), but a bit of work every now & then is still better than nothing at all.
Grant
Darwin NT
ID: 1846965 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1846970 - Posted: 6 Feb 2017, 5:10:13 UTC - in response to Message 1846965.  

My main cruncher has been able to get some GPU work, but still hasn't been able to time a CPU request to match up with available work.

It's a shame the PFB splitters can't keep up with demand (let alone get ahead of it), but a bit of work every now & then is still better than nothing at all.

If someone can stop those splitters congregating together on 1 file we'll all be a lot happier.

Cheers.
ID: 1846970 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1846974 - Posted: 6 Feb 2017, 5:23:54 UTC - in response to Message 1846970.  

If someone can stop those splitters congregating together on 1 file we'll all be a lot happier.

That seems to be it.
Used to be 4 Arecibo splitters could pump out as much as 45 new WU/sec, but ever since they started ganging up on files even with 7 splitters running they only occasionally peak at 40/s, generally they're around 35/s.

And it would help splitting multiple different files by having plenty of different work available, not just all (or mostly) shorties, or VLARs.
Grant
Darwin NT
ID: 1846974 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1846976 - Posted: 6 Feb 2017, 5:46:59 UTC - in response to Message 1846974.  
Last modified: 6 Feb 2017, 5:49:47 UTC

If someone can stop those splitters congregating together on 1 file we'll all be a lot happier.


One would think that having a splitter process "lock" a source file would be a no-brainer to put together. I used to do that sort of thing in networked MS-DOS 30 years ago by just using token files.
ID: 1846976 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1846987 - Posted: 6 Feb 2017, 8:08:05 UTC - in response to Message 1846930.  

Though I don't see that running doubles on this machine would be feasible as the CPU is 100% committed just running singles.

The way open cl works on Nvideas is if you run 2 it will grab a second cpu core. Nvidea open cl sucks.


. . With SoG the -use_sleep command solves that problem, but it can slow down processing a little bit. Worthwhile though if CPU cores are limited.

Stephen

:)
ID: 1846987 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1846988 - Posted: 6 Feb 2017, 8:14:18 UTC - in response to Message 1846935.  


. . I am now very new to Einstein but it must need some tweaking because my runtimes are far longer than Blc tasks and the credits don't look that good, in fact they seem about comparable to Seti. Overnight I accrued about 10,000 on a machine that does 31,000 a day on Seti. Looking at the monitors (afterburner) the CPU is overcommitted and the GPU usage is eratic and low. Oh well, it is hopefully only a temporary thing.

Stephen

.

No tweaking required or even possible. Just run the stock 1.17 app for Nvidia or better still, select to run Beta apps and get the 1.18 app which is about twice as fast as the 1.17 app. They are OpenCL apps so require at least one full CPU core to support a GPU task. Just like here on SETI. I find that two tasks per card is the optimum on Pascal. You can run an app_config.xml file just like SETI but I have identical parameters for both projects. 0.5 GPU and 1.0 CPU per task.


. . Hmm, if the OpenCL app in Einstein supports app_config.xml I wonder if it also supports an app commandline file, and in particular the -use_sleep command ??? :) Otherwise the beta-test option sounds feasible.

Stephen

??
ID: 1846988 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1846989 - Posted: 6 Feb 2017, 8:18:41 UTC - in response to Message 1846988.  
Last modified: 6 Feb 2017, 8:19:16 UTC


. . I am now very new to Einstein but it must need some tweaking because my runtimes are far longer than Blc tasks and the credits don't look that good, in fact they seem about comparable to Seti. Overnight I accrued about 10,000 on a machine that does 31,000 a day on Seti. Looking at the monitors (afterburner) the CPU is overcommitted and the GPU usage is eratic and low. Oh well, it is hopefully only a temporary thing.

Stephen

.

No tweaking required or even possible. Just run the stock 1.17 app for Nvidia or better still, select to run Beta apps and get the 1.18 app which is about twice as fast as the 1.17 app. They are OpenCL apps so require at least one full CPU core to support a GPU task. Just like here on SETI. I find that two tasks per card is the optimum on Pascal. You can run an app_config.xml file just like SETI but I have identical parameters for both projects. 0.5 GPU and 1.0 CPU per task.


. . Hmm, if the OpenCL app in Einstein supports app_config.xml I wonder if it also supports an app commandline file, and in particular the -use_sleep command ??? :) Otherwise the beta-test option sounds feasible.

Stephen

??


That app_config.xml is a Boinc feature not seti, whilst -use_sleep and other comand lines are app specific.


With each crime and every kindness we birth our future.
ID: 1846989 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1846990 - Posted: 6 Feb 2017, 8:21:23 UTC - in response to Message 1846962.  

It turned out they had left the the power connector off the Optical drive which nearly blew the thing up, everything was running red hot (the IDE bus was still connected).(

If the data cable was connected, but not the power cable the result would be- the Optical drive wouldn't work. That's all.


. . Well it certainly didn't work but it certainly wasn't all. The Optical drive was hot, the IDE cable was hot, the PSU was hot. Clearly it was trying to draw much the power it needed from the IDE bus and wasn't at all happy. After restoring the power connection everything was much cooler. That is a pretty good indication.

Stephen

:(
ID: 1846990 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1846992 - Posted: 6 Feb 2017, 8:24:54 UTC - in response to Message 1846989.  


No tweaking required or even possible. Just run the stock 1.17 app for Nvidia or better still, select to run Beta apps and get the 1.18 app which is about twice as fast as the 1.17 app. They are OpenCL apps so require at least one full CPU core to support a GPU task. Just like here on SETI. I find that two tasks per card is the optimum on Pascal. You can run an app_config.xml file just like SETI but I have identical parameters for both projects. 0.5 GPU and 1.0 CPU per task.


. . Hmm, if the OpenCL app in Einstein supports app_config.xml I wonder if it also supports an app commandline file, and in particular the -use_sleep command ??? :) Otherwise the beta-test option sounds feasible.

Stephen

??


That app_config.xml is a Boinc feature not seti, whilst -use_sleep and other comand lines are app specific.


. . Hmm, a very good point ....

Stephen

:(
ID: 1846992 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1846997 - Posted: 6 Feb 2017, 9:40:13 UTC - in response to Message 1846990.  

Well it certainly didn't work but it certainly wasn't all. The Optical drive was hot, the IDE cable was hot, the PSU was hot. Clearly it was trying to draw much the power it needed from the IDE bus and wasn't at all happy. After restoring the power connection everything was much cooler. That is a pretty good indication.

I've pulled the power connector from dozens of HDDs & optical drives over the years when troubleshooting while leaving the data cable connected and never had the issue you described.
And given how a HDD or Optical drive is connected, I can't see how it could occur other than Pin20 shorting to Pin 19 or Pin22 on the motherboard (If Pin20 is even present there). On a standard IDE cable there was no power supplied to the drive- the hole for that pin was usually not even in the connector. On a standard IDE drive the pin wasn't even present.
Grant
Darwin NT
ID: 1846997 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1847038 - Posted: 6 Feb 2017, 15:48:02 UTC - in response to Message 1846997.  

Well it certainly didn't work but it certainly wasn't all. The Optical drive was hot, the IDE cable was hot, the PSU was hot. Clearly it was trying to draw much the power it needed from the IDE bus and wasn't at all happy. After restoring the power connection everything was much cooler. That is a pretty good indication.

I've pulled the power connector from dozens of HDDs & optical drives over the years when troubleshooting while leaving the data cable connected and never had the issue you described.
And given how a HDD or Optical drive is connected, I can't see how it could occur other than Pin20 shorting to Pin 19 or Pin22 on the motherboard (If Pin20 is even present there). On a standard IDE cable there was no power supplied to the drive- the hole for that pin was usually not even in the connector. On a standard IDE drive the pin wasn't even present.


. . Well you make a good argument but to me the high temps spoke louder :)

. . considering what you say there was probably no chance of damage if I had not corrected the issue but I still felt much better that I had.

Stephen

:)
ID: 1847038 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1847185 - Posted: 7 Feb 2017, 6:14:13 UTC - in response to Message 1847038.  

Forums are being very, very sluggish at the moment.
Grant
Darwin NT
ID: 1847185 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1847217 - Posted: 7 Feb 2017, 12:49:43 UTC - in response to Message 1847185.  

Forums are being very, very sluggish at the moment.


. . Many things seem quite flaky since they did that OS upgrade.

Stephen

:(
ID: 1847217 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1847242 - Posted: 7 Feb 2017, 16:26:24 UTC

Good. PSU is in, installed it.
1. Forgot the CPU cable, I can tell you the system won't boot then.
2. One of the SATA power cables wasn't plugged in right, so only the DVD was found.
But on the third try, everything started up... then windows told me that it had lost the hibernation file and that it had to boot normally and that everything that wasn't saved would go lost, would I like to continue? Like there's a choice... ;-)

Anyway, will now go back to testing.
ID: 1847242 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22199
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1847280 - Posted: 8 Feb 2017, 6:07:56 UTC

Centurion is still under the weather so work is struggling out through the crack under the door. Strangely I'm getting work for my CPUs but very little for the GPUs, which are hammering through Beta work with great joy.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1847280 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1847293 - Posted: 8 Feb 2017, 8:01:19 UTC - in response to Message 1847280.  

Looks like the current work mix is shorties & VLARs (although there are a couple of mid range units in my last download). If you do get any GPU work, it won't last long. And the odds of getting any GPU work are minimal.
My main system has only been able to get work on 4 occasions since the outage.
Grant
Darwin NT
ID: 1847293 · Report as offensive
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 42 · Next

Message boards : Number crunching : Panic Mode On (104) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.