Panic Mode On (106) Server Problems?

Message boards : Number crunching : Panic Mode On (106) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 · Next

AuthorMessage
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1876786 - Posted: 4 Jul 2017, 23:43:28 UTC - in response to Message 1875687.  
Last modified: 4 Jul 2017, 23:45:55 UTC

Looking at stats I noticed the 'Top Computer' has an error rate of ~ 10%.........the rating scale apparently has no measurement for accuracy/efficiency.........


https://setiathome.berkeley.edu/results.php?hostid=7475713


Hi,
It is my machine. The errors are due to experimental code and pushing the GPUs to the thermal limit.

Errors occur when the task hangs and does not finish in 40 minutes or so, sometimes a result file is missing.
Some of the errors are lost tasks that do not show on my computer and it can not process them (ghosts).
Bad workunit header is a download/file corruption error. A zero time abort seems to be a time limit exceeded error (system abort, not user abort).
Errors are not bad.

Bad results (Invalid count) is zero so if the machine gets the work done the results are correct.

The inconclusive count is near or under 5% and that is acceptable. Most of them need to get processed by a third machine because my wingman has an issue with his/hers computer producing mostly rubbish.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1876786 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11410
Credit: 29,581,041
RAC: 66
United States
Message 1876788 - Posted: 4 Jul 2017, 23:56:39 UTC - in response to Message 1876786.  

Petri is there any chance your Linux magic can be adapted to Windoz?
ID: 1876788 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1876789 - Posted: 5 Jul 2017, 0:03:57 UTC - in response to Message 1876788.  

Petri is there any chance your Linux magic can be adapted to Windoz?


Hi, I know there is an ongoing effort to do so. Some of the existing and solved bugs need to be
addressed and then a lot of testing with all kinds of NVIDIA platforms and tunig and tweakig with memory limits etc..

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1876789 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1876794 - Posted: 5 Jul 2017, 0:21:51 UTC - in response to Message 1876788.  

Petri is there any chance your Linux magic can be adapted to Windoz?


. . Aaahh! The often asked question ... :)

Stephen

:)
ID: 1876794 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 1876831 - Posted: 5 Jul 2017, 7:22:51 UTC

Since the outage the GBT tasks have stared flowing again.
Grant
Darwin NT
ID: 1876831 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1876832 - Posted: 5 Jul 2017, 7:27:33 UTC - in response to Message 1876831.  

You had to say it didn't you. I haven't said anything because it might jinx the servers.
Yes, work has been flowing quite nicely since maintenance.
ID: 1876832 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1876992 - Posted: 5 Jul 2017, 22:42:33 UTC - in response to Message 1876786.  

Thanks Petri, no disrespect meant to your machine, I was just pointing out that error rate for whatever reason should be considered somehow when ranking hardware combinations......maybe like the Roger Maris asterisk.........IMHO.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1876992 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1877000 - Posted: 5 Jul 2017, 23:51:12 UTC - in response to Message 1876992.  
Last modified: 6 Jul 2017, 0:22:53 UTC

Thanks Petri, no disrespect meant to your machine, I was just pointing out that error rate for whatever reason should be considered somehow when ranking hardware combinations......maybe like the Roger Maris asterisk.........IMHO.


While I'm googling what you meant I keep on respecting all of you with continuous updates to my source code to keep the spirit up and alive.
And my machine has only 4 GPUs despite of it lying to the servers having 16 of them. Only one of them is a 1080Ti and the 3 others are only 1080.

P33.

EDIT: And I found out after some googling that a one Roger Maris suffered a stress-induced hair loss and the same guy played some game in a small part of a world in an ancient era.
EDIT2: His hardware was a bat.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1877000 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1877030 - Posted: 6 Jul 2017, 2:07:49 UTC
Last modified: 6 Jul 2017, 2:09:13 UTC

Sorry to hit you with an unfamiliar reference from American Baseball. Roger Maris hit more home runs in a single season than Babe Ruth but because the seasons had been lengthened to include more games after Ruth set the record, Maris' 60 home run season was considered an exception many felt should have been marked in the records with an asterisk.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1877030 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 1877337 - Posted: 7 Jul 2017, 22:13:39 UTC

Looks like we're back to lots of Arecibo VLARs and just a trace of GBT work.
Grant
Darwin NT
ID: 1877337 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 1877396 - Posted: 8 Jul 2017, 3:06:08 UTC

Change the application settings, and down comes the work...
Grant
Darwin NT
ID: 1877396 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1877404 - Posted: 8 Jul 2017, 5:16:47 UTC - in response to Message 1877396.  

Change the application settings, and down comes the work...


. . It seems that though the symptoms have improved the problem has not been eliminated ...

Stephen

:(
ID: 1877404 · Report as offensive
Filipe

Send message
Joined: 12 Aug 00
Posts: 218
Credit: 21,281,677
RAC: 20
Portugal
Message 1877715 - Posted: 10 Jul 2017, 11:08:41 UTC

Are Arecibo WU given priority over GBT?

Seems there is more arecibo work coming from the servers than GBT
ID: 1877715 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1877739 - Posted: 10 Jul 2017, 15:58:39 UTC - in response to Message 1877715.  

In practical effect .... yes. Should not be the case though. Nobody has been able to explain the general flakiness of the scheduling servers for the better part of a year now. Frustrating as heck when you can crunch the GBT work much faster on specific hardware compared to Arecibo. We can complain but there is nothing we can do to effect any change.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1877739 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1877741 - Posted: 10 Jul 2017, 16:33:20 UTC - in response to Message 1877739.  

The grain of salt of course is some hardware crunches faster than others, my 900s are more efficient than the CPU 😉
ID: 1877741 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1877744 - Posted: 10 Jul 2017, 16:39:34 UTC - in response to Message 1877741.  

The grain of salt of course is some hardware crunches faster than others, my 900s are more efficient than the CPU 😉

Yes, that is what I was alluding too. I use the rescheduler and use it most often on the Ryzen system. It seems to get a preponderance of Arecibo work thrown to the CPU and it processes BLC CPU work 50% faster than Arecibo, so I reschedule heavily and move the Arecibo shorties and VLARs to the GTX 1070s.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1877744 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1877797 - Posted: 10 Jul 2017, 22:37:40 UTC - in response to Message 1877744.  

. . Hi Keith,

. . I just had a peek at your results and I was surprised to see that your 8350 rig has a RAC 5,000 higher than your 8370 rig ... is that because you cannot keep the GBT work up to it??

Stephen

??
ID: 1877797 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1877814 - Posted: 11 Jul 2017, 0:16:06 UTC - in response to Message 1877797.  

. . Hi Keith,

. . I just had a peek at your results and I was surprised to see that your 8350 rig has a RAC 5,000 higher than your 8370 rig ... is that because you cannot keep the GBT work up to it??

Stephen

??

No, I am doing the exact same thing on the FX rigs. I haven't been able to fathom why the 8370 has fallen so far behind the 8350. It has been a mystery to me for over a week now. The 8370 is clocked at 4.6 Ghz compared to 4.4 Ghz for the 8350. The 8370 CPU APR has always been higher than the 8350. Both rigs have the exact same GTX 1070s in them at the exact same clocks. Up till about a week and a half ago, the 8370 was always slightly ahead on RAC compared to the 8350 as to be expected. It has fallen way behind for unknown reasons. The GPU tasks appear to process in comparable times. BoincTasks shows me processing the same amount of CPU tasks per day and per week and the same for GPU tasks. The only thing that would make sense is that I have been awarded significantly less credit on the 8370 compared to the 8350. That would be the case if the 8370 has been doing nothing but Arecibo shorties, but if that were the case then I would expect the number of GPU tasks processed per day/week should have been significantly higher than the 8350.

The only other factor that may have played into this conundrum was that I put the 8370 to work for about 4 days on Beta processing the Alt 8.06 CPU app about a month ago.

The other puzzling fact is that in BOINC Manager on the Statistics tab, I show the 8370 doing 50,000 credits RAC for the past month. So why has the SETI website shown me as low as 44,000 RAC. For some reason I have been penalized in RAC on the website somehow.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1877814 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13841
Credit: 208,696,464
RAC: 304
Australia
Message 1877840 - Posted: 11 Jul 2017, 7:12:33 UTC - in response to Message 1877814.  

No, I am doing the exact same thing on the FX rigs. I haven't been able to fathom why the 8370 has fallen so far behind the 8350.

Different work mix?
Today my C2D has been getting just Arecibo Work, with the odd GBT WU thrown in (just got a dozen GBT WUs- most in ages) > Even over the last week it has been heavily Arecibo work.
My i7 while been getting mostly Arecibo work has had periods of batches of GBT work coming through.
Grant
Darwin NT
ID: 1877840 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1877856 - Posted: 11 Jul 2017, 21:38:19 UTC - in response to Message 1876789.  

Petri is there any chance your Linux magic can be adapted to Windoz?


Hi, I know there is an ongoing effort to do so. Some of the existing and solved bugs need to be
addressed and then a lot of testing with all kinds of NVIDIA platforms and tunig and tweakig with memory limits etc..

Petri


I volunteer. I have 3 NVIDIA gpus. I am beginning to think I just need to let SetiBeta be left turned on. Otherwise they start another test without me :)
A proud member of the OFA (Old Farts Association).
ID: 1877856 · Report as offensive
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 · Next

Message boards : Number crunching : Panic Mode On (106) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.