Panic Mode On (103) Server Problems?

Message boards : Number crunching : Panic Mode On (103) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 34 · Next

AuthorMessage
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1810607 - Posted: 20 Aug 2016, 3:24:16 UTC - in response to Message 1810606.  

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.
ID: 1810607 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810608 - Posted: 20 Aug 2016, 3:24:35 UTC - in response to Message 1810606.  

Probably would be good if someone who is able could get a copy of that WU for an autopsy by those than know what they're doing...


Cursory look says 'perfect storm' of cluster-clowns.
- First result is Tbar running 'special' code, known to have some issues
- Second is a Linux host, CPU turning in more invalids than valids
- Third is Stock Windows CPU, usually reference, AMD CPU massively overclocked or broken (looking at its tasks)
- fourth is a stock Mac App (nuff said, lots of invalids)
- Fifth, AMD APU, looks fine to me (maybe, maybe not, TBC)
- sixth, Stock mac OpenCL

So my bet is that the workunit itself is 'fine', and you're just seeing a statistical inevitably of combining 5 kinds of turds, and one poor little APU that seems possibly legit, but we may never really know.

Will be happy if someone wants to dig and disprove or prove my assessment, which is: IMO this is not good enough [Edit: yes I'm starting to get salty, lmao]

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.
It should have been able to survive another Intel iGPU, a Mac ATI HD4, possibly 1 other, and still been able to Validate on the 10th attempt.


Just my Opinion, you need to stop begging, whether it be 4, 5 , or 50, the picture is arse. Deal with it, I have.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810608 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810609 - Posted: 20 Aug 2016, 3:26:45 UTC - in response to Message 1810607.  
Last modified: 20 Aug 2016, 3:27:38 UTC

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.


Exactly --> All arse (except maybe the lone APU, which we can't confirm, so also arse)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810609 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1810612 - Posted: 20 Aug 2016, 3:43:06 UTC - in response to Message 1810607.  

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.

I see. So you only get 10 attempts if they are mixed results.
Considering the current state of SETI Apps and Hosts, perhaps that should just be changed to 10 attempts total. I've seen quite a few that went the full 10, I've never seen one that stopped at 6. Now I have.
Probably see more...
ID: 1810612 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810615 - Posted: 20 Aug 2016, 3:44:35 UTC - in response to Message 1810612.  

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.

I see. So you only get 10 attempts if they are mixed results.
Considering the current state of SETI Apps and Hosts, perhaps that should just be changed to 10 attempts total. I've seen quite a few that went the full 10, I've never seen one that stopped at 6. Now I have.
Probably see more...


You will, because as I said, arse.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810615 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1810632 - Posted: 20 Aug 2016, 4:25:36 UTC - in response to Message 1810609.  

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.


Exactly --> All arse (except maybe the lone APU, which we can't confirm, so also arse)

Not one of mine, thankfully ...
ID: 1810632 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1810642 - Posted: 20 Aug 2016, 5:17:46 UTC - in response to Message 1810612.  

I see. So you only get 10 attempts if they are mixed results.
Considering the current state of SETI Apps and Hosts, perhaps that should just be changed to 10 attempts total. I've seen quite a few that went the full 10, I've never seen one that stopped at 6. Now I have.
Probably see more...

I have an interesting one in the queue on one of my boxes, the _5 task from WU 2192117866.
name	blc2_2bit_guppi_57451_70772_HIP117779_0027.11485.831.17.26.7.vlar
application	SETI@home v8
created	22 Jun 2016, 18:17:31 UTC
minimum quorum	2
initial replication	4
max # of error/total/success tasks	5, 10, 5
Task
click for details 	Computer 	Sent 	Time reported
or deadline
explain 	Status 	Run time
(sec) 	CPU time
(sec) 	Credit 	Application
5000863925 	7827171 	23 Jun 2016, 0:45:58 UTC 	23 Jun 2016, 13:31:40 UTC 	Completed, validation inconclusive 	6,077.92 	6,036.65 	pending 	SETI@home v8 v8.00
windows_intelx86
5000863926 	3502292 	23 Jun 2016, 0:45:47 UTC 	25 Jun 2016, 14:31:45 UTC 	Completed, validation inconclusive 	2,289.01 	176.53 	pending 	SETI@home v8 v8.00 (opencl_nvidia_mac)
x86_64-apple-darwin
5005988357 	7379791 	25 Jun 2016, 21:12:00 UTC 	18 Aug 2016, 2:11:42 UTC 	Timed out - no response 	0.00 	0.00 	--- 	SETI@home v8 v8.00 (cuda50)
windows_intelx86
5103387390 	7993196 	18 Aug 2016, 9:24:56 UTC 	18 Aug 2016, 14:57:02 UTC 	Completed, validation inconclusive 	5,074.25 	358.45 	pending 	SETI@home v8 v8.12 (opencl_intel_gpu_sah)
windows_intelx86
5104272180 	8027604 	18 Aug 2016, 22:31:44 UTC 	18 Aug 2016, 23:27:03 UTC 	Error while computing 	0.00 	0.00 	--- 	SETI@home v8 v8.00 (opencl_ati5_mac)
x86_64-apple-darwin
5104834544 	6980751 	19 Aug 2016, 6:58:27 UTC 	11 Oct 2016, 11:58:09 UTC 	In progress 	--- 	--- 	--- 	SETI@home v8
Anonymous platform (NVIDIA GPU)

So far, a Windows CPU, a Mac NVIDIA GPU and a Windows Intel GPU all disagree, while a Windows Cuda50 timed out and a Mac ATI GPU crapped out. My Win7 host is next in line and will probably run it as Cuda50 sometime tomorrow, unless I reschedule it to the CPU or to SoG. Let's see....what might produce the most interesting result? Hmmm...
ID: 1810642 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810664 - Posted: 20 Aug 2016, 7:40:57 UTC - in response to Message 1810632.  
Last modified: 20 Aug 2016, 7:44:46 UTC

But... that still doesn't explain why the task was Pronounced after a mere 5 tries.

If you look at the "Outcome" of each task, they all say "Success", even though they were all probably sitting in an Inconclusive state. The max number for that outcome is 5 (max # of error/total/success tasks: 5, 10, 5). If they were all errors, it would have been the same. I'm pretty sure it will only get to 10 if the outcomes are a mixed lot of success, error and/or invalid.


Exactly --> All arse (except maybe the lone APU, which we can't confirm, so also arse)

Not one of mine, thankfully ...


Don't worry too much Jim. This is very much a 'special' situation that can result only from special circumstances, created by special snowflake people.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810664 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1810699 - Posted: 20 Aug 2016, 11:59:51 UTC - in response to Message 1810664.  
Last modified: 20 Aug 2016, 12:04:30 UTC

Don't worry too much Jim. This is very much a 'special' situation that can result only from special circumstances, created by special snowflake people.

No worries, Jason.
I've just got this one box I built (and probably should not have) that's been giving me grief. Seems every time I fire up the second 750ti, it starts crashing and blowing error WUs out all over the place. Funky box, Socket 775 with an E5450 Xeon socket 771 with the 775 mod applied. Hoping to add some decent memory today might solve its issues, though who knows why I'm throwing money at an ancient Gigabyte Tech P45 mobo at this point ... Sometimes it's not worth saving stuff from the landfill ... May make me a snowflake people, who knows ...
ID: 1810699 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810728 - Posted: 20 Aug 2016, 14:35:07 UTC - in response to Message 1810699.  

Don't worry too much Jim. This is very much a 'special' situation that can result only from special circumstances, created by special snowflake people.

No worries, Jason.
I've just got this one box I built (and probably should not have) that's been giving me grief. Seems every time I fire up the second 750ti, it starts crashing and blowing error WUs out all over the place. Funky box, Socket 775 with an E5450 Xeon socket 771 with the 775 mod applied. Hoping to add some decent memory today might solve its issues, though who knows why I'm throwing money at an ancient Gigabyte Tech P45 mobo at this point ... Sometimes it's not worth saving stuff from the landfill ... May make me a snowflake people, who knows ...


Lol, probably the opposite, since you're actually noticing the errors.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810728 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810738 - Posted: 20 Aug 2016, 15:15:32 UTC - in response to Message 1810700.  

All assuming I can get my monitor to light up again.

Fiat lux!

But no sign of any new hipsters yet, and RTS continues to fall. Maybe the whole GBT processing stream is fubar'd, not just Messier?
ID: 1810738 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1810804 - Posted: 20 Aug 2016, 20:35:03 UTC

Kitties are keeping an eye on the RTS cache. Currently down to 322k from it's usual 500k.
Hope it does not dive too low.

Meow?
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1810804 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1810811 - Posted: 20 Aug 2016, 21:03:45 UTC

And in 30 minutes or so, RTS has dropped from 322k to 316k.
I have sent messages to the boyz from da lab, but I have no clue if any of them is around this weekend, or what they might be able to do to shore things up a bit.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1810811 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1810823 - Posted: 20 Aug 2016, 21:24:45 UTC - in response to Message 1810811.  

Do you think it is just a goof in the lab? I almost thought the admins were being kind and turned off the output of the guppies to make amends for the guppi inundation earlier in the week. The Arecibo work certainly is boosting everybody's RAC for the contest.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1810823 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1810827 - Posted: 20 Aug 2016, 21:32:43 UTC - in response to Message 1810823.  

The Arecibo work certainly is boosting everybody's RAC for the contest.

Not mine, I stopped crunching seti on it's GPU and put an i5 on seti 3 weeks ago, and doubled my RAC. As nearly as I can tell CPUs don't care.
ID: 1810827 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1810830 - Posted: 20 Aug 2016, 21:34:25 UTC - in response to Message 1810823.  
Last modified: 20 Aug 2016, 21:35:29 UTC

Do you think it is just a goof in the lab? I almost thought the admins were being kind and turned off the output of the guppies to make amends for the guppi inundation earlier in the week. The Arecibo work certainly is boosting everybody's RAC for the contest.

Earlier it appeared that the Messier files were taking up splitter time without producing valid WUs. But they are gone now, so I am not sure what the current problem might be.
Perhaps a dodgy batch of Guppi files? Dunno.

RAC not seeing much action here yet, as my crunchers are still working on the Guppies in their caches.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1810830 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1810831 - Posted: 20 Aug 2016, 21:38:30 UTC - in response to Message 1810823.  

Do you think it is just a goof in the lab? I almost thought the admins were being kind and turned off the output of the guppies to make amends for the guppi inundation earlier in the week. The Arecibo work certainly is boosting everybody's RAC for the contest.

It doesn't seem like it's intentional. The gbt splitters still appear to be running full blast but aren't adding anything to the RTS buffer. In the last 48 hours they've blown through all the MESSIER031 files, a small number of blc7 files, and are now rapidly depleting the remaining blc2 files. At the rate they're going, I would think they'll finish all the currently loaded guppi files by the end of the weekend, if not sooner, with no usable WUs to show for the effort.
ID: 1810831 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1810833 - Posted: 20 Aug 2016, 21:44:38 UTC - in response to Message 1810831.  

The current rate indicates only Arecibo tasks are being created, "Current result creation rate 17.9866/sec". It would seem the gbt splitters aren't creating any tasks, and it's been that way for a while. The rate should be above 30/sec.
ID: 1810833 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1810839 - Posted: 20 Aug 2016, 21:58:39 UTC - in response to Message 1810833.  

The current rate indicates only Arecibo tasks are being created, "Current result creation rate 17.9866/sec". It would seem the gbt splitters aren't creating any tasks, and it's been that way for a while. The rate should be above 30/sec.

Well, I sent the warning messages to da lab boyz. No response in the last hour, but it's about 3PM on a Saturday afternoon in Berkeley.
Hopefully one of them gets the message perhaps this evening and can intervene on our behalf.

Meow.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1810839 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1810854 - Posted: 20 Aug 2016, 22:53:38 UTC - in response to Message 1810831.  

Do you think it is just a goof in the lab? I almost thought the admins were being kind and turned off the output of the guppies to make amends for the guppi inundation earlier in the week. The Arecibo work certainly is boosting everybody's RAC for the contest.

It doesn't seem like it's intentional. The gbt splitters still appear to be running full blast but aren't adding anything to the RTS buffer. In the last 48 hours they've blown through all the MESSIER031 files, a small number of blc7 files, and are now rapidly depleting the remaining blc2 files. At the rate they're going, I would think they'll finish all the currently loaded guppi files by the end of the weekend, if not sooner, with no usable WUs to show for the effort.

I haven't seen a GUPPI task in almost two days on any machine other than I think one lonely resend. I based my comment about better RAC on the bend upward in the contest RAC curve that coincided with the absence of any GUPPI tasks and only Arecibo tasks on my machines. I guess just coincidence then.
WOW Contest charts
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1810854 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 34 · Next

Message boards : Number crunching : Panic Mode On (103) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.