Panic Mode On (94) Server Problems?

Message boards : Number crunching : Panic Mode On (94) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 22 · Next

AuthorMessage
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1631561 - Posted: 23 Jan 2015, 7:51:00 UTC

One thing is for sure it will help freeze some server space up when the AP results start to get perched.
ID: 1631561 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1631592 - Posted: 23 Jan 2015, 10:23:55 UTC

Just got screwed by two errors validating with each other.
http://setiathome.berkeley.edu/workunit.php?wuid=1683786197
Grant
Darwin NT
ID: 1631592 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1631600 - Posted: 23 Jan 2015, 11:13:58 UTC - in response to Message 1631598.  
Last modified: 23 Jan 2015, 11:26:44 UTC

Just got screwed by two errors validating with each other.
http://setiathome.berkeley.edu/workunit.php?wuid=1683786197

And on top of all, that result is now added to the database, as a valid result for the signals in the WU.

I wonder how many millions of invalid results that is now in the database as "real/valid" results?

The Science is compromised by all those crazy computers, filling the database with invalid results, and nobody from the project seems to care one iota about that.

And to add to it all, you haven't even received any toaster from SETI I guess....

What's interesting is the other two Computers were NV 750s running the "New" driver and using CUDA32. Previously the same scenario was seen with 2 NV 900s validating against each other while running CUDA32. It would Appear the "New" driver doesn't like CUDA32.

I suppose it was inevitable, first it was the "new" drivers with the Older NV cards and OpenCL. Now it seems the New drivers may be having problems with the Old CUDA. Maybe someone should test the New NV drivers against CUDA32...
Strange results with new GPU (GTX 970)
ID: 1631600 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1631640 - Posted: 23 Jan 2015, 14:03:52 UTC - in response to Message 1631598.  

Just got screwed by two errors validating with each other.
http://setiathome.berkeley.edu/workunit.php?wuid=1683786197

And on top of all, that result is now added to the database, as a valid result for the signals in the WU.

I wonder how many millions of invalid results that is now in the database as "real/valid" results?

The Science is compromised by all those crazy computers, filling the database with invalid results, and nobody from the project seems to care one iota about that.

And to add to it all, you haven't even received any toaster from SETI I guess....

Don't worry the instances of bad GPU results validating against themselves to cause a good CPU tasks to be flagged as invalid happens with AP as well.
http://setiathome.berkeley.edu/workunit.php?wuid=1673423338
http://setiathome.berkeley.edu/workunit.php?wuid=1659064648
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1631640 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1631644 - Posted: 23 Jan 2015, 14:18:39 UTC - in response to Message 1631640.  

Just got screwed by two errors validating with each other.
http://setiathome.berkeley.edu/workunit.php?wuid=1683786197

And on top of all, that result is now added to the database, as a valid result for the signals in the WU.

I wonder how many millions of invalid results that is now in the database as "real/valid" results?

The Science is compromised by all those crazy computers, filling the database with invalid results, and nobody from the project seems to care one iota about that.

And to add to it all, you haven't even received any toaster from SETI I guess....

Don't worry the instances of bad GPU results validating against themselves to cause a good CPU tasks to be flagged as invalid happens with AP as well.
http://setiathome.berkeley.edu/workunit.php?wuid=1673423338
http://setiathome.berkeley.edu/workunit.php?wuid=1659064648

Well, I just checked the Validation inconclusives on my Test Mac and All the recent inconclusives are against nVidia Driver 347.09, http://setiathome.berkeley.edu/results.php?hostid=6796479&offset=0&show_names=0&state=3&appid=11

If these Bad Results start validating against my test App I will Not be a Happy Tester.
ID: 1631644 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1631659 - Posted: 23 Jan 2015, 14:55:30 UTC - in response to Message 1631644.  

Don't worry the instances of bad GPU results validating against themselves to cause a good CPU tasks to be flagged as invalid happens with AP as well.
http://setiathome.berkeley.edu/workunit.php?wuid=1673423338
http://setiathome.berkeley.edu/workunit.php?wuid=1659064648

Well, I just checked the Validation inconclusives on my Test Mac and All the recent inconclusives are against nVidia Driver 347.09, http://setiathome.berkeley.edu/results.php?hostid=6796479&offset=0&show_names=0&state=3&appid=11

If these Bad Results start validating against my test App I will Not be a Happy Tester.

Those two results are with 4 different drivers.
[2] NVIDIA GeForce GTX 980 (4095MB) driver: 347.25
NVIDIA GeForce GTX 750 Ti (2048MB) driver: 344.75 OpenCL: 1.1
[2] NVIDIA GeForce GTX 770 (2048MB) driver: 331.82
NVIDIA GeForce GTX 750 Ti (2048MB) driver: 347.09 OpenCL: 1.1

They all do use 2721 app, but given I have 2 of these vs 11586 valid the percentage is a mere 0.000172622%.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1631659 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1631667 - Posted: 23 Jan 2015, 15:23:46 UTC - in response to Message 1631659.  
Last modified: 23 Jan 2015, 16:17:27 UTC

ID: 1631667 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1631674 - Posted: 23 Jan 2015, 15:54:48 UTC

If we don't get some more datasets loaded into the splitting queue, we shall soon be crunching nothing at all.

Meow.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1631674 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1631699 - Posted: 23 Jan 2015, 16:21:47 UTC - in response to Message 1631674.  

If we don't get some more datasets loaded into the splitting queue, we shall soon be crunching nothing at all.

Meow.

We may have to wait for the next shipment to arrive from Arecibo. We already seem to have scraped the bottom of the barrel at least twice since Matt warned us we were Running out of workunits on 13 Nov 2014.
ID: 1631699 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1631702 - Posted: 23 Jan 2015, 16:25:54 UTC - in response to Message 1631699.  

If we don't get some more datasets loaded into the splitting queue, we shall soon be crunching nothing at all.

Meow.

We may have to wait for the next shipment to arrive from Arecibo. We already seem to have scraped the bottom of the barrel at least twice since Matt warned us we were Running out of workunits on 13 Nov 2014.

Hope not, but you could be right.
Meowsigh, and Godspeed the next shipment.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1631702 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1631715 - Posted: 23 Jan 2015, 16:55:10 UTC

And export stats hasn't been updated since yesterday.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1631715 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1631726 - Posted: 23 Jan 2015, 17:22:56 UTC - in response to Message 1631531.  

I'm curious to see what the splitters will do when they get to "tape" 24se13ad. My records show that my machines already processed 287 tasks from that file just last March. They were all MB v7 tasks, however. I didn't get any APs from it. Looks like it should be the next "tape" in line, but I think I'll have to wait until morning to see what happens.

By the dawn's early light (well, actually I guess it's past 9 AM already), I see that the 24se13ad "tape" is now marked "(done)" for MB. It appears that at least some splitting was performed on it, because I see 9 files from that tape on my xw9400 this morning. I compared those file names with the 287 that I crunched last March and got the following.

New file names start with:
24se13ad.17323
24se13ad.18692
24se13ad.31616
24se13ad.32457

March, 2014, file names start with:
24se13ad.2405
24se13ad.3136
24se13ad.3682
24se13ad.10026
24se13ad.11409
24se13ad.15412
24se13ad.19064
24se13ad.19944
24se13ad.24015
24se13ad.24710
24se13ad.24872
24se13ad.28197
24se13ad.28860
24se13ad.30999

I guess that means there's no overlap, but I'm not really sure. Does each of those 2nd nodes constitute a "channel", or is that buried deeper in the file names?
ID: 1631726 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1631729 - Posted: 23 Jan 2015, 17:29:55 UTC - in response to Message 1631598.  

And to add to it all, you haven't even received any toaster from SETI I guess....

Fundraising idea. Seti@home can sell these for a good mark-up, with the profits going to the project. Use the same logo as on the t-shirts, maybe without the number.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1631729 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1631735 - Posted: 23 Jan 2015, 17:35:51 UTC - in response to Message 1631729.  

I'm game...where do I sign up??
ID: 1631735 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1631741 - Posted: 23 Jan 2015, 17:46:50 UTC - in response to Message 1631715.  

And export stats hasn't been updated since yesterday.

There are several hours before today's daily stats dump is generated.
http://setiathome.berkeley.edu/stats/
The timestamp is UCB local time or UTC -8.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1631741 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1631766 - Posted: 23 Jan 2015, 18:30:18 UTC - in response to Message 1631640.  

...
The Science is compromised by all those crazy computers, filling the database with invalid results, and nobody from the project seems to care one iota about that.
...

Don't worry the instances of bad GPU results validating against themselves to cause a good CPU tasks to be flagged as invalid happens with AP as well.
http://setiathome.berkeley.edu/workunit.php?wuid=1673423338
http://setiathome.berkeley.edu/workunit.php?wuid=1659064648

I wouldn't call those GPU results bad, IMO the WUs simply had signals so close to critical threholding levels that quirks of the AP validator caused the CPU result invalidation.

For wuid=1673423338, the single pulse found by both GPU and CPU processing is well below the 1.01*threshold level so the Validator ignored it. The Rep. pulse found only by the GPUs is probably right at threshold, but I've downloaded the WU and will check that.

For wuid=1659064648, the single pulse found by both CPU and GPU is very close to the 1.01*threshold level (87.85995151 for scale 5). The CPU app was made before we added enough digits to the peak_power field in stderr to be sure its reported peak_power=87.86 is above or below that critical level, but the GPU peak_power=87.8589 is just below so the Validator ignored that GPU signal. The fact that it declared the CPU result invalid indicates its peak_power was actually above the critical level so the Validator was comparing 1 signal from the CPU result to 0 signals from the GPUs.

A proposed change to the AP Validator would have given the wuid=1673423338 CPU result credit on a "weakly similar" basis (and the accompanying "valid" indication). For wuid=1659064648 the initial validation would have come out "strongly similar" and the CPU result would have become canonical. I hope the server issues will subside enough so that change can move up the todo list.
                                                                  Joe
ID: 1631766 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1631817 - Posted: 23 Jan 2015, 19:53:57 UTC - in response to Message 1631766.  

...
The Science is compromised by all those crazy computers, filling the database with invalid results, and nobody from the project seems to care one iota about that.
...

Don't worry the instances of bad GPU results validating against themselves to cause a good CPU tasks to be flagged as invalid happens with AP as well.
http://setiathome.berkeley.edu/workunit.php?wuid=1673423338
http://setiathome.berkeley.edu/workunit.php?wuid=1659064648

I wouldn't call those GPU results bad, IMO the WUs simply had signals so close to critical threholding levels that quirks of the AP validator caused the CPU result invalidation.

For wuid=1673423338, the single pulse found by both GPU and CPU processing is well below the 1.01*threshold level so the Validator ignored it. The Rep. pulse found only by the GPUs is probably right at threshold, but I've downloaded the WU and will check that.

For wuid=1659064648, the single pulse found by both CPU and GPU is very close to the 1.01*threshold level (87.85995151 for scale 5). The CPU app was made before we added enough digits to the peak_power field in stderr to be sure its reported peak_power=87.86 is above or below that critical level, but the GPU peak_power=87.8589 is just below so the Validator ignored that GPU signal. The fact that it declared the CPU result invalid indicates its peak_power was actually above the critical level so the Validator was comparing 1 signal from the CPU result to 0 signals from the GPUs.

A proposed change to the AP Validator would have given the wuid=1673423338 CPU result credit on a "weakly similar" basis (and the accompanying "valid" indication). For wuid=1659064648 the initial validation would have come out "strongly similar" and the CPU result would have become canonical. I hope the server issues will subside enough so that change can move up the todo list.
                                                                  Joe

A threshold issue sound much better than one of the alternatives.
I'm sure they want the server issues to be resolved more than we do.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1631817 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1631853 - Posted: 23 Jan 2015, 20:48:41 UTC - in response to Message 1631076.  

Well there goes my 'consecutive valid tasks' count...

Yeah, well, now you're stuck with one of those "guaranteed to fail" MB WUs, too. Join the crowd! :^)

And that WU has moved on to the last person to give it a whirl. So far, the previous stock CPU app has been marked as invalid and mine is now inconclusive, pending the last person's result, which will definitely mark mine as invalid, and probably theirs, as well. Unless they get lucky and get "can't validate" which from past experience, won't reset their consecutive valid count. Lucky them.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1631853 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1631926 - Posted: 23 Jan 2015, 22:56:58 UTC - in response to Message 1631920.  

This is going to be interesting. AP assimilators are finally running.

Yay! Hopefully soon, nearly 10TiB of disk space can be reclaimed once those WUs get purged.

And I would have to imagine it would help the database be a bit more responsive and efficient, as well.

We'll see how it goes.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1631926 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1631937 - Posted: 23 Jan 2015, 23:12:33 UTC - in response to Message 1631920.  

This is going to be interesting. AP assimilators are finally running.

Some are, some aren't.
None of the AP splitters are running though.
Grant
Darwin NT
ID: 1631937 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (94) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.