April Showers (Apr 09 2015)

Message boards : Technical News : April Showers (Apr 09 2015)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1441
Credit: 276,580
RAC: 2,579
United States
Message 1663120 - Posted: 9 Apr 2015, 16:47:04 UTC

So! All the recent headaches are due to continuing issues with the master science database. While all the data seems to be intact, there's something fundamentally wrong causing informix to keep hanging up (usually when we are continuing work on reconnecting the fragmented result tables).

During the previous recent crashes I would clean up what informix was complaining about in various error messages (always the result table indexes) but this time I'm in the process of doing a comprehensive check of everything in the database just to be sure. And, in fact, I'm seeing minor problems that I've been able to clean up thus far (once again, no loss in data - just internal bookkeeping and broken index issues).

I thought this full check would be done by now (ha ha) but it's not even close. Meanwhile we *should* be able to do Astropulse work, but the software blanking engine requires the master science database to do some integrity checks, so that is all offline as well.

There are ways to speed up such events in the future. We're working on enacting several improvements. Yes, we here are all beyond tired of our project grinding to a halt and things will change for the better.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1663120 · Report as offensive
rob smith
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 14907
Credit: 230,007,595
RAC: 386,749
United Kingdom
Message 1663123 - Posted: 9 Apr 2015, 16:50:33 UTC

Thanks for the update Matt.



May your fingers and mood recover quickly
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1663123 · Report as offensive
kittymanProject Donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 48431
Credit: 865,888,847
RAC: 201,240
United States
Message 1663125 - Posted: 9 Apr 2015, 16:51:53 UTC

Thank you for the update, Matt.
And Godspeed the recovery.

It is good that your more thorough look at the DB issues is uncovering some things not previously noted.
And good as well that some new thoughts are being considered to make the DB work better for Set.

Best kitty juju for all your efforts being sent.........
Meow....meow.... meow......
A kitty keeps loneliness away.
More meowing, less hissing. I speak meow, do you?

Have made friends in this life.
Most were cats.
ID: 1663125 · Report as offensive
Profile CLYDEProject Donor
Volunteer tester

Send message
Joined: 9 Aug 99
Posts: 9946
Credit: 41,850,015
RAC: 18,481
United States
Message 1663126 - Posted: 9 Apr 2015, 16:52:47 UTC

Thank you, and the others, for their hard work.

Get it done right, not fast!
ID: 1663126 · Report as offensive
Bill Butler
Avatar

Send message
Joined: 26 Aug 03
Posts: 101
Credit: 3,672,557
RAC: 0
United States
Message 1663134 - Posted: 9 Apr 2015, 17:12:16 UTC

Matt,

Thank you for your continued dedication and hard work!
ID: 1663134 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 33452
Credit: 11,770,853
RAC: 8,001
Belgium
Message 1663153 - Posted: 9 Apr 2015, 17:59:18 UTC

Thanx for the update Matt.
rOZZ
Music
Pictures
ID: 1663153 · Report as offensive
Andrew
Avatar

Send message
Joined: 28 Mar 15
Posts: 47
Credit: 1,053,596
RAC: 0
Canada
Message 1663155 - Posted: 9 Apr 2015, 18:00:48 UTC

Good luck. Thanks for the update .. as it snows here..
ID: 1663155 · Report as offensive
Profile ReiAyanamiProject Donor
Avatar

Send message
Joined: 6 Dec 05
Posts: 113
Credit: 145,692,329
RAC: 132,432
Japan
Message 1663175 - Posted: 9 Apr 2015, 19:06:22 UTC

Thank you for the update.
This will give me a chance to clean up and reconfigure my PC's.
ID: 1663175 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 36,343,117
RAC: 23,900
Netherlands
Message 1663176 - Posted: 9 Apr 2015, 19:10:16 UTC

Best of luck ! We'll be waiting patiently. Take all the time you need, better a good fix than a quick fix...
ID: 1663176 · Report as offensive
Jeremy

Send message
Joined: 4 Apr 15
Posts: 10
Credit: 1,814,001
RAC: 0
United States
Message 1663177 - Posted: 9 Apr 2015, 19:16:21 UTC - in response to Message 1663120.  

I think what he is really trying to say, is that the SETI team is looking forward to may flowers!!!

Thanks for the update. I've been an on and off number cruncher since the projects inception. Personally, I appreciate all of the effort that the SETI research team puts into this project. It is a truly wonderful concept and I also know that many others are appreciative as well.

I wish you the best of luck on your endeavor. I hope that you have a speedy repair. We all know how frustrating software/hardware can be at times. Naturally, I want to be the person that finds the ET work unit....j/k LOL ;-)

Jeremy
ID: 1663177 · Report as offensive
Profile KC5VDJ - Jim the Enchanter
Avatar

Send message
Joined: 17 May 99
Posts: 81
Credit: 4,083,597
RAC: 0
United States
Message 1663431 - Posted: 10 Apr 2015, 5:16:50 UTC
Last modified: 10 Apr 2015, 5:49:01 UTC

Are workunits being distributed?

I have had computer issues recently, and think I have them cleared up, but reinstalling BOINC and attaching to the project downloads the pictures only, without even downloading the default client.

Also, is there a way to do a mass abort on any workunits not yet returned by my account? I lost a load of them several times in the past week or so. Probably 400 work units that were untouched in queue on my system got lost due to crashes that confused BOINC. If the inability to get it going again is related to that, please abort those work units and allow me to get things going again on my end. I do believe my computer is now stable.

Thanks,

Jim
Delidded i7-4790K (CLU/CLU) at 4.7GHz @ 1.310Vcore 24/7, 32GB DDR3-2400, Corsair H100i v2, Gigabyte Z97X-Gaming G1 WIFI-BK, MSI Radeon RX 480 Gaming 4G, HX-650 PSU, Corsair 750D

ID: 1663431 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 2913
Credit: 10,855,708
RAC: 371
United States
Message 1663452 - Posted: 10 Apr 2015, 6:50:22 UTC - in response to Message 1663431.  

Are workunits being distributed?

I don't think so, no. I'm not totally sure if even "third-opinion" (when two tasks are inconclusive and get sent out again) are working presently.

Also, is there a way to do a mass abort on any workunits not yet returned by my account? I lost a load of them several times in the past week or so.

Maybe if you do 'detach' in BOINC Manager for this project.. that might let go of them, but I think that only works for whatever you have in your cache on that machine. With 'resend lost tasks' being turned-off server-side, there's nothing else you can do about those.. they'll expire and get sent out to a third person.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1663452 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11429
Credit: 101,726,154
RAC: 75,385
United Kingdom
Message 1663465 - Posted: 10 Apr 2015, 7:19:42 UTC - in response to Message 1663452.  

Are workunits being distributed?

I don't think so, no. I'm not totally sure if even "third-opinion" (when two tasks are inconclusive and get sent out again) are working presently.

Also, is there a way to do a mass abort on any workunits not yet returned by my account? I lost a load of them several times in the past week or so.

Maybe if you do 'detach' in BOINC Manager for this project.. that might let go of them, but I think that only works for whatever you have in your cache on that machine. With 'resend lost tasks' being turned-off server-side, there's nothing else you can do about those.. they'll expire and get sent out to a third person.

Since the problem that's holding us up is with the Science database, the BOINC database should be very quiet at the moment. I wonder whether Matt might consider turning 'resend lost results' back on during the break, to flush those and any others out of the system?
ID: 1663465 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 33452
Credit: 11,770,853
RAC: 8,001
Belgium
Message 1663487 - Posted: 10 Apr 2015, 8:02:54 UTC

Welcome to the project Andrew!
rOZZ
Music
Pictures
ID: 1663487 · Report as offensive
Profile Silver Surfer

Send message
Joined: 23 Mar 01
Posts: 2
Credit: 2,535,311
RAC: 800
Netherlands
Message 1663494 - Posted: 10 Apr 2015, 8:19:53 UTC

Hi Matt,

I was wandering why I didn't get any new work. :-(

Keep up the good work.!!
ID: 1663494 · Report as offensive
Profile Keith MyersProject Donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 1977
Credit: 162,461,336
RAC: 347,774
United States
Message 1663633 - Posted: 10 Apr 2015, 15:32:56 UTC - in response to Message 1663465.  


Since the problem that's holding us up is with the Science database, the BOINC database should be very quiet at the moment. I wonder whether Matt might consider turning 'resend lost results' back on during the break, to flush those and any others out of the system?


He must have since when I just turned this computer on for the day a half hour ago, I picked up 45 new GPU tasks, and I had already been out of them for over a day. Of course half of them are already processed as I type. Cricket doesn't show much of anything going out now though.

Cheers, Keith
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1663633 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 571
Credit: 65,751,688
RAC: 398
Finland
Message 1663673 - Posted: 10 Apr 2015, 17:00:28 UTC - in response to Message 1663633.  


Since the problem that's holding us up is with the Science database, the BOINC database should be very quiet at the moment. I wonder whether Matt might consider turning 'resend lost results' back on during the break, to flush those and any others out of the system?


He must have since when I just turned this computer on for the day a half hour ago, I picked up 45 new GPU tasks, and I had already been out of them for over a day. Of course half of them are already processed as I type. Cricket doesn't show much of anything going out now though.


No Keith, You just got lucky and did get tasks that someone else didn't finish.

'Resend lost task' is about when users computers somehow lost task in progress.
ID: 1663673 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3652
Credit: 171,835,416
RAC: 209,286
United States
Message 1663688 - Posted: 10 Apr 2015, 17:59:32 UTC - in response to Message 1663673.  
Last modified: 10 Apr 2015, 18:58:58 UTC

Yep, 'resend lost tasks' is still turned off. This host has 2 lost tasks attributed to it, the cache is empty yet there are still 2 tasks listed as in progress, http://setiathome.berkeley.edu/results.php?hostid=7258715. I've already cleared another host by detaching from the project and reattaching which resulted in 44 lost tasks being listed as Abandoned. Since the other host only has 2 lost tasks I'm going to let them just time out. Some hosts have Hundreds of lost tasks, one known host was in the thousands. It might make bookkeeping easier if these tasks were cleared from the database.

Well, it was just turned on;
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | Sending scheduler request: To fetch work.
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | Requesting new tasks for CPU and ATI
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | [sched_op] CPU work request: 69120.00 seconds; 2.00 devices
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | [sched_op] ATI work request: 34560.00 seconds; 2.00 devices
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Scheduler request completed: got 0 new tasks
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | [sched_op] Server version 705
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Didn't resend lost task 14ja12aa.5456.23675.438086664199.12.61_1 (expired)
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Didn't resend lost task 06oc12aa.21007.19703.438086664206.12.28_1 (expired)
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Project has no tasks available

Too bad they were expired and not resent..
ID: 1663688 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 3652
Credit: 171,835,416
RAC: 209,286
United States
Message 1663739 - Posted: 10 Apr 2015, 19:55:05 UTC - in response to Message 1663688.  

I suppose I had my preferences set wrong to have them resent, oh well.

They are being resent to this host though;
http://setiathome.berkeley.edu/results.php?hostid=7033569
ID: 1663739 · Report as offensive
Profile Donald L. JohnsonProject Donor
Avatar

Send message
Joined: 5 Aug 02
Posts: 8221
Credit: 6,207,622
RAC: 7,684
United States
Message 1663785 - Posted: 10 Apr 2015, 21:04:24 UTC - in response to Message 1663688.  

Yep, 'resend lost tasks' is still turned off. This host has 2 lost tasks attributed to it, the cache is empty yet there are still 2 tasks listed as in progress, http://setiathome.berkeley.edu/results.php?hostid=7258715. I've already cleared another host by detaching from the project and reattaching which resulted in 44 lost tasks being listed as Abandoned. Since the other host only has 2 lost tasks I'm going to let them just time out. Some hosts have Hundreds of lost tasks, one known host was in the thousands. It might make bookkeeping easier if these tasks were cleared from the database.

Well, it was just turned on;
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | Sending scheduler request: To fetch work.
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | Requesting new tasks for CPU and ATI
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | [sched_op] CPU work request: 69120.00 seconds; 2.00 devices
Fri 10 Apr 2015 02:51:26 PM EDT | SETI@home | [sched_op] ATI work request: 34560.00 seconds; 2.00 devices
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Scheduler request completed: got 0 new tasks
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | [sched_op] Server version 705
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Didn't resend lost task 14ja12aa.5456.23675.438086664199.12.61_1 (expired)
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Didn't resend lost task 06oc12aa.21007.19703.438086664206.12.28_1 (expired)
Fri 10 Apr 2015 02:51:28 PM EDT | SETI@home | Project has no tasks available

Too bad they were expired and not resent..

No, Resend Lost Tasks is NOT turned on. That's why those tasks were expired and sent to another cruncher. And the Resend Lost Tasks function is completely Server side, there is nothing you can do with your preferences to affect it.
Donald
Infernal Optimist / Submariner, retired
ID: 1663785 · Report as offensive
1 · 2 · 3 · Next

Message boards : Technical News : April Showers (Apr 09 2015)


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.