Missing Results?

Message boards : Number crunching : Missing Results?
Message board moderation

To post messages, you must log in.

AuthorMessage
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1711655 - Posted: 12 Aug 2015, 10:08:31 UTC

Hi,
I cannot seem to find any results validated or pending for the day of the outage (11 aug UTC) and maybe half a day before.
Has anyone noticed the same?
ID: 1711655 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711658 - Posted: 12 Aug 2015, 10:23:27 UTC - in response to Message 1711655.  

All mine look normal, except of course there's a big gap between ~15:00 UTC (early start to the outage) and the early hours of 12 August (late finish).

There are problems with the workunit pages for both MB and AP, as we've discussed elsewhere, but not result pages. Unless you have information from your logs that suggest otherwise?
ID: 1711658 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1711659 - Posted: 12 Aug 2015, 10:29:30 UTC

I have no results, that seem to have been received by the server from about 5pm aug 10 to the end of aug 11 (UTC time).
I am not 100% certain, but there had to be results turned in by my computer during that time.
I noticed a sharp drop in my RAC beforethe outage.
ID: 1711659 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711665 - Posted: 12 Aug 2015, 10:51:48 UTC - in response to Message 1711659.  

I'd suggest that you check things at your end. Looking in particular at your Computer 7593028, you have loads of work available to work on, but you've returned very little of it in the last two or three weeks - 25 of the tasks passed their deadlines without being processed.

That looks from here more like the computer wasn't running BOINC, or wasn't switched on, even though you may have thought it was. Although the project has had one or two problems, there's been nothing that could explain a gap as long as that - and your computer should have been able to continue processing the work it had available, even if the project was down.
ID: 1711665 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1711670 - Posted: 12 Aug 2015, 11:08:14 UTC
Last modified: 12 Aug 2015, 11:09:06 UTC

Thanks.
The timed out ones were lost units from a few weeks ago.
Maybe it is nothing, but my computer was working all the time.
I just never had a whole day, where nothing was reported (while running)

Working the time back, from the first results reported on the 12 aug utc,
there should have been a few results arriving at the server before the outage.

Thanks for looking into that, maybe i am a just seeing things(or not seeing, in this case)
ID: 1711670 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22202
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1711677 - Posted: 12 Aug 2015, 11:40:49 UTC

Don't forget that any result that has been validated will vanish from your list after 24 hours. So it is possible that you have returned results where you wingman was "waiting" for you, in which case validation would be "instant, and then 24 hours later they don't show...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1711677 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711686 - Posted: 12 Aug 2015, 12:01:00 UTC - in response to Message 1711677.  

Don't forget that any result that has been validated will vanish from your list after 24 hours. So it is possible that you have returned results where you wingman was "waiting" for you, in which case validation would be "instant, and then 24 hours later they don't show...

That's certainly the case for Rasputin, because he's currently working on tasks issued om 31 July 2015, before we reached task 2^32. It's even possible that in their efforts to clean up the database last night, the crew ran an extra DB purge which removed validated tasks earlier than the normal 24 hour grace period.

Fir the rest of us, working on tasks issued after Task 2^32 (2 Aug 2015, 5:44:59 UTC), the normal 'disappears 24 hours after validation' rule doesn't currently apply.
ID: 1711686 · Report as offensive
Profile Zombu2
Volunteer tester

Send message
Joined: 24 Feb 01
Posts: 1615
Credit: 49,315,423
RAC: 0
United States
Message 1711687 - Posted: 12 Aug 2015, 12:11:27 UTC

Hmm it s weird i am cranking out wu's but they do not show up as pending or validated they disappear into thin air
I came down with a bad case of i don't give a crap
ID: 1711687 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1711688 - Posted: 12 Aug 2015, 12:19:49 UTC - in response to Message 1711686.  

the normal 'disappears 24 hours after validation' rule doesn't currently apply.


What does, instead?
ID: 1711688 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1711698 - Posted: 12 Aug 2015, 13:07:54 UTC - in response to Message 1711686.  

Don't forget that any result that has been validated will vanish from your list after 24 hours. So it is possible that you have returned results where you wingman was "waiting" for you, in which case validation would be "instant, and then 24 hours later they don't show...

That's certainly the case for Rasputin, because he's currently working on tasks issued om 31 July 2015, before we reached task 2^32. It's even possible that in their efforts to clean up the database last night, the crew ran an extra DB purge which removed validated tasks earlier than the normal 24 hour grace period.

Fir the rest of us, working on tasks issued after Task 2^32 (2 Aug 2015, 5:44:59 UTC), the normal 'disappears 24 hours after validation' rule doesn't currently apply.

I have several, like this wu, that were issues & completed just before maintenance started. So I'm not sure if they ran an early purge.
Since the wu will be purged soon. Here are the timestamps of the two results.
Sent: 11 Aug 2015, 10:41:43 UTC, Reported: 11 Aug 2015, 14:41:22 UTC
Sent: 11 Aug 2015, 10:41:46 UTC, Reported: 11 Aug 2015, 12:18:40 UTC
Perhaps that one didn't make it through the validator stage before maintenance started, but I have dozens that were reported & validated prior to maintenance. The oldest wu I can see with both results now goes back to 11 Aug 2015, 13:13:01 UTC. As the rest have started the delete/purge process.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1711698 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711701 - Posted: 12 Aug 2015, 13:18:03 UTC - in response to Message 1711688.  

the normal 'disappears 24 hours after validation' rule doesn't currently apply.

What does, instead?

Certainly up until yesterday, tasks with TaskID >= 2^32 weren't being purged at all - they were sill visible in the database 10 days after validation. That's related to the problem of 12 million (and rising) undeleted result files shown on the server status page.

Certainly, if I were a server admin, my first priority would be to fix the purge/deleter process so it dealt with 64-bit numbers, so that the problem doesn't get any worse. Maybe they reduced the purge delay, so that they could see whether it was working for new validations as soon as they get into the lab today, rather than waiting 24 hours until practically the end of the working day today. That might account for Zombu's observation.

Then they would have capped the problem, and they would work on the backlog at leisure, and make sure they do it right.
ID: 1711701 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1711704 - Posted: 12 Aug 2015, 13:24:30 UTC

Thanks, Guys!
I just wanted to know, if there was something odd going on.
ID: 1711704 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711709 - Posted: 12 Aug 2015, 13:34:44 UTC - in response to Message 1711698.  

Watching

Workunit 1865970821

4310560648 6910484 11 Aug 2015, 12:14:01 UTC 11 Aug 2015, 14:34:39 UTC Completed and validated 3,246.83 3,223.07 489.70 AstroPulse v7
Anonymous platform (NVIDIA GPU)
4310560649 6946318 11 Aug 2015, 12:14:01 UTC 11 Aug 2015, 13:36:56 UTC Completed and validated 2,575.00 436.38 489.70 AstroPulse v7 v7.07 (opencl_nvidia_mac)

According to theory, the whole caboodle (WU and results) should be purged in about an hour...
ID: 1711709 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711734 - Posted: 12 Aug 2015, 14:46:37 UTC - in response to Message 1711709.  

No such luck. Task 4310560648 is still in the database, but the associated WU 1865970821 was purged after 24 hours as usual.

So the pile of un-purged results is going to get worse.
ID: 1711734 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1711735 - Posted: 12 Aug 2015, 14:47:05 UTC - in response to Message 1711709.  
Last modified: 12 Aug 2015, 14:47:47 UTC

Watching

Workunit 1865970821

4310560648 6910484 11 Aug 2015, 12:14:01 UTC 11 Aug 2015, 14:34:39 UTC Completed and validated 3,246.83 3,223.07 489.70 AstroPulse v7
Anonymous platform (NVIDIA GPU)
4310560649 6946318 11 Aug 2015, 12:14:01 UTC 11 Aug 2015, 13:36:56 UTC Completed and validated 2,575.00 436.38 489.70 AstroPulse v7 v7.07 (opencl_nvidia_mac)

According to theory, the whole caboodle (WU and results) should be purged in about an hour...


And work unit gone.

ID: 1711735 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711745 - Posted: 12 Aug 2015, 15:00:16 UTC - in response to Message 1711735.  

And work unit gone.

Leaving both results orphaned.
ID: 1711745 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1711760 - Posted: 12 Aug 2015, 15:30:57 UTC - in response to Message 1711734.  

No such luck. Task 4310560648 is still in the database, but the associated WU 1865970821 was purged after 24 hours as usual.

So the pile of un-purged results is going to get worse.

Indeed every task from 4294967296 to 4294967250 is still present in the system with its associated workunit having been purged. While tasks from 4294967295 to 4294967250 & earlier have been purged.

The newest task I have is 4312462714. So that gives us at least 17,495,418 tasks that will be orphaned. So the current total of 12,714,491 Results waiting for deletion doesn't even account for all of the new work generated.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1711760 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1711769 - Posted: 12 Aug 2015, 15:41:48 UTC - in response to Message 1711760.  

No such luck. Task 4310560648 is still in the database, but the associated WU 1865970821 was purged after 24 hours as usual.

So the pile of un-purged results is going to get worse.

Indeed every task from 4294967296 to 4294967250 is still present in the system with its associated workunit having been purged. While tasks from 4294967295 to 4294967250 & earlier have been purged.

The newest task I have is 4312462714. So that gives us at least 17,495,418 tasks that will be orphaned. So the current total of 12,714,491 Results waiting for deletion doesn't even account for all of the new work generated.

And, of course, the BOINC database is shared between MB and AP, so recent AP WUs (like the mine-canary I picked) have IDs in the vulnerable range. At least the AP heap started smaller, and grows more slowly - but it's nonetheless a heap, and it looks like we'll be adding to the MB heap again soon.
ID: 1711769 · Report as offensive

Message boards : Number crunching : Missing Results?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.