Hope not to see anymore of these!

Message boards : Number crunching : Hope not to see anymore of these!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Clyde C. Phillips, III

Send message
Joined: 2 Aug 00
Posts: 1851
Credit: 5,955,047
RAC: 0
United States
Message 627196 - Posted: 26 Aug 2007, 18:01:28 UTC

application SETI@home Enhanced
created 16 Aug 2007 11:58:33 UTC
name 05mr07aa.15859.24612.16.4.20
minimum quorum 2
initial replication 2
max # of error/total/success results 5, 10, 5
Result ID
click for details Computer Sent Time reported
or deadline
explain Server state
explain Outcome
explain Client state
explain CPU time (sec) claimed credit granted credit
599793947 --- --- --- Unsent Unknown New --- --- ---
591705657 3041784 16 Aug 2007 11:58:40 UTC 26 Aug 2007 10:30:25 UTC Over Client error Compute error 345,201.03 0.13 ---
591705658 3328573 16 Aug 2007 11:58:40 UTC 19 Aug 2007 14:24:03 UTC Over Client error Done 49.00 0.02 ---
594589211 2398546 19 Aug 2007 14:32:26 UTC 23 Aug 2007 15:50:07 UTC Over Success Done 21,276.91 0.68 pending


Mine was the 21,000-second one, probably the one I was looking for. I have sympathy for the unfortunate cruncher that lost four days, and envy the one who skirted the problem in only 49 seconds.
ID: 627196 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 627266 - Posted: 26 Aug 2007, 19:40:05 UTC - in response to Message 627196.  

application SETI@home Enhanced
created 16 Aug 2007 11:58:33 UTC
name 05mr07aa.15859.24612.16.4.20
minimum quorum 2
initial replication 2
max # of error/total/success results 5, 10, 5
Result ID
click for details Computer Sent Time reported
or deadline
explain Server state
explain Outcome
explain Client state
explain CPU time (sec) claimed credit granted credit
599793947 --- --- --- Unsent Unknown New --- --- ---
591705657 3041784 16 Aug 2007 11:58:40 UTC 26 Aug 2007 10:30:25 UTC Over Client error Compute error 345,201.03 0.13 ---
591705658 3328573 16 Aug 2007 11:58:40 UTC 19 Aug 2007 14:24:03 UTC Over Client error Done 49.00 0.02 ---
594589211 2398546 19 Aug 2007 14:32:26 UTC 23 Aug 2007 15:50:07 UTC Over Success Done 21,276.91 0.68 pending


Mine was the 21,000-second one, probably the one I was looking for. I have sympathy for the unfortunate cruncher that lost four days, and envy the one who skirted the problem in only 49 seconds.

There are probably 10000 of those with the negative triplet threshold still going the rounds until they get two successes or more than 5 errors. The long execution does reduce the load on the servers, but that's the only positive aspect.
                                                             Joe
ID: 627266 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 627314 - Posted: 26 Aug 2007, 20:25:53 UTC - in response to Message 627266.  

There are probably 10000 of those with the negative triplet threshold still going the rounds until they get two successes or more than 5 errors. The long execution does reduce the load on the servers, but that's the only positive aspect.
                                                             Joe

I've been doing some thinking, and researching, on that subject since MadMac posted about his re-send a little earlier.

I've been going through the 'Work Unit problem' thread looking at the results of the people who reported receiving "Splitsville Evercrunch Specials", and the vast majority of the WUs reported are still in the database. You, Joe, are an honourable exception, because three of the four you performed surgery on have been purged - but WU 148063178 is still there on a 16-day deadline cycle, so it could be with us for a couple of months.

The earliest initial send I could find was RID 590516006 at 14 Aug 2007 15:53:07 UTC, and the last I've got logged is RID 591766228 at 16 Aug 2007 13:12:32 UTC (though that's several hours before Matt first posted that he'd been made aware of the problem, and the early hours of the morning PDT: I suspect there may be later ones I haven't come across yet). Even so, subtracting those two RIDs gives a minimum estimate of 1250222 results split while the faulty splitters were in use. Taking Matt's figure of 2.5%, that's well over 30,000 "Evercrunch Specials" created.

How many of them are still in the database? Looking at the 42 WUs in my sample, I'd say nearer a half than a third. And just for fun, those 42 WUs have already consumed a total of 131.15 days of reported processor time: the record stands at 412,035.67 seconds (unless anyone knows better.....) - it errored with a 'Maximum CPU time exceeded'. RID 590730256.

Your outstanding result, and the record-holder, are unusual with 16-day deadlines: 9 of my 42 are like that. The majority are on 8-day deadlines, and will keep churning round on the weekly treadmill until someone, somewhere, bites the bullet. The 16-day ones will be a longer, slower agony, of course.

What's to be done? I'm sure Eric will ask, in the first staff meeting after he gets back, how the transition to Multibeam went. I'm sure he'll get an honest answer. There'll be a lot of other news to pass on as well. But after that, the agenda will move on to Future Plans: priorities for the next day/week/month. I hope that scripting a purge for these pesky blighters will at least get considered as part of that priority setting. If it gets considered, but rejected, then I'm happy. If they don't even think about it, I'll be cross.
ID: 627314 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 627334 - Posted: 26 Aug 2007, 20:54:43 UTC - in response to Message 627314.  
Last modified: 26 Aug 2007, 21:00:03 UTC

There are probably 10000 of those with the negative triplet threshold still going the rounds until they get two successes or more than 5 errors. The long execution does reduce the load on the servers, but that's the only positive aspect.
                                                             Joe

I've been doing some thinking, and researching, on that subject since MadMac posted about his re-send a little earlier.

I've been going through the 'Work Unit problem' thread looking at the results of the people who reported receiving "Splitsville Evercrunch Specials", and the vast majority of the WUs reported are still in the database. You, Joe, are an honourable exception, because three of the four you performed surgery on have been purged - but WU 148063178 is still there on a 16-day deadline cycle, so it could be with us for a couple of months.

The earliest initial send I could find was RID 590516006 at 14 Aug 2007 15:53:07 UTC, and the last I've got logged is RID 591766228 at 16 Aug 2007 13:12:32 UTC (though that's several hours before Matt first posted that he'd been made aware of the problem, and the early hours of the morning PDT: I suspect there may be later ones I haven't come across yet). Even so, subtracting those two RIDs gives a minimum estimate of 1250222 results split while the faulty splitters were in use. Taking Matt's figure of 2.5%, that's well over 30,000 "Evercrunch Specials" created.

How many of them are still in the database? Looking at the 42 WUs in my sample, I'd say nearer a half than a third. And just for fun, those 42 WUs have already consumed a total of 131.15 days of reported processor time: the record stands at 412,035.67 seconds (unless anyone knows better.....) - it errored with a 'Maximum CPU time exceeded'. RID 590730256.

Your outstanding result, and the record-holder, are unusual with 16-day deadlines: 9 of my 42 are like that. The majority are on 8-day deadlines, and will keep churning round on the weekly treadmill until someone, somewhere, bites the bullet. The 16-day ones will be a longer, slower agony, of course.

What's to be done? I'm sure Eric will ask, in the first staff meeting after he gets back, how the transition to Multibeam went. I'm sure he'll get an honest answer. There'll be a lot of other news to pass on as well. But after that, the agenda will move on to Future Plans: priorities for the next day/week/month. I hope that scripting a purge for these pesky blighters will at least get considered as part of that priority setting. If it gets considered, but rejected, then I'm happy. If they don't even think about it, I'll be cross.


LOL...I love your nomenclature of 'Splittsville Evercrunch Specials'...heck, if that's all the servers handed out, the server traffic problem would be solved at least.
'Course it would raise heck with the RACs (like MB hasn't already) and slow the science end of things down.

EDIT...For my part, I have just bitten the bullet and let them crunch to whatever completion they may arrive at. No sense easing my pain just to pass it along to somebody else.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 627334 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 627686 - Posted: 27 Aug 2007, 8:55:37 UTC

Arrrrgggghhh! Just found another of the little blighters in my own cache: 148048195

I aborted this one, to bump the 'failed' count instead of the 'success' count.

Bump those figures to 43 WUs, 138.25 days.
ID: 627686 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 627759 - Posted: 27 Aug 2007, 13:18:46 UTC
Last modified: 27 Aug 2007, 13:41:49 UTC

Here's another one for the list:

147603418

Unfortunately, my host is wandering at the moment, so I can't do anything about it. Worse, I saw there was another T2400 which ran it, and it took it almost a day to beat its way through it, yipee! ;-)

The only good news was it looks like mine has it at bat right now, and when it finishes (or the other current wingman) it's the end of the road for this clinker. :-)

<edit> Sidebar: Richard, I saw some folks mention they were getting DL failures again, and the Cricket Graphs have been squirrely. Have you poked around with the SYN RTT again today? I'll watch the 'Gone in 21 seconds' thread.

Alinator
ID: 627759 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 627762 - Posted: 27 Aug 2007, 13:37:41 UTC - in response to Message 627759.  

Here's another one for the list:

147603418

Unfortunately, my host is wandering at the moment, so I can't do anything about it. Worse, I saw there was another T2400 which ran it, and it took it almost a day to beat its way through it, yipee! ;-)

The only good news was it looks like mine has it at bat right now, and when it finishes (or the other current winman) it's the end of the road for this clinker. :-)

Added. 46 WUs, 155.19 days

Does anyone know how Eric's spam filter (and indeed, Eric himself) reacts to unsolicited Excel files? I might be tempted to send him a 'welcome back' present.
ID: 627762 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19064
Credit: 40,757,560
RAC: 67
United Kingdom
Message 627779 - Posted: 27 Aug 2007, 14:13:20 UTC - in response to Message 627762.  

Here's another one for the list:

147603418

Unfortunately, my host is wandering at the moment, so I can't do anything about it. Worse, I saw there was another T2400 which ran it, and it took it almost a day to beat its way through it, yipee! ;-)

The only good news was it looks like mine has it at bat right now, and when it finishes (or the other current winman) it's the end of the road for this clinker. :-)

Added. 46 WUs, 155.19 days

Does anyone know how Eric's spam filter (and indeed, Eric himself) reacts to unsolicited Excel files? I might be tempted to send him a 'welcome back' present.

Save as csv, and send.
ID: 627779 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 627792 - Posted: 27 Aug 2007, 14:28:22 UTC - in response to Message 627779.  

Save as csv, and send.

The trouble is, that strips out the colour coding, calculation formulae and - crucially in this case - hyperlinks.

I would have zipped, of course - are you referring to problems getting it through an 'active content' filter, or non-availability of M$ Office progams in the lab?
ID: 627792 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 627800 - Posted: 27 Aug 2007, 14:38:53 UTC

I've sent .xls to him via the email addy he uses for Boinc alpha. Don't know if he opened them though. He never said so, one way or the other. I can open .xls with openoffice (linux), and the formatting is only a little off, so I think it might work for him. Wouldn't kill you to try one way or another.
ID: 627800 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 627831 - Posted: 27 Aug 2007, 15:53:38 UTC - in response to Message 627686.  

Arrrrgggghhh! Just found another of the little blighters in my own cache: 148048195
I aborted this one, to bump the 'failed' count instead of the 'success' count.
Bump those figures to 43 WUs, 138.25 days.

Darn, I should have thought of this a week ago. For Windows users,
- Create a batch file as follows and put it on your desktop. I named mine "find_triplet.bat" but that's obviously up to you.
find /I "<triplet_thresh>-" "C:\\Program Files\\BOINC\\projects\\setiathome.berkeley.edu\\*.*" > search_result.txt
pause

- This can take a minute to run depending on your queue. Be patient.
- Running the batch file will create a file called "search_result.txt" on your desktop. Open the text file with Notepad and do a search for "triplet"
- Any WU with a negative <triplet_thresh> will have the triplet value displayed on a line immediately below the file name. Either abort it or "fix" it with Joe's "99" trick. If there are no <triplet_thresh>'s in the text file then you are clean for now.
- Note, you may have to vary the path depending on your installation.
- I used the "pause" so I could see that it runs without errors, this can be removed if you wish.
- You could probably get away with running this with BOINC active, but if the bat file happens to have a result open when the CC tries to write to it that might not be good, so I'm going to kill BOINC while it's running.
- As long as you maintain at least a day's worth of work, running this once a day should find most if not all of these WU's before they start.

I just ran this on all of my machines and found 3 more of these WU's
ID: 627831 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 627841 - Posted: 27 Aug 2007, 16:15:51 UTC

As a novice what is a batch file and how do I set about doing one.
ID: 627841 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 627844 - Posted: 27 Aug 2007, 16:21:34 UTC - in response to Message 627841.  
Last modified: 27 Aug 2007, 16:22:27 UTC

As a novice what is a batch file and how do I set about doing one.

Use Notepad to create a new text file and give it a name with a .bat extension. This must be Notepad, not Wordpad or Word, since it has to be a pure text file. Then just copy/paste the lines in red into the file and save it to your desktop. Double-click the file to make it run. If you need to edit the file after it's been created you can right-click it and choose "Edit" from the context menu.

EDIT - BTW, a bat file is an old DOS filetype that runs DOS commands.
ID: 627844 · Report as offensive
Profile Jim-R.
Volunteer tester
Avatar

Send message
Joined: 7 Feb 06
Posts: 1494
Credit: 194,148
RAC: 0
United States
Message 627845 - Posted: 27 Aug 2007, 16:29:05 UTC - in response to Message 627841.  

As a novice what is a batch file and how do I set about doing one.

A "batch file" is a simple text file which ends in the .bat extension which is recognized by a DOS/Windows computer as an executable file. It is useful for running a single command with the same set of command line options every time, or for running a series of commands.

In his batch file he is using the "find" command with the /I option to search for the string <triplet_thresh>- in all of the files in the SETI project directory and output the value using the "redirect" symbol ">" to a file called "search_results.txt". Then he has a "pause" to keep the batch file from exiting until you hit a key.

find /I "<triplet_thresh>-" "C:\\Program Files\\BOINC\\projects\\setiathome.berkeley.edu\\*.*" > search_result.txt
pause

You create a batch file with any text editor (notepad, etc.). It MUST end in .bat before it can be executed. The actual "name" of the file doesn't matter, just the .bat extension.
Jim

Some people plan their life out and look back at the wealth they've had.
Others live life day by day and look back at the wealth of experiences and enjoyment they've had.
ID: 627845 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 627846 - Posted: 27 Aug 2007, 16:32:36 UTC - in response to Message 627841.  

As a novice what is a batch file and how do I set about doing one.


A "batch" file is a sequentially ordered ASCII file that usually ends in ".bat" (but can end in .CMD for some command processors).

Batch files are useful for executing routinely repetitive tasks. An "@" symbol is used before a command is issued if you do not want the command echoed to the command line.

A sample file would look like this:


@echo off
rem - "rem" is short for remark so that you can comment your files
rem "echo off" means to turn off all command echoing functions on the command
rem line for the rest of the file. The "@" prevents "echo off" from being dis-
rem played on the command line.
echo.
echo This program will open a Window for the "C" drive if executed from within
echo Windows.
start C:\\
rem "echo" with a "." means to add a blank line to the screen. "echo" with
rem text following means to print the line to the screen as a message.
rem Then you simply insert your commands that you would normally run from the
rem command line.



You could copy that entire block of text and paste it into a file and save it with a .BAT extension and it would actually work. There are plenty of ways to make more advanced batch files that can make your life easier.
ID: 627846 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 627877 - Posted: 27 Aug 2007, 17:51:02 UTC
Last modified: 27 Aug 2007, 17:52:59 UTC

Thnaks for the info, think I have got another -9 file a 02mr07ai etc will check and see if someone has done it before. My last one I was no 7 doing it the other 5 had errors next to their names. It is not a -9 one someone did the work and received no credit due to compute error. Phew.
ID: 627877 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 627885 - Posted: 27 Aug 2007, 18:16:34 UTC - in response to Message 627831.  

Darn, I should have thought of this a week ago. For Windows users,
- Create a batch file as follows and put it on your desktop. I named mine "find_triplet.bat" but that's obviously up to you.
find /I "<triplet_thresh>-" "C:\\Program Files\\BOINC\\projects\\setiathome.berkeley.edu\\*.*" > search_result.txt
pause

- This can take a minute to run depending on your queue. Be patient.
- Running the batch file will create a file called "search_result.txt" on your desktop. Open the text file with Notepad and do a search for "triplet"
- Any WU with a negative <triplet_thresh> will have the triplet value displayed on a line immediately below the file name. Either abort it or "fix" it with Joe's "99" trick. If there are no <triplet_thresh>'s in the text file then you are clean for now.
- Note, you may have to vary the path depending on your installation.
- I used the "pause" so I could see that it runs without errors, this can be removed if you wish.
- You could probably get away with running this with BOINC active, but if the bat file happens to have a result open when the CC tries to write to it that might not be good, so I'm going to kill BOINC while it's running.
- As long as you maintain at least a day's worth of work, running this once a day should find most if not all of these WU's before they start.

I just ran this on all of my machines and found 3 more of these WU's


Thanks for this information. I created the batch file and ran it on all my boxes. Found 1 which I aborted. Thanks again.


Boinc....Boinc....Boinc....Boinc....
ID: 627885 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 627996 - Posted: 27 Aug 2007, 22:06:31 UTC - in response to Message 627831.  
Last modified: 27 Aug 2007, 22:06:49 UTC

I just ran this on all of my machines and found 3 more of these WU's

Gus,

Would you mind laying out the corpses for inspection in true kitty style, the next time you do this?

I found 147662115 and 147663393 - but I can only spot them if they've reported, and I go cross-eyed scanning big caches like yours.

Still, those two make the running total

49 WUs. 162.17 days

Anyone offer me a nice round 50 for Eric?
ID: 627996 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 628018 - Posted: 27 Aug 2007, 22:41:27 UTC - in response to Message 627996.  

I just ran this on all of my machines and found 3 more of these WU's

Gus,

Would you mind laying out the corpses for inspection in true kitty style, the next time you do this?

You bet Richard. I plan to do it before going to bed tonight and will report any found here. Or would you rather a PM?
ID: 628018 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 628026 - Posted: 27 Aug 2007, 22:48:21 UTC
Last modified: 27 Aug 2007, 22:51:35 UTC

50: 148074871
51: 148074879
52: 147537922

(Hope they are no doubles.)
edit: One more
53: 148064261
_\|/_
U r s
ID: 628026 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Hope not to see anymore of these!


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.