Message boards :
Number crunching :
Astropulse Errors-Optimized version 5
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Blurf Send message Joined: 2 Sep 06 Posts: 8962 Credit: 12,678,685 RAC: 0 |
@Blurf Done. |
Byron S Goodgame Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 |
I'd keep an eye on it, to see that it validates & gets credit, but most likely nothing to worry about. wuid=390473712 did validate in the end. Using r103 now. Thanks for the info. |
Stick Send message Joined: 26 Feb 00 Posts: 100 Credit: 5,283,449 RAC: 5 |
My first rev 103 result was a success and validated. But I noticed its claimed credit was about 30 points higher than it's wingman's rev 69 result. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
application Astropulse created 28 Dec 2008 10:36:00 UTC name ap_15no08ae_B5_P0_00253_20081228_01028.wu minimum quorum 2 initial replication 4 max # of error/total/success tasks 5, 10, 10 Task ID click for details Computer Sent Time reported or deadline explain Server state explain Outcome explain Client state explain CPU time (sec) claimed credit granted credit 1105979118 4730870 28 Dec 2008 10:37:11 UTC 4 Jan 2009 8:25:37 UTC Over Client error Compute error 238,370.00 142.54 --- 1105979119 4677151 28 Dec 2008 10:37:10 UTC 3 Jan 2009 1:41:31 UTC Over Success Done 62,636.84 213.12 0.00 1112661132 1683125 4 Jan 2009 8:25:42 UTC 25 Jan 2009 15:17:26 UTC Over Success Done 1,813,830.00 703.68 0.00 1134159972 4070955 25 Jan 2009 15:17:39 UTC 26 Jan 2009 20:35:27 UTC Over Success Done 44,771.58 782.78 0.00 1135724185 4264561 26 Jan 2009 20:35:43 UTC 25 Feb 2009 20:35:43 UTC In progress --- New --- --- --- Is this one of the reasons AP validators are turned off? As well as the AP-Splitters, because this seems like a waist of time, IMHO. Hope, it will validate after all ;) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
You'll notice that newer issued AstroPulse tasks claim more credit than they used to. This is due to the rising server side credit multiplier. In all cases where I've looked at it, it was either a recent issued task, or a resend, where the wingmen were issued some months ago, failed to complete successfully and were subsequently reissued. @Fred, The first three wingmen clearly choked on that task, for whatever reasons. Hopefully your new wingman will be in with an accurate V5 result. As to the state of the AP pipeline, probably some work is underway related to testing / implementing the new, substantially different, 5.01 AstroPulse application. I've heard whispers that it, when it is ready to come to main, may be treated as a new application name. This would avoid many of the cross validation difficulties experienced with the 4.35->5 transition. There could be other reasons for things being turned off, but I would imagine the reconfiguration involved might take a bit of effort. [Note: 5.01 AP tasks are also intrinsically larger in terms of amount of processing, so the credit claim will likely rise further for those (when they appear here) ] Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Stick Send message Joined: 26 Feb 00 Posts: 100 Credit: 5,283,449 RAC: 5 |
Jason, Thank you! I didn't realize the credit adjustment was also applied to reissues. (And, obviously, I also have a tendency to jump to the wrong conclusion.) Stick You'll notice that newer issued AstroPulse tasks claim more credit than they used to. This is due to the rising server side credit multiplier. In all cases where I've looked at it, it was either a recent issued task, or a resend, where the wingmen were issued some months ago, failed to complete successfully and were subsequently reissued. |
KenZaske Send message Joined: 12 Oct 04 Posts: 7 Credit: 456,380 RAC: 0 |
Hello everyone; I have been using the optimized AP version 5.00r69 for quite a while, suddenly almost every work unit I have done this month has gotten a zero score. Dozens of them report a “client error!†See: http://setiathome.berkeley.edu/results.php?hostid=4752984&offset=40 for more details. I went back to the default client but I want to use the optimized client. Any idea what happened or which optimized client still works? Ken |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Hi there, The one Astropulse task I can see in your list, appears on the surface to have processed normally, but was teamed with an outdated v4.36 wingman, so has been sent out for reissue. Probably OK, but I can't guarantee that, because all the other tasks, which seem to be multibeam (cuda) appear to be getting 'Compute Errors' on your host. If you're looking for a newer AstroPulse build, there is one in the optimised apps sticky, but bear in mind that soon a newer incompatible version will be available, possibly in a few days or less. What I would suggest for that machine, as it seems to have some health problems, is reverting to stock in the meantime, give it a good clean, and make sure those cuda drivers & other general machine health indicators are up to scratch. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
KenZaske Send message Joined: 12 Oct 04 Posts: 7 Credit: 456,380 RAC: 0 |
I got those health issues dealth with a few weeks ago. The new motherboard uses an NF4U chip set, which I am not impressed with. I wish I had another ULI MB but alas, I don't. It to six reinstalls of WinXP 64bit to get a stable install. Now if nVidia would just publish a stable driver set I would be happy. Thanks for the quick reply. |
KenZaske Send message Joined: 12 Oct 04 Posts: 7 Credit: 456,380 RAC: 0 |
I have been thinking about it and have one question. Does it require PhysX to be installed? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I have been thinking about it and have one question. Does it require PhysX to be installed? For AstroPulse: there is no Astropulse GPU / Cuda application at this time (Only CPU), so Astropulse applications won't care about any GPU related installation considerations at this time. It may come at a later date, but the exact form the GPU will be used may be different to the current multibeam Cuda enabled builds. For Multibeam (setiathome_enhanced): AFAIK at the moment, it doesn't matter for the Cuda Multibeam builds whether PhysX is installed or not, but when I put in my updated drivers in my machine (which is important to do), it installed PhysX and it seems to do no harm. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
KenZaske Send message Joined: 12 Oct 04 Posts: 7 Credit: 456,380 RAC: 0 |
Thank you. As for PhyxX doing no harm, I have noticed a small increase in game performance if I uninstall it. But then again, none of the games I play use it anyway. |
Tony Farkas Send message Joined: 29 Apr 02 Posts: 1 Credit: 14,151 RAC: 0 |
I've completed 2 AP wu on 2 diffent computers and they wont upload. It took them around 100 hours to do each wu. how do get them uploaded? They have been finished for 4 days now. 2/24/2009 2:57:08 PM|SETI@home|Started upload of ap_16ja09ag_B3_P1_00398_20090218_02436.wu_1_0 2/24/2009 2:57:22 PM||Project communication failed: attempting access to reference site 2/24/2009 2:57:22 PM|SETI@home|Temporarily failed upload of ap_16ja09ag_B3_P1_00398_20090218_02436.wu_1_0: HTTP error 2/24/2009 2:57:22 PM|SETI@home|Backing off 3 hr 26 min 47 sec on upload of ap_16ja09ag_B3_P1_00398_20090218_02436.wu_1_0 2/24/2009 2:57:31 PM||Internet access OK - project servers may be temporarily down. |
Piotr Kunkel Send message Joined: 7 Apr 00 Posts: 18 Credit: 19,385,083 RAC: 0 |
Simply be patient. " Then they came for me and there was no one left to speak out for me." Martin Niemöller |
john deneer Send message Joined: 16 Nov 06 Posts: 331 Credit: 20,996,606 RAC: 0 |
Maybe this has already been covered, or it is simply a fluke but I decided to post it anyway .... There's a lot of you out there that know more about this kind of stuff than I do. [edit]I just realized this thread is intended for ap 5.0 and not the newer 5.03, but I don't see a thread for posting errors occurring with 5.03. But if a mod wants to move it to a more appropriate thread, be my guest .... [/edit] I got a compute error on this unit Optimized AstroPulse 5.03, on a dual atom running Windows HomeServer. Painfully slow, but it's running all day anyway and the extra power consumption for making it run seti is something like 4 Watts :-) The unit errored after something like 16 h of processing. A couple of others are still running, I'll keep an eye on those too. Most obvious problem: No heartbeat from core client for 30 sec - exiting In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. ### Restart at 12.26 percent. No heartbeat from core client for 30 sec - exiting In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. boinc_graphics_make_shmem failed: 0 What causes this and can it be avoided somehow, or is it just something that happens from time to time? Regards, John. Name ap_07ja09ad_B4_P0_00119_20090223_28491.wu_2 Workunit 417940236 Created 23 Feb 2009 19:43:08 UTC Sent 23 Feb 2009 20:04:17 UTC Received 25 Feb 2009 23:06:17 UTC Server state Over Outcome Client error Client state Compute error Exit status -202 (0xffffffffffffff36) Computer ID 4728779 Report deadline 25 Mar 2009 20:04:17 UTC CPU time 61519.14 stderr out <core_client_version>6.2.19</core_client_version> <![CDATA[ <message> - exit code -202 (0xffffff36) </message> <stderr_txt> In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. AstroPulse v. 5.03 Non-graphics FFTW USE_CONVERSION_OPT USE_SSE3 Windows x86 rev 112, Don't Panic!, by Raistmer with support of Lunatics.kwsn.net team. SSE3 static fftw lib, built by Jason G. ffa threshold mod, by Joe Segur. SSE3 dechirping by JDWhale CPUID: Intel(R) Atom(TM) CPU 330 @ 1.60GHz Cache: L1=64K L2=512K Features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 No heartbeat from core client for 30 sec - exiting In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. ### Restart at 12.26 percent. No heartbeat from core client for 30 sec - exiting In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. boinc_graphics_make_shmem failed: 0 </stderr_txt> ]]> Validate state Invalid Claimed credit 149.103875429963 Granted credit 0 application version 5.03 |
Scrooge McDuck Send message Joined: 26 Nov 99 Posts: 627 Credit: 1,674,173 RAC: 54 |
I changed to new optimized ap_5.03r112_SSE3 the last days. Using WinXP SP3 32bit, AMD 64X2-3800, NO OCing. Never had any problems with optimized astropulse clients before. Now all AP WUs immediately exit with error. The german error message in the following reads: "Process creation failed: Access denied (0x5)" 02.03.2009 10:13:00|SETI@home|Starting ap_21ja09aa_B1_P0_00172_20090228_06971.wu_1 02.03.2009 10:13:00|SETI@home|[error] Process creation failed: Zugriff verweigert (0x5) 02.03.2009 10:13:00|SETI@home|[error] Process creation failed: Zugriff verweigert (0x5) 02.03.2009 10:13:01|SETI@home|[error] Process creation failed: Zugriff verweigert (0x5) 02.03.2009 10:13:01|SETI@home|[error] Process creation failed: Zugriff verweigert (0x5) 02.03.2009 10:13:01|SETI@home|[error] Process creation failed: Zugriff verweigert (0x5) 02.03.2009 10:13:03|SETI@home|Computation for task ap_21ja09aa_B1_P0_00172_20090228_06971.wu_1 finished 02.03.2009 10:13:03|SETI@home|Output file ap_21ja09aa_B1_P0_00172_20090228_06971.wu_1_0 for task ap_21ja09aa_B1_P0_00172_20090228_06971.wu_1 absent I've no idea, what this is about. Normal s@h MB WUs are handled without problems. I observed, mainly AMD systems exit on AP WUs with error. Maybe some AMD specific problem with optimized AP clients? Some links to my AP WUs exiting in error with wingmans also using AMD systems: AP WU 419896689 AP WU 419895682 AP WU 418909902 Suggestions? Regards, Michael |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I changed to new optimized ap_5.03r112_SSE3 the last days. Using WinXP SP3 32bit, AMD 64X2-3800, NO OCing. Never had any problems with optimized astropulse clients before. Now all AP WUs immediately exit with error. The german error message in the following reads: "Process creation failed: Access denied (0x5)" Not come across this one directly myself, but suspect it could be the Boinc accounts' permissions messed up somehow. I had a similar DLL failure on one machine a while back, which running the repair install via control panel seemed to fix those (a boinc reinstall should fix those similarly). I have no idea why they would've been damaged or changed in any way on my system though, but if the permissions repair helps you please let us know here, and we can let the Boinc devs know these may be being trashed through some unidentified mechanism. The alternative is probably to use the non-service (non-protected application) install type for Boinc (default), but I didn't try that either. Jason [Edit: also could try raising the priority of the Boinc.exe process as discussed in the next post, about 'No Heartbeat' messages.] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
For the "no heartbeat" message, first suspect high system load during these times. I'm told it's harmless enough (though annoying ;)). I had that symptom last week on my machine out in the lounge room, and eventually traced it to my mother having secret games of "Railroad Tycoon II" at around 2am, without suspending Boinc (as shown). In your case I would look for some scheduled tasks or services that run around these times. The hard crash though, may be of more concern, and may or may not be connected to the No heartbeat issue directly. I've seen mention that raising the priority of the boinc.exe process above normal (using task manager, or easier might be 'Process Lasso' which will allow automated priority modification of the program on startup.) If that helps both issues, so indicating they may be connected to system utilisation / boinc priority, then please let us know. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
I'm told it's harmless enough (though annoying ;)). It's harmless if it only happens once or twice per WU, but if it happens too often (I think 100 times per WU), you eventually reach "Too many normally harmless exits" and the task is abandoned. As with all BOINC error and warning messages (and indeed Windows messages), it's better to understand and eliminate as many as possible, rather than just ignoring them and hoping they'll go away. |
Scrooge McDuck Send message Joined: 26 Nov 99 Posts: 627 Credit: 1,674,173 RAC: 54 |
Oh well, a stupid problem I could have easily identified by myself. Thanks Jason for the hint. I'm running BOINC in protected service mode. I simply copied the optimized AP binaries to the appropriate folder and modified the app_info.xml accordingly. But I don't payed attention tho the access rights of the binaries (.exe). The s@h MB binary (running without problems) has the following group access rights set: Administrators: Full boinc_admins: Full boinc_projects: Full boinc_users: Read So I simply missed to add access rights to the optimized AP binaries for those boinc_admins, boinc_projects and boinc_users groups in WinXP. I simply forgot it. But maybe my fault helps others... ;-) I only have to wait now for next AP WUs to observe the issue again. Greetings, Michael |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.