Public beta for nVidia AstroPulse, rev 521

Message boards : Number crunching : Public beta for nVidia AstroPulse, rev 521
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 30 · Next

AuthorMessage
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1135623 - Posted: 3 Aug 2011, 19:27:33 UTC

I like this one http://setiathome.berkeley.edu/workunit.php?wuid=795242703 5052 credits for a couple of hours work is not bad at all.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1135623 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1135783 - Posted: 4 Aug 2011, 4:42:28 UTC
Last modified: 4 Aug 2011, 4:44:32 UTC

Adding the space did the trick, I am now running two APs and one MB on my GPU. The second AP is moving slowly but it is moving so I guess everything is now right with the world. Thanks again Geek for pointing out my mistake. (by "moving slowly" I mean by GPU standards, it's still at least five times faster than on my CPU.) :-)


PROUD MEMBER OF Team Starfire World BOINC
ID: 1135783 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1135851 - Posted: 4 Aug 2011, 10:04:08 UTC - in response to Message 1135625.  

I like this one http://setiathome.berkeley.edu/workunit.php?wuid=795242703 5052 credits for a couple of hours work is not bad at all.


That's a good example of how well thought through CreditNew really is. It shows without any doubt that DA have found a flawless credit system.

NewCredit would be fine on a project where all tasks fully ran to completion and none exited early,

The problem with NewCredit and Seti is Multibeam has the 'SETI@Home Informational message -9 result_overflow' early exit,
and Astropulse has the 'In ap_remove_radar.cpp: get_indices_to_randomize: num_ffts_forecast < 100. Blanking too much rfi?' and the 'Found 30 single pulses and 30 repeating pulses, exiting.' early exits,
If they weren't included in the calculations for APR and NewCredit, Credit granted would likely be a bit more stable, ie no more High Claims,

But there would still be host normalization:

•The host normalization mechanism reduces the claimed credit of hosts that are less efficient than average, and increases the claimed credit of hosts that are more efficient than average.


Claggy
ID: 1135851 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1135874 - Posted: 4 Aug 2011, 12:58:31 UTC

Maybe I spoke too soon. I completed 5 APs last night. One validated, one is inconclusive. The one inconclusive and three others all show 30 repetitive pulses and will probably go invalid. I'm going to let it run this way for a little while to make sure but it looks like for me running more than one AP at a time is a no go.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1135874 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1135877 - Posted: 4 Aug 2011, 13:09:04 UTC

Also have one inconclusive http://setiathome.berkeley.edu/result.php?resultid=2007662982

Think it was also when running 2 APs. Another thing when running 2 APs is that for some reason it used more CPU, normally it use 13% when running 1 with 275+ drivers, but when running 2 it goes sometime up to 26% I think it was. I've decided to only run one AP, seems the best.
ID: 1135877 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1135880 - Posted: 4 Aug 2011, 13:16:15 UTC - in response to Message 1135877.  

Also have one inconclusive http://setiathome.berkeley.edu/result.php?resultid=2007662982

Think it was also when running 2 APs. Another thing when running 2 APs is that for some reason it used more CPU, normally it use 13% when running 1 with 275+ drivers, but when running 2 it goes sometime up to 26% I think it was. I've decided to only run one AP, seems the best.


One thing I did was take Kittyman's recommendation and leave one core free on my CPU. Might help in your situation.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1135880 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1135891 - Posted: 4 Aug 2011, 14:03:56 UTC - in response to Message 1135880.  
Last modified: 4 Aug 2011, 14:04:32 UTC

Also have one inconclusive http://setiathome.berkeley.edu/result.php?resultid=2007662982

Think it was also when running 2 APs. Another thing when running 2 APs is that for some reason it used more CPU, normally it use 13% when running 1 with 275+ drivers, but when running 2 it goes sometime up to 26% I think it was. I've decided to only run one AP, seems the best.


One thing I did was take Kittyman's recommendation and leave one core free on my CPU. Might help in your situation.


Leaving one core free only is helpful with more than one GPU running.
Wastet cycles otherwise.

30/30 units doesn´t mean it will not get credits.
Just found to many signals and therfore early exit like -9 with MB.
Most of them get granted.


With each crime and every kindness we birth our future.
ID: 1135891 · Report as offensive
Profile X-Files 27
Avatar

Send message
Joined: 17 May 99
Posts: 104
Credit: 111,191,433
RAC: 0
Canada
Message 1135892 - Posted: 4 Aug 2011, 14:05:54 UTC

No invalids or errors so far with 50+ AP.

using cmd:
-ffa_block 8192 -ffa_block_fetch 2048 -unroll 12 -instances_per_device 3 -hp
cuda count: .34

Seti MB cuda count: .31

http://setiathome.berkeley.edu/results.php?hostid=2340442&offset=0&show_names=0&state=0&appid=5
ID: 1135892 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1135895 - Posted: 4 Aug 2011, 14:29:31 UTC

Slavic, with only an E5400 dual and a GTS 450 it kind of defeats the purpose if I only run one core but thanks for the thought.

X-Files27, I was running 6144/1536. I will try your suggestion of 8192/2048 and see how that goes. I'm running an unroll of 8 and a per device of 2 so I think I'll leave those that way for now. If I set the per device to 3 it will try to run a third instance and I'm sure my little 450 can't handle that. I also have .34/.31 set. I was a little afraid of messing with the FFA stuff as I didn't quite understand what to move it to.

Mike, I know 30/30 is usually no problem but that's not what I'm getting. I'm getting the right number of single pulses but 30 repetitive pulses which is not right. I have to fiddle around and try to find out what is causing that. Another cruncher was getting the same problem a couple of days ago and I pointed it out to Raistmer but I haven't seen any solution as yet.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1135895 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1135942 - Posted: 4 Aug 2011, 16:49:01 UTC - in response to Message 1135895.  

...
I know 30/30 is usually no problem but that's not what I'm getting. I'm getting the right number of single pulses but 30 repetitive pulses which is not right. I have to fiddle around and try to find out what is causing that. Another cruncher was getting the same problem a couple of days ago and I pointed it out to Raistmer but I haven't seen any solution as yet.

Actually, 30 repetitive pulses isn't too unusual. True, if it isn't confirmed by a wingmate it's a problem. What is exceedingly unusual is 30 single pulses with no repetitive pulses, but even that may validate as your Task 2013445088 illustrates. Excerpts:

...
Run time	13,945.13
CPU time	1,965.64
Validate state	Valid
Credit	666.64
Application version	Astropulse v505
Anonymous platform (NVIDIA GPU)
...
    single pulses: 30
repetitive pulses: 0
  percent blanked: 5.52
...

The wingmate ran stock CPU, which doesn't show signal counts. And the Astropulse validation only compares single pulse signals which are 1% above threshold, so there's a possibility it simply ignored those 30. The files have been deleted, so there's no way to check further.
                                                              Joe
ID: 1135942 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1136036 - Posted: 4 Aug 2011, 21:10:11 UTC - in response to Message 1135942.  

All my units result in 30 repetitive pulses for that host.

- all ?/30 become invalids.

- all the 30/30 get validated. And this is what I find quite bizarre: I mean, the repetitive pulses are obviously wrong. Are they not checked by the validators at all in this case?
ID: 1136036 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1136051 - Posted: 4 Aug 2011, 21:53:21 UTC

I have one inconclusive where the 3rd wingman is running Linux and Astropulse V505 v5.06. What is that??

I ask because all his AP efforts have gone for naught. Just errors, more than two pages of them.

His Host and the common workunit.

Lt

ID: 1136051 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1136052 - Posted: 4 Aug 2011, 21:54:20 UTC


Got my second invalid today.
Sad i couldn´t download it for further testing.

Single pulses 2
repetitive pulses 2

http://setiathome.berkeley.edu/workunit.php?wuid=776557486




With each crime and every kindness we birth our future.
ID: 1136052 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1136059 - Posted: 4 Aug 2011, 22:31:05 UTC - in response to Message 1136051.  

I have one inconclusive where the 3rd wingman is running Linux and Astropulse V505 v5.06. What is that??

That's just the ordinary stock application for Linux: see applications.

I think segmentation violations are common with that app, but we'd best wait for a Linux specialist.
ID: 1136059 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1136088 - Posted: 5 Aug 2011, 1:10:13 UTC - in response to Message 1136036.  

All my units result in 30 repetitive pulses for that host.

- all ?/30 become invalids.

- all the 30/30 get validated. And this is what I find quite bizarre: I mean, the repetitive pulses are obviously wrong. Are they not checked by the validators at all in this case?

Not all the 30/30 get validated, for WU 788416427 yours was invalidated even though the other two hosts also found 30/30. Excerpt:

Task        Computer          Sent                     reported             Status
2007953553  5790778 24 Jul 2011 | 2:18:50 UTC   25 Jul 2011 | 13:15:59 UTC  Completed, marked as invalid
2008013186  5759991 24 Jul 2011 | 3:22:02 UTC   24 Jul 2011 | 14:35:21 UTC  Completed and validated
2010893549  2430488 25 Jul 2011 | 16:29:54 UTC  4 Aug 2011 | 17:14:26 UTC   Completed and validated


As to why your host is getting credit on the majority of its 30/30 tasks, in such a result there are actually 40 single pulses (30 reportable, 10 "best"). The criteria for a "Weakly similar" judgement by validation is that at least half the signals match, so in that sense wrong repetitive pulses don't prohibit getting credit (though your host's result should never be canonical). The reason it's not always granted credit is that single pulses are not compared if they aren't at least 1% above threshold, so in effect there will usually be less than 40 actually considered.
                                                                 Joe
ID: 1136088 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1136178 - Posted: 5 Aug 2011, 7:27:50 UTC - in response to Message 1136036.  

All my units result in 30 repetitive pulses for that host.

- all ?/30 become invalids.

- all the 30/30 get validated. And this is what I find quite bizarre: I mean, the repetitive pulses are obviously wrong. Are they not checked by the validators at all in this case?

I can't check result's stderr, try to use default FFA values if you use smth else.
ID: 1136178 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1136249 - Posted: 5 Aug 2011, 14:34:00 UTC

Any idea why when a AP task gets suspended or when I exit Boinc and a AP task is running, the screen often goes black for a sec and when the screen comes back a yellow triangle appears in the taskbar? Win7 then later wants to report a video hardware fail to MS.
ID: 1136249 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1136253 - Posted: 5 Aug 2011, 14:38:12 UTC - in response to Message 1136249.  

Any idea why when a AP task gets suspended or when I exit Boinc and a AP task is running, the screen often goes black for a sec and when the screen comes back a yellow triangle appears in the taskbar? Win7 then later wants to report a video hardware fail to MS.

Sounds like your AP app hasn't used the threadsafe exit BOINC library yet.

Sorry, that's a technote to the developer.
ID: 1136253 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1136255 - Posted: 5 Aug 2011, 14:42:56 UTC - in response to Message 1136249.  
Last modified: 5 Aug 2011, 14:44:12 UTC

Any idea why when a AP task gets suspended or when I exit Boinc and a AP task is running, the screen often goes black for a sec and when the screen comes back a yellow triangle appears in the taskbar? Win7 then later wants to report a video hardware fail to MS.


Reduce unroll factor.
Try -unroll 6.


With each crime and every kindness we birth our future.
ID: 1136255 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1136288 - Posted: 5 Aug 2011, 16:14:45 UTC - in response to Message 1136255.  

Any idea why when a AP task gets suspended or when I exit Boinc and a AP task is running, the screen often goes black for a sec and when the screen comes back a yellow triangle appears in the taskbar? Win7 then later wants to report a video hardware fail to MS.


Reduce unroll factor.
Try -unroll 6.

Nope, didn't help.
ID: 1136288 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 30 · Next

Message boards : Number crunching : Public beta for nVidia AstroPulse, rev 521


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.