Lunatics Help

Message boards : Number crunching : Lunatics Help
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1778663 - Posted: 13 Apr 2016, 8:09:34 UTC - in response to Message 1778657.  

Dont worry.
The apps are tested with GBT tasks.

I'd say that the apps were designed to work to the GBT specification, and have received some limited testing on a sample tape at Beta.

In the past, we have sometimes found that real-world (or should I say, real-space) data contains quirks which didn't show up on the test tapes. So, we need to keep a watching brief on the GBT data flow as it ramps up, just in case some tweaking is needed.

So far, I seem to have been allocated the grand total of two GBT tasks - both VLARs - and both are still waiting to run. So I'm not panicking yet.


+1 - vigilance will be the thing, in what amounts to a new game.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1778663 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1778691 - Posted: 13 Apr 2016, 9:34:45 UTC

I have tested approx 50 different Tasks already.
So i wouldn`t worry to much.


With each crime and every kindness we birth our future.
ID: 1778691 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1778696 - Posted: 13 Apr 2016, 9:51:59 UTC - in response to Message 1778691.  

I have tested approx 50 different Tasks already.
So i wouldn`t worry to much.

From how many different tapes?
ID: 1778696 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1778714 - Posted: 13 Apr 2016, 12:14:44 UTC - in response to Message 1778696.  
Last modified: 13 Apr 2016, 12:17:58 UTC

I have tested approx 50 different Tasks already.
So i wouldn`t worry to much.

From how many different tapes?


Didn`t check it sorry.
Since there was nothing special i deleted them some while ago.
As soon i see an interesting work unit i will download it for further testing of course.


With each crime and every kindness we birth our future.
ID: 1778714 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1778723 - Posted: 13 Apr 2016, 12:35:42 UTC

Have 9 tasks in my cache.

Start benching.......................


With each crime and every kindness we birth our future.
ID: 1778723 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1778725 - Posted: 13 Apr 2016, 12:43:19 UTC - in response to Message 1778631.  
Last modified: 13 Apr 2016, 12:53:08 UTC

. . Hello Richard,

. . Does today's announcement mean extra headaches for you? Now that the promised reason for V8 release has come to be, do you expect any issues?

Stephen

Even though I'm not Richard, if there were any problems with the new w/u's then someone will soon let us know here. ;-)

I've got quiet a few here now, but I hope that they don't add a complication to what I'm doing.

Cheers.


. . I thought it might have been a very clever disguise ... :)

. . I have several queued [update: 6] as well but none have started yet. All of mine are vlar WU's. To go where ... aaah nope that's been done.

P.S. [update] while I wasn't watching one has started on the other computer and is 25 mins in AOK.
ID: 1778725 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1778726 - Posted: 13 Apr 2016, 12:48:12 UTC - in response to Message 1778657.  

Dont worry.
The apps are tested with GBT tasks.

I'd say that the apps were designed to work to the GBT specification, and have received some limited testing on a sample tape at Beta.

In the past, we have sometimes found that real-world (or should I say, real-space) data contains quirks which didn't show up on the test tapes. So, we need to keep a watching brief on the GBT data flow as it ramps up, just in case some tweaking is needed.

So far, I seem to have been allocated the grand total of two GBT tasks - both VLARs - and both are still waiting to run. So I'm not panicking yet.



. . Only 2? So far I have six also all VLARs :)

. . The first one is now running and is 20 mins in with no probs so far.
ID: 1778726 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1778728 - Posted: 13 Apr 2016, 12:49:24 UTC - in response to Message 1778691.  

All systems go then :)
ID: 1778728 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1778738 - Posted: 13 Apr 2016, 14:27:57 UTC

Looking good here.

WU : blc3_2bit_guppi_57451_19958_HIP62472_0005.7585.831.17.26.109.wu
MB8_win_x64_AVX_VS2010_r3308.exe -verb -nog :
Elapsed 5183.603 secs
CPU 5159.951 secs
MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3430.exe -device 0 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 :
Elapsed 1386.068 secs, speedup: 73.26% ratio: 3.74x
CPU 384.917 secs, speedup: 92.54% ratio: 13.41x


With each crime and every kindness we birth our future.
ID: 1778738 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1778857 - Posted: 14 Apr 2016, 0:37:12 UTC - in response to Message 1778631.  

. .

. . I have a quandary, a slight paradox. With two WU's running on the GPU that finish at different times, and AP's needing the whole GPU time for itself, the AP task never starts because there is always one other task still running on the GPU and instead of reserving the second spot until the GPU is completetly free for the AP, it just starts another CUDA50. How do I tell it to give the AP some priority??
ID: 1778857 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1778893 - Posted: 14 Apr 2016, 3:16:59 UTC - in response to Message 1778857.  
Last modified: 14 Apr 2016, 3:18:58 UTC

You cant. But the AP will start at some point. I notice that it seems to take at least 2 days for the APs to start running on my computers. I run 2 APs or 3 WU on my GPUs. I know it seems like the APs will never start, but the do.

The AP, when it starts will just suspend the still running WU.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1778893 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1779057 - Posted: 14 Apr 2016, 17:30:47 UTC - in response to Message 1778893.  

You cant. But the AP will start at some point. I notice that it seems to take at least 2 days for the APs to start running on my computers. I run 2 APs or 3 WU on my GPUs. I know it seems like the APs will never start, but the do.

The AP, when it starts will just suspend the still running WU.


. . That will really kill my proud record of same day results :)

. . Oh well, maybe I will just have to go back to one WU at a time.

. . Thanks for the reply
ID: 1779057 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1779150 - Posted: 14 Apr 2016, 22:21:35 UTC - in response to Message 1779057.  

One thing you can do is to pause all the other WUs and the AP will start. You can then resume the other work and in my experience the AP will finish before the GPU is taken over again.

I did notice that you seem to have APs that have finished without problems.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1779150 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1779164 - Posted: 14 Apr 2016, 22:59:46 UTC - in response to Message 1779150.  

One thing you can do is to pause all the other WUs and the AP will start. You can then resume the other work and in my experience the AP will finish before the GPU is taken over again.

I did notice that you seem to have APs that have finished without problems.



. . That was what I resorted to, freezing all other work so the AP had sole use of the GPU. It worked. But I don't want to have to do that every time there is an AP in my queue.

. . Yes I have had successful AP runs on the both CPU and the GPU when I was running onesies (single GPU tasks at a time) but this was my first AP since going to twosies (You guessed it, 2 GPU tasks at once :) ) and all have completed without a hitch, once I got the last one to actually start that is.

. . My objective as I said before is mimimum turnaround time, I like to return a result in less than half a day if possible.

. . It irritates the OCD side of me when I see massive numbers of WU's sitting in queues for weeks, or worse, months until they time out, when they could be completed in far less time. Oh well, "c'est la vie!"
ID: 1779164 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1779256 - Posted: 15 Apr 2016, 7:32:36 UTC - in response to Message 1779164.  

. . My objective as I said before is mimimum turnaround time, I like to return a result in less than half a day if possible.

Most people like to keep their systems busy, so will carry the largest cache they can in order to get through outages without running out of work.
Grant
Darwin NT
ID: 1779256 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1779597 - Posted: 16 Apr 2016, 13:13:11 UTC - in response to Message 1779256.  

. . My objective as I said before is mimimum turnaround time, I like to return a result in less than half a day if possible.

Most people like to keep their systems busy, so will carry the largest cache they can in order to get through outages without running out of work.



. . Yes the down times over the last couple of weeks have left me without work for many hours, most of one day in fact, which dented the old RAC. But there are a lot of volunteers who have like 200 jobs in their queue and haven't contacted the servers for weeks .... :(

. . This came to my attention when I reviewed some old completed jobs that have been weeks or months in my cache (well 2 anyway) but not validated. Only to have many of those end up timing out and going to another host before being completed. When you look at the host that timed out, you see what I described above. It's fine to run with large queues of work in your cache, when you are getting through them, such that it might take a few days to get a result in. But when your rig is only doing a few WU's a week and you only make contact once a fortnight or less, why have dozens or hundreds of jobs sitting there until they time out? That is the part that makes no sense to me :(

. . But as in the words of the song ... " now now musn't grumble". :)
ID: 1779597 · Report as offensive
Profile Rich Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 27 Oct 14
Posts: 4
Credit: 49,285,910
RAC: 0
United States
Message 1779609 - Posted: 16 Apr 2016, 14:27:06 UTC - in response to Message 1779597.  

I have to agree with you Stephen. I had one in my queue since December 25 and it was finally validated last night. No reason to have a queue that large, the servers are not down for more then a day or two at most.
ID: 1779609 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1779740 - Posted: 16 Apr 2016, 22:03:55 UTC - in response to Message 1779609.  
Last modified: 16 Apr 2016, 22:04:45 UTC

I have to agree with you Stephen. I had one in my queue since December 25 and it was finally validated last night. No reason to have a queue that large, the servers are not down for more then a day or two at most.


. . That is how I see it too. I am guessing that one WU from December had at least one host time out on the job. Maybe two in that time frame. In my limited experience the servers are not usually down for even a full day let alone much more. I must admit my cache is too small for the longer outages but I like to keep the turn around snappy :)
ID: 1779740 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1779760 - Posted: 16 Apr 2016, 22:40:52 UTC - in response to Message 1779609.  

I have to agree with you Stephen. I had one in my queue since December 25 and it was finally validated last night. No reason to have a queue that large,

That's isn't the result of a large queue (although one might be involved) it's the result of the length of the deadline for the WU, and whether or not when the WU is reissued it goes to a machine that returns a result, or it goes to another machine that isn't returning work.
Long running WUs have deadlines of roughly 8 weeks. While it's not common, it's not unusual for it to take almost 6 months for some results to validate due to the first machine timing out, and the next 2 re-issues also timing out before you finally get a system that returns a valid result.
Grant
Darwin NT
ID: 1779760 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1779849 - Posted: 17 Apr 2016, 7:12:55 UTC - in response to Message 1779760.  
Last modified: 17 Apr 2016, 7:13:55 UTC

I have to agree with you Stephen. I had one in my queue since December 25 and it was finally validated last night. No reason to have a queue that large,

That's isn't the result of a large queue (although one might be involved) it's the result of the length of the deadline for the WU, and whether or not when the WU is reissued it goes to a machine that returns a result, or it goes to another machine that isn't returning work.
Long running WUs have deadlines of roughly 8 weeks. While it's not common, it's not unusual for it to take almost 6 months for some results to validate due to the first machine timing out, and the next 2 re-issues also timing out before you finally get a system that returns a valid result.



. . And there you have the essence of my comment. A case in point is a job in my Q that is only 3 weeks old but the other host has 400 jobs in his cache and has not contacted the server in that 3 week period. He has lots of jobs time out I am sure because there are still half a dozen listing that timed out in February from a December issue. How many thousand WU's are bouncing around taking months to get a result because of things like that? And as for it not being common for that to happen, for a job to bounce several times requires several hosts to be doing the same thing and that is just ones that job encounters, with hundreds of thousands of possible hosts what are the odds of that happening if it is so uncommon?

. . The reason those 6 WU's are still in his error queue is that they all went to the same new host who is just as bad. He took them on the 29th Feb and hasn't contacted the server since. There were probably dozens, maybe even hundreds of others that timed out as well but they were lucky and went to hosts that gave a result.

. . There is nothing I can do about it but it just irks me, there are so many volunteers who seriously want to get the job done why do these individuals do what they do?

P.S. Maybe they need to shorten task deadlines ??
ID: 1779849 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

Message boards : Number crunching : Lunatics Help


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.