Posts by Eric B


log in
1) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769874)
Posted 5 Mar 2016 by Profile Eric B
Everything is working fine now with the lunatics binary - so it appears to be a bug in the official v8 MB software, wouldn't you say? Although, I don't understand why it doesn't affect other machines.
Thanks for everyone's help with this issue - its much appreciated!
2) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769873)
Posted 5 Mar 2016 by Profile Eric B
I noticed your host 7011730 that was having the issue looks to be running an older OS, Linux 3.11.10, than your two other machines. Which have Linux 4.1.2 & Linux 4.4.3.

Possibly an older lib somewhere mucking up the works for the stock app?


The normal kernel for OpenSuse 13.1 is 3.11 In my case, I updated the kernel on those machines myself. I have a 4.4 kernel I'm working on as time permits and will be switching to that on erb1 at some point down the road. erb1 got rebuilt/reformatted about 10 months ago or so because it had been through too many upgrades and things were getting weird there, so I did a fresh 13.1 install. (I tried 13.2 at that time too but I didn't feel it was really ready for prime time so I reformatted and went back to 13.1)
3) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769871)
Posted 5 Mar 2016 by Profile Eric B
I tried an experiment this afternoon and installed the Lunatics V8 MB executable just to see if that was going to have the same issue - and so far it hasn't. Of course its going to take several days to see this is really working but it looks promising after 6 hours or so on lunatics. One thing i do notice with lunatics tho and maybe its nothing but the estimated total time per work unit seems high most are 5 hours plus and some are 14 and a few are around 2 hours.

Switching to anonymous platform means the server has to relearn how long it takes to complete tasks for the application. It also happens when switching to a new application like the change from SETI@home v7 to SETI@home v8.
After about 10-11 completed tasks you will see the estimated time for newly received tasks will be lower and lower until it is accurate.

I would guess the 5 and 2 hour tasks are probably VHAR tasks or the the 14 hour ones are VLAR tasks.


Yup, you are right, after 12 hours or so the estimated times dropped down to the normal 2.5 hrs I used to see. good to understand that now - tahnks
4) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769634)
Posted 5 Mar 2016 by Profile Eric B
I think this looks just like the good old stuck in benchmarks bug.

Eric, could you unhide your hosts or give a link to this one. It would be nice to see the stderr from more tasks.


I just unhid them, the troubled machine is erb1 the others seem fine
5) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769633)
Posted 5 Mar 2016 by Profile Eric B
SETI@home preferences | Should SETI@home show your computers on its web site?

Got it! Thanks! they are visible now.
6) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769632)
Posted 5 Mar 2016 by Profile Eric B
I tried an experiment this afternoon and installed the Lunatics V8 MB executable just to see if that was going to have the same issue - and so far it hasn't. Of course its going to take several days to see this is really working but it looks promising after 6 hours or so on lunatics. One thing i do notice with lunatics tho and maybe its nothing but the estimated total time per work unit seems high most are 5 hours plus and some are 14 and a few are around 2 hours.
7) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769629)
Posted 5 Mar 2016 by Profile Eric B
I think this looks just like the good old stuck in benchmarks bug.

Eric, could you unhide your hosts or give a link to this one. It would be nice to see the stderr from more tasks.

Ok, maybe I'm dense but for the life of me I cant find the place where you allow others to view your computers/tasks, is it under account? I've seen it before (a long time past) but cant find it now
8) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769563)
Posted 4 Mar 2016 by Profile Eric B
setiathome_8.00_x86_64-pc-linux-gnu is the main program and needs to be executable, but at 98 bytes, that can't be the real file, just a symlink. Let's assume the real file is executing properly, else nothing at all would be written to stderr.txt

I don't recognise boinc_setiathome_8, and at 266080 bytes it should be significant. Anyone?


Its a boinc link, not a symlink. In otherwords its an ordinary file that contains a reference - only boinc does this, it isnt a standard part of linux at all

# cat setiathome_8.00_x86_64-pc-linux-gnu
<soft_link>../../projects/setiathome.berkeley.edu/setiathome_8.00_x86_64-pc-linux-gnu</soft_link>
9) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769561)
Posted 4 Mar 2016 by Profile Eric B
Brent Norman:
No, not running as admin (or root as we say in the linux world) Just an ordinary user. I'm not sure i understand your question about x86_64, but that's the cpu mode not a filesystem thing
10) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769554)
Posted 4 Mar 2016 by Profile Eric B
Also, do these file sizes and perms look ok (my username has been replaced with x's)?
# ls -l total 72 -rw-r--r-- 1 xxxxxxxx users 0 Mar 3 06:02 boinc_lockfile -rw-rw---- 1 xxxxxxxx users 266080 Mar 3 06:02 boinc_setiathome_8 -rw-r--r-- 1 xxxxxxxx users 100 Mar 2 22:39 graphics_app -rw-r--r-- 1 xxxxxxxx users 8111 Mar 4 12:14 init_data.xml -rw-r--r-- 1 xxxxxxxx users 104 Mar 2 22:39 result.sah -rw-r--r-- 1 xxxxxxxx users 86 Mar 2 22:39 setiathome-8.00_AUTHORS -rw-r--r-- 1 xxxxxxxx users 86 Mar 2 22:39 setiathome-8.00_COPYING -rw-r--r-- 1 xxxxxxxx users 88 Mar 2 22:39 setiathome-8.00_COPYRIGHT -rw-r--r-- 1 xxxxxxxx users 85 Mar 2 22:39 setiathome-8.00_README -rw-r--r-- 1 xxxxxxxx users 98 Mar 2 22:39 setiathome_8.00_x86_64-pc-linux-gnu -rw-r--r-- 1 xxxxxxxx users 75 Mar 2 22:39 seti_logo -rw-r--r-- 1 xxxxxxxx users 77 Mar 2 22:39 sponsor_bkg -rw-r--r-- 1 xxxxxxxx users 73 Mar 2 22:39 sponsor_logo -rw-r--r-- 1 xxxxxxxx users 1356 Mar 3 06:02 stderr.txt -rw-r--r-- 1 xxxxxxxx users 6590 Mar 3 06:02 wisdom.sah -rw-r--r-- 1 xxxxxxxx users 100 Mar 2 22:39 work_unit.sah
11) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769550)
Posted 4 Mar 2016 by Profile Eric B
The stderr.txt (in full):

setiathome_v8 8.00 Revision: 3290 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
libboinc: BOINC 7.7.0

Work Unit Info:
...............
WU true angle range is : 0.009480
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
v_avxGetPowerSpectrum 0.000065 0.00000
setiathome_v8 8.00 Revision: 3290 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
libboinc: BOINC 7.7.0

Work Unit Info:
...............
WU true angle range is : 0.009480
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
v_avxGetPowerSpectrum 0.000111 0.00000
setiathome_v8 8.00 Revision: 3290 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
libboinc: BOINC 7.7.0

Work Unit Info:
...............
WU true angle range is : 0.009480
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
v_avxGetPowerSpectrum 0.000101 0.00000
12) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769504)
Posted 4 Mar 2016 by Profile Eric B
The cpu is fully loaded according to top, here is the info from boinc:
13) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769467)
Posted 4 Mar 2016 by Profile Eric B
As near as I can tell they are all CPU units.
Here is a sample image showing the progress of some WU's that were started yesterday and still showing 0%, after another 24 or 48 hours they might show 100% - it varies with the particular WU. Its 4AM here so forgive me if I'm mumbling

14) Message boards : Number crunching : Need help - Seti v8 MB units running forever (Message 1769374)
Posted 4 Mar 2016 by Profile Eric B
OpenSuse Linux 13.1 x86_64, Intel i7-3960X 6 core/HT 32G ram, NVIDIA GTX 460

Since I started crunching v8 MB units I have this problem on 1 pc where the majority of work units crunch forever (I've let them run up to 60 hours and they stay at 100% and are still crunching with no expected completion time, just "---" Remaining). I tried going back to earlier versions of boinc but have the same problem. I did a project reset, and a fresh install - no dice. Sometimes, shutting down boinc and restarting will get them to complete, sometimes not, restarting boinc always resets the Elapsed Time back to zero.
Is this a known v8 issue? Is there anything I can do to fix this? Curious how it only affects this one system, My other two systems are fine (all 3 same os)
15) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1398760)
Posted 4 Aug 2013 by Profile Eric B
Just a follow up - using my locally compiled version of boinc 7.0.65 and the 2 executable files:
MBv7_7.05r1848_sse3_linux64 (Seti V7.05) setiathome_x41g_x86_64-pc-linux-gnu_cuda32 (seti v7.00 cuda32)

has worked fine, I can crunch both CPU MB and NVidia Cuda WU's with no apparent issues.
My OS is OpenSuse 12.3 kernel 3.7.10-1.16-desktop, on a SNB-E 6 Core/HT with Nvidia GTX-460 using the proprietary Nvidia Driver 319.32
I'm posting this info in case any one else has similar problems and could benefit from my resolution.
16) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1395930)
Posted 28 Jul 2013 by Profile Eric B
sorry its taken me so long t reply here. I ended up reformatting and re-installing. It didn’t help, the MB v7 stock app gets stuck. So, i am now running x41g cuda and MBv7_7.05r1848_sse3_linux64 which i just started up about 20 minutes ago. It looks promising, I'm crunching mb v7 cpu and nvidia cuda. I will know more in a day or so.
The root cause of the previously posted problems is something in the stock seti app that just doesn’t sit well with this OpenSuse 12.3 installation.
17) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1394666)
Posted 25 Jul 2013 by Profile Eric B
God idea on ownership, but its all ok. Could it be a library outside of boinc that its using? BTW: I don’t have the ability to run native, so to speak, the change to use a new glibc 2.3.2 or higher really screwed me, its not practical to try to upgrade glibc either, too much depends on it. I have to compile boinc in order to use it. Maybe its related to that somehow. But, is ti boinc? or is it the seti executable? How far back can i go with boinc versions and still run seti v7?
18) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1394517)
Posted 25 Jul 2013 by Profile Eric B
Even with a completely fresh install of boinc, i have issues with cpu mb, They seem to get stuck and make no fwd progress, this happens about every 4 hours or so. If i restart boinc they take off like they should until the next few WU's then I'm back in the same boat, the WU runs but makes no fwd progress at all. When I do restart boinc, i also notice that every WU that was in progress is reset back to 0%. I have another system with the same OS, well, sorta the same. One is OpenSuse 12.1 with 3.1 kernel the other is OpenSuse 12.3 with 3.8 kernel I don’t have trouble on OpenSuse 12.1 for some reason.
19) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1393478)
Posted 22 Jul 2013 by Profile Eric B
I also have need of some instruction on these app_info.xml files. Is it just me or are these an absolute abomination? I don’t see the order there. When i try to run the app_info.xml i ONLY get nvidia work never any cpu work. I tried removing the setiathome_enhanced sections but no change.
Basically i have 2 apps:
setiathome_7.01_x86_64-pc-linux-gnu
setiathome_x41g_x86_64-pc-linux-gnu_cuda32
How do i get a proper app_info.xml written?

<app_info> <app> <name>setiathome_v7</name> </app> <file_info> <name>setiathome_7.01_x86_64-pc-linux-gnu</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>701</version_num> <platform>x86_64-pc-linux-gnu</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <file_ref> <file_name>setiathome_7.01_x86_64-pc-linux-gnu</file_name> <main_program/> </file_ref> </app_version> <file_info> <name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</name> <executable/> </file_info> <file_info> <name>libcudart.so.3</name> <executable/> </file_info> <file_info> <name>libcufft.so.3</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>700</version_num> <platform>x86_64-pc-linux-gnu</platform> <plan_class>cuda32</plan_class> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>1.0</max_ncpus> <coproc> <type>CUDA</type> <count>1.0</count> </coproc> <file_ref> <file_name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.3</file_name> </file_ref> <file_ref> <file_name>libcufft.so.3</file_name> </file_ref> </app_version> </app_info>

20) Message boards : Number crunching : vlar running over 21hrs on cpu (Message 1393471)
Posted 22 Jul 2013 by Profile Eric B
Tbar: I ran a bunch of open browser windows, some graphics and a series of cuda apps and watched the remaining free gpu-ram. It never dropped below 600M, so I don't think I am dong anything outside of seti that would cause gpu-ram to get used up, in fact most of the time the computer is just crunching and not much else running.


Next 20

Copyright © 2016 University of California