Posts by Eric B |
![]() |
| log in |
|
1)
Message boards :
Number crunching :
SNB-E not using all threads as it should
(Message 1296822)
Posted 217 days ago by Eric B
Thanks for the tags tip! The [] part kinda threw me for a minute and made me wonder why <tag></tag> didn’t do anything, then i realized the format was not standard and i would assume it's to prevent attacks. Yeh post this is better looking now. also i re-wrapped some stuff for easier reading. > ps aux|grep AK|grep -v grep |wc -l 11 From boinc mgr msgs:
Sun 14 Oct 2012 03:15:10 AM PDT Starting BOINC client version 6.10.58 for x86_64-pc-linux-gnu
Sun 14 Oct 2012 03:15:10 AM PDT Config: GUI RPC allowed from:
Sun 14 Oct 2012 03:15:10 AM PDT Config: 192.168.1.17
Sun 14 Oct 2012 03:15:10 AM PDT Config: 192.168.1.103
Sun 14 Oct 2012 03:15:10 AM PDT log flags: file_xfer, sched_ops, task
Sun 14 Oct 2012 03:15:10 AM PDT Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.5 c-ares/1.5.1
Sun 14 Oct 2012 03:15:10 AM PDT Data directory: /home/erbenton/BOINC
Sun 14 Oct 2012 03:15:10 AM PDT Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz [Family 6 Model 45 Stepping 7]
Sun 14 Oct 2012 03:15:10 AM PDT Processor: 15.00 MB cache
Sun 14 Oct 2012 03:15:10 AM PDT Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc
arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
Sun 14 Oct 2012 03:15:10 AM PDT OS: Linux: 3.1.10cstm-1.16-cstm
Sun 14 Oct 2012 03:15:10 AM PDT Memory: 15.63 GB physical, 512.00 MB virtual
Sun 14 Oct 2012 03:15:10 AM PDT Disk: 52.87 GB total, 19.04 GB free
Sun 14 Oct 2012 03:15:10 AM PDT Local time is UTC -7 hours
Sun 14 Oct 2012 03:15:10 AM PDT NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 4020,
compute capability 2.1, 1024MB, 641 GFLOPS peak)
Sun 14 Oct 2012 03:15:10 AM PDT SETI@home Found app_info.xml; using anonymous platform
Sun 14 Oct 2012 03:15:10 AM PDT SETI@home URL http://setiathome.berkeley.edu/; Computer ID 4520457; resource share 100
Sun 14 Oct 2012 03:15:10 AM PDT General prefs: from http://milkyway.cs.rpi.edu/milkyway/ (last modified 29-May-2011 00:31:18)
Sun 14 Oct 2012 03:15:10 AM PDT Host location: none
Sun 14 Oct 2012 03:15:10 AM PDT General prefs: using your defaults
Sun 14 Oct 2012 03:15:10 AM PDT Reading preferences override file
Sun 14 Oct 2012 03:15:10 AM PDT Preferences:
Sun 14 Oct 2012 03:15:10 AM PDT max memory usage when active: 8002.02MB
Sun 14 Oct 2012 03:15:10 AM PDT max memory usage when idle: 12803.23MB
Sun 14 Oct 2012 03:15:10 AM PDT max disk usage: 4.00GB
Sun 14 Oct 2012 03:15:10 AM PDT (to change preferences, visit the web site of an attached project,
or select Preferences in the Manager)
Could it be that Milkyway at home entry? i have not been able to figure out how to get fully rid of it. e.g it never shows in the boinc mgr but its in various files. How do i completely clean that thing out of there? It seems a likely candidate for trouble in my case so it would be good to clean that out and see if it clears up the missing instance problem. I wonder if its safe to just go delete all these references?
>grep milkyway * |less
Sun 14 Oct 2012 03:15:10 AM PDT Starting BOINC client version 6.10.58 for x86_64-pc-linux-gnu
Sun 14 Oct 2012 03:15:10 AM PDT Config: GUI RPC allowed from:
Sun 14 Oct 2012 03:15:10 AM PDT Config: 192.168.1.17
Sun 14 Oct 2012 03:15:10 AM PDT Config: 192.168.1.103
Sun 14 Oct 2012 03:15:10 AM PDT log flags: file_xfer, sched_ops, task
Sun 14 Oct 2012 03:15:10 AM PDT Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.5 c-ares/1.5.1
Sun 14 Oct 2012 03:15:10 AM PDT Data directory: /home/erbenton/BOINC
Sun 14 Oct 2012 03:15:10 AM PDT Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz [Family 6 Model 45 Stepping 7]
Sun 14 Oct 2012 03:15:10 AM PDT Processor: 15.00 MB cache
Sun 14 Oct 2012 03:15:10 AM PDT Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc
arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
Sun 14 Oct 2012 03:15:10 AM PDT OS: Linux: 3.1.10cstm-1.16-cstm
Sun 14 Oct 2012 03:15:10 AM PDT Memory: 15.63 GB physical, 512.00 MB virtual
Sun 14 Oct 2012 03:15:10 AM PDT Disk: 52.87 GB total, 19.04 GB free
Sun 14 Oct 2012 03:15:10 AM PDT Local time is UTC -7 hours
Sun 14 Oct 2012 03:15:10 AM PDT NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 4020,
compute capability 2.1, 1024MB, 641 GFLOPS peak)
Sun 14 Oct 2012 03:15:10 AM PDT SETI@home Found app_info.xml; using anonymous platform
Sun 14 Oct 2012 03:15:10 AM PDT SETI@home URL http://setiathome.berkeley.edu/; Computer ID 4520457; resource share 100
Sun 14 Oct 2012 03:15:10 AM PDT General prefs: from http://milkyway.cs.rpi.edu/milkyway/ (last modified 29-May-2011 00:31:18)
Sun 14 Oct 2012 03:15:10 AM PDT Host location: none
Sun 14 Oct 2012 03:15:10 AM PDT General prefs: using your defaults
Sun 14 Oct 2012 03:15:10 AM PDT Reading preferences override file
Sun 14 Oct 2012 03:15:10 AM PDT Preferences:
Sun 14 Oct 2012 03:15:10 AM PDT max memory usage when active: 8002.02MB
Sun 14 Oct 2012 03:15:10 AM PDT max memory usage when idle: 12803.23MB
Sun 14 Oct 2012 03:15:10 AM PDT max disk usage: 4.00GB
Sun 14 Oct 2012 03:15:10 AM PDT (to change preferences, visit the web site of an attached project,
or select Preferences in the Manager)
|
|
2)
Message boards :
Number crunching :
SNB-E not using all threads as it should
(Message 1296785)
Posted 217 days ago by Eric B
oh, i missed answering one of your questions - yeh i have plenty of cpu and gpu tasks on both machines, according to the boinc manager anyway (hand counted well over 25 each of cpu and gpu bfore i stopped counting). I wrote a script to track some things and while i dont claim its 100% accurate (the estimates of "available" work are only estimates based on what i see as how the average work progresses so they could be off a bit, but its darn close) here is its output. My stats come from analyzing the client_state.xml file and deducing what things meant by looking the boinc manager for clues, eg. find WU xx_yy and see what its state was in the manager then go find it in client_state.xml and see what i could learn. I think I have the id's of most of the states pretty well nailed down. There are actually 2 other states i haven’t worked into my script yet and they are called "active_task_states" : state 0 "started but currently suspended" and state 1 "actually executing" and I'm always on the hunt for more info i can ferret out of that file and add to my script. I do network upload/downloads once a day and run this script via cron about 5 minutes before that, I'm watching for errors and so forth because i find if you try to do 3 cuda tasks you start to see some errors, maybe 7 out of 50 completed fermi tasks or so, could just be the fermi SW as its the only linux fermi app out there that i know of anyway)
sys1 is the 16 thread snb and snb2 is the quad core HT system sorry for the formatting, there doesn’t seem to be a way to get the script output to space out properly. You can try to copy and paste it into an editor with fixed spacing and it should be more readable |
|
3)
Message boards :
Number crunching :
SNB-E not using all threads as it should
(Message 1296772)
Posted 217 days ago by Eric B
I did find that the SNB-E has this cc_config.xml file in the BOINC/projects directory (but its named cc_config.xml.off) which i assume means it wont be read and there is no corresponding file in the the other system. Other than that both are set to use 100% processors and at most use xx processors is set at 192 due to there are a few rare occasions i can get to play with a very big server and 192 more than covers the number of threads that thing has. I checked these settings on website and also on he manager preferences menu, Cold this be causing the problem even tho its name is cc_config.xml.off? Is there any config file in ~/BOINC i can examine to help determine why it only runs 11 cpu tasks? All threads seem fully occupied if i go by the gkrellm display cat ~/BOINC/cc_config.xml.off <cc_config>
<log_flags> <cpu_sched>1</cpu_sched> <debt_debug>1</debt_debug> <cpu_sched_debug>1</cpu_sched_debug> <coproc_debug>1</coproc_debug> <cpu_sched>1</cpu_sched> <file_xfer>0</file_xfer> <file_xfer_debug>0</file_xfer_debug> <app_msg_send>1</app_msg_send> <app_msg_receive>1</app_msg_receive> <unparsed_xml>1</unparsed_xml> <work_fetch_debug>1</work_fetch_debug> </log_flags> </cc_config> |
|
4)
Message boards :
Number crunching :
SNB-E not using all threads as it should
(Message 1296597)
Posted 217 days ago by Eric B
I have 2 OpenSuse 12.1 x64 Linux systems: a 4 Core-HT Sandy Bridge system with a Nvidia GTX460, and 8G DRAM - on that system the total is 14 boinc threads - 8 cpu and 2 gpu. OK, thats great and what i would expect. I also have an SNB-E system which is 6 core HT (12 threads) and it also has an Nidia GTX460, but 16G DRAM. On that system i get only 11 cpu and 2 cuda tasks running at a time. Both systems are using seti boinc version 6.10.58 The app_info.xml is virtually identical on both systems. and both are using Alex's AK_V* optimized linux fermi apps e.g the SNB-E system: cat ~/BOINC/projects/setiathome.berkeley.edu/app_info.xml <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_V8_linux64_ssse3</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <file_ref> <file_name>AK_V8_linux64_ssse3</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>setiathome-6.11.x86_64-pc-linux-gnu__cuda32</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>611</version_num> <plan_class>cuda_fermi</plan_class> <avg_ncpus>0.250</avg_ncpus> <max_ncpus>0.50</max_ncpus> <coproc> <type>CUDA</type> <count>0.50</count> </coproc> <file_ref> <file_name>setiathome-6.11.x86_64-pc-linux-gnu__cuda32</file_name> <main_program/> </file_ref> </app_version> </app_info> ldd AK_V8_linux64_ssse3 linux-vdso.so.1 => (0x00007fff563aa000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa3a5570000) libc.so.6 => /lib64/libc.so.6 (0x00007fa3a51e0000) /lib64/ld-linux-x86-64.so.2 (0x00007fa3a578d000) libm.so.6 => /lib64/libm.so.6 (0x00007fa3a4f89000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fa3a4d73000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fa3a4b6f000) locate libcuda: BOINC/projects/setiathome.berkeley.edu/libcudart.so.3 /usr/lib/libcuda.so /usr/lib/libcuda.so.1 /usr/lib/libcuda.so.304.43 /usr/lib64/libcuda.so /usr/lib64/libcuda.so.1 /usr/lib64/libcuda.so.304.43 /usr/local/cuda/lib/libcudart.so /usr/local/cuda/lib/libcudart.so.4 /usr/local/cuda/lib/libcudart.so.4.1.28 /usr/local/cuda/lib64/libcudart.so /usr/local/cuda/lib64/libcudart.so.4 /usr/local/cuda/lib64/libcudart.so.4.1.28 And on the 8 thread SNB system it looks like this: cat ~/BOINC/projects/setiathome.berkeley.edu/app_info.xml <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_V8_linux64_ssse3</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <file_ref> <file_name>AK_V8_linux64_ssse3</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>setiathome-6.11.x86_64-pc-linux-gnu__cuda32</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>611</version_num> <plan_class>cuda_fermi</plan_class> <avg_ncpus>0.250</avg_ncpus> <max_ncpus>0.50</max_ncpus> <coproc> <type>CUDA</type> <count>0.50</count> </coproc> <file_ref> <file_name>setiathome-6.11.x86_64-pc-linux-gnu__cuda32</file_name> <main_program/> </file_ref> </app_version> </app_info> and: ldd AK_V8_linux64_ssse3 linux-vdso.so.1 => (0x00007fff129c6000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007feb1d660000) libc.so.6 => /lib64/libc.so.6 (0x00007feb1d2d0000) /lib64/ld-linux-x86-64.so.2 (0x00007feb1d87d000) libm.so.6 => /lib64/libm.so.6 (0x00007feb1d079000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007feb1ce63000) libdl.so.2 => /lib64/libdl.so.2 (0x00007feb1cc5f000) libcuda seeems to be 4.1.28 BOINC/projects/setiathome.berkeley.edu/libcudart.so.3 usr/lib/libcuda.so /usr/lib/libcuda.so.1 /usr/lib/libcuda.so.304.43 /usr/lib64/libcuda.so /usr/lib64/libcuda.so.1 /usr/lib64/libcuda.so.304.43 /usr/local/cuda/lib/libcudart.so /usr/local/cuda/lib/libcudart.so.4 /usr/local/cuda/lib/libcudart.so.4.1.28 /usr/local/cuda/lib64/libcudart.so /usr/local/cuda/lib64/libcudart.so.4 /usr/local/cuda/lib64/libcudart.so.4.1.28 |
|
5)
Questions and Answers :
Unix/Linux :
boinc 6.10.17 requests CUDA tasks but cant get any
(Message 976575)
Posted 1173 days ago by Eric B
The problem was in app_info.xml, and contributing to the confusion were the very misleading behavior/error messages from BOINC, they are almost meaningless. it wasnt calling for gpu & cpu apps to run, now I have setiathome-CUDA-6.08.x86_64-pc-linux-gnu running the CUDA stuff and AK_V8_linux64_ssse3 running Seti MB. The app_info.xml below works. anyone know where the app_info.xml is documented? I'd like to understand why the file_info information has to be repeated in the app_name section. I can think of a much simpler format, anyway, I'm running. app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_V8_linux64_ssse3</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <file_ref> <file_name>AK_V8_linux64_ssse3</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</name> <executable/> </file_info> <file_info> <name>libcudart.so.2</name> <executable/> </file_info> <file_info> <name>libcufft.so.2</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>608</version_num> <plan_class>cuda</plan_class> <avg_ncpus>0.350000</avg_ncpus> <max_ncpus>0.350000</max_ncpus> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>setiathome-CUDA-6.08.x86_64-pc-linux-gnu</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.2</file_name> </file_ref> <file_ref> <file_name>libcufft.so.2</file_name> </file_ref> </app_version> </app_info> |
|
6)
Questions and Answers :
Unix/Linux :
boinc 6.10.17 requests CUDA tasks but cant get any
(Message 976309)
Posted 1174 days ago by Eric B
I installed crunch3r's app and now CUDA is processing happily away. HOWEVER. . . on first boinc run all my WU's got discarded. Boinc downloaded some cuda work and has been trying to get CPU work now for awhile, but like before with CUDA, now the CPU cant ever get work. Here's some selected output: Sat 06 Mar 2010 11:46:02 AM PST SETI@home Sending scheduler request: To fetch work. Sat 06 Mar 2010 11:46:02 AM PST SETI@home Requesting new tasks for CPU Sat 06 Mar 2010 11:46:07 AM PST SETI@home Scheduler request completed: got 0 new tasks Sat 06 Mar 2010 11:46:07 AM PST SETI@home Message from server: No work sent Sat 06 Mar 2010 11:47:22 AM PST SETI@home Sending scheduler request: To fetch work. Sat 06 Mar 2010 11:47:22 AM PST SETI@home Requesting new tasks for CPU Sat 06 Mar 2010 11:47:25 AM PST Project communication failed: attempting access to reference site Sat 06 Mar 2010 11:47:27 AM PST SETI@home Scheduler request failed: Server returned nothing (no headers, no data) Sat 06 Mar 2010 11:47:31 AM PST Internet access OK - project servers may be temporarily down. Sat 06 Mar 2010 11:48:27 AM PST SETI@home Sending scheduler request: To fetch work. Sat 06 Mar 2010 11:48:27 AM PST SETI@home Requesting new tasks for CPU Sat 06 Mar 2010 11:48:30 AM PST Project communication failed: attempting access to reference site Sat 06 Mar 2010 11:48:32 AM PST SETI@home Scheduler request failed: Server returned nothing (no headers, no data) Sat 06 Mar 2010 11:48:36 AM PST Internet access OK - project servers may be temporarily down. Sat 06 Mar 2010 11:49:32 AM PST SETI@home Sending scheduler request: To fetch work. Sat 06 Mar 2010 11:49:32 AM PST SETI@home Requesting new tasks for CPU Sat 06 Mar 2010 11:49:37 AM PST SETI@home Scheduler request completed: got 0 new tasks Sat 06 Mar 2010 11:49:37 AM PST SETI@home Message from server: No work sent |
|
7)
Questions and Answers :
Unix/Linux :
boinc 6.10.17 requests CUDA tasks but cant get any
(Message 976136)
Posted 1175 days ago by Eric B
I installed boinc yesterday on a newly built system. Boinc seems to recognize my nvidia card is cuda capable, it even requests CUDA tasks but i never get any. Its been trying for the last 20 hours. I do get regular seti MB tasks ok I'm running mandriva 64 bit 2010 with a slightly modified kernel. This fresh boinc install shows: ~/BOINC >file libcudart.so libcudart.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped and from the Messages tab i see: Thu 04 Mar 2010 11:30:53 PM PST OS: Linux: 2.6.31.12-1mnbX Thu 04 Mar 2010 11:30:53 PM PST Memory: 3.87 GB physical, 0 bytes virtual Thu 04 Mar 2010 11:30:53 PM PST Disk: 70.33 GB total, 59.15 GB free Thu 04 Mar 2010 11:30:53 PM PST Local time is UTC -8 hours Thu 04 Mar 2010 11:30:53 PM PST NVIDIA GPU 0: GeForce GT 220 (driver version unknown, CUDA version 2030, compute capability 1.2, 1024MB, 131 GFLOPS peak) [snip] Fri 05 Mar 2010 07:14:04 PM PST SETI@home Requesting new tasks for GPU Fri 05 Mar 2010 07:14:09 PM PST SETI@home Scheduler request completed: got 0 new tasks Fri 05 Mar 2010 07:14:09 PM PST SETI@home Message from server: No work sent I'm running this nvidia driver; NVIDIA-Linux-x86_64-190.53-pkg2.run |
|
8)
Questions and Answers :
Wish list :
my wish list item
(Message 770733)
Posted 1799 days ago by Eric B
I'd like to see my ranking displayed on the boinc status bar. It should be updated each time boinc connects to the project. For those with more than one project, you should be able to select which projects ranking you'd like to see on the status bar |
|
9)
Questions and Answers :
Wish list :
science improvements with multi core cpu's
(Message 518901)
Posted 2287 days ago by Eric B
Intel just demo'd an 80 core cpu. IBM has opened the door to huge cpu caches. It wont be too far down the road, and computer systems with dozens or even 100's of cpu cores and large fast memories will be common. Has any thought/work been done on redesigning Seti software to take advantage of such an environment? How much more in-depth analysis/processing could be done with 80 or 100 heavily cached cores and gigabytes of fast memory? How would you change the software if you knew it had teraflop performance at its disposal? |
|
10)
Questions and Answers :
Web site :
Boinc is not ready
(Message 129756)
Posted 2886 days ago by Eric B
Has this boinc software been tested? Not too much apparently. My boinc credits have plateaued even tho I'm crunching like crazy on several machines, Uploads are sporadic at best, the software only partially works (a little better on windows than it does on linux), messages are totally confusing, and the interface is clunky. My impression is that it was hurriedly put together to "just get it out". It looks like some cheap MSVC project written by an amateur. I'm really disappointed. Bring back seti-classic. It worked. |
| Copyright © 2013 University of California |