Multi core greater than 80 core

Message boards : Number crunching : Multi core greater than 80 core
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1085315 - Posted: 9 Mar 2011, 15:06:29 UTC - in response to Message 1085270.  


Try to set:
On multiprocessors, use at most 50% of the processors

to see what happens (will BOINC use 80 or 30?)




i set this, suspended then update, then resume. Prior I was up to 37% with 40 of my CPU's pegged. The setting didnt seem to change anything. I am certain without memory in the lower tray attached to the second 4 sockets is the reason why the other processors wont register. Ill find out in a few hours when I get in.


Last night the system was displaying 40 processors, now it is at 60. Did setting to 50% increase the processor count for BOINC & are you saying you had it set to 37% before?

In Datacenter there are settings to limit CPU & memory resources per application. As you are running Enterprise I would guess it might have the same feature set? I'm not sure what the defaults are, but you might be running into that.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1085315 · Report as offensive
Profile David Anderson
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 13 Feb 99
Posts: 173
Credit: 502,653
RAC: 0
Message 1085377 - Posted: 9 Mar 2011, 20:02:41 UTC

The BOINC client has no limit on the # of CPUs.
If uses the Windows GetSystemInfo() function to get the # of CPUs.
-- David
ID: 1085377 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085398 - Posted: 9 Mar 2011, 21:12:30 UTC - in response to Message 1085273.  

thanks,

here is some output:

3/9/2011 12:55:01 AM Starting BOINC client version 6.10.58 for windows_x86_64
3/9/2011 12:55:01 AM log flags: file_xfer, sched_ops, task
3/9/2011 12:55:01 AM Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
3/9/2011 12:55:01 AM Data directory: C:\ProgramData\BOINC
3/9/2011 12:55:01 AM Running under account Administrator
3/9/2011 12:55:01 AM Processor: 60 GenuineIntel Intel(R) Xeon(R) CPU E7- 2870 @ 2.40GHz [Family 6 Model 47 Stepping 2]
3/9/2011 12:55:01 AM Processor: 256.00 KB cache
3/9/2011 12:55:01 AM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx smx tm2 dca popcnt aes pbe
3/9/2011 12:55:01 AM OS: Microsoft Windows Server 2008 "R2": Enterprise x64 Edition, Service Pack 1, (06.01.7601.00)
3/9/2011 12:55:01 AM Memory: 63.99 GB physical, 127.98 GB virtual
3/9/2011 12:55:01 AM Disk: 136.60 GB total, 62.52 GB free
3/9/2011 12:55:01 AM Local time is UTC -8 hours
3/9/2011 12:55:01 AM No usable GPUs found
3/9/2011 12:55:01 AM SETI@home Found app_info.xml; using anonymous platform
3/9/2011 12:55:01 AM SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5847405; resource share 100
3/9/2011 12:55:01 AM SETI@home General prefs: from SETI@home (last modified 08-Mar-2011 22:10:17)
3/9/2011 12:55:01 AM SETI@home Computer location: work
3/9/2011 12:55:01 AM SETI@home General prefs: no separate prefs for work; using your defaults
3/9/2011 12:55:01 AM Reading preferences override file
3/9/2011 12:55:01 AM Preferences:
3/9/2011 12:55:01 AM max memory usage when active: 32762.76MB
3/9/2011 12:55:01 AM max memory usage when idle: 58972.97MB
3/9/2011 12:55:01 AM max disk usage: 10.00GB
3/9/2011 12:55:01 AM don't use GPU while active
3/9/2011 12:55:01 AM (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
3/9/2011 12:55:01 AM Using proxy info from GUI
3/9/2011 12:55:01 AM Using HTTP proxy itgproxy:80

3/9/2011 12:55:01 AM SETI@home Started upload of 09se10ac.25092.3748.3.10.111_0_0
3/9/2011 12:55:01 AM SETI@home Started upload of 09se10ac.25092.7020.3.10.119_1_0
3/9/2011 12:55:01 AM GPUs have become unusable; disabling tasks


According to your startup log, you've set this host to the 'work' venue,
but you don't have any 'work' preferences, so Boinc is using the default prefs,
but is then using the local preferences which over-ride Web preferences,
you could try Clearing the local preferences to see if more CPU's are utilised,

Claggy



ok, thanks for the pointer, I updated work preferences.
ID: 1085398 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085401 - Posted: 9 Mar 2011, 21:20:06 UTC - in response to Message 1085285.  

Hmm

I have an idea, perhaps it's a limit in boinc to only handle 64 subprocesses and it can't load more cores by itself.

If you take a read at my blog i have a workaround that could suit the need to fully utilize that system.

Unfortunately the hosts get separate "hosts id" though they are running in the same machine but you can easily merge the numbers you get from the system to see the total RAC of that system.

Look here for the principals to run more than one boinc installation in one system.
http://vyper.kafit.se/wp/index.php/2011/02/04/running-different-nvidia-architectures-most-optimal-at-setihome/

Kind regards Vyper



very cool, I will look at running more instances.
ID: 1085401 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085403 - Posted: 9 Mar 2011, 21:24:30 UTC - in response to Message 1085315.  


Try to set:
On multiprocessors, use at most 50% of the processors

to see what happens (will BOINC use 80 or 30?)




i set this, suspended then update, then resume. Prior I was up to 37% with 40 of my CPU's pegged. The setting didnt seem to change anything. I am certain without memory in the lower tray attached to the second 4 sockets is the reason why the other processors wont register. Ill find out in a few hours when I get in.


Last night the system was displaying 40 processors, now it is at 60. Did setting to 50% increase the processor count for BOINC & are you saying you had it set to 37% before?

In Datacenter there are settings to limit CPU & memory resources per application. As you are running Enterprise I would guess it might have the same feature set? I'm not sure what the defaults are, but you might be running into that.



Yes I set it to 50%, thats all I did. I set it back so let see if it changes back. No in taskman as you can see in the screenshot its at 37% utilization over the cores. I was at 25% then it shot it.

Is there an outage today, I am not getting new jobs. Down to 2% now.
ID: 1085403 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1085412 - Posted: 9 Mar 2011, 21:41:54 UTC - in response to Message 1085403.  
Last modified: 9 Mar 2011, 21:53:34 UTC

The only tasks you're got left are 5 Astropulse tasks, make sure none of them are suspended,

Is Boinc asking for work, and how much?

If you make a cc_config.xml (with Notepad) with the following,
drop it in your Boinc Data directory, and do a read config file:

<cc_config> 
   <log_flags> 
       <file_xfer>1</file_xfer> 
       <sched_ops>1</sched_ops> 
       <sched_op_debug>1</sched_op_debug> 
       <task>1</task> 
    </log_flags> 
</cc_config>


The sched_op_debug flag will show how much you're asking for.

Claggy
ID: 1085412 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1085421 - Posted: 9 Mar 2011, 21:56:30 UTC - in response to Message 1085403.  

replica just came back online, that could be part of the issue.
Janice
ID: 1085421 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1085425 - Posted: 9 Mar 2011, 22:01:18 UTC - in response to Message 1085377.  
Last modified: 9 Mar 2011, 22:43:55 UTC

The BOINC client has no limit on the # of CPUs.
If uses the Windows GetSystemInfo() function to get the # of CPUs.
-- David


Do you have ideas why Windows Task Manager shows 160 virtual CPUs (so Windows "knows" about all of them) but BOINC "thinks" there are 60 virtual CPUs ?

Is it possible for GetSystemInfo() function to "lie" about the # of CPUs ?


GetSystemInfo Function:
http://msdn.microsoft.com/en-us/blogvisualizer/ms724381

SYSTEM_INFO Structure:
http://msdn.microsoft.com/en-us/blogvisualizer/ms724958

"dwNumberOfProcessors
The number of logical processors in the current group. To retrieve this value, use the GetLogicalProcessorInformation function."


GetLogicalProcessorInformation Function:
http://msdn.microsoft.com/en-us/blogvisualizer/ms683194

"On systems with more than 64 logical processors, the GetLogicalProcessorInformation function retrieves logical processor information about processors in the processor group to which the calling thread is currently assigned. Use the GetLogicalProcessorInformationEx function to retrieve information about processors in all processor groups on the system."


Processor Groups:
http://msdn.microsoft.com/en-us/blogvisualizer/dd405503

"Support for systems that have more than 64 logical processors is based on the concept of a processor group, which is a static set of up to 64 logical processors that is treated as a single scheduling entity. Processor groups are numbered starting with 0. Systems with fewer than 64 logical processors always have a single group, Group 0."


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1085425 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085434 - Posted: 9 Mar 2011, 22:10:10 UTC - in response to Message 1085412.  

The only tasks you're got left are 5 Astropulse tasks, make sure none of them are suspended,

Is Boinc asking for work, and how much?

If you make a cc_config.xml (with Notepad) with the following,
drop it in your Boinc Data directory, and do a read config file:

<cc_config> 
   <log_flags> 
       <file_xfer>1</file_xfer> 
       <sched_ops>1</sched_ops> 
       <sched_op_debug>1</sched_op_debug> 
       <task>1</task> 
    </log_flags> 
</cc_config>


The sched_op_debug flag will show how much you're asking for.

Claggy



forgive my ignorance but where is the Boinc data dir? you mane just the bonic dir?
ID: 1085434 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1085439 - Posted: 9 Mar 2011, 22:15:19 UTC - in response to Message 1085434.  

The only tasks you're got left are 5 Astropulse tasks, make sure none of them are suspended,

Is Boinc asking for work, and how much?

If you make a cc_config.xml (with Notepad) with the following,
drop it in your Boinc Data directory, and do a read config file:

<cc_config> 
   <log_flags> 
       <file_xfer>1</file_xfer> 
       <sched_ops>1</sched_ops> 
       <sched_op_debug>1</sched_op_debug> 
       <task>1</task> 
    </log_flags> 
</cc_config>


The sched_op_debug flag will show how much you're asking for.

Claggy



forgive my ignorance but where is the Boinc data dir? you mane just the bonic dir?

It's posted in your startup log, 4th line:

3/9/2011 12:55:01 AM Data directory: C:\ProgramData\BOINC


It'll be a hidden folder, eithier unhide it or paste the file into that location,

Claggy
ID: 1085439 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085441 - Posted: 9 Mar 2011, 22:21:19 UTC - in response to Message 1085439.  

The only tasks you're got left are 5 Astropulse tasks, make sure none of them are suspended,

Is Boinc asking for work, and how much?

If you make a cc_config.xml (with Notepad) with the following,
drop it in your Boinc Data directory, and do a read config file:

<cc_config> 
   <log_flags> 
       <file_xfer>1</file_xfer> 
       <sched_ops>1</sched_ops> 
       <sched_op_debug>1</sched_op_debug> 
       <task>1</task> 
    </log_flags> 
</cc_config>


The sched_op_debug flag will show how much you're asking for.

Claggy



forgive my ignorance but where is the Boinc data dir? you mane just the bonic dir?

It's posted in your startup log, 4th line:

3/9/2011 12:55:01 AM Data directory: C:\ProgramData\BOINC


It'll be a hidden folder, eithier unhide it or paste the file into that location,

Claggy



3/9/2011 2:20:17 PM SETI@home [sched_op_debug] handle_scheduler_reply(): got ack for result ap_10ja11aa_B3_P0_00075_20110308_21339.wu_0
3/9/2011 2:20:17 PM SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
3/9/2011 2:20:17 PM SETI@home [sched_op_debug] Reason: requested by project
3/9/2011 2:20:19 PM Re-reading cc_config.xml
3/9/2011 2:20:19 PM Re-read config file
3/9/2011 2:20:19 PM log flags: file_xfer, sched_ops, task, sched_op_debug
3/9/2011 2:20:25 PM SETI@home resumed by user
3/9/2011 2:20:25 PM SETI@home Restarting task ap_10ja11aa_B4_P0_00172_20110308_01236.wu_0 using astropulse_v505 version 505
3/9/2011 2:20:25 PM SETI@home Restarting task ap_01dc10aa_B5_P0_00207_20110303_04185.wu_2 using astropulse_v505 version 505

ID: 1085441 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085443 - Posted: 9 Mar 2011, 22:23:31 UTC - in response to Message 1085441.  

I had to order some new memory risers for the lower 40 CPU's. They should be here by tomorrow. Then we can see about getting this thing to scale..

For now I will swap back to the IA-64 and see if I cant get that going.
ID: 1085443 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20252
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1085480 - Posted: 9 Mar 2011, 23:38:27 UTC - in response to Message 1085220.  

... 160 LP system not being able to scale properly. ...


You could check it out with Linux... That would make for a very interesting comparison! (Linux is designed to scale up well, especially for the scheduler.)


Good luck, and:

Happy fast crunchin'!,
Martin


See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1085480 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1085491 - Posted: 10 Mar 2011, 0:20:02 UTC - in response to Message 1085425.  
Last modified: 10 Mar 2011, 0:27:36 UTC


If BOINC is using GetSystemInfo() function to find the number of logical processors in the computer
I think BOINC will see no more than 64 CPUs.

I hope the workaround will be to set in cc_config.xml
   <options>
      <ncpus>160</ncpus>
   </options>


(this was proposed already in the second post in this thread)

I'm not sure will the 160 running tasks be properly assigned to different logical CPUs or will run in just a group of 64 CPUs.


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1085491 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085492 - Posted: 10 Mar 2011, 0:21:52 UTC - in response to Message 1085480.  

... 160 LP system not being able to scale properly. ...


You could check it out with Linux... That would make for a very interesting comparison! (Linux is designed to scale up well, especially for the scheduler.)


Good luck, and:

Happy fast crunchin'!,
Martin



Linux..... barf... :) Sorry we are windows server users only. LOL
ID: 1085492 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085493 - Posted: 10 Mar 2011, 0:24:10 UTC - in response to Message 1085492.  

This is a pic of the HP RX6600 16GB 4 socket with threads (16 LP)on Win2k8R2SP1 IA-64 With the binary I got it seems to scale well.



ID: 1085493 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085494 - Posted: 10 Mar 2011, 0:24:44 UTC - in response to Message 1085493.  

next is the 256 way! Cross your fingers.
ID: 1085494 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085505 - Posted: 10 Mar 2011, 0:58:10 UTC - in response to Message 1085494.  

How do I request more tasks? the 160 processor box is down to 1 and it wont get more. Last night it had like 60+
ID: 1085505 · Report as offensive
Bry B

Send message
Joined: 3 Apr 99
Posts: 53
Credit: 832,165
RAC: 0
United States
Message 1085506 - Posted: 10 Mar 2011, 1:04:24 UTC - in response to Message 1085505.  

How do I request more tasks? the 160 processor box is down to 1 and it wont get more. Last night it had like 60+



nevermind I got it :)
ID: 1085506 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1085509 - Posted: 10 Mar 2011, 1:11:59 UTC - in response to Message 1085491.  
Last modified: 10 Mar 2011, 1:30:07 UTC


If BOINC is using GetSystemInfo() function to find the number of logical processors in the computer
I think BOINC will see no more than 64 CPUs.

I hope the workaround will be to set in cc_config.xml
   <options>
      <ncpus>160</ncpus>
   </options>


(this was proposed already in the second post in this thread)

I'm not sure will the 160 running tasks be properly assigned to different logical CPUs or will run in just a group of 64 CPUs.



That just tells BOINC to run that number of instances of the science app IIRC. Windows SHOULD take care of assigning resources correctly.

EDIT: ncpus defiantly works
3/9/2011 8:24:57 PM Processor: 256 GenuineIntel Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz [Family 6 Model 30 Stepping 5]
Also if you do not have 256 cpu's make sure you suspend running work before setting that & running BOINC. As trying to run 16 threads per cpu on a wee i7 with only 4GB of ram isn't a good thing. lol
3/9/2011 8:29:23 PM Number of usable CPUs has changed from 256 to 8. Running benchmarks. Phew!
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1085509 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : Number crunching : Multi core greater than 80 core


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.