Login

Important information

This site uses cookies to store information on your computer. By continuing to use our site, you consent to our cookies.

ARM websites use two types of cookie: (1) those that enable the site to function and perform as required; and (2) analytical cookies which anonymously track visitors only while using the site. If you are not happy with this use of these cookies please review our Privacy Policy to learn how they can be disabled. By disabling cookies some features of the site will not work.

ARM Community: Meaning of ACTLR.smp - ARM Community

Jump to content

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

Meaning of ACTLR.smp Rate Topic: -----

#1 User is offline   SJS 

  • Member
  • Pip
  • Group: Members
  • Posts: 14
  • Joined: 07-November 11

Posted 20 December 2011 - 07:09 PM

Hi,

Is it really important to have the SMP bit in ACTLR set to 1 (Cortex a9 MPcore)? What is a purpose of this bit? It is written in ARM website that this bit indicates whether processor takes responsibility for cache coherence or not. I thought that SCU had something to do with cache coherence (I read that it uses MESI algorithm for that). So will the cache be coherent if ACTLR.smp =0?

Thanks
0

#2 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 21 December 2011 - 08:24 AM

The ACTLR.SMP bit is the per CPU enable. Basically as well as the global setting, each individual CPU also needs to enable coherency support.

To use the hardware coherency support you need to do several things...

* Enable the SCU, through the SCU Control Register (can be done by any of the SCUs)
* Enable coherency management in the CPU, through the ACTLR.SMP bit (must be done by EACH CPU you want coherency for)
* Enable the MMU on the CPU (again, do this on each CPU)
* Mark the appropriate address regions as Normal, WB/WA, shareable
* Optionally set the ACTLR.FW bit

The story is a little different on the A15
0

#3 User is offline   SJS 

  • Member
  • Pip
  • Group: Members
  • Posts: 14
  • Joined: 07-November 11

Posted 21 December 2011 - 09:40 AM

It seems that this bit is important, but there is no possibility to set it. I am just wondering what was the reason to put it in ACTLR, which cannot be modified... (there is no possibility to switch to secure mode). Can coherency be achieved by some sort of emulation in software? There should be a way out... It cannot be that stupid.
0

#4 User is offline   isogen74 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 1097
  • Joined: 20-March 07

Posted 21 December 2011 - 12:30 PM

If the secure world bootstrap has set the NSACR.NS_SMP bit to 1 then the SMP-bit in the ACTLR is non-secure writeable.

If you have a multi-core A9 then it is usual for the secure bootstrap to eitherL

(1) enable SMP for all cores before handing over to non-secure, leaving the SMP-bit read-only in the non-secure world. This means you cannot run AMP, but that's fairly uncommon.

(2) set the NSACR.NS_SMP bit to 1, and let non-secure decide how to use SMP/AMP across the cores. But any secure software running has to cope with SMP/AMP changing beneath it's feet.
When optimizing software, consider that the quickest code to run is the bit you removed from the call path.
0

#5 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 21 December 2011 - 12:54 PM

Couple of additional thoughts...

The NSACR.NS_SMP bit is only available if your part if based on the r1p0 (or later) A9

The SCU registers can also be restricted to Secure access only... So you potentially have the same problem with the SCU enable as with the ACTLR.
0

#6 User is offline   calvinhung 

  • Member
  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 22-March 12

Posted 22 March 2012 - 06:35 AM

I'm running Linux on a single core of Cortex-A9 MPCore.
If I enable SMP config in Linux, it will set ACTLR.SMP=1.

Is it correct to set SCTLR.SMP=1 with single core of CA9 MPCore?
0

#7 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 22 March 2012 - 08:53 AM

It shouldn't make much difference (unless your part uses ACP).
0

#8 User is offline   calvinhung 

  • Member
  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 22-March 12

Posted 22 March 2012 - 04:04 PM

Thanks to ttfn's response.

However, the problems is that if I don't set ACTLR.SMP=1 in Linux SMP mode for single core CA9 MPcore, the kernel will hang in atomic lock loop when doing LDREX/STREX operations.

Is this normal or I missed something?
0

#9 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 23 March 2012 - 02:38 PM

Ahh.... it depends on what Linux does with page tables and whether you have a global monitor.

So what does setting the SCTLR.SMP bit actually do? Well (assuming the SCU is enabled) it configures the core as being part of the inner-shareable domain. This affects all the regions you mark as Write-back/Write-allocated inner cacheable + shared in the translation tables.

* SCU enabled + SCTLR.SMP bit set
Inner WB/WA + shared regions treated as cacheable at L1, SCU maintains coherency between cores in cluster

* SCU disabled and/or SCTLR.SMP bit not set
Inner WB/WA + shared regions treated as NON-CACHEABLE. This is the same behaviour as on the Cortex-A8.

The Shareable attribute tells the processor whether _other_ processors/masters access the region, or if it's just this processor. For cores without any coherncy logic, marking a region as shareable means the processor will NOT be cached by the integrated caches. Regardless of what you set the inner cache policy as. This is because you've told it that another master might modify the region, and the core would have no way to detect this. So to be "safe" it won't cache the shared region.

For the MPCores, the shared regions are cached because the coherency can be maintained with the other cores in the cluster.


Why does this matter to you????

Well I'm guess that by setting CONFIG_SMP you are causing the kernel to mark cacheable memory as shared. Which will work fine as long as the SCTLR.SMP bit is set (which the kernel should do) and the other cores are suitably configured.

For mutexes/semaphores, you use the special LDREX/STREX instructions. These are there to allow you implement mutex/semaphore lock functions. Basically when you do a STREX it "checks" whether the location has changed since you read it with a LDREX. This checking can be done inside the core (Local Monitor) or in the memory system (Global Monitor). Basically any region which gets cached will only use the Local Monitor, regions that are not cached use the Global Monitor. Flipping the SCTLR.SMP bit therefore changes which Monitor you use for Inner WB/WA + Shared regions...

Problem is not all chips actually have a Global Monitor. So the STREX instructions which try to use the Global Monitor will just fail.
1

#10 User is offline   calvinhung 

  • Member
  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 22-March 12

Posted 23 March 2012 - 03:02 PM

Thanks a lot, ttfn. Very detail explanations.

One more question, do you mean ACP here by "Global Monitor"? or other common implementations?
1

#11 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 23 March 2012 - 04:12 PM

If by ACP you mean Accelerator Coherency Port, no. They are unrelated.

Imagine you had two processors (say an A9 and R4) in one chip. They share some data, and you want to use a mutex to control which of the two processors can access the data at once. You need some hardware support for ensuring that STREXs from processor can detect if the other got there first. That is the job of the Global Monitor. In my experience, the Global Monitor is usually part of the memory controller. That is if you have one - not all chips do.
1

#12 User is offline   Sandeep Mukherjee 

  • Member
  • Pip
  • Group: Members
  • Posts: 9
  • Joined: 14-March 12

Posted 02 April 2012 - 08:18 PM

How is the ACTLR.SMP bit in Cortex-A15 different than that in Cortex-A9MPCore / Cortex-A5MPCore ? A15 TRM says that:

"In the Cortex-A15 processor, the L1 data cache and L2 cache are always coherent, for shared or non-shared data, regardless of the value of the SMP bit."

Does this mean that page-table shareability bit doesn't have any effect for data accesses in A15 (with regard to coherency maintenance inside A15 cluster) ?

If that is the case, then does all LDREX and STREX are checked against local and global monitors irrespective of page-table shareability bit and ACTLR.SMP bit settings ?
0

#13 User is offline   isogen74 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 1097
  • Joined: 20-March 07

Posted 02 April 2012 - 09:51 PM

Quote

Does this mean that page-table shareability bit doesn't have any effect for data accesses in A15


Hmm there was some thread on this a few weeks back, and the shared bit did make a difference. I can't remember the details, but if you want data to be shared then set it as shared in the MMU. Anything else is coding to the microarchitecture of the core, not the "ARM Architecture"; when you move that kind of code to a different core it may well break in confusing and hard to debug ways, so you should always try and conform to the ARM ARM if possible.

This post has been edited by isogen74: 02 April 2012 - 09:52 PM

When optimizing software, consider that the quickest code to run is the bit you removed from the call path.
0

#14 User is offline   lsjk 

  • Member
  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 06-April 12

Posted 19 April 2012 - 07:00 AM

瀏覽文章引用框(ttfn @ 23 March 2012 - 04:12 PM)

If by ACP you mean Accelerator Coherency Port, no. They are unrelated.

Imagine you had two processors (say an A9 and R4) in one chip. They share some data, and you want to use a mutex to control which of the two processors can access the data at once. You need some hardware support for ensuring that STREXs from processor can detect if the other got there first. That is the job of the Global Monitor. In my experience, the Global Monitor is usually part of the memory controller. That is if you have one - not all chips do.


Hi, do you happen to know if the ARM versatile express A9x4 board has global monitor ?  Because if I do atomic operations ( ldrex/strex ) on non-cached shared normal memory regions under smp mode, it just stuck forever on those instructions. If I remove the shared attribute, then it succeeds. However the memory controller manual (pl341) claims that it has 2 exclusive access monitors. I suppose this should be the global monitor here ?
0

#15 User is offline   ttfn 

  • Super Contributor
  • PipPipPipPip
  • Group: Members
  • Posts: 576
  • Joined: 29-September 06

Posted 19 April 2012 - 07:24 AM

You're probably best asking that to ARM's support team.
1

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic