Quick Links
AMP & L2 Cache L2 cache usage in AMP mode
#1
Posted 01 April 2011 - 05:48 AM
If yes, then what are the take care items?
Note: Currently using ARM-CA9 MPCore (x3)
#2
Posted 04 April 2011 - 04:11 PM
The Snoop Control Unit (SCU) is responsible for the data coherency during conflicting accesses to L2 I think.
Therefore I would advise you to look at the SCU chapter on the ARM Cortex-A9 TRM...
HTH.
Kind regards, Alban
Partnership Marketing Specialist, ARM.
ARM Connected Community | Contact the team | Raise a new technical support case
#3
Posted 05 April 2011 - 12:45 PM
As to L2 accesses.... the SCU only manages L1 data cache coherency (if turned on, and if the CPUs are have the SMP bit set). It does provide some arbitration for accesses to the main memory system, but that's about it.
If you have a L2 shared between multiple CPUs (MPCore or not) you need to coordinate which CPU or CPUs try to configure it. You also need to ensure that if multiple CPUs use the same physical address range that each specifies the same attributes. I.e. that CPU0 doesn't treat it as non-cacheable and CPU1 as cacheable. Otherwise odd things can happen.
Basically if you have such shared regions:
* Make sure the attributes match on each CPU
* Use the SCU to maintain L1 coherency
* Be very careful about how you do maintenance operations
#4
Posted 07 April 2011 - 07:27 AM
> What do you mean by AMP mode?
different software packages running independently on each CPU. Like, one CPU has OS based application others two CPUs have non-OS based application.
It is possible that sometimes these different applications on separate CPUs may use data structures in shared memory for read/write purpose and also FIFOs access in shared memory.
My references are:
1. DDI0406B_arm_architecture_reference_manual.pdf
2. DDI0407F_cortex_a9_r2p2_mpcore_trm.pdf
3. http://infocenter.ar...f/CIHCHFCG.html
> You also need to ensure that if multiple CPUs use the same physical address range that each specifies the same attributes.
> I.e. that CPU0 doesn't treat it as non-cacheable and CPU1 as cacheable. Otherwise odd things can happen.
>
> Basically if you have such shared regions:
> * Make sure the attributes match on each CPU
> * Use the SCU to maintain L1 coherency
> * Be very careful about how you do maintenance operations
i understand these take care items, will definitely make a note of these in system memory allocation design.
However, i want to confirm one point regarding ACTLR.SMP bit, i.e., by default this is '0', which means that CA9 does not take part in coherency maintenance. Regardless which type of software architecture is used(SMP or AMP), in case of MPCore environment, this bit should be set always to configure CA9 to take part in coherency maintenance. This i interpreted because, most of the MPCore systems usually pass information from one CPU to other, which results into shared memory requirement. In MPCore systems, when L1 is used by CPUs, it is mandatory to make use of L2(please correct me in this case if i am wrong regarding the underlined part) with SCU enabled and with coherency maintenance, which looks like a safer system design consideration.
My inference at present with your support is: L2 can be enabled in AMP mode.
#5
Posted 07 April 2011 - 08:05 AM
I am not sure that I would agree with this statement. With a MPCore system, it is likely that for most systems it makes sense to set ACTLR.SMP (and enable the SCU). Because coherency management is useful. However, enabling coherency management does have a cost (energy). If you had an AMP set up which very rarely access shared data its possible the energy cost is higher than the benefit. This kind of set up seems unlikely (give you chose a MPCore in the first place). So I'd summarize you in most cases want coherency management, rather than always.
" In MPCore systems, when L1 is used by CPUs, it is mandatory to make use of L2..."
No it is not mandatory. But, if you have a L2 cache it would seem a very strange decision not to use it!
"My inference at present with your support is: L2 can be enabled in AMP mode."
Correct. But you will have to ensure that you are consistent in your use of attributes across the two processors, and that you are careful over cache maintenance ops.
#7
Posted 10 September 2012 - 04:00 AM
ttfn, on 05 April 2011 - 12:45 PM, said:
As to L2 accesses.... the SCU only manages L1 data cache coherency (if turned on, and if the CPUs are have the SMP bit set). It does provide some arbitration for accesses to the main memory system, but that's about it.
If you have a L2 shared between multiple CPUs (MPCore or not) you need to coordinate which CPU or CPUs try to configure it. You also need to ensure that if multiple CPUs use the same physical address range that each specifies the same attributes. I.e. that CPU0 doesn't treat it as non-cacheable and CPU1 as cacheable. Otherwise odd things can happen.
Basically if you have such shared regions:
* Make sure the attributes match on each CPU
* Use the SCU to maintain L1 coherency
* Be very careful about how you do maintenance operations
On the related topic, I have been trying to run the firmware on MPCore (x1) Cortex-A9 with L2 Cache enabled. Everything runs fine if I just enabled L1 Cache (D & I) but if I try to follow the sequence to enable L2 Cache, it starts failing randomly. Before the firmware fails, I can see that L2 Cache I and D hits (thousands in number) through enabling counters etc. With L2 Cache enabled from the boot code, I am able to branch to the functions and executed successfully but as I said firmware with L2 cache enabled failed at random places. I tried all these things to get a confidence whether L2 Cache is really enabled and working from initial setup.
Through step execution, it looks to me that the stage has not reached where I should be bother about maintaining the coherency between L1, L2 and main memory. From the configuration of L2 Cache controller, the speculative read feature is enabled so I am also taking care with Cortex-A9 processor before enabling the SCU and L2 Cache controller. From my understanding if the MPCore is working with only one core, SMP bit isn't relevant and can be left as 0. Is this understanding correct?
I have reviewed the MMU page table setting as well where the rules are common for inner and outer cache attributes i.e. TEX[2] is 0. Everything looks to be working fine from MMU page table settings keeping L1 enabled on both I and D. Another point I would like to mention is that L1 I cache is VIPT whereas L1 D and L2 (unified) are PIPT. I can understand the consequences of VIPT cache which can be resolved through page coloring.
It looks to me that this configuration is quite common in Cortex-A9 as what I can understand from the specification for L1 and L2 cache. Is there any specific course of action required considering different level of caches?
I have spent good time looking into the simulation of L2 cache with MPCore Cortex-A9 (x1) and to the extend what I can run looks fine from the signals. But yes, the firmware what I run on Simulation is quite different from real firmware (as I can't do everything under simulation what I do in real firmware). The intention of attempting L2 cache under simulation is confirm about its configuration.
I have been going through page tables and MP core specification along with L2C-310 Cache controller specification. I am not sure what exactly is missing from my setup of L2 cache with MPCore Cortex-A9 (x1). Any pointers? I appreciate for all the inputs and thanks in advance.
Vaibhav
















