Commit Graph

31 Commits

Author SHA1 Message Date
Babu Moger 51c5ecfd85 x86/resctrl: Introduce interface to list monitor states of all the groups
ANBZ: #9790

cherry-picked from https://lore.kernel.org/all/cover.1722981659.git.babu.moger@amd.com/

Provide the interface to list the monitor states of all the resctrl
groups in ABMC mode.

Example:
$cat /sys/fs/resctrl/info/L3_MON/mbm_control

List follows the following format:

"<CTRL_MON group>/<MON group>/<domain_id>=<flags>"

Format for specific type of groups:

- Default CTRL_MON group:
  "//<domain_id>=<flags>"

- Non-default CTRL_MON group:
  "<CTRL_MON group>//<domain_id>=<flags>"

- Child MON group of default CTRL_MON group:
  "/<MON group>/<domain_id>=<flags>"

- Child MON group of non-default CTRL_MON group:
  "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"

Flags can be one of the following:
t  MBM total event is enabled
l  MBM local event is enabled
tl Both total and local MBM events are enabled
_  None of the MBM events are enabled

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3731
2024-08-20 07:58:05 +00:00
Babu Moger 9bd6f677fe x86/resctrl: Enable AMD ABMC feature by default when supported
ANBZ: #9790

cherry-picked from https://lore.kernel.org/all/cover.1722981659.git.babu.moger@amd.com/

Enable ABMC by default when supported during the boot up.

Users will not see any difference in the behavior when resctrl is
mounted. With automatic assignment everything will work as running
in the legacy monitor mode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3731
2024-08-20 07:58:05 +00:00
Babu Moger 0461e39b43 x86/resctrl: Add the interface to assign a hardware counter
ANBZ: #9790

cherry-picked from https://lore.kernel.org/all/cover.1722981659.git.babu.moger@amd.com/

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as it is assigned.
The assigned RMID will be tracked by the hardware until the user unassigns
it manually.

Counters are configured by writing to L3_QOS_ABMC_CFG MSR and
specifying the counter id, bandwidth source, and bandwidth types.

Provide the interface to assign the counter ids to RMID.

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
    Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3731
2024-08-20 07:58:05 +00:00
Babu Moger e44c49fa51 x86/resctrl: Remove MSR reading of event configuration value
ANBZ: #9790

cherry-picked from https://lore.kernel.org/all/cover.1722981659.git.babu.moger@amd.com/

The event configuration is domain specific and initialized during domain
initialization. The values is stored in rdt_hw_mon_domain.

It is not required to read the configuration register every time user asks
for it. Use the value stored in rdt_hw_mon_domain instead.

Introduce resctrl_arch_event_config_get() and
resctrl_arch_event_config_set() to get/set architecture domain specific
mbm_total_cfg/mbm_local_cfg values. Also, remove unused config value
definitions.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3731
2024-08-20 07:58:05 +00:00
Babu Moger 1616e89531 x86/resctrl: Add support to enable/disable AMD ABMC feature
ANBZ: #9790

cherry-picked from https://lore.kernel.org/all/cover.1722981659.git.babu.moger@amd.com/

Add the functionality to enable/disable AMD ABMC feature.

AMD ABMC feature is enabled by setting enabled bit(0) in MSR
L3_QOS_EXT_CFG.  When the state of ABMC is changed, the MSR needs
to be updated on all the logical processors in the QOS Domain.

Hardware counters will reset when ABMC state is changed. Reset the
architectural state so that reading of hardware counter is not considered
as an overflow in next update.

The ABMC feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3731
2024-08-20 07:58:05 +00:00
Linus Torvalds 6a49fd755f x86/resctl: fix scheduler confusion with 'current'
ANBZ: #9739

commit 7fef099702 upstream.

The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to get the current thread pointer efficiently from per-cpu storage.

And the keyword in there is 'stable': the current thread pointer never
changes as far as a single thread is concerned.  Even if when a thread
is preempted, or moved to another CPU, or even across an explicit call
'schedule()' that thread will still have the same value for 'current'.

It is, after all, the kernel base pointer to thread-local storage.
That's why it's stable to begin with, but it's also why it's important
enough that we have that special 'this_cpu_read_stable()' access for it.

So this is all done very intentionally to allow the compiler to treat
'current' as a value that never visibly changes, so that the compiler
can do CSE and combine multiple different 'current' accesses into one.

However, there is obviously one very special situation when the
currently running thread does actually change: inside the scheduler
itself.

So the scheduler code paths are special, and do not have a 'current'
thread at all.  Instead there are _two_ threads: the previous and the
next thread - typically called 'prev' and 'next' (or prev_p/next_p)
internally.

So this is all actually quite straightforward and simple, and not all
that complicated.

Except for when you then have special code that is run in scheduler
context, that code then has to be aware that 'current' isn't really a
valid thing.  Did you mean 'prev'? Did you mean 'next'?

In fact, even if then look at the code, and you use 'current' after the
new value has been assigned to the percpu variable, we have explicitly
told the compiler that 'current' is magical and always stable.  So the
compiler is quite free to use an older (or newer) value of 'current',
and the actual assignment to the percpu storage is not relevant even if
it might look that way.

Which is exactly what happened in the resctl code, that blithely used
'current' in '__resctrl_sched_in()' when it really wanted the new
process state (as implied by the name: we're scheduling 'into' that new
resctl state).  And clang would end up just using the old thread pointer
value at least in some configurations.

This could have happened with gcc too, and purely depends on random
compiler details.  Clang just seems to have been more aggressive about
moving the read of the per-cpu current_task pointer around.

The fix is trivial: just make the resctl code adhere to the scheduler
rules of using the prev/next thread pointer explicitly, instead of using
'current' in a situation where it just wasn't valid.

That same code is then also used outside of the scheduler context (when
a thread resctl state is explicitly changed), and then we will just pass
in 'current' as that pointer, of course.  There is no ambiguity in that
case.

The fix may be trivial, but noticing and figuring out what went wrong
was not.  The credit for that goes to Stephane Eranian.

  [kun: make resctrl_sched_in change extend to arm mpam.]

Reported-by: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Stephane Eranian <eranian@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3694
2024-08-16 08:45:19 +00:00
Babu Moger e2760fecdc x86/resctrl: Add interface to write mbm_total_bytes_config
ANBZ: #8969

commit 92bd5a1390 upstream.

The event configuration for mbm_total_bytes can be changed by the user by
writing to the file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config.

The event configuration settings are domain specific and affect all the
CPUs in the domain.

Following are the types of events supported:

  ====  ===========================================================
  Bits   Description
  ====  ===========================================================
  6      Dirty Victims from the QOS domain to all types of memory
  5      Reads to slow memory in the non-local NUMA domain
  4      Reads to slow memory in the local NUMA domain
  3      Non-temporal writes to non-local NUMA domain
  2      Non-temporal writes to local NUMA domain
  1      Reads to memory in the non-local NUMA domain
  0      Reads to memory in the local NUMA domain
  ====  ===========================================================

For example:

To change the mbm_total_bytes to count only reads on domain 0, the bits
0, 1, 4 and 5 needs to be set, which is 110011b (in hex 0x33).
Run the command:

  $echo  0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config

To change the mbm_total_bytes to count all the slow memory reads on domain 1,
the bits 4 and 5 needs to be set which is 110000b (in hex 0x30).
Run the command:

  $echo  1=0x30 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config

  [ kun: modified due to previous MPAM changes. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20230113152039.770054-12-babu.moger@amd.com
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3140
2024-05-15 02:05:14 +00:00
Babu Moger b2036cd04a x86/resctrl: Add interface to read mbm_total_bytes_config
ANBZ: #8969

commit dc2a3e8579 upstream.

The event configuration can be viewed by the user by reading the
configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config.  The
event configuration settings are domain specific and will affect all the CPUs in
the domain.

Following are the types of events supported:

  ====  ===========================================================
  Bits   Description
  ====  ===========================================================
  6      Dirty Victims from the QOS domain to all types of memory
  5      Reads to slow memory in the non-local NUMA domain
  4      Reads to slow memory in the local NUMA domain
  3      Non-temporal writes to non-local NUMA domain
  2      Non-temporal writes to local NUMA domain
  1      Reads to memory in the non-local NUMA domain
  0      Reads to memory in the local NUMA domain
  ====  ===========================================================

By default, the mbm_total_bytes_config is set to 0x7f to count all the
event types.

For example:

  $cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
  0=0x7f;1=0x7f;2=0x7f;3=0x7f

In this case, the event mbm_total_bytes is configured with 0x7f on
domains 0 to 3.

  [ kun: modified due to previous MPAM changes. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20230113152039.770054-10-babu.moger@amd.com
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3140
2024-05-15 02:05:14 +00:00
Babu Moger 6528b0214b x86/resctrl: Support monitor configuration
ANBZ: #8969

commit d507f83ced upstream.

Add a new field in struct mon_evt to support Bandwidth Monitoring Event
Configuration (BMEC) and also update the "mon_features" display.

The resctrl file "mon_features" will display the supported events
and files that can be used to configure those events if monitor
configuration is supported.

Before the change:

  $ cat /sys/fs/resctrl/info/L3_MON/mon_features
  llc_occupancy
  mbm_total_bytes
  mbm_local_bytes

After the change when BMEC is supported:

  $ cat /sys/fs/resctrl/info/L3_MON/mon_features
  llc_occupancy
  mbm_total_bytes
  mbm_total_bytes_config
  mbm_local_bytes
  mbm_local_bytes_config

  [kun: mondified due to previous MPAM changes. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20230113152039.770054-9-babu.moger@amd.com
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/3140
2024-05-15 02:05:14 +00:00
Shawn Wang dcfedc900d anolis: fs/resctrl: Add a new resctrl monitoring event to get MB in Bps
ANBZ: #8044

Some platforms like Yitian710 can get the memory bandwidth of a specific
PARTID in Bps directly, while current resctrl file system only support
mbm_{local,total}_bytes as counters in bytes. Add a new resctrl monitoring
event mbm_Bps to support this feature.

To avoid introducing a new interface, remains the name "mbm_local_bytes"
instead of "mbm_Bps" as before.

Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/2661
2024-03-27 06:40:37 +00:00
Shawn Wang 202d8a95aa anolis: arm_mpam: Identify different types of machines for MPAM implementation specific features
ANBZ: #8044

ARM MPAM allows different machines have different MPAM
implementation-specific features. To avoid affecting MPAM standard
features and distinguish different machines, introduce a new variable
mpam_current_machine as a machine identifier, which is based on the
information from MPAM ACPI table or the device tree.

Now only Yitian710 is supported. Machines without specific features are not
affected.

Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/2661
2024-03-27 06:40:37 +00:00
Guanjun eaa885aa00 anolis: x86/resctrl: Add mount option for memory bandwidth HWDRC
ANBZ: #7999

Add a mount option to resctrl filesystem so that the admin that enables
memory bandwidth HWDRC can make sure that resctrl doesn't provide any
hooks to control MBA.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Guanjun <guanjun@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Acked-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/2643
2024-02-20 03:17:22 +00:00
Guanjun 1c839d157e anolis: x86/resctrl: Workaround to detect if memory bandwidth HWDRC is capable
ANBZ: #7999

CPUID for memory bandwidth HWDRC feature is not exposed by ICX H/W. But
OS/kernel needs a interface to detect if this feature is capable.

Add a workaround to check the capability via HWDRC OS mailbox. Mount
option "hwdrc_mb" of resctrl fs only takes effect when memory bandwidth
HWDRC feature is capable.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Guanjun <guanjun@linux.alibaba.com>
Reviewed-by: Artie Ding <artie.ding@linux.alibaba.com>
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Acked-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/2643
2024-02-20 03:17:22 +00:00
James Morse 0ede43de3c arm_mpam: Allow MBWU counters to be used, even when not free running
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Resctrl exposes teh cache opccupancy and bandwidth counters to user space
via files. Each control and monitor group has a copy of these files, and
can read the counter by reading the file.

MPAM needs to allocate a hardware monitor to do the counting work, and there
may not be enough for every control and monitor group to have one. Currently
MPAM doesn't expose the MBM/MBWU counters to resctrl unless there are enough
monitors.

To allow perf to read these counters via a PMU driver, add support for
allocating a monitor. To prevent these appearing in the resctrl filesystem,
report false from resctrl_arch_event_is_free_running() for the MBM event
types.

Backport Notes:
Remove the old `resctrl_arch_event_is_free_running()` definition.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:21 +08:00
James Morse 4d05198351 arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Enough MPAM support is present to enable ARCH_HAS_CPU_RESCTRL.
Let it rip^Wlink!

Remove the temporary resctrl_mon_ctx_waiters that was previously
used to hide a link error.

Backport Notes:
Add a default definition for function `resctrl_arch_event_is_free_running()` in
include/linux/arm_mpam.h to fix the compilation error on ARM64 architecture.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:15 +08:00
James Morse 6c2ca482de arm_mpam: resctrl: Add empty definitions for fine-grained enables
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

resctrl has individual hooks to separately enable and disable the
closid/partid and rmid/pmg context switching code.

For MPAM this is all the same thing, as the value in struct task_struct
is used to cache the value that should be written to hardware.
arm64's context switching code is enabled once MPAM is usable, but
doesn't touch the hardware unless the value has changed.

Resctrl doesn't need to ask. Add empty definitions for these hoooks.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:14 +08:00
James Morse 80a9425434 arm_mpam: resctrl: Add empty definitions for pseudo lock
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Pseudo lock isn't supported on arm64. Add empty definitions of the
functions arm64 doesn't implement. Because the Kconfig option is not
selected, none of these will be called.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:12 +08:00
James Morse 226351bed6 arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

resctrl uses resctrl_arch_rmid_read() to read counters. CDP emulation
means the counter may need reading twice to get both the I and D side
allocations. The same goes for reset.

Add the roudning helper for checking monitor values while we're here.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:11 +08:00
James Morse c433c54665 arm_mpam: resctrl: Allow resctrl to allocate monitors for CSU
ANBZ: #1697

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

When resctrl wants to read a domain's 'QOS_L3_OCCUP', it needs
to allocate a monitor on the corresponding resource. Monitors are
allocated by class instead of component because any per-component
user needs to have pre-emption disabled to avoid being migrated to
another CPU.

Add helpers to do this.

This patch temporarily creates resctrl_mon_ctx_waiters as the
resctrl version can't be selected until it will link. This gets
removed in a later patch.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/561
2022-08-02 16:43:10 +08:00
James Morse 8d891b0c0e arm_mpam: resctrl: Add rmid index helpers
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Because MPAM's pmg aren't identical to RDT's rmid, resctrl handles
some datastructrues by index. This allows x86 to map indexes to
RMID, and MPAM to map them to partid-and-pmg.

Add the helpers to do this.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:52 +08:00
James Morse cc46c518c1 arm64: mpam: Add helpers to change a tasks and cpu mpam partid/pmg values
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Care must be taken when modifying the partid and pmg of a task, as
writing these values may race with the task being scheduled in, and
reading the modified values.

Add helpers to set the task properties, and the cpu default value,
and add the plumbing to the mpam driver that lets resctrl use them.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:51 +08:00
James Morse aa956c1273 arm_mpam: resctrl: Add CDP emulation
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Intel RDT's CDP feature allows the cache to use a different control value
depending on whether the accesses was for instruction fetch or a data
access. MPAM's equivalent feature is the other way up: the CPU assigns a
different partid label to traffic depending on whether it was instruction
fetch or a data access, which causes the cache to use a different control
value based solely on the partid.

MPAM can emulate CDP, with the side effect that the alternative partid is
seen by all caches, it can't be enabled per-cache.

Add the resctrl hooks to turn this on or off. Add the helpers that
match a closid against a task, which need to be aware that the value
written to hardware is not the same as the one resctrl is using.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:50 +08:00
James Morse bf3cf98241 arm_mpam: resctrl: Implement resctrl_arch_reset_resources()
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

We already have a helper for reseting an mpam class. Hook it up to
resctrl_arch_reset_resources().

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:45 +08:00
James Morse 1daf975a81 arm_mpam: resctrl: Pick the caches we will use as resctrl resources
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Sytems with MPAM support may have a variety of control types at any
point of their system layout. We can only expose certain types of
control, and only if they exist at particular locations.

Start with the well-know caches. These have to be depth 2 or 3
and support MPAM's cache portion bitmap controls, with a number
of portions fewer that resctrl's limit.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:43 +08:00
James Morse d02a5712a4 arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

resctrl has its own data structures to describe its resources. We
can't use these directly as we play tricks with the 'MBA' resource,
picking the MPAM controls or monitors that best apply. We may export
the same component as both L3 and MBA.

Add mpam_resctrl_exports[] as the array of class->resctrl mappings we
are exporting, and add the cpuhp hooks that allocated and free the
resctrl domain structures.

While we're here, plumb in a few other obvious things.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:42 +08:00
James Morse 1089a9f6b3 arm_mpam: Probe MSCs to find the supported partid/pmg values
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

CPUs can generate traffic with a range of PARTID and PMG values,
but each MSC may have its own maximum size for these fields.
Before MPAM can be used, the driver needs to probe each RIS on
each MSC, to find the system-wide smallest value that can be used.

While doing this, RIS entries that firmware didn't describe are create
under MPAM_CLASS_UNKNOWN.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:27 +08:00
James Morse 3ee0e8fafd arm_mpam: Add the class and component structures for ris firmware described
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

An MSC is a container of resources, each identified by their RIS index.
Some RIS are described by firmware to provide their position in the system.
Others are discovered when the driver probes the hardware.

To configure a resource it needs to be found by its class, e.g. 'L2'.
There are two kinds of grouping, a class is a set of components, which
are visible as there are likely to be multiple instances of the L2 cache.
struct mpam_components are a set of struct mpam_msc_ris, which are not
visible as each L2 cache may be composed of individual slices which need
to be configured the same as the hardware is not able to distribute the
configuration.

Add support for creating and destroying these structures.
A gfp is passed as the structure for 'unknown' may need creating
if a new RIS entry is discovered when probing the MSC.

Backport Notes:
To avoid conflicts, modify `acpi_pptt_get_cpumask_from_cache_id()` to
`acpi_pptt_get_cpumask_from_cache_id_and_level()`.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:24 +08:00
James Morse 807314cc3d ACPI / MPAM: Parse the MPAM table
ANBZ: #1654

cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git

Add code to parse the arm64 specific MPAM table, looking up the cache
level from the PPTT and feeding the end result into the MPAM driver.

Backport Notes:
1. As the kernel needs to support two sets of MPAM codes (Yitian & Kunpeng),
to avoid parsing wrong MPAM table version, we add an OEM ID judgement in
function `acpi_mpam_parse()`.
2. Different types of MSC may have the same identifier number. For the
uniqueness of the platform device, we assign each platfrom device with a value
`msc_num` incremented by one in function `_parse_table()`.
3. As different levels of cache may have the same cache id number,
find_acpi_cache_level_from_id() will have conflicts. Here we set the level of
cache resource to a default value, 3.
4. Change the parameter `ACPI_ACTIVE_LOW` passed to function
`acpi_register_gsi()` to `ACPI_ACTIVE_HIGH`.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-02 16:41:14 +08:00
Xin Hao 75cbe537d0 anolis: arm64:mpam: Move KunPeng mpam relative codes to staging dir
ANBZ: #331

In order to unify the support of MPAM(Memory system performance resource
Partitioning and Monitoring) for the same kernel source in different
arm architecture chips, there move Kunpeng's mpam related codes to the
'drivers/staging/kunpeng' directory temporarily.

Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-01 12:16:26 +00:00
James Morse 641de1cc04 openEuler: arm64/mpam: Enabling registering and logging error interrupts
to #34407882

commit 4b4c3e7a5a95e840109d3dae7afa74b4aafe2213 openEuler

hulk inclusion
category: feature
feature: ARM MPAM support
bugzilla: 48265
CVE: NA

--------------------------------

The MPAM MSC error interrupt tells us how we misconfigured the MSC.
We don't expect to to this. If the interrupt fires, print a
summary, and mark MPAM as broken. Eventually we will try and cleanly
teardown when we see this.

Now we can register from a helper mpam_register_device_irq() to
register overflow and error interrupt from mpam device, When devices
come and go we want to make sure the error irq is enabled. We disable
the error irq when cpus are taken offline in case the component remains
online even when the associated CPUs are offline.

Code of this patch are borrowed from james <james.morse@arm.com>.

[Wang ShaoBo: few version adaptation changes]

Signed-off-by: James Morse <james.morse@arm.com>
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=6d1ceca3eb5953fc16a524c9aad933519aa3f64c
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=81d178c198165fd557431d6879135d2e03ea92c0
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-01 11:16:39 +00:00
Wang ShaoBo 7b57c1f9f9 openEuler: arm64/mpam: Migrate old MSCs' discovery process to new branch
to #34407882

commit 414f08a62123c8129f5897af201d92ff6bae6fef openEuler

hulk inclusion
category: feature
feature: ARM MPAM support
bugzilla: 48265
CVE: NA

--------------------------------

We used to make use of mpam_node structure to initialize MSCs and directly
use resctrl_resource structure to store the MSCs' probing information
before, it's a good choice until we support multiple MSC's node per domain,
so far this new framework mpam_device->mpam_component->mpam_class has been
constructed, we should make MPAM setup process compatible with this new
framework firstly.

At present, we only parsed the base address to create the mpam devices, but
did not deal with the interruption registration issue, which will be dealt
with later.

We will continue to update discovery process from MPAM ACPI tlb according to
latest MPAM ACPI spec.

Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
2022-08-01 11:16:32 +00:00