Skip to content
  1. Jan 16, 2018
    • Nathan Fontenot's avatar
      powerpc/drmem: Add support for ibm, dynamic-memory-v2 property · 2b31e3ae
      Nathan Fontenot authored
      
      
      The Power Hypervisor has introduced a new device tree format for
      the property describing the dynamic reconfiguration LMBs for a system,
      ibm,dynamic-memory-v2. This new format condenses the size of the
      property, especially on large memory systems, by reporting sets
      of LMBs that have the same properties (flags and associativity array
      index).
      
      This patch updates the powerpc/mm/drmem.c code to provide routines
      that can parse the new device tree format during the walk_drmem_lmb*
      routines used during boot, the creation of the LMB array, and updating
      the device tree to create a new property in the proper format for
      ibm,dynamic-memory-v2.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2b31e3ae
    • Nathan Fontenot's avatar
      powerpc: Move of_drconf_cell struct to asm/drmem.h · 2c777215
      Nathan Fontenot authored
      
      
      Now that the powerpc code parses dynamic reconfiguration memory
      LMB information from the LMB array and not the device tree
      directly we can move the of_drconf_cell struct to drmem.h where
      it fits better.
      
      In addition, the struct is renamed to of_drconf_cell_v1 in
      anticipation of upcoming support for version 2 of the dynamic
      reconfiguration property and the members are typed as __be*
      values to reflect how they exist in the device tree.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2c777215
    • Nathan Fontenot's avatar
      powerpc/pseries: Update memory hotplug code to use drmem LMB array · 6195a500
      Nathan Fontenot authored
      
      
      Update the pseries memory hotplug code to use the newly added
      dynamic reconfiguration LMB array. Doing this is required for the
      upcoming support of version 2 of the dynamic reconfiguration
      device tree property.
      
      In addition, making this change cleans up the code that parses the
      LMB information as we no longer need to worry about device tree
      format. This allows us to discard one of the first steps on memory
      hotplug where we make a working copy of the device tree property and
      convert the entire property to cpu format. Instead we just use the
      LMB array directly while holding the memory hotplug lock.
      
      This patch also moves the updating of the device tree property to
      powerpc/mm/drmem.c. This allows to the hotplug code to work without
      needing to know the device tree format and provides a single
      routine for updating the device tree property. This new routine
      will handle determination of the proper device tree format and
      generate a properly formatted device tree property.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6195a500
    • Nathan Fontenot's avatar
      powerpc/numa: Update numa code use walk_drmem_lmbs · 514a9cb3
      Nathan Fontenot authored
      
      
      Update code in powerpc/numa.c to use the walk_drmem_lmbs()
      routine instead of parsing the device tree directly. This is
      in anticipation of introducing a new ibm,dynamic-memory-v2
      property with a different format. This will allow the numa code
      to use a single initialization routine per-LMB irregardless of
      the device tree format.
      
      Additionally, to support additional routines in numa.c that need
      to look up LMB information, an late_init routine is added to drmem.c
      to allocate the array of LMB information. This LMB array will provide
      per-LMB information to separate the LMB data from the device tree
      format.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      514a9cb3
    • Nathan Fontenot's avatar
      powerpc/mm: Separate ibm, dynamic-memory data from DT format · 6c6ea537
      Nathan Fontenot authored
      
      
      We currently have code to parse the dynamic reconfiguration LMB
      information from the ibm,dynamic-meory device tree property in
      multiple locations; numa.c, prom.c, and pseries/hotplug-memory.c.
      In anticipation of adding support for a version 2 of the
      ibm,dynamic-memory property this patch aims to separate the device
      tree information from the device tree format.
      
      Doing this requires a two step process to avoid a possibly very large
      bootmem allocation early in boot. During initial boot, new routines
      are provided to walk the device tree property and make a call-back
      for each LMB.
      
      The second step (introduced in later patches) will allocate an
      array of LMB information that can be used directly without needing
      to know the DT format.
      
      This approach provides the benefit of consolidating the device tree
      property parsing to a single location and (eventually) providing
      a common data structure for retrieving LMB information.
      
      This patch introduces a routine to walk the ibm,dynamic-memory
      property in the flattened device tree and updates the prom.c code
      to use this to initialize memory.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6c6ea537
    • Nathan Fontenot's avatar
      powerpc/numa: Look up associativity array in of_drconf_to_nid_single · b88fc309
      Nathan Fontenot authored
      
      
      Look up the associativity arrays in of_drconf_to_nid_single when
      deriving the nid for a LMB instead of having it passed in as a
      parameter.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b88fc309
    • Nathan Fontenot's avatar
      powerpc/numa: Look up device node in of_get_usable_memory() · 22508f3d
      Nathan Fontenot authored
      
      
      Look up the device node for the usable memory property instead
      of having it passed in as a parameter. This changes precedes an update
      in which the calling routines for of_get_usable_memory() will not have
      the device node pointer to pass in.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      22508f3d
    • Nathan Fontenot's avatar
      powerpc/numa: Look up device node in of_get_assoc_arrays() · 35f80deb
      Nathan Fontenot authored
      
      
      Look up the device node for the associativity array property instead
      of having it passed in as a parameter. This changes precedes an update
      in which the calling routines for of_get_assoc_arrays() will not have
      the device node pointer to pass in.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      35f80deb
  2. Jan 03, 2018
  3. Dec 22, 2017
  4. Dec 20, 2017
    • Ram Pai's avatar
      powerpc: capture the PTE format changes in the dump pte report · 7e436355
      Ram Pai authored
      
      
      The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
      capture these changes in the dump pte report.
      
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7e436355
    • Ram Pai's avatar
      powerpc: use helper functions to get and set hash slots · a8548686
      Ram Pai authored
      
      
      replace redundant code in __hash_page_4K() and flush_hash_page()
      with helper functions pte_get_hash_gslot() and pte_set_hidx()
      
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a8548686
    • Ram Pai's avatar
      powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 · 273b4936
      Ram Pai authored
      
      
      We need PTE bits 3 ,4, 5, 6 and 57 to support protection-keys,
      because these are the bits we want to consolidate on across all
      configuration to support protection keys.
      
      Bit 3,4,5 and 6 are currently used on 4K-pte kernels. But bit 9
      and 10 are available. Hence we use the two available bits and
      free up bit 5 and 6. We will still not be able to free up bit 3
      and 4. In the absence of any other free bits, we will have to
      stay satisfied with what we have :-(. This means we will not
      be able to support 32 protection keys, but only 8. The bit
      numbers are big-endian as defined in the ISA3.0
      
      This patch does the following change to 4K PTE.
      
      H_PAGE_F_SECOND (S) which occupied bit 4 moves to bit 7.
      H_PAGE_F_GIX (G,I,X) which occupied bit 5, 6 and 7 also moves
      to bit 8,9, 10 respectively.
      H_PAGE_HASHPTE (H) which occupied bit 8 moves to bit 4.
      
      Before the patch, the 4k PTE format was as follows
      
       0 1 2 3 4  5  6  7  8 9 10....................57.....63
       : : : : :  :  :  :  : : :                      :     :
       v v v v v  v  v  v  v v v                      v     v
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x|B|S |G |I |X |H| | |x|x|................| |x|x|x|
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      
      After the patch, the 4k PTE format is as follows
      
       0 1 2 3 4  5  6  7  8 9 10....................57.....63
       : : : : :  :  :  :  : : :                      :     :
       v v v v v  v  v  v  v v v                      v     v
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x|B|H |  |  |S |G|I|X|x|x|................| |.|.|.|
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      
      The patch has no code changes; just swizzles around bits.
      
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      273b4936
    • Ram Pai's avatar
      powerpc: shifted-by-one hidx value · 7b84947c
      Ram Pai authored
      
      
      0xf is considered invalid hidx value. It indicates absence of a backing
      HPTE. A PTE is initialized to 0xf either
      a) when it is new it is newly allocated to hold 4k-backing-HPTE
      	or
      b) Any time it gets demoted to a 4k-backing-HPTE
      
      This patch shifts the representation by one-modulo-0xf; i.e hidx 0 is
      represented as 1, 1 as 2,... , and 0xf as 0. This convention lets us
      initialize the secondary-part of the PTE to all zeroes. PTEs are anyway
      zero'd when allocated. We do not have to zero them again; thus saving on
      the initialization.
      
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7b84947c
    • Ram Pai's avatar
      powerpc: Free up four 64K PTE bits in 64K backed HPTE pages · bf9a95f9
      Ram Pai authored
      
      
      Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6
      in the 64K backed HPTE pages. This along with the earlier
      patch will entirely free up the four bits from 64K PTE.
      The bit numbers are big-endian as defined in the ISA3.0
      
      This patch does the following change to 64K PTE backed
      by 64K HPTE.
      
      H_PAGE_F_SECOND (S) which occupied bit 4 moves to the
      	second part of the pte to bit 60.
      H_PAGE_F_GIX (G,I,X) which occupied bit 5, 6 and 7 also
      	moves to the second part of the pte to bit 61,
       	62, 63, 64 respectively
      
      since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
      bit 9 to bit 7.
      
      The second part of the PTE will hold
      (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
      NOTE: None of the bits in the secondary PTE were not used
      by 64k-HPTE backed PTE.
      
      Before the patch, the 64K HPTE backed 64k PTE format was
      as follows
      
       0 1 2 3 4  5  6  7  8 9 10...........................63
       : : : : :  :  :  :  : : :                            :
       v v v v v  v  v  v  v v v                            v
      
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
      
      After the patch, the 64k HPTE backed 64k PTE format is
      as follows
      
       0 1 2 3 4  5  6  7  8 9 10...........................63
       : : : : :  :  :  :  : : :                            :
       v v v v v  v  v  v  v v v                            v
      
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
      
      The above PTE changes is applicable to hugetlbpages aswell.
      
      The patch does the following code changes:
      
      a) moves the H_PAGE_F_SECOND and H_PAGE_F_GIX to 4k PTE
      	header since it is no more needed b the 64k PTEs.
      b) abstracts out __real_pte() and __rpte_to_hidx() so the
      	caller need not know the bit location of the slot.
      c) moves the slot bits to the secondary pte.
      
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      bf9a95f9
    • Ram Pai's avatar
      powerpc: Free up four 64K PTE bits in 4K backed HPTE pages · 9d2edb18
      Ram Pai authored
      
      
      Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6,
      in the 4K backed HPTE pages.These bits continue to be used
      for 64K backed HPTE pages in this patch, but will be freed
      up in the next patch. The bit numbers are big-endian as
      defined in the ISA3.0
      
      The patch does the following change to the 4k HTPE backed
      64K PTE's format.
      
      H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
      		below)
      V0 which occupied bit 4 is not used anymore.
      V1 which occupied bit 5 is not used anymore.
      V2 which occupied bit 6 is not used anymore.
      V3 which occupied bit 7 is not used anymore.
      
      Before the patch, the 4k backed 64k PTE format was as follows
      
       0 1 2 3 4  5  6  7  8 9 10...........................63
       : : : : :  :  :  :  : : :                            :
       v v v v v  v  v  v  v v v                            v
      
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
      
      After the patch, the 4k backed 64k PTE format is as follows
      
       0 1 2 3 4  5  6  7  8 9 10...........................63
       : : : : :  :  :  :  : : :                            :
       v v v v v  v  v  v  v v v                            v
      
      ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
      |x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
      |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
      '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
      
      the four bits S,G,I,X (one quadruplet per 4k HPTE) that
      cache the hash-bucket slot value, is initialized to
      1,1,1,1 indicating -- an invalid slot. If a HPTE gets
      cached in a 1111 slot(i.e 7th slot of secondary hash
      bucket), it is released immediately. In other words,
      even though 1111 is a valid slot value in the hash
      bucket, we consider it invalid and release the slot and
      the HPTE. This gives us the opportunity to determine
      the validity of S,G,I,X bits based on its contents and
      not on any of the bits V0,V1,V2 or V3 in the primary PTE
      
      When we release a HPTE cached in the 1111 slot
      we also release a legitimate slot in the primary
      hash bucket and unmap its corresponding HPTE. This
      is to ensure that we do get a HPTE cached in a slot
      of the primary hash bucket, the next time we retry.
      
      Though treating 1111 slot as invalid, reduces the
      number of available slots in the hash bucket and may
      have an effect on the performance, the probabilty of
      hitting a 1111 slot is extermely low.
      
      Compared to the current scheme, the above scheme
      reduces the number of false hash table updates
      significantly and has the added advantage of releasing
      four valuable PTE bits for other purpose.
      
      NOTE:even though bits 3, 4, 5, 6, 7 are not used when
      the 64K PTE is backed by 4k HPTE, they continue to be
      used if the PTE gets backed by 64k HPTE. The next
      patch will decouple that aswell, and truely release the
      bits.
      
      This idea was jointly developed by Paul Mackerras,
      Aneesh, Michael Ellermen and myself.
      
      4K PTE format remains unchanged currently.
      
      The patch does the following code changes
      a) PTE flags are split between 64k and 4k header files.
      b) __hash_page_4K() is reimplemented to reflect the
       above logic.
      
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9d2edb18
    • Ram Pai's avatar
      powerpc: introduce pte_get_hash_gslot() helper · 318995b4
      Ram Pai authored
      
      
      Introduce pte_get_hash_gslot()() which returns the global slot number of
      the HPTE in the global hash table.
      
      This function will come in handy as we work towards re-arranging the PTE
      bits in the later patches.
      
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      318995b4
    • Ram Pai's avatar
      powerpc: introduce pte_set_hidx() helper · 59aa31fd
      Ram Pai authored
      
      
      Introduce pte_set_hidx().It sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX) bits
      at the appropriate location in the PTE of 4K PTE. For 64K PTE, it sets
      the bits in the second part of the PTE. Though the implementation for
      the former just needs the slot parameter, it does take some additional
      parameters to keep the prototype consistent.
      
      This function will be handy as we work towards re-arranging the bits in
      the subsequent patches.
      
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      59aa31fd
  5. Dec 11, 2017
  6. Dec 04, 2017