Skip to content
  1. Aug 25, 2021
  2. Aug 23, 2021
  3. Aug 20, 2021
  4. Aug 18, 2021
  5. Aug 17, 2021
  6. Aug 16, 2021
  7. Aug 15, 2021
    • Fangrui Song's avatar
      powerpc: Add "-z notext" flag to disable diagnostic · 03557853
      Fangrui Song authored
      Object files used to link .tmp_vmlinux.kallsyms1 have many
      R_PPC64_ADDR64 relocations in non-SHF_WRITE sections. There are many
      text relocations (e.g. in .rela___ksymtab_gpl+* and .rela__mcount_loc
      sections) in a -pie link and are disallowed by LLD:
      
        ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output
        >>> defined in arch/powerpc/kernel/head_64.o
        >>> referenced by arch/powerpc/kernel/head_64.o:(__restart_table+0x10)
      
      Newer GNU ld configured with "--enable-textrel-check=error" will report
      an error as well:
      
        $ ld-new -EL -m elf64lppc -pie ... -o .tmp_vmlinux.kallsyms1 ...
        ld-new: read-only segment has dynamic relocations
      
      Add "-z notext" to suppress the errors. Non-CONFIG_RELOCATABLE builds
      use the default -no-pie mode and thus R_PPC64_ADDR64 relocations can be
      resolved at link-time.
      
      Reported-by: It...
      03557853
    • Christophe Leroy's avatar
      powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto · 1e688dd2
      Christophe Leroy authored
      
      
      Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
      flexibility to GCC.
      
      For that add an entry to the exception table so that
      program_check_exception() knowns where to resume execution
      after a WARNING.
      
      Here are two exemples. The first one is done on PPC32 (which
      benefits from the previous patch), the second is on PPC64.
      
      	unsigned long test(struct pt_regs *regs)
      	{
      		int ret;
      
      		WARN_ON(regs->msr & MSR_PR);
      
      		return regs->gpr[3];
      	}
      
      	unsigned long test9w(unsigned long a, unsigned long b)
      	{
      		if (WARN_ON(!b))
      			return 0;
      		return a / b;
      	}
      
      Before the patch:
      
      	000003a8 <test>:
      	 3a8:	81 23 00 84 	lwz     r9,132(r3)
      	 3ac:	71 29 40 00 	andi.   r9,r9,16384
      	 3b0:	40 82 00 0c 	bne     3bc <test+0x14>
      	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
      	 3b8:	4e 80 00 20 	blr
      
      	 3bc:	0f e0 00 00 	twui    r0,0
      	 3c0:	80 63 00 0c 	lwz     r3,12(r3)
      	 3c4:	4e 80 00 20 	blr
      
      	0000000000000bf0 <.test9w>:
      	 bf0:	7c 89 00 74 	cntlzd  r9,r4
      	 bf4:	79 29 d1 82 	rldicl  r9,r9,58,6
      	 bf8:	0b 09 00 00 	tdnei   r9,0
      	 bfc:	2c 24 00 00 	cmpdi   r4,0
      	 c00:	41 82 00 0c 	beq     c0c <.test9w+0x1c>
      	 c04:	7c 63 23 92 	divdu   r3,r3,r4
      	 c08:	4e 80 00 20 	blr
      
      	 c0c:	38 60 00 00 	li      r3,0
      	 c10:	4e 80 00 20 	blr
      
      After the patch:
      
      	000003a8 <test>:
      	 3a8:	81 23 00 84 	lwz     r9,132(r3)
      	 3ac:	71 29 40 00 	andi.   r9,r9,16384
      	 3b0:	40 82 00 0c 	bne     3bc <test+0x14>
      	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
      	 3b8:	4e 80 00 20 	blr
      
      	 3bc:	0f e0 00 00 	twui    r0,0
      
      	0000000000000c50 <.test9w>:
      	 c50:	7c 89 00 74 	cntlzd  r9,r4
      	 c54:	79 29 d1 82 	rldicl  r9,r9,58,6
      	 c58:	0b 09 00 00 	tdnei   r9,0
      	 c5c:	7c 63 23 92 	divdu   r3,r3,r4
      	 c60:	4e 80 00 20 	blr
      
      	 c70:	38 60 00 00 	li      r3,0
      	 c74:	4e 80 00 20 	blr
      
      In the first exemple, we see GCC doesn't need to duplicate what
      happens after the trap.
      
      In the second exemple, we see that GCC doesn't need to emit a test
      and a branch in the likely path in addition to the trap.
      
      We've got some WARN_ON() in .softirqentry.text section so it needs
      to be added in the OTHER_TEXT_SECTIONS in modpost.c
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/389962b1b702e3c78d169e59bcfac56282889173.1618331882.git.christophe.leroy@csgroup.eu
      1e688dd2
  8. Aug 14, 2021
    • Christophe Leroy's avatar
      powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 · db87a719
      Christophe Leroy authored
      powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
      
      For catching simple conditions like a variable having value 0, this
      is efficient because it does the test and the trap at the same time.
      But most conditions used with BUG_ON or WARN_ON are more complex and
      forces GCC to format the condition into a 0 or 1 value in a register.
      This will usually require 2 to 3 instructions.
      
      The most efficient solution would be to use __builtin_trap() because
      GCC is able to optimise the use of the different trap instructions
      based on the requested condition, but this is complex if not
      impossible for the following reasons:
      - __builtin_trap() is a non-recoverable instruction, so it can't be
      used for WARN_ON
      - Knowing which line of code generated the trap would require the
      analysis of DWARF information. This is not a feature we have today.
      
      As mentioned in commit 8d4fbcfb
      
       ("Fix WARN_ON() on bitfield ops")
      the way WARN_ON() is implemented is suboptimal. That commit also
      mentions an issue with 'long long' condition. It fixed it for
      WARN_ON() but the same problem still exists today with BUG_ON() on
      PPC32. It will be fixed by using the generic implementation.
      
      By using the generic implementation, gcc will naturally generate a
      branch to the unconditional trap generated by BUG().
      
      As modern powerpc implement zero-cycle branch,
      that's even more efficient.
      
      And for the functions using WARN_ON() and its return, the test
      on return from WARN_ON() is now also used for the WARN_ON() itself.
      
      On PPC64 we don't want it because we want to be able to use CFAR
      register to track how we entered the code that trapped. The CFAR
      register would be clobbered by the branch.
      
      A simple test function:
      
      	unsigned long test9w(unsigned long a, unsigned long b)
      	{
      		if (WARN_ON(!b))
      			return 0;
      		return a / b;
      	}
      
      Before the patch:
      
      	0000046c <test9w>:
      	 46c:	7c 89 00 34 	cntlzw  r9,r4
      	 470:	55 29 d9 7e 	rlwinm  r9,r9,27,5,31
      	 474:	0f 09 00 00 	twnei   r9,0
      	 478:	2c 04 00 00 	cmpwi   r4,0
      	 47c:	41 82 00 0c 	beq     488 <test9w+0x1c>
      	 480:	7c 63 23 96 	divwu   r3,r3,r4
      	 484:	4e 80 00 20 	blr
      
      	 488:	38 60 00 00 	li      r3,0
      	 48c:	4e 80 00 20 	blr
      
      After the patch:
      
      	00000468 <test9w>:
      	 468:	2c 04 00 00 	cmpwi   r4,0
      	 46c:	41 82 00 0c 	beq     478 <test9w+0x10>
      	 470:	7c 63 23 96 	divwu   r3,r3,r4
      	 474:	4e 80 00 20 	blr
      
      	 478:	0f e0 00 00 	twui    r0,0
      	 47c:	38 60 00 00 	li      r3,0
      	 480:	4e 80 00 20 	blr
      
      So we see before the patch we need 3 instructions on the likely path
      to handle the WARN_ON(). With the patch the trap goes on the unlikely
      path.
      
      See below the difference at the entry of system_call_exception where
      we have several BUG_ON(), allthough less impressing.
      
      With the patch:
      
      	00000000 <system_call_exception>:
      	   0:	81 6a 00 84 	lwz     r11,132(r10)
      	   4:	90 6a 00 88 	stw     r3,136(r10)
      	   8:	71 60 00 02 	andi.   r0,r11,2
      	   c:	41 82 00 70 	beq     7c <system_call_exception+0x7c>
      	  10:	71 60 40 00 	andi.   r0,r11,16384
      	  14:	41 82 00 6c 	beq     80 <system_call_exception+0x80>
      	  18:	71 6b 80 00 	andi.   r11,r11,32768
      	  1c:	41 82 00 68 	beq     84 <system_call_exception+0x84>
      	  20:	94 21 ff e0 	stwu    r1,-32(r1)
      	  24:	93 e1 00 1c 	stw     r31,28(r1)
      	  28:	7d 8c 42 e6 	mftb    r12
      	...
      	  7c:	0f e0 00 00 	twui    r0,0
      	  80:	0f e0 00 00 	twui    r0,0
      	  84:	0f e0 00 00 	twui    r0,0
      
      Without the patch:
      
      	00000000 <system_call_exception>:
      	   0:	94 21 ff e0 	stwu    r1,-32(r1)
      	   4:	93 e1 00 1c 	stw     r31,28(r1)
      	   8:	90 6a 00 88 	stw     r3,136(r10)
      	   c:	81 6a 00 84 	lwz     r11,132(r10)
      	  10:	69 60 00 02 	xori    r0,r11,2
      	  14:	54 00 ff fe 	rlwinm  r0,r0,31,31,31
      	  18:	0f 00 00 00 	twnei   r0,0
      	  1c:	69 60 40 00 	xori    r0,r11,16384
      	  20:	54 00 97 fe 	rlwinm  r0,r0,18,31,31
      	  24:	0f 00 00 00 	twnei   r0,0
      	  28:	69 6b 80 00 	xori    r11,r11,32768
      	  2c:	55 6b 8f fe 	rlwinm  r11,r11,17,31,31
      	  30:	0f 0b 00 00 	twnei   r11,0
      	  34:	7d 8c 42 e6 	mftb    r12
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/b286e07fb771a664b631cd07a40b09c06f26e64b.1618331881.git.christophe.leroy@csgroup.eu
      db87a719
  9. Aug 13, 2021