Skip to content
  1. May 24, 2020
  2. May 23, 2020
  3. May 22, 2020
  4. May 21, 2020
  5. May 20, 2020
    • Xiaoguang Wang's avatar
      io_uring: reset -EBUSY error when io sq thread is waken up · d4ae271d
      Xiaoguang Wang authored
      
      
      In io_sq_thread(), currently if we get an -EBUSY error and go to sleep,
      we will won't clear it again, which will result in io_sq_thread() will
      never have a chance to submit sqes again. Below test program test.c
      can reveal this bug:
      
      int main(int argc, char *argv[])
      {
              struct io_uring ring;
              int i, fd, ret;
              struct io_uring_sqe *sqe;
              struct io_uring_cqe *cqe;
              struct iovec *iovecs;
              void *buf;
              struct io_uring_params p;
      
              if (argc < 2) {
                      printf("%s: file\n", argv[0]);
                      return 1;
              }
      
              memset(&p, 0, sizeof(p));
              p.flags = IORING_SETUP_SQPOLL;
              ret = io_uring_queue_init_params(4, &ring, &p);
              if (ret < 0) {
                      fprintf(stderr, "queue_init: %s\n", strerror(-ret));
                      return 1;
              }
      
              fd = open(argv[1], O_RDONLY | O_DIRECT);
              if (fd < 0) {
                      perror("open");
                      return 1;
              }
      
              iovecs = calloc(10, sizeof(struct iovec));
              for (i = 0; i < 10; i++) {
                      if (posix_memalign(&buf, 4096, 4096))
                              return 1;
                      iovecs[i].iov_base = buf;
                      iovecs[i].iov_len = 4096;
              }
      
              ret = io_uring_register_files(&ring, &fd, 1);
              if (ret < 0) {
                      fprintf(stderr, "%s: register %d\n", __FUNCTION__, ret);
                      return ret;
              }
      
              for (i = 0; i < 10; i++) {
                      sqe = io_uring_get_sqe(&ring);
                      if (!sqe)
                              break;
      
                      io_uring_prep_readv(sqe, 0, &iovecs[i], 1, 0);
                      sqe->flags |= IOSQE_FIXED_FILE;
      
                      ret = io_uring_submit(&ring);
                      sleep(1);
                      printf("submit %d\n", i);
              }
      
              for (i = 0; i < 10; i++) {
                      io_uring_wait_cqe(&ring, &cqe);
                      printf("receive: %d\n", i);
                      if (cqe->res != 4096) {
                              fprintf(stderr, "ret=%d, wanted 4096\n", cqe->res);
                              ret = 1;
                      }
                      io_uring_cqe_seen(&ring, cqe);
              }
      
              close(fd);
              io_uring_queue_exit(&ring);
              return 0;
      }
      sudo ./test testfile
      above command will hang on the tenth request, to fix this bug, when io
      sq_thread is waken up, we reset the variable 'ret' to be zero.
      
      Suggested-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d4ae271d
    • Christophe Leroy's avatar
      Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits." · 40bb0e90
      Christophe Leroy authored
      
      
      This reverts commit 697ece78.
      
      The implementation of SWAP on powerpc requires page protection
      bits to not be one of the least significant PTE bits.
      
      Until the SWAP implementation is changed and this requirement voids,
      we have to keep at least _PAGE_RW outside of the 3 last bits.
      
      For now, revert to previous PTE bits order. A further rework
      may come later.
      
      Fixes: 697ece78 ("powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits.")
      Reported-by: default avatarRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/b34706f8de87f84d135abb5f3ede6b6f16fb1f41.1589969799.git.christophe.leroy@csgroup.eu
      40bb0e90
    • Keno Fischer's avatar
      arm64: Fix PTRACE_SYSEMU semantics · 1cf6022b
      Keno Fischer authored
      
      
      Quoth the man page:
      ```
             If the tracee was restarted by PTRACE_SYSCALL or PTRACE_SYSEMU, the
             tracee enters syscall-enter-stop just prior to entering any system
             call (which will not be executed if the restart was using
             PTRACE_SYSEMU, regardless of any change made to registers at this
             point or how the tracee is restarted after this stop).
      ```
      
      The parenthetical comment is currently true on x86 and powerpc,
      but not currently true on arm64. arm64 re-checks the _TIF_SYSCALL_EMU
      flag after the syscall entry ptrace stop. However, at this point,
      it reflects which method was used to re-start the syscall
      at the entry stop, rather than the method that was used to reach it.
      Fix that by recording the original flag before performing the ptrace
      stop, bringing the behavior in line with documentation and x86/powerpc.
      
      Fixes: f086f674 ("arm64: ptrace: add support for syscall emulation")
      Cc: <stable@vger.kernel.org> # 5.3.x-
      Signed-off-by: default avatarKeno Fischer <keno@juliacomputing.com>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Tested-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Tested-by: default avatarBin Lu <Bin.Lu@arm.com>
      [catalin.marinas@arm.com: moved 'flags' bit masking]
      [catalin.marinas@arm.com: changed 'flags' type to unsigned long]
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      1cf6022b
    • Jens Axboe's avatar
      io_uring: don't add non-IO requests to iopoll pending list · b532576e
      Jens Axboe authored
      
      
      We normally disable any commands that aren't specifically poll commands
      for a ring that is setup for polling, but we do allow buffer provide and
      remove commands to support buffer selection for polled IO. Once a
      request is issued, we add it to the poll list to poll for completion. But
      we should not do that for non-IO commands, as those request complete
      inline immediately and aren't pollable. If we do, we can leave requests
      on the iopoll list after they are freed.
      
      Fixes: ddf0322d ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b532576e
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 115a5416
      Linus Torvalds authored
      Pull vfs fix from Al Viro:
       "Stable fodder fix: copy_fdtable() would get screwed on 64bit boxen
        with sysctl_nr_open raised to 512M or higher, which became possible
        since 2.6.25.
      
        Nobody sane would set the things up that way, but..."
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix multiplication overflow in copy_fdtable()
      115a5416
    • Linus Torvalds's avatar
      Merge tag 'arc-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 3c9e6656
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - fix recent DSP code regression on ARC700 platforms
      
       - fix thinkos in ICCM/DCCM size checks
      
       - USB regression fix
      
       - other small fixes here and there
      
      * tag 'arc-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: show_regs: avoid extra line of output
        ARC: guard dsp early init against non ARCv2
        ARC: [plat-eznps]: Restrict to CONFIG_ISA_ARCOMPACT
        ARC: entry: comment
        arc: remove #ifndef CONFIG_AS_CFI_SIGNAL_FRAME
        arc: ptrace: hard-code "arc" instead of UTS_MACHINE
        ARC: [plat-hsdk]: fix USB regression
        ARC: Fix ICCM & DCCM runtime size checks
      3c9e6656
    • Al Viro's avatar
      fix multiplication overflow in copy_fdtable() · 4e89b721
      Al Viro authored
      
      
      cpy and set really should be size_t; we won't get an overflow on that,
      since sysctl_nr_open can't be set above ~(size_t)0 / sizeof(void *),
      so nr that would've managed to overflow size_t on that multiplication
      won't get anywhere near copy_fdtable() - we'll fail with EMFILE
      before that.
      
      Cc: stable@kernel.org # v2.6.25+
      Fixes: 9cfe015a (get rid of NR_OPEN and introduce a sysctl_nr_open)
      Reported-by: default avatarThiago Macieira <thiago.macieira@intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4e89b721
    • Bijan Mottahedeh's avatar
      io_uring: don't use kiocb.private to store buf_index · 4f4eeba8
      Bijan Mottahedeh authored
      
      
      kiocb.private is used in iomap_dio_rw() so store buf_index separately.
      
      Signed-off-by: default avatarBijan Mottahedeh <bijan.mottahedeh@oracle.com>
      
      Move 'buf_index' to a hole in io_kiocb.
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4f4eeba8