Skip to content
  1. Jul 15, 2013
  2. Jul 13, 2013
    • J. Bruce Fields's avatar
      nfsd4: fix minorversion support interface · 35f7a14f
      J. Bruce Fields authored
      
      
      You can turn on or off support for minorversions using e.g.
      
      	echo "-4.2" >/proc/fs/nfsd/versions
      
      However, the current implementation is a little wonky.  For example, the
      above will turn off 4.2 support, but it will also turn *on* 4.1 support.
      
      This didn't matter as long as we only had 2 minorversions, which was
      true till very recently.
      
      And do a little cleanup here.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      35f7a14f
  3. Jul 12, 2013
    • David Jeffery's avatar
      lockd: protect nlm_blocked access in nlmsvc_retry_blocked · 1c327d96
      David Jeffery authored
      In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring
      the pointer of the first entry is unprotected by any lock.  This allows a rare
      race condition when there is only one entry on the list.  A function such as
      nlmsvc_grant_callback() can be called, which will temporarily remove the entry
      from the list.  Between the list_empty() and list_entry(),the list may become
      empty, causing an invalid pointer to be used as an nlm_block, leading to a
      possible crash.
      
      This patch adds the nlm_block_lock around these calls to prevent concurrent
      use of the nlm_blocked list.
      
      This was a regression introduced by
      f904be9c
      
        "lockd: Mostly remove BKL from
      the server".
      
      Cc: Bryan Schumaker <bjschuma@netapp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1c327d96
  4. Jul 09, 2013
    • J. Bruce Fields's avatar
      nfsd4: support minorversion 1 by default · d1091481
      J. Bruce Fields authored
      
      
      We now have minimal minorversion 1 support; turn it on by default.
      
      This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions".
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      d1091481
    • J. Bruce Fields's avatar
      nfsd4: allow destroy_session over destroyed session · f0f51f5c
      J. Bruce Fields authored
      
      
      RFC 5661 allows a client to destroy a session using a compound
      associated with the destroyed session, as long as the DESTROY_SESSION op
      is the last op of the compound.
      
      We attempt to allow this, but testing against a Solaris client (which
      does destroy sessions in this way) showed that we were failing the
      DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference
      count on the session (held by us) represented another rpc in progress
      over this session.
      
      Fix this by noting that in this case the expected reference count is 1,
      not 0.
      
      Also, note as long as the session holds a reference to the compound
      we're destroying, we can't free it here--instead, delay the free till
      the final put in nfs4svc_encode_compoundres.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      f0f51f5c
    • J. Bruce Fields's avatar
      svcrpc: fix failures to handle -1 uid's · 0979292b
      J. Bruce Fields authored
      As of f025adf1 "sunrpc: Properly decode
      kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
      (0xffff) uid or gid would fail with a badcred error.
      
      Commit afe3c3fd
      
       "svcrpc: fix failures to
      handle -1 uid's and gid's" fixed part of the problem, but overlooked the
      gid upcall--the kernel can request supplementary gid's for the -1 uid,
      but mountd's attempt write a response will get -EINVAL.
      
      Symptoms were nfsd failing to reply to the first attempt to use a newly
      negotiated krb5 context.
      
      Reported-by: default avatarSven Geggus <lists@fuchsschwanzdomain.de>
      Tested-by: default avatarSven Geggus <lists@fuchsschwanzdomain.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0979292b
  5. Jul 02, 2013
    • NeilBrown's avatar
      sunrpc: Don't schedule an upcall on a replaced cache entry. · 0bebc633
      NeilBrown authored
      
      
      When a cache entry is replaced, the "expiry_time" get set to
      zero by a call to "cache_fresh_locked(..., 0)" at the end of
      "sunrpc_cache_update".
      
      This low expiry time makes cache_check() think that the 'refresh_age'
      is negative, so the 'age' is comparatively large and a refresh is
      triggered.
      However refreshing a replaced entry it pointless, it cannot achieve
      anything useful.
      
      So teach cache_check to ignore a low refresh_age when expiry_time
      is zero.
      
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0bebc633
    • NeilBrown's avatar
      net/sunrpc: xpt_auth_cache should be ignored when expired. · 7715cde8
      NeilBrown authored
      commit d202cce8
      
      
          sunrpc: never return expired entries in sunrpc_cache_lookup
      
      moved the 'entry is expired' test from cache_check to
      sunrpc_cache_lookup, so that it happened early and some races could
      safely be ignored.
      
      However the ip_map (in svcauth_unix.c) has a separate single-item
      cache which allows quick lookup without locking.  An entry in this
      case would not be subject to the expiry test and so could be used
      well after it has expired.
      
      This is not normally a big problem because the first time it is used
      after it is expired an up-call will be scheduled to refresh the entry
      (if it hasn't been scheduled already) and the old entry will then
      be invalidated.  So on the second attempt to use it after it has
      expired, ip_map_cached_get will discard it.
      
      However that is subtle and not ideal, so replace the "!cache_valid"
      test with "cache_is_expired".
      In doing this we drop the test on the "CACHE_VALID" bit.  This is
      unnecessary as the bit is never cleared, and an entry will only
      be cached if the bit is set.
      
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      7715cde8
    • NeilBrown's avatar
      sunrpc/cache: ensure items removed from cache do not have pending upcalls. · 013920eb
      NeilBrown authored
      
      
      It is possible for a race to set CACHE_PENDING after cache_clean()
      has removed a cache entry from the cache.
      If CACHE_PENDING is still set when the entry is finally 'put',
      the cache_dequeue() will never happen and we can leak memory.
      
      So set a new flag 'CACHE_CLEANED' when we remove something from
      the cache, and don't queue any upcall if it is set.
      
      If CACHE_PENDING is set before CACHE_CLEANED, the call that
      cache_clean() makes to cache_fresh_unlocked() will free memory
      as needed.  If CACHE_PENDING is set after CACHE_CLEANED, the
      test in sunrpc_cache_pipe_upcall will ensure that the memory
      is not allocated.
      
      Reported-by: default avatar <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      013920eb
    • NeilBrown's avatar
      sunrpc/cache: use cache_fresh_unlocked consistently and correctly. · 2a1c7f53
      NeilBrown authored
      
      
      cache_fresh_unlocked() is called when a cache entry
      has been updated and ensures that if there were any
      pending upcalls, they are cleared.
      
      So every time we update a cache entry, we should call this,
      and this should be the only way that we try to clear
      pending calls (that sort of uniformity makes code sooo much
      easier to read).
      
      try_to_negate_entry() will (possibly) mark an entry as
      negative.  If it doesn't, it is because the entry already
      is VALID.
      So the entry will be valid on exit, so it is appropriate to
      call cache_fresh_unlocked().
      So tidy up try_to_negate_entry() to do that, and remove
      partial open-coded cache_fresh_unlocked() from the one
      call-site of try_to_negate_entry().
      
      In the other branch of the 'switch(cache_make_upcall())',
      we again have a partial open-coded version of cache_fresh_unlocked().
      Replace that with a real call.
      
      And again in cache_clean(), use a real call to cache_fresh_unlocked().
      
      These call sites might previously have called
      cache_revisit_request() if CACHE_PENDING wasn't set.
      This is never necessary because cache_revisit_request() can
      only do anything if the item is in the cache_defer_hash,
      However any time that an item is added to the cache_defer_hash
      (setup_deferral), the code immediately tests CACHE_PENDING,
      and removes the entry again if it is clear.  So all other
      places we only need to 'cache_revisit_request' if we've
      just cleared CACHE_PENDING.
      
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      2a1c7f53
    • NeilBrown's avatar
      sunrpc/cache: remove races with queuing an upcall. · f9e1aedc
      NeilBrown authored
      
      
      We currently queue an upcall after setting CACHE_PENDING,
      and dequeue after clearing CACHE_PENDING.
      So a request should only be present when CACHE_PENDING is set.
      
      However we don't combine the test and the enqueue/dequeue in
      a protected region, so it is possible (if unlikely) for a race
      to result in a request being queued without CACHE_PENDING set,
      or a request to be absent despite CACHE_PENDING.
      
      So: include a test for CACHE_PENDING inside the regions of
      enqueue and dequeue where queue_lock is held, and abort
      the operation if the value is not as expected.
      
      Also remove the early 'return' from cache_dequeue() to ensure that it
      always removes all entries: As there is no locking between setting
      CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
      inconceivable for some other thread to clear CACHE_PENDING and then
      someone else to set it and call sunrpc_cache_pipe_upcall, both before
      the original threads completed the call.
      
      With this, it perfectly safe and correct to:
       - call cache_dequeue() if and only if we have just
         cleared CACHE_PENDING
       - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
         if and only if we have just set CACHE_PENDING.
      
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      f9e1aedc
    • J. Bruce Fields's avatar
      nfsd4: return delegation immediately if lease fails · d08d32e6
      J. Bruce Fields authored
      
      
      This case shouldn't happen--the administrator shouldn't really allow
      other applications access to the export until clients have had the
      chance to reclaim their state--but if it does then we should set the
      "return this lease immediately" bit on the reply.  That still leaves
      some small races, but it's the best the protocol allows us to do in the
      case a lease is ripped out from under us....
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      d08d32e6
    • J. Bruce Fields's avatar
      nfsd4: do not throw away 4.1 lock state on last unlock · 0a262ffb
      J. Bruce Fields authored
      This reverts commit eb2099f3
      
       "nfsd4:
      release lockowners on last unlock in 4.1 case".  Trond identified
      language in rfc 5661 section 8.2.4 which forbids this behavior:
      
      	Stateids associated with byte-range locks are an exception.
      	They remain valid even if a LOCKU frees all remaining locks, so
      	long as the open file with which they are associated remains
      	open, unless the client frees the stateids via the FREE_STATEID
      	operation.
      
      And bakeathon 2013 testing found a 4.1 freebsd client was getting an
      incorrect BAD_STATEID return from a FREE_STATEID in the above situation
      and then failing.
      
      The spec language honestly was probably a mistake but at this point with
      implementations already following it we're probably stuck with that.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0a262ffb
    • J. Bruce Fields's avatar
      nfsd4: delegation-based open reclaims should bypass permissions · 89f6c336
      J. Bruce Fields authored
      
      
      We saw a v4.0 client's create fail as follows:
      
      	- open create succeeds and gets a read delegation
      	- client attempts to set mode on new file, gets DELAY while
      	  server recalls delegation.
      	- client attempts a CLAIM_DELEGATE_CUR open using the
      	  delegation, gets error because of new file mode.
      
      This probably can't happen on a recent kernel since we're no longer
      giving out delegations on create opens.  Nevertheless, it's a
      bug--reclaim opens should bypass permission checks.
      
      Reported-by: default avatarSteve Dickson <steved@redhat.com>
      Reported-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      89f6c336
    • J. Bruce Fields's avatar
      svcrpc: don't error out on small tcp fragment · 1f691b07
      J. Bruce Fields authored
      
      
      Though clients we care about mostly don't do this, it is possible for
      rpc requests to be sent in multiple fragments.  Here we have a sanity
      check to ensure that the final received rpc isn't too small--except that
      the number we're actually checking is the length of just the final
      fragment, not of the whole rpc.  So a perfectly legal rpc that's
      unluckily fragmented could cause the server to close the connection
      here.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1f691b07
    • J. Bruce Fields's avatar
      svcrpc: fix handling of too-short rpc's · cf3aa02c
      J. Bruce Fields authored
      
      
      If we detect that an rpc is too short, we abort and close the
      connection.  Except, there's a bug here: we're leaving sk_datalen
      nonzero without leaving any pages in the sk_pages array.  The most
      likely result of the inconsistency is a subsequent crash in
      svc_tcp_clear_pages.
      
      Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      cf3aa02c
    • J. Bruce Fields's avatar
      nfsd4: minor read_buf cleanup · 590b7431
      J. Bruce Fields authored
      
      
      The code to step to the next page seems reasonably self-contained.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      590b7431
    • J. Bruce Fields's avatar
      nfsd4: fix decoding of compounds across page boundaries · 24750082
      J. Bruce Fields authored
      
      
      A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball.
      A network trace showed the server returning BAD_XDR on the final getattr
      of a getattr+write+getattr compound.  The final getattr started on a
      page boundary.
      
      I believe the Linux client ignores errors on the post-write getattr, and
      that that's why we haven't seen this before.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRick Macklem <rmacklem@uoguelph.ca>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      24750082
    • J. Bruce Fields's avatar
      nfsd4: clean up nfs4_open_delegation · 99c41515
      J. Bruce Fields authored
      
      
      The nfs4_open_delegation logic is unecessarily baroque.
      
      Also stop pretending we support write delegations in several places.
      
      Some day we will support write delegations, but when that happens adding
      back in these flag parameters will be the easy part.  For now they're
      just confusing.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      99c41515
    • Steve Dickson's avatar
      NFSD: Don't give out read delegations on creates · 9a0590ae
      Steve Dickson authored
      
      
      When an exclusive create is done with the mode bits
      set (aka open(testfile, O_CREAT | O_EXCL, 0777)) this
      causes a OPEN op followed by a SETATTR op. When a
      read delegation is given in the OPEN, it causes
      the SETATTR to delay with EAGAIN until the
      delegation is recalled.
      
      This patch caused exclusive creates to give out
      a write delegation (which turn into no delegation)
      which allows the SETATTR seamlessly succeed.
      
      Signed-off-by: default avatarSteve Dickson <steved@redhat.com>
      [bfields: do this for any CREATE, not just exclusive; comment]
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9a0590ae
    • J. Bruce Fields's avatar
      nfsd4: allow client to send no cb_sec flavors · 57569a70
      J. Bruce Fields authored
      
      
      In testing I notice that some of the pynfs tests forget to send any
      cb_sec flavors, and that we haven't necessarily errored out in that case
      before.
      
      I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in
      that case in case this is something clients actually do.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      57569a70
    • J. Bruce Fields's avatar
      nfsd4: fail attempts to request gss on the backchannel · b78724b7
      J. Bruce Fields authored
      
      
      We don't support gss on the backchannel.  We should state that fact up
      front rather than just letting things continue and later making the
      client try to figure out why the backchannel isn't working.
      
      Trond suggested instead returning NFS4ERR_NOENT.  I think it would be
      tricky for the client to distinguish between the case "I don't support
      gss on the backchannel" and "I can't find that in my cache, please
      create another context and try that instead", and I'd prefer something
      that currently doesn't have any other meaning for this operation, hence
      the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b78724b7
    • J. Bruce Fields's avatar
      nfsd4: implement minimal SP4_MACH_CRED · 57266a6e
      J. Bruce Fields authored
      
      
      Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring
      the client-provided spo_must_* arrays and just enforcing credential
      checks for the minimum required operations.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      57266a6e
    • J. Bruce Fields's avatar
      svcrpc: store gss mech in svc_cred · 0dc1531a
      J. Bruce Fields authored
      
      
      Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
      This will make it easier to enforce SP4_MACH_CRED, which needs to
      compare the mechanism used on the exchange_id with that used on
      protected operations.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0dc1531a
    • J. Bruce Fields's avatar
      svcrpc: introduce init_svc_cred · 44234063
      J. Bruce Fields authored
      
      
      Common helper to zero out fields of the svc_cred.
      
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      44234063
    • J. Bruce Fields's avatar
      Merge branch 'for-3.10' into 'for-3.11' · 0de93493
      J. Bruce Fields authored
      Merge bugfixes into my for-3.11 branch.
      0de93493
  6. May 29, 2013
  7. May 21, 2013
  8. May 15, 2013
  9. May 13, 2013
  10. May 12, 2013