Commit 38764c73 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping
  support for a filehandle format deprecated 20 years ago, and further
  xdr-related cleanup from Chuck"

* tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linux: (26 commits)
  nfsd4: remove obselete comment
  nfsd: document server-to-server-copy parameters
  NFSD:fix boolreturn.cocci warning
  nfsd: update create verifier comment
  SUNRPC: Change return value type of .pc_encode
  SUNRPC: Replace the "__be32 *p" parameter to .pc_encode
  NFSD: Save location of NFSv4 COMPOUND status
  SUNRPC: Change return value type of .pc_decode
  SUNRPC: Replace the "__be32 *p" parameter to .pc_decode
  SUNRPC: De-duplicate .pc_release() call sites
  SUNRPC: Simplify the SVC dispatch code path
  SUNRPC: Capture value of xdr_buf::page_base
  SUNRPC: Add trace event when alloc_pages_bulk() makes no progress
  svcrdma: Split svcrmda_wc_{read,write} tracepoints
  svcrdma: Split the svcrdma_wc_send() tracepoint
  svcrdma: Split the svcrdma_wc_receive() tracepoint
  NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()
  SUNRPC: xdr_stream_subsegment() must handle non-zero page_bases
  NFSD: Initialize pointer ni with NULL and not plain integer 0
  NFSD: simplify struct nfsfh
  ...
parents 2ec20f48 80479eb8
Loading
Loading
Loading
Loading
+14 −0
Original line number Diff line number Diff line
@@ -3253,6 +3253,19 @@
			driver. A non-zero value sets the minimum interval
			in seconds between layoutstats transmissions.

	nfsd.inter_copy_offload_enable =
			[NFSv4.2] When set to 1, the server will support
			server-to-server copies for which this server is
			the destination of the copy.

	nfsd.nfsd4_ssc_umount_timeout =
			[NFSv4.2] When used as the destination of a
			server-to-server copy, knfsd temporarily mounts
			the source server.  It caches the mount in case
			it will be needed again, and discards it if not
			used for the number of milliseconds specified by
			this parameter.

	nfsd.nfs4_disable_idmapping=
			[NFSv4] When set to the default of '1', the NFSv4
			server will return only numeric uids and gids to
@@ -3260,6 +3273,7 @@
			and gids from such clients.  This is intended to ease
			migration from NFSv2/v3.


	nmi_backtrace.backtrace_idle [KNL]
			Dump stacks even of idle CPUs in response to an
			NMI stack-backtrace request.
+1 −0
Original line number Diff line number Diff line
@@ -11,3 +11,4 @@ NFS
   rpc-server-gss
   nfs41-server
   knfsd-stats
   reexport
+113 −0
Original line number Diff line number Diff line
Reexporting NFS filesystems
===========================

Overview
--------

It is possible to reexport an NFS filesystem over NFS.  However, this
feature comes with a number of limitations.  Before trying it, we
recommend some careful research to determine whether it will work for
your purposes.

A discussion of current known limitations follows.

"fsid=" required, crossmnt broken
---------------------------------

We require the "fsid=" export option on any reexport of an NFS
filesystem.  You can use "uuidgen -r" to generate a unique argument.

The "crossmnt" export does not propagate "fsid=", so it will not allow
traversing into further nfs filesystems; if you wish to export nfs
filesystems mounted under the exported filesystem, you'll need to export
them explicitly, assigning each its own unique "fsid= option.

Reboot recovery
---------------

The NFS protocol's normal reboot recovery mechanisms don't work for the
case when the reexport server reboots.  Clients will lose any locks
they held before the reboot, and further IO will result in errors.
Closing and reopening files should clear the errors.

Filehandle limits
-----------------

If the original server uses an X byte filehandle for a given object, the
reexport server's filehandle for the reexported object will be X+22
bytes, rounded up to the nearest multiple of four bytes.

The result must fit into the RFC-mandated filehandle size limits:

+-------+-----------+
| NFSv2 |  32 bytes |
+-------+-----------+
| NFSv3 |  64 bytes |
+-------+-----------+
| NFSv4 | 128 bytes |
+-------+-----------+

So, for example, you will only be able to reexport a filesystem over
NFSv2 if the original server gives you filehandles that fit in 10
bytes--which is unlikely.

In general there's no way to know the maximum filehandle size given out
by an NFS server without asking the server vendor.

But the following table gives a few examples.  The first column is the
typical length of the filehandle from a Linux server exporting the given
filesystem, the second is the length after that nfs export is reexported
by another Linux host:

+--------+-------------------+----------------+
|        | filehandle length | after reexport |
+========+===================+================+
| ext4:  | 28 bytes          | 52 bytes       |
+--------+-------------------+----------------+
| xfs:   | 32 bytes          | 56 bytes       |
+--------+-------------------+----------------+
| btrfs: | 40 bytes          | 64 bytes       |
+--------+-------------------+----------------+

All will therefore fit in an NFSv3 or NFSv4 filehandle after reexport,
but none are reexportable over NFSv2.

Linux server filehandles are a bit more complicated than this, though;
for example:

        - The (non-default) "subtreecheck" export option generally
          requires another 4 to 8 bytes in the filehandle.
        - If you export a subdirectory of a filesystem (instead of
          exporting the filesystem root), that also usually adds 4 to 8
          bytes.
        - If you export over NFSv2, knfsd usually uses a shorter
          filesystem identifier that saves 8 bytes.
        - The root directory of an export uses a filehandle that is
          shorter.

As you can see, the 128-byte NFSv4 filehandle is large enough that
you're unlikely to have trouble using NFSv4 to reexport any filesystem
exported from a Linux server.  In general, if the original server is
something that also supports NFSv3, you're *probably* OK.  Re-exporting
over NFSv3 may be dicier, and reexporting over NFSv2 will probably
never work.

For more details of Linux filehandle structure, the best reference is
the source code and comments; see in particular:

        - include/linux/exportfs.h:enum fid_type
        - include/uapi/linux/nfsd/nfsfh.h:struct nfs_fhbase_new
        - fs/nfsd/nfsfh.c:set_version_and_fsid_type
        - fs/nfs/export.c:nfs_encode_fh

Open DENY bits ignored
----------------------

NFS since NFSv4 supports ALLOW and DENY bits taken from Windows, which
allow you, for example, to open a file in a mode which forbids other
read opens or write opens. The Linux client doesn't use them, and the
server's support has always been incomplete: they are enforced only
against other NFS users, not against processes accessing the exported
filesystem locally. A reexport server will also not pass them along to
the original server, so they will not be enforced between clients of
different reexport servers.
+2 −4
Original line number Diff line number Diff line
@@ -780,11 +780,9 @@ module_exit(exit_nlm);
static int nlmsvc_dispatch(struct svc_rqst *rqstp, __be32 *statp)
{
	const struct svc_procedure *procp = rqstp->rq_procinfo;
	struct kvec *argv = rqstp->rq_arg.head;
	struct kvec *resv = rqstp->rq_res.head;

	svcxdr_init_decode(rqstp);
	if (!procp->pc_decode(rqstp, argv->iov_base))
	if (!procp->pc_decode(rqstp, &rqstp->rq_arg_stream))
		goto out_decode_err;

	*statp = procp->pc_func(rqstp);
@@ -794,7 +792,7 @@ static int nlmsvc_dispatch(struct svc_rqst *rqstp, __be32 *statp)
		return 1;

	svcxdr_init_encode(rqstp);
	if (!procp->pc_encode(rqstp, resv->iov_base + resv->iov_len))
	if (!procp->pc_encode(rqstp, &rqstp->rq_res_stream))
		goto out_encode_err;

	return 1;
+71 −81
Original line number Diff line number Diff line
@@ -145,137 +145,131 @@ svcxdr_encode_testrply(struct xdr_stream *xdr, const struct nlm_res *resp)
 * Decode Call arguments
 */

int
nlmsvc_decode_void(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_void(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	return 1;
	return true;
}

int
nlmsvc_decode_testargs(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_testargs(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;
	u32 exclusive;

	if (!svcxdr_decode_cookie(xdr, &argp->cookie))
		return 0;
		return false;
	if (xdr_stream_decode_bool(xdr, &exclusive) < 0)
		return 0;
		return false;
	if (!svcxdr_decode_lock(xdr, &argp->lock))
		return 0;
		return false;
	if (exclusive)
		argp->lock.fl.fl_type = F_WRLCK;

	return 1;
	return true;
}

int
nlmsvc_decode_lockargs(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_lockargs(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;
	u32 exclusive;

	if (!svcxdr_decode_cookie(xdr, &argp->cookie))
		return 0;
		return false;
	if (xdr_stream_decode_bool(xdr, &argp->block) < 0)
		return 0;
		return false;
	if (xdr_stream_decode_bool(xdr, &exclusive) < 0)
		return 0;
		return false;
	if (!svcxdr_decode_lock(xdr, &argp->lock))
		return 0;
		return false;
	if (exclusive)
		argp->lock.fl.fl_type = F_WRLCK;
	if (xdr_stream_decode_bool(xdr, &argp->reclaim) < 0)
		return 0;
		return false;
	if (xdr_stream_decode_u32(xdr, &argp->state) < 0)
		return 0;
		return false;
	argp->monitor = 1;		/* monitor client by default */

	return 1;
	return true;
}

int
nlmsvc_decode_cancargs(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_cancargs(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;
	u32 exclusive;

	if (!svcxdr_decode_cookie(xdr, &argp->cookie))
		return 0;
		return false;
	if (xdr_stream_decode_bool(xdr, &argp->block) < 0)
		return 0;
		return false;
	if (xdr_stream_decode_bool(xdr, &exclusive) < 0)
		return 0;
		return false;
	if (!svcxdr_decode_lock(xdr, &argp->lock))
		return 0;
		return false;
	if (exclusive)
		argp->lock.fl.fl_type = F_WRLCK;

	return 1;
	return true;
}

int
nlmsvc_decode_unlockargs(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_unlockargs(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;

	if (!svcxdr_decode_cookie(xdr, &argp->cookie))
		return 0;
		return false;
	if (!svcxdr_decode_lock(xdr, &argp->lock))
		return 0;
		return false;
	argp->lock.fl.fl_type = F_UNLCK;

	return 1;
	return true;
}

int
nlmsvc_decode_res(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_res(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_res *resp = rqstp->rq_argp;

	if (!svcxdr_decode_cookie(xdr, &resp->cookie))
		return 0;
		return false;
	if (!svcxdr_decode_stats(xdr, &resp->status))
		return 0;
		return false;

	return 1;
	return true;
}

int
nlmsvc_decode_reboot(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_reboot(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_reboot *argp = rqstp->rq_argp;
	__be32 *p;
	u32 len;

	if (xdr_stream_decode_u32(xdr, &len) < 0)
		return 0;
		return false;
	if (len > SM_MAXSTRLEN)
		return 0;
		return false;
	p = xdr_inline_decode(xdr, len);
	if (!p)
		return 0;
		return false;
	argp->len = len;
	argp->mon = (char *)p;
	if (xdr_stream_decode_u32(xdr, &argp->state) < 0)
		return 0;
		return false;
	p = xdr_inline_decode(xdr, SM_PRIV_SIZE);
	if (!p)
		return 0;
		return false;
	memcpy(&argp->priv.data, p, sizeof(argp->priv.data));

	return 1;
	return true;
}

int
nlmsvc_decode_shareargs(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_shareargs(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;
	struct nlm_lock	*lock = &argp->lock;

@@ -284,35 +278,34 @@ nlmsvc_decode_shareargs(struct svc_rqst *rqstp, __be32 *p)
	lock->svid = ~(u32)0;

	if (!svcxdr_decode_cookie(xdr, &argp->cookie))
		return 0;
		return false;
	if (!svcxdr_decode_string(xdr, &lock->caller, &lock->len))
		return 0;
		return false;
	if (!svcxdr_decode_fhandle(xdr, &lock->fh))
		return 0;
		return false;
	if (!svcxdr_decode_owner(xdr, &lock->oh))
		return 0;
		return false;
	/* XXX: Range checks are missing in the original code */
	if (xdr_stream_decode_u32(xdr, &argp->fsm_mode) < 0)
		return 0;
		return false;
	if (xdr_stream_decode_u32(xdr, &argp->fsm_access) < 0)
		return 0;
		return false;

	return 1;
	return true;
}

int
nlmsvc_decode_notify(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_decode_notify(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_arg_stream;
	struct nlm_args *argp = rqstp->rq_argp;
	struct nlm_lock	*lock = &argp->lock;

	if (!svcxdr_decode_string(xdr, &lock->caller, &lock->len))
		return 0;
		return false;
	if (xdr_stream_decode_u32(xdr, &argp->state) < 0)
		return 0;
		return false;

	return 1;
	return true;
}


@@ -320,45 +313,42 @@ nlmsvc_decode_notify(struct svc_rqst *rqstp, __be32 *p)
 * Encode Reply results
 */

int
nlmsvc_encode_void(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_encode_void(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	return 1;
	return true;
}

int
nlmsvc_encode_testres(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_encode_testres(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_res_stream;
	struct nlm_res *resp = rqstp->rq_resp;

	return svcxdr_encode_cookie(xdr, &resp->cookie) &&
		svcxdr_encode_testrply(xdr, resp);
}

int
nlmsvc_encode_res(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_encode_res(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_res_stream;
	struct nlm_res *resp = rqstp->rq_resp;

	return svcxdr_encode_cookie(xdr, &resp->cookie) &&
		svcxdr_encode_stats(xdr, resp->status);
}

int
nlmsvc_encode_shareres(struct svc_rqst *rqstp, __be32 *p)
bool
nlmsvc_encode_shareres(struct svc_rqst *rqstp, struct xdr_stream *xdr)
{
	struct xdr_stream *xdr = &rqstp->rq_res_stream;
	struct nlm_res *resp = rqstp->rq_resp;

	if (!svcxdr_encode_cookie(xdr, &resp->cookie))
		return 0;
		return false;
	if (!svcxdr_encode_stats(xdr, resp->status))
		return 0;
		return false;
	/* sequence */
	if (xdr_stream_encode_u32(xdr, 0) < 0)
		return 0;
		return false;

	return 1;
	return true;
}
Loading