Commit 9ae440b8 authored by Jakub Kicinski's avatar Jakub Kicinski
Browse files

Merge branch 'splice-net-switch-over-users-of-sendpage-and-remove-it'

David Howells says:

====================
splice, net: Switch over users of sendpage() and remove it

Here's the final set of patches towards the removal of sendpage.  All the
drivers that use sendpage() get switched over to using sendmsg() with
MSG_SPLICE_PAGES.

The following changes are made:

 (1) Make the protocol drivers behave according to MSG_MORE, not
     MSG_SENDPAGE_NOTLAST.  The latter is restricted to turning on MSG_MORE
     in the sendpage() wrappers.

 (2) Fix ocfs2 to allocate its global protocol buffers with folio_alloc()
     rather than kzalloc() so as not to invoke the !sendpage_ok warning in
     skb_splice_from_iter().

 (3) Make ceph/rds, skb_send_sock, dlm, nvme, smc, ocfs2, drbd and iscsi
     use sendmsg(), not sendpage and make them specify MSG_MORE instead of
     MSG_SENDPAGE_NOTLAST.

 (4) Kill off sendpage and clean up MSG_SENDPAGE_NOTLAST.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1
Link: https://lore.kernel.org/r/20230616161301.622169-1-dhowells@redhat.com/ # v1
Link: https://lore.kernel.org/r/20230617121146.716077-1-dhowells@redhat.com/ # v2
Link: https://lore.kernel.org/r/20230620145338.1300897-1-dhowells@redhat.com/ # v3
====================

Link: https://lore.kernel.org/r/20230623225513.2732256-1-dhowells@redhat.com


Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents b545a13c b848b26c
Loading
Loading
Loading
Loading
+5 −5
Original line number Diff line number Diff line
@@ -240,11 +240,11 @@ offsets into ``msg``, respectively.
If a program of type ``BPF_PROG_TYPE_SK_MSG`` is run on a ``msg`` it can only
parse data that the (``data``, ``data_end``) pointers have already consumed.
For ``sendmsg()`` hooks this is likely the first scatterlist element. But for
calls relying on the ``sendpage`` handler (e.g., ``sendfile()``) this will be
the range (**0**, **0**) because the data is shared with user space and by
default the objective is to avoid allowing user space to modify data while (or
after) BPF verdict is being decided. This helper can be used to pull in data
and to set the start and end pointers to given values. Data will be copied if
calls relying on MSG_SPLICE_PAGES (e.g., ``sendfile()``) this will be the
range (**0**, **0**) because the data is shared with user space and by default
the objective is to avoid allowing user space to modify data while (or after)
BPF verdict is being decided. This helper can be used to pull in data and to
set the start and end pointers to given values. Data will be copied if
necessary (i.e., if data was not linear and if start and end pointers do not
point to the same chunk).

+0 −2
Original line number Diff line number Diff line
@@ -521,8 +521,6 @@ prototypes::
	int (*fsync) (struct file *, loff_t start, loff_t end, int datasync);
	int (*fasync) (int, struct file *, int);
	int (*lock) (struct file *, int, struct file_lock *);
	ssize_t (*sendpage) (struct file *, struct page *, int, size_t,
			loff_t *, int);
	unsigned long (*get_unmapped_area)(struct file *, unsigned long,
			unsigned long, unsigned long, unsigned long);
	int (*check_flags)(int);
+0 −1
Original line number Diff line number Diff line
@@ -1086,7 +1086,6 @@ This describes how the VFS can manipulate an open file. As of kernel
		int (*fsync) (struct file *, loff_t, loff_t, int datasync);
		int (*fasync) (int, struct file *, int);
		int (*lock) (struct file *, int, struct file_lock *);
		ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
		unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
		int (*check_flags)(int);
		int (*flock) (struct file *, int, struct file_lock *);
+2 −2
Original line number Diff line number Diff line
@@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes.
rps_sock_flow_table is a global flow table that contains the *desired* CPU
for flows: the CPU that is currently processing the flow in userspace.
Each table value is a CPU index that is updated during calls to recvmsg
and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
and tcp_splice_read()).
and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and
tcp_splice_read()).

When the scheduler moves a thread to a new CPU while it has outstanding
receive packets on the old CPU, packets may arrive out of order. To
+0 −28
Original line number Diff line number Diff line
@@ -482,7 +482,6 @@ static const struct proto_ops alg_proto_ops = {
	.listen		=	sock_no_listen,
	.shutdown	=	sock_no_shutdown,
	.mmap		=	sock_no_mmap,
	.sendpage	=	sock_no_sendpage,
	.sendmsg	=	sock_no_sendmsg,
	.recvmsg	=	sock_no_recvmsg,

@@ -1106,33 +1105,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
EXPORT_SYMBOL_GPL(af_alg_sendmsg);

/**
 * af_alg_sendpage - sendpage system call handler
 * @sock: socket of connection to user space to write to
 * @page: data to send
 * @offset: offset into page to begin sending
 * @size: length of data
 * @flags: message send/receive flags
 *
 * This is a generic implementation of sendpage to fill ctx->tsgl_list.
 */
ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
			int offset, size_t size, int flags)
{
	struct bio_vec bvec;
	struct msghdr msg = {
		.msg_flags = flags | MSG_SPLICE_PAGES,
	};

	if (flags & MSG_SENDPAGE_NOTLAST)
		msg.msg_flags |= MSG_MORE;

	bvec_set_page(&bvec, page, size, offset);
	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
	return sock_sendmsg(sock, &msg);
}
EXPORT_SYMBOL_GPL(af_alg_sendpage);

/**
 * af_alg_free_resources - release resources required for crypto request
 * @areq: Request holding the TX and RX SGL
Loading