Skip to content
  1. Dec 19, 2023
  2. Dec 18, 2023
  3. Dec 17, 2023
    • David S. Miller's avatar
      Merge branch 'skb-coalescing-page_pool' · 3a3af3ae
      David S. Miller authored
      
      
      Liang Chen says:
      
      ====================
      skbuff: Optimize SKB coalescing for page pool
      
      The combination of the following condition was excluded from skb coalescing:
      
      from->pp_recycle = 1
      from->cloned = 1
      to->pp_recycle = 1
      
      With page pool in use, this combination can be quite common(ex.
      NetworkMananger may lead to the additional packet_type being registered,
      thus the cloning). In scenarios with a higher number of small packets, it
      can significantly affect the success rate of coalescing.
      
      This patchset aims to optimize this scenario and enable coalescing of this
      particular combination. That also involves supporting multiple users
      referencing the same fragment of a pp page to accomondate the need to
      increment the "from" SKB page's pp page reference count.
      
      Changes from v10:
      - re-number patches to 1/3, 2/3, 3/3
      
      Changes from v9:
      - patch 1 was already applied
      - imporve description for patch 2
      - make sure skb_pp_frag_ref only work for pp aware skbs
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a3af3ae
    • Liang Chen's avatar
      skbuff: Optimization of SKB coalescing for page pool · f7dc3248
      Liang Chen authored
      In order to address the issues encountered with commit 1effe8ca
      
      
      ("skbuff: fix coalescing for page_pool fragment recycling"), the
      combination of the following condition was excluded from skb coalescing:
      
      from->pp_recycle = 1
      from->cloned = 1
      to->pp_recycle = 1
      
      However, with page pool environments, the aforementioned combination can
      be quite common(ex. NetworkMananger may lead to the additional
      packet_type being registered, thus the cloning). In scenarios with a
      higher number of small packets, it can significantly affect the success
      rate of coalescing. For example, considering packets of 256 bytes size,
      our comparison of coalescing success rate is as follows:
      
      Without page pool: 70%
      With page pool: 13%
      
      Consequently, this has an impact on performance:
      
      Without page pool: 2.57 Gbits/sec
      With page pool: 2.26 Gbits/sec
      
      Therefore, it seems worthwhile to optimize this scenario and enable
      coalescing of this particular combination. To achieve this, we need to
      ensure the correct increment of the "from" SKB page's page pool
      reference count (pp_ref_count).
      
      Following this optimization, the success rate of coalescing measured in
      our environment has improved as follows:
      
      With page pool: 60%
      
      This success rate is approaching the rate achieved without using page
      pool, and the performance has also been improved:
      
      With page pool: 2.52 Gbits/sec
      
      Below is the performance comparison for small packets before and after
      this optimization. We observe no impact to packets larger than 4K.
      
      packet size     before      after       improved
      (bytes)         (Gbits/sec) (Gbits/sec)
      128             1.19        1.27        7.13%
      256             2.26        2.52        11.75%
      512             4.13        4.81        16.50%
      1024            6.17        6.73        9.05%
      2048            14.54       15.47       6.45%
      4096            25.44       27.87       9.52%
      
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Suggested-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarMina Almasry <almasrymina@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7dc3248
    • Liang Chen's avatar
      skbuff: Add a function to check if a page belongs to page_pool · 8cfa2dee
      Liang Chen authored
      
      
      Wrap code for checking if a page is a page_pool page into a
      function for better readability and ease of reuse.
      
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Reviewed-by: default avatarMina Almasry <almarsymina@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cfa2dee
    • Liang Chen's avatar
      page_pool: halve BIAS_MAX for multiple user references of a fragment · aaf153ae
      Liang Chen authored
      
      
      Up to now, we were only subtracting from the number of used page fragments
      to figure out when a page could be freed or recycled. A following patch
      introduces support for multiple users referencing the same fragment. So
      reduce the initial page fragments value to half to avoid overflowing.
      
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Reviewed-by: default avatarMina Almasry <almarsymina@google.com>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aaf153ae
    • David S. Miller's avatar
      Merge branch 'tcp-ao-selftests' · 66fe8963
      David S. Miller authored
      
      
      Dmitry Safonov says:
      
      ====================
      selftests/net: Add TCP-AO tests
      
      An essential part of any big kernel submissions is selftests.
      At the beginning of TCP-AO project, I made patches to fcnal-test.sh
      and nettest.c to have the benefits of easy refactoring, early noticing
      breakages, putting a moat around the code, documenting
      and designing uAPI.
      
      While tests based on fcnal-test.sh/nettest.c provided initial testing*
      and were very easy to add, the pile of TCP-AO quickly grew out of
      one-binary + shell-script testing.
      
      The design of the TCP-AO testing is a bit different than one-big
      selftest binary as I did previously in net/ipsec.c. I found it
      beneficial to avoid implementing a tests runner/scheduler and delegate
      it to the user or Makefile. The approach is very influenced
      by CRIU/ZDTM testing[1]: it provides a static library with helper
      functions and selftest binaries that create specific scenarios.
      I also tried to utilize kselftest.h.
      
      test_init() function does all needed preparations. To not leave
      any traces after a selftest exists, it creates a network namespace
      and if the test wants to establish a TCP connection, a child netns.
      The parent and child netns have veth pair with proper ip addresses
      and routes set up. Both peers, the client and server are different
      pthreads. The treading model was chosen over forking mostly by easiness
      of cleanup on a failure: no need to search for children, handle SIGCHLD,
      make sure not to wait for a dead peer to perform anything, etc.
      Any thread that does exit() naturally kills the tests, sweet!
      The selftests are compiled currently in two variants: ipv4 and ipv6.
      Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not
      there in this version. As pretty much all tests are shared between two
      address families, most of the code can be shared, too. To differ in code
      what kind of test is running, Makefile supplies -DIPV6_TEST to compiler
      and ifdeffery in tests can do things that have to be different between
      address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests
      and also to tests code sharing in CRIU/ZDTM.
      
      The total number of tests is 832.
      From them rst_ipv{4,6} has currently one flaky subtest, that may fail:
      > not ok 9 client connection was not reset: 0
      I'll investigate what happens there. Also, unsigned-md5_ipv{4,6}
      are flaky because of netns counter checks: it doesn't expect that
      there may be retransmitted TCP segments from a previous sub-selftest.
      That will be fixed. Besides, key-management_ipv{4,6} has 3 sub-tests
      passing with XFAIL:
      > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
      > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
      > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
      ...
      > # Totals: pass:117 fail:0 xfail:3 xpass:0 skip:0 error:0
      Those need some more kernel work to pass instead of xfail.
      
      The overview of selftests (see the diffstat at the bottom):
      ├── lib
      │   ├── aolib.h
      │   │   The header for all selftests to include.
      │   ├── kconfig.c
      │   │   Kernel kconfig detector to SKIP tests that depend on something.
      │   ├── netlink.c
      │   │   Netlink helper to add/modify/delete VETH/IPs/routes/VRFs
      │   │   I considered just using libmnl, but this is around 400 lines
      │   │   and avoids selftests dependency on out-of-tree sources/packets.
      │   ├── proc.c
      │   │   SNMP/netstat procfs parser and the counters comparator.
      │   ├── repair.c
      │   │   Heavily influenced by libsoccr and reduced to minimum TCP
      │   │   socket checkpoint/repair. Shouldn't be used out of selftests,
      │   │   though.
      │   ├── setup.c
      │   │   All the needed netns/veth/ips/etc preparations for test init.
      │   ├── sock.c
      │   │   Socket helpers: {s,g}etsockopt()s/connect()/listen()/etc.
      │   └── utils.c
      │       Random stuff (a pun intended).
      ├── bench-lookups.c
      │   The only benchmark in selftests currently: checks how well TCP-AO
      │   setsockopt()s perform, depending on the amount of keys on a socket.
      ├── connect.c
      │   Trivial sample, can be used as a boilerplate to write a new test.
      ├── connect-deny.c
      │   More-or-less what could be expected for TCP-AO in fcnal-test.sh
      ├── icmps-accept.c -> icmps-discard.c
      ├── icmps-discard.c
      │   Verifies RFC5925 (7.8) by checking that TCP-AO connection can be
      │   broken if ICMPs are accepted and survives when ::accept_icmps = 0
      ├── key-management.c
      │   Key manipulations, rotations between randomized hashing algorithms
      │   and counter checks for those scenarios.
      ├── restore.c
      │   TCP_AO_REPAIR: verifies that a socket can be re-created without
      │   TCP-AO connection being interrupted.
      ├── rst.c
      │   As RST segments are signed on a separate code-path in kernel,
      │   verifies passive/active TCP send_reset().
      ├── self-connect.c
      │   Verifies that TCP self-connect and also simultaneous open work.
      ├── seq-ext.c
      │   Utilizes TCP_AO_REPAIR to check that on SEQ roll-over SNE
      │   increment is performed and segments with different SNEs fail to
      │   pass verification.
      ├── setsockopt-closed.c
      │   Checks that {s,g}etsockopt()s are extendable syscalls and common
      │   error-paths for them.
      └── unsigned-md5.c
          Checks listen() socket for (non-)matching peers with: AO/MD5/none
          keys. As well as their interaction with VRFs and AO_REQUIRED flag.
      
      There are certainly more test scenarios that can be added, but even so,
      I'm pretty happy that this much of TCP-AO functionality and uAPIs got
      covered. These selftests were iteratively developed by me during TCP-AO
      kernel upstreaming and the resulting kernel patches would have been
      worse without having these tests. They provided the user-side
      perspective but also allowed safer refactoring with less possibility
      of introducing a regression. Now it's time to use them to dig
      a moat around the TCP-AO code!
      
      There are also people from other network companies that work on TCP-AO
      (+testing), so sharing these selftests will allow them to contribute
      and may benefit from their efforts.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66fe8963
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO key-management test · 3c3ead55
      Dmitry Safonov authored
      
      
      Check multiple keys on a socket:
      - rotation on closed socket
      - current/rnext operations shouldn't be possible on listen sockets
      - current/rnext key set should be the one, that's used on connect()
      - key rotations with pseudo-random generated keys
      - copying matching keys on connect() and on accept()
      
      At this moment there are 3 tests that are "expected" to fail: a kernel
      fix is needed to improve the situation, they are marked XFAIL.
      
      Sample output:
      > # ./key-management_ipv4
      > 1..120
      > # 1601[lib/setup.c:239] rand seed 1700526653
      > TAP version 13
      > ok 1 closed socket, delete a key: the key was deleted
      > ok 2 closed socket, delete all keys: the key was deleted
      > ok 3 closed socket, delete current key: key deletion was prevented
      > ok 4 closed socket, delete rnext key: key deletion was prevented
      > ok 5 closed socket, delete a key + set current/rnext: the key was deleted
      > ok 6 closed socket, force-delete current key: the key was deleted
      > ok 7 closed socket, force-delete rnext key: the key was deleted
      > ok 8 closed socket, delete current+rnext key: key deletion was prevented
      > ok 9 closed socket, add + change current key
      > ok 10 closed socket, add + change rnext key
      > ok 11 listen socket, delete a key: the key was deleted
      > ok 12 listen socket, delete all keys: the key was deleted
      > ok 13 listen socket, setting current key not allowed
      > ok 14 listen socket, setting rnext key not allowed
      > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
      > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
      > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
      > ok 18 listen socket, getsockopt(TCP_AO_REPAIR) is restricted
      > ok 19 listen socket, setsockopt(TCP_AO_REPAIR) is restricted
      > ok 20 listen socket, delete a key + set current/rnext: key deletion was prevented
      > ok 21 listen socket, force-delete current key: key deletion was prevented
      > ok 22 listen socket, force-delete rnext key: key deletion was prevented
      > ok 23 listen socket, delete a key: the key was deleted
      > ok 24 listen socket, add + change current key
      > ok 25 listen socket, add + change rnext key
      > ok 26 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 27 client: Check current/rnext keys unset before connect(): current key 19 as expected
      > ok 28 client: Check current/rnext keys unset before connect(): rnext key 146 as expected
      > ok 29 server: Check current/rnext keys unset before connect(): server alive
      > ok 30 server: Check current/rnext keys unset before connect(): passed counters checks
      > ok 31 client: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 32 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 33 server: Check current/rnext keys unset before connect(): passed counters checks
      > ok 34 client: Check current/rnext keys unset before connect(): passed counters checks
      > ok 35 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 36 server: Check current/rnext keys set before connect(): server alive
      > ok 37 server: Check current/rnext keys set before connect(): passed counters checks
      > ok 38 client: Check current/rnext keys set before connect(): current key 10 as expected
      > ok 39 client: Check current/rnext keys set before connect(): rnext key 137 as expected
      > ok 40 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 41 client: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 42 client: Check current/rnext keys set before connect(): passed counters checks
      > ok 43 server: Check current/rnext keys set before connect(): passed counters checks
      > ok 44 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 45 server: Check current != rnext keys set before connect(): server alive
      > ok 46 server: Check current != rnext keys set before connect(): passed counters checks
      > ok 47 client: Check current != rnext keys set before connect(): current key 10 as expected
      > ok 48 client: Check current != rnext keys set before connect(): rnext key 132 as expected
      > ok 49 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 50 client: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 51 client: Check current != rnext keys set before connect(): passed counters checks
      > ok 52 server: Check current != rnext keys set before connect(): passed counters checks
      > ok 53 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 54 server: Check current flapping back on peer's RnextKey request: server alive
      > ok 55 server: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 56 client: Check current flapping back on peer's RnextKey request: current key 10 as expected
      > ok 57 client: Check current flapping back on peer's RnextKey request: rnext key 132 as expected
      > ok 58 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 59 client: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 60 server: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 61 client: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 62 server: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 63 server: Rotate over all different keys: server alive
      > ok 64 server: Rotate over all different keys: passed counters checks
      > ok 65 server: Rotate over all different keys: current key 128 as expected
      > ok 66 client: Rotate over all different keys: rnext key 128 as expected
      > ok 67 server: Rotate over all different keys: current key 129 as expected
      > ok 68 client: Rotate over all different keys: rnext key 129 as expected
      > ok 69 server: Rotate over all different keys: current key 130 as expected
      > ok 70 client: Rotate over all different keys: rnext key 130 as expected
      > ok 71 server: Rotate over all different keys: current key 131 as expected
      > ok 72 client: Rotate over all different keys: rnext key 131 as expected
      > ok 73 server: Rotate over all different keys: current key 132 as expected
      > ok 74 client: Rotate over all different keys: rnext key 132 as expected
      > ok 75 server: Rotate over all different keys: current key 133 as expected
      > ok 76 client: Rotate over all different keys: rnext key 133 as expected
      > ok 77 server: Rotate over all different keys: current key 134 as expected
      > ok 78 client: Rotate over all different keys: rnext key 134 as expected
      > ok 79 server: Rotate over all different keys: current key 135 as expected
      > ok 80 client: Rotate over all different keys: rnext key 135 as expected
      > ok 81 server: Rotate over all different keys: current key 136 as expected
      > ok 82 client: Rotate over all different keys: rnext key 136 as expected
      > ok 83 server: Rotate over all different keys: current key 137 as expected
      > ok 84 client: Rotate over all different keys: rnext key 137 as expected
      > ok 85 server: Rotate over all different keys: current key 138 as expected
      > ok 86 client: Rotate over all different keys: rnext key 138 as expected
      > ok 87 server: Rotate over all different keys: current key 139 as expected
      > ok 88 client: Rotate over all different keys: rnext key 139 as expected
      > ok 89 server: Rotate over all different keys: current key 140 as expected
      > ok 90 client: Rotate over all different keys: rnext key 140 as expected
      > ok 91 server: Rotate over all different keys: current key 141 as expected
      > ok 92 client: Rotate over all different keys: rnext key 141 as expected
      > ok 93 server: Rotate over all different keys: current key 142 as expected
      > ok 94 client: Rotate over all different keys: rnext key 142 as expected
      > ok 95 server: Rotate over all different keys: current key 143 as expected
      > ok 96 client: Rotate over all different keys: rnext key 143 as expected
      > ok 97 server: Rotate over all different keys: current key 144 as expected
      > ok 98 client: Rotate over all different keys: rnext key 144 as expected
      > ok 99 server: Rotate over all different keys: current key 145 as expected
      > ok 100 client: Rotate over all different keys: rnext key 145 as expected
      > ok 101 server: Rotate over all different keys: current key 146 as expected
      > ok 102 client: Rotate over all different keys: rnext key 146 as expected
      > ok 103 server: Rotate over all different keys: current key 127 as expected
      > ok 104 client: Rotate over all different keys: rnext key 127 as expected
      > ok 105 client: Rotate over all different keys: current key 0 as expected
      > ok 106 client: Rotate over all different keys: rnext key 127 as expected
      > ok 107 server: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 108 client: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 109 client: Rotate over all different keys: passed counters checks
      > ok 110 server: Rotate over all different keys: passed counters checks
      > ok 111 server: Check accept() => established key matching: The socket keys are consistent with the expectations
      > ok 112 Can't add a key with non-matching ip-address for established sk
      > ok 113 Can't add a key with non-matching VRF for established sk
      > ok 114 server: Check accept() => established key matching: server alive
      > ok 115 server: Check accept() => established key matching: passed counters checks
      > ok 116 client: Check connect() => established key matching: current key 0 as expected
      > ok 117 client: Check connect() => established key matching: rnext key 128 as expected
      > ok 118 client: Check connect() => established key matching: The socket keys are consistent with the expectations
      > ok 119 server: Check accept() => established key matching: The socket keys are consistent with the expectations
      > ok 120 server: Check accept() => established key matching: passed counters checks
      > # Totals: pass:120 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c3ead55
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO selfconnect/simultaneous connect test · 8c4e8dd0
      Dmitry Safonov authored
      
      
      Check that a rare functionality of TCP named self-connect works with
      TCP-AO. This "under the cover" also checks TCP simultaneous connect
      (TCP_SYN_RECV socket state), which would be harder to check other ways.
      
      In order to verify that it's indeed TCP simultaneous connect, check
      the counters TCPChallengeACK and TCPSYNChallenge.
      
      Sample of the output:
      > # ./self-connect_ipv6
      > 1..4
      > # 1738[lib/setup.c:254] rand seed 1696451931
      > TAP version 13
      > ok 1 self-connect(same keyids): connect TCPAOGood 0 => 24
      > ok 2 self-connect(different keyids): connect TCPAOGood 26 => 50
      > ok 3 self-connect(restore): connect TCPAOGood 52 => 97
      > ok 4 self-connect(restore, different keyids): connect TCPAOGood 99 => 144
      > # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c4e8dd0
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO RST test · c6df7b23
      Dmitry Safonov authored
      
      
      Check that both active and passive reset works and correctly sign
      segments with TCP-AO or don't send RSTs if not possible to sign.
      A listening socket with backlog = 0 gets one connection in accept
      queue, another in syn queue. Once the server/listener socket is
      forcibly closed, client sockets aren't connected to anything.
      In regular situation they would receive RST on any segment, but
      with TCP-AO as there's no listener, no AO-key and unknown ISNs,
      no RST should be sent.
      
      And "passive" reset, where RST is sent on reply for some segment
      (tcp_v{4,6}_send_reset()) - there use TCP_REPAIR to corrupt SEQ numbers,
      which later results in TCP-AO signed RST, which will be verified and
      client socket will get EPIPE.
      
      No TCPAORequired/TCPAOBad segments are expected during these tests.
      
      Sample of the output:
      > # ./rst_ipv4
      > 1..15
      > # 1462[lib/setup.c:254] rand seed 1686611171
      > TAP version 13
      > ok 1 servered 1000 bytes
      > ok 2 Verified established tcp connection
      > ok 3 sk[0] = 7, connection was reset
      > ok 4 sk[1] = 8, connection was reset
      > ok 5 sk[2] = 9
      > ok 6 MKT counters are good on server
      > ok 7 Verified established tcp connection
      > ok 8 client connection broken post-seq-adjust
      > ok 9 client connection was reset
      > ok 10 No segments without AO sign (server)
      > ok 11 Signed AO segments (server): 0 => 30
      > ok 12 No segments with bad AO sign (server)
      > ok 13 No segments without AO sign (client)
      > ok 14 Signed AO segments (client): 0 => 30
      > ok 15 No segments with bad AO sign (client)
      > # Totals: pass:15 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6df7b23
    • Dmitry Safonov's avatar
      selftests/net: Add SEQ number extension test · 0d16eae5
      Dmitry Safonov authored
      
      
      Check that on SEQ number wraparound there is no disruption or TCPAOBad
      segments produced.
      
      Sample of expected output:
      > # ./seq-ext_ipv4
      > 1..7
      > # 1436[lib/setup.c:254] rand seed 1686611079
      > TAP version 13
      > ok 1 server alive
      > ok 2 post-migrate connection alive
      > ok 3 TCPAOGood counter increased 1002 => 3002
      > ok 4 TCPAOGood counter increased 1003 => 3003
      > ok 5 TCPAOBad counter didn't increase
      > ok 6 TCPAOBad counter didn't increase
      > ok 7 SEQ extension incremented: 1/1999, 1/998999
      > # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d16eae5
    • Dmitry Safonov's avatar
      selftests/net: Add TCP_REPAIR TCP-AO tests · 3715d32d
      Dmitry Safonov authored
      
      
      The test plan is:
      1. check that TCP-AO connection may be restored on another socket
      2. check restore with wrong send/recv ISN (checking that they are
         part of MAC generation)
      3. check restore with wrong SEQ number extension (checking that
         high bytes of it taken into MAC generation)
      
      Sample output expected:
      > # ./restore_ipv4
      > 1..20
      > # 1412[lib/setup.c:254] rand seed 1686610825
      > TAP version 13
      > ok 1 TCP-AO migrate to another socket: server alive
      > ok 2 TCP-AO migrate to another socket: post-migrate connection is alive
      > ok 3 TCP-AO migrate to another socket: counter TCPAOGood increased 23 => 44
      > ok 4 TCP-AO migrate to another socket: counter TCPAOGood increased 22 => 42
      > ok 5 TCP-AO with wrong send ISN: server couldn't serve
      > ok 6 TCP-AO with wrong send ISN: post-migrate connection is broken
      > ok 7 TCP-AO with wrong send ISN: counter TCPAOBad increased 0 => 4
      > ok 8 TCP-AO with wrong send ISN: counter TCPAOBad increased 0 => 3
      > ok 9 TCP-AO with wrong receive ISN: server couldn't serve
      > ok 10 TCP-AO with wrong receive ISN: post-migrate connection is broken
      > ok 11 TCP-AO with wrong receive ISN: counter TCPAOBad increased 4 => 8
      > ok 12 TCP-AO with wrong receive ISN: counter TCPAOBad increased 5 => 10
      > ok 13 TCP-AO with wrong send SEQ ext number: server couldn't serve
      > ok 14 TCP-AO with wrong send SEQ ext number: post-migrate connection is broken
      > ok 15 TCP-AO with wrong send SEQ ext number: counter TCPAOBad increased 9 => 10
      > ok 16 TCP-AO with wrong send SEQ ext number: counter TCPAOBad increased 11 => 19
      > ok 17 TCP-AO with wrong receive SEQ ext number: post-migrate connection is broken
      > ok 18 TCP-AO with wrong receive SEQ ext number: server couldn't serve
      > ok 19 TCP-AO with wrong receive SEQ ext number: counter TCPAOBad increased 10 => 18
      > ok 20 TCP-AO with wrong receive SEQ ext number: counter TCPAOBad increased 20 => 23
      > # Totals: pass:20 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3715d32d
    • Dmitry Safonov's avatar
      selftests/net: Add test/benchmark for removing MKTs · d1066c9c
      Dmitry Safonov authored
      
      
      Sample output:
      > 1..36
      > # 1106[lib/setup.c:207] rand seed 1660754406
      > TAP version 13
      > ok 1   Worst case connect       512 keys: min=0ms max=1ms mean=0.583329ms stddev=0.076376
      > ok 2 Connect random-search      512 keys: min=0ms max=1ms mean=0.53412ms stddev=0.0516779
      > ok 3    Worst case delete       512 keys: min=2ms max=11ms mean=6.04139ms stddev=0.245792
      > ok 4        Add a new key       512 keys: min=0ms max=13ms mean=0.673415ms stddev=0.0820618
      > ok 5 Remove random-search       512 keys: min=5ms max=9ms mean=6.65969ms stddev=0.258064
      > ok 6         Remove async       512 keys: min=0ms max=0ms mean=0.041825ms stddev=0.0204512
      > ok 7   Worst case connect       1024 keys: min=0ms max=2ms mean=0.520357ms stddev=0.0721358
      > ok 8 Connect random-search      1024 keys: min=0ms max=2ms mean=0.535312ms stddev=0.0517355
      > ok 9    Worst case delete       1024 keys: min=5ms max=9ms mean=8.27219ms stddev=0.287614
      > ok 10        Add a new key      1024 keys: min=0ms max=1ms mean=0.688121ms stddev=0.0829531
      > ok 11 Remove random-search      1024 keys: min=5ms max=9ms mean=8.37649ms stddev=0.289422
      > ok 12         Remove async      1024 keys: min=0ms max=0ms mean=0.0457096ms stddev=0.0213798
      > ok 13   Worst case connect      2048 keys: min=0ms max=2ms mean=0.748804ms stddev=0.0865335
      > ok 14 Connect random-search     2048 keys: min=0ms max=2ms mean=0.782993ms stddev=0.0625697
      > ok 15    Worst case delete      2048 keys: min=5ms max=10ms mean=8.23106ms stddev=0.286898
      > ok 16        Add a new key      2048 keys: min=0ms max=1ms mean=0.812988ms stddev=0.0901658
      > ok 17 Remove random-search      2048 keys: min=8ms max=9ms mean=8.84949ms stddev=0.297481
      > ok 18         Remove async      2048 keys: min=0ms max=0ms mean=0.0297223ms stddev=0.0172402
      > ok 19   Worst case connect      4096 keys: min=1ms max=5ms mean=1.53352ms stddev=0.123836
      > ok 20 Connect random-search     4096 keys: min=1ms max=5ms mean=1.52226ms stddev=0.0872429
      > ok 21    Worst case delete      4096 keys: min=5ms max=9ms mean=8.25874ms stddev=0.28738
      > ok 22        Add a new key      4096 keys: min=0ms max=3ms mean=1.67382ms stddev=0.129376
      > ok 23 Remove random-search      4096 keys: min=5ms max=10ms mean=8.26178ms stddev=0.287433
      > ok 24         Remove async      4096 keys: min=0ms max=0ms mean=0.0340009ms stddev=0.0184393
      > ok 25   Worst case connect      8192 keys: min=2ms max=4ms mean=2.86208ms stddev=0.169177
      > ok 26 Connect random-search     8192 keys: min=2ms max=4ms mean=2.87592ms stddev=0.119915
      > ok 27    Worst case delete      8192 keys: min=6ms max=11ms mean=7.55291ms stddev=0.274826
      > ok 28        Add a new key      8192 keys: min=1ms max=5ms mean=2.56797ms stddev=0.160249
      > ok 29 Remove random-search      8192 keys: min=5ms max=10ms mean=7.14002ms stddev=0.267208
      > ok 30         Remove async      8192 keys: min=0ms max=0ms mean=0.0320066ms stddev=0.0178904
      > ok 31   Worst case connect      16384 keys: min=5ms max=6ms mean=5.55334ms stddev=0.235655
      > ok 32 Connect random-search     16384 keys: min=5ms max=6ms mean=5.52614ms stddev=0.166225
      > ok 33    Worst case delete      16384 keys: min=5ms max=11ms mean=7.39109ms stddev=0.271866
      > ok 34        Add a new key      16384 keys: min=2ms max=4ms mean=3.35799ms stddev=0.183248
      > ok 35 Remove random-search      16384 keys: min=5ms max=8ms mean=6.86078ms stddev=0.261931
      > ok 36         Remove async      16384 keys: min=0ms max=0ms mean=0.0302384ms stddev=0.0173892
      > # Totals: pass:36 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      >From the output it's visible that the current simplified approach with
      linked-list of MKTs scales quite fine even for thousands of keys.
      And that also means that the majority of the time for delete is eaten by
      synchronize_rcu() [which I can confirm separately by tracing].
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1066c9c
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests · 6f0c472a
      Dmitry Safonov authored
      
      
      The test plan was (most of tests have all 3 client types):
      1. TCP-AO listen (INADDR_ANY)
      2. TCP-MD5 listen (INADDR_ANY)
      3. non-signed listen (INADDR_ANY)
      4. TCP-AO + TCP-MD5 listen (prefix)
      5. TCP-AO subprefix add failure [checked in setsockopt-closed.c]
      6. TCP-AO out of prefix connect [checked in connect-deny.c]
      7. TCP-AO + TCP-MD5 on connect()
      8. TCP-AO intersect with TCP-MD5 failure
      9. Established TCP-AO: add TCP-MD5 key
      10. Established TCP-MD5: add TCP-AO key
      11. Established non-signed: add TCP-AO key
      
      Output produced:
      > # ./unsigned-md5_ipv6
      > 1..72
      > # 1592[lib/setup.c:239] rand seed 1697567046
      > TAP version 13
      > ok 1 AO server (INADDR_ANY): AO client: counter TCPAOGood increased 0 => 2
      > ok 2 AO server (INADDR_ANY): AO client: connected
      > ok 3 AO server (INADDR_ANY): MD5 client
      > ok 4 AO server (INADDR_ANY): MD5 client: counter TCPMD5Unexpected increased 0 => 1
      > ok 5 AO server (INADDR_ANY): no sign client: counter TCPAORequired increased 0 => 1
      > ok 6 AO server (INADDR_ANY): unsigned client
      > ok 7 AO server (AO_REQUIRED): AO client: connected
      > ok 8 AO server (AO_REQUIRED): AO client: counter TCPAOGood increased 4 => 6
      > ok 9 AO server (AO_REQUIRED): unsigned client
      > ok 10 AO server (AO_REQUIRED): unsigned client: counter TCPAORequired increased 1 => 2
      > ok 11 MD5 server (INADDR_ANY): AO client: counter TCPAOKeyNotFound increased 0 => 1
      > ok 12 MD5 server (INADDR_ANY): AO client
      > ok 13 MD5 server (INADDR_ANY): MD5 client: connected
      > ok 14 MD5 server (INADDR_ANY): MD5 client: no counter checks
      > ok 15 MD5 server (INADDR_ANY): no sign client
      > ok 16 MD5 server (INADDR_ANY): no sign client: counter TCPMD5NotFound increased 0 => 1
      > ok 17 no sign server: AO client
      > ok 18 no sign server: AO client: counter TCPAOKeyNotFound increased 1 => 2
      > ok 19 no sign server: MD5 client
      > ok 20 no sign server: MD5 client: counter TCPMD5Unexpected increased 1 => 2
      > ok 21 no sign server: no sign client: connected
      > ok 22 no sign server: no sign client: counter CurrEstab increased 0 => 1
      > ok 23 AO+MD5 server: AO client (matching): connected
      > ok 24 AO+MD5 server: AO client (matching): counter TCPAOGood increased 8 => 10
      > ok 25 AO+MD5 server: AO client (misconfig, matching MD5)
      > ok 26 AO+MD5 server: AO client (misconfig, matching MD5): counter TCPAOKeyNotFound increased 2 => 3
      > ok 27 AO+MD5 server: AO client (misconfig, non-matching): counter TCPAOKeyNotFound increased 3 => 4
      > ok 28 AO+MD5 server: AO client (misconfig, non-matching)
      > ok 29 AO+MD5 server: MD5 client (matching): connected
      > ok 30 AO+MD5 server: MD5 client (matching): no counter checks
      > ok 31 AO+MD5 server: MD5 client (misconfig, matching AO)
      > ok 32 AO+MD5 server: MD5 client (misconfig, matching AO): counter TCPMD5Unexpected increased 2 => 3
      > ok 33 AO+MD5 server: MD5 client (misconfig, non-matching)
      > ok 34 AO+MD5 server: MD5 client (misconfig, non-matching): counter TCPMD5Unexpected increased 3 => 4
      > ok 35 AO+MD5 server: no sign client (unmatched): connected
      > ok 36 AO+MD5 server: no sign client (unmatched): counter CurrEstab increased 0 => 1
      > ok 37 AO+MD5 server: no sign client (misconfig, matching AO)
      > ok 38 AO+MD5 server: no sign client (misconfig, matching AO): counter TCPAORequired increased 2 => 3
      > ok 39 AO+MD5 server: no sign client (misconfig, matching MD5)
      > ok 40 AO+MD5 server: no sign client (misconfig, matching MD5): counter TCPMD5NotFound increased 1 => 2
      > ok 41 AO+MD5 server: client with both [TCP-MD5] and TCP-AO keys: connect() was prevented
      > ok 42 AO+MD5 server: client with both [TCP-MD5] and TCP-AO keys: no counter checks
      > ok 43 AO+MD5 server: client with both TCP-MD5 and [TCP-AO] keys: connect() was prevented
      > ok 44 AO+MD5 server: client with both TCP-MD5 and [TCP-AO] keys: no counter checks
      > ok 45 TCP-AO established: add TCP-MD5 key: postfailed as expected
      > ok 46 TCP-AO established: add TCP-MD5 key: counter TCPAOGood increased 12 => 14
      > ok 47 TCP-MD5 established: add TCP-AO key: postfailed as expected
      > ok 48 TCP-MD5 established: add TCP-AO key: no counter checks
      > ok 49 non-signed established: add TCP-AO key: postfailed as expected
      > ok 50 non-signed established: add TCP-AO key: counter CurrEstab increased 0 => 1
      > ok 51 TCP-AO key intersects with existing TCP-MD5 key: prefailed as expected: Key was rejected by service
      > ok 52 TCP-MD5 key intersects with existing TCP-AO key: prefailed as expected: Key was rejected by service
      > ok 53 TCP-MD5 key + TCP-AO required: prefailed as expected: Key was rejected by service
      > ok 54 TCP-AO required on socket + TCP-MD5 key: prefailed as expected: Key was rejected by service
      > ok 55 VRF: TCP-AO key (no l3index) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service
      > ok 56 VRF: TCP-MD5 key (no l3index) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service
      > ok 57 VRF: TCP-AO key (no l3index) + TCP-MD5 key (l3index=0): prefailed as expected: Key was rejected by service
      > ok 58 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service
      > ok 59 VRF: TCP-AO key (no l3index) + TCP-MD5 key (l3index=N): prefailed as expected: Key was rejected by service
      > ok 60 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service
      > ok 61 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service
      > ok 62 VRF: TCP-MD5 key (no l3index) + TCP-AO key (l3index=0): prefailed as expected: Key was rejected by service
      > ok 63 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (l3index=0): prefailed as expected: Key was rejected by service
      > ok 64 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (l3index=0): prefailed as expected: Key was rejected by service
      > ok 65 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (l3index=N)
      > ok 66 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (l3index=0)
      > ok 67 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service
      > ok 68 VRF: TCP-MD5 key (no l3index) + TCP-AO key (l3index=N): prefailed as expected: Key was rejected by service
      > ok 69 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (l3index=0)
      > ok 70 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (l3index=N)
      > ok 71 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (l3index=N): prefailed as expected: Key was rejected by service
      > ok 72 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (l3index=N): prefailed as expected: Key was rejected by service
      > # Totals: pass:72 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f0c472a
    • Dmitry Safonov's avatar
      selftests/net: Add test for TCP-AO add setsockopt() command · b2666053
      Dmitry Safonov authored
      
      
      Verify corner-cases for UAPI.
      Sample output:
      > # ./setsockopt-closed_ipv4
      > 1..120
      > # 1657[lib/setup.c:254] rand seed 1681938184
      > TAP version 13
      > ok 1 AO add: minimum size
      > ok 2 AO add: extended size
      > ok 3 AO add: null optval
      > ok 4 AO del: minimum size
      > ok 5 AO del: extended size
      > ok 6 AO del: null optval
      > ok 7 AO set info: minimum size
      > ok 8 AO set info: extended size
      > ok 9 AO info get: : extended size
      > ok 10 AO set info: null optval
      > ok 11 AO get info: minimum size
      > ok 12 AO get info: extended size
      > ok 13 AO get info: null optval
      > ok 14 AO get info: null optlen
      > ok 15 AO get keys: minimum size
      > ok 16 AO get keys: extended size
      > ok 17 AO get keys: null optval
      > ok 18 AO get keys: null optlen
      > ok 19 key add: too big keylen
      > ok 20 key add: using reserved padding
      > ok 21 key add: using reserved2 padding
      > ok 22 key add: wrong address family
      > ok 23 key add: port (unsupported)
      > ok 24 key add: no prefix, addr
      > ok 25 key add: no prefix, any addr
      > ok 26 key add: prefix, any addr
      > ok 27 key add: too big prefix
      > ok 28 key add: too short prefix
      > ok 29 key add: bad key flags
      > ok 30 key add: add current key on a listen socket
      > ok 31 key add: add rnext key on a listen socket
      > ok 32 key add: add current+rnext key on a listen socket
      > ok 33 key add: add key and set as current
      > ok 34 key add: add key and set as rnext
      > ok 35 key add: add key and set as current+rnext
      > ok 36 key add: ifindex without TCP_AO_KEYF_IFNINDEX
      > ok 37 key add: non-existent VRF
      > ok 38 optmem limit was hit on adding 69 key
      > ok 39 key add: maclen bigger than TCP hdr
      > ok 40 key add: bad algo
      > ok 41 key del: using reserved padding
      > ok 42 key del: using reserved2 padding
      > ok 43 key del: del and set current key on a listen socket
      > ok 44 key del: del and set rnext key on a listen socket
      > ok 45 key del: del and set current+rnext key on a listen socket
      > ok 46 key del: bad key flags
      > ok 47 key del: ifindex without TCP_AO_KEYF_IFNINDEX
      > ok 48 key del: non-existent VRF
      > ok 49 key del: set non-exising current key
      > ok 50 key del: set non-existing rnext key
      > ok 51 key del: set non-existing current+rnext key
      > ok 52 key del: set current key
      > ok 53 key del: set rnext key
      > ok 54 key del: set current+rnext key
      > ok 55 key del: set as current key to be removed
      > ok 56 key del: set as rnext key to be removed
      > ok 57 key del: set as current+rnext key to be removed
      > ok 58 key del: async on non-listen
      > ok 59 key del: non-existing sndid
      > ok 60 key del: non-existing rcvid
      > ok 61 key del: incorrect addr
      > ok 62 key del: correct key delete
      > ok 63 AO info set: set current key on a listen socket
      > ok 64 AO info set: set rnext key on a listen socket
      > ok 65 AO info set: set current+rnext key on a listen socket
      > ok 66 AO info set: using reserved padding
      > ok 67 AO info set: using reserved2 padding
      > ok 68 AO info set: accept_icmps
      > ok 69 AO info get: accept_icmps
      > ok 70 AO info set: ao required
      > ok 71 AO info get: ao required
      > ok 72 AO info set: ao required with MD5 key
      > ok 73 AO info set: set non-existing current key
      > ok 74 AO info set: set non-existing rnext key
      > ok 75 AO info set: set non-existing current+rnext key
      > ok 76 AO info set: set current key
      > ok 77 AO info get: set current key
      > ok 78 AO info set: set rnext key
      > ok 79 AO info get: set rnext key
      > ok 80 AO info set: set current+rnext key
      > ok 81 AO info get: set current+rnext key
      > ok 82 AO info set: set counters
      > ok 83 AO info get: set counters
      > ok 84 AO info set: no-op
      > ok 85 AO info get: no-op
      > ok 86 get keys: no ao_info
      > ok 87 get keys: proper tcp_ao_get_mkts()
      > ok 88 get keys: set out-only pkt_good counter
      > ok 89 get keys: set out-only pkt_bad counter
      > ok 90 get keys: bad keyflags
      > ok 91 get keys: ifindex without TCP_AO_KEYF_IFNINDEX
      > ok 92 get keys: using reserved field
      > ok 93 get keys: no prefix, addr
      > ok 94 get keys: no prefix, any addr
      > ok 95 get keys: prefix, any addr
      > ok 96 get keys: too big prefix
      > ok 97 get keys: too short prefix
      > ok 98 get keys: prefix + addr
      > ok 99 get keys: get_all + prefix
      > ok 100 get keys: get_all + addr
      > ok 101 get keys: get_all + sndid
      > ok 102 get keys: get_all + rcvid
      > ok 103 get keys: current + prefix
      > ok 104 get keys: current + addr
      > ok 105 get keys: current + sndid
      > ok 106 get keys: current + rcvid
      > ok 107 get keys: rnext + prefix
      > ok 108 get keys: rnext + addr
      > ok 109 get keys: rnext + sndid
      > ok 110 get keys: rnext + rcvid
      > ok 111 get keys: get_all + current
      > ok 112 get keys: get_all + rnext
      > ok 113 get keys: current + rnext
      > ok 114 key add: duplicate: full copy
      > ok 115 key add: duplicate: any addr key on the socket
      > ok 116 key add: duplicate: add any addr key
      > ok 117 key add: duplicate: add any addr for the same subnet
      > ok 118 key add: duplicate: full copy of a key
      > ok 119 key add: duplicate: RecvID differs
      > ok 120 key add: duplicate: SendID differs
      > # Totals: pass:120 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2666053
    • Dmitry Safonov's avatar
      selftests/net: Add a test for TCP-AO keys matching · ed9d09b3
      Dmitry Safonov authored
      
      
      Add TCP-AO tests on connect()/accept() pair.
      SNMP counters exposed by kernel are very useful here to verify the
      expected behavior of TCP-AO.
      
      Expected output for ipv4 version:
      > # ./connect-deny_ipv4
      > 1..19
      > # 1702[lib/setup.c:254] rand seed 1680553689
      > TAP version 13
      > ok 1 Non-AO server + AO client
      > ok 2 Non-AO server + AO client: counter TCPAOKeyNotFound increased 0 => 1
      > ok 3 AO server + Non-AO client
      > ok 4 AO server + Non-AO client: counter TCPAORequired increased 0 => 1
      > ok 5 Wrong password
      > ok 6 Wrong password: counter TCPAOBad increased 0 => 1
      > ok 7 Wrong rcv id
      > ok 8 Wrong rcv id: counter TCPAOKeyNotFound increased 1 => 2
      > ok 9 Wrong snd id
      > ok 10 Wrong snd id: counter TCPAOGood increased 0 => 1
      > ok 11 Server: Wrong addr: counter TCPAOKeyNotFound increased 2 => 3
      > ok 12 Server: Wrong addr
      > ok 13 Client: Wrong addr: connect() was prevented
      > ok 14 rcv id != snd id: connected
      > ok 15 rcv id != snd id: counter TCPAOGood increased 1 => 3
      > ok 16 Server: prefix match: connected
      > ok 17 Server: prefix match: counter TCPAOGood increased 4 => 6
      > ok 18 Client: prefix match: connected
      > ok 19 Client: prefix match: counter TCPAOGood increased 7 => 9
      > # Totals: pass:19 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Expected output for ipv6 version:
      > # ./connect-deny_ipv6
      > 1..19
      > # 1725[lib/setup.c:254] rand seed 1680553711
      > TAP version 13
      > ok 1 Non-AO server + AO client
      > ok 2 Non-AO server + AO client: counter TCPAOKeyNotFound increased 0 => 1
      > ok 3 AO server + Non-AO client: counter TCPAORequired increased 0 => 1
      > ok 4 AO server + Non-AO client
      > ok 5 Wrong password: counter TCPAOBad increased 0 => 1
      > ok 6 Wrong password
      > ok 7 Wrong rcv id: counter TCPAOKeyNotFound increased 1 => 2
      > ok 8 Wrong rcv id
      > ok 9 Wrong snd id: counter TCPAOGood increased 0 => 1
      > ok 10 Wrong snd id
      > ok 11 Server: Wrong addr
      > ok 12 Server: Wrong addr: counter TCPAOKeyNotFound increased 2 => 3
      > ok 13 Client: Wrong addr: connect() was prevented
      > ok 14 rcv id != snd id: connected
      > ok 15 rcv id != snd id: counter TCPAOGood increased 1 => 3
      > ok 16 Server: prefix match: connected
      > ok 17 Server: prefix match: counter TCPAOGood increased 5 => 7
      > ok 18 Client: prefix match: connected
      > ok 19 Client: prefix match: counter TCPAOGood increased 8 => 10
      > # Totals: pass:19 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed9d09b3
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO ICMPs accept test · d11301f6
      Dmitry Safonov authored
      
      
      Reverse to icmps-discard test: the server accepts ICMPs, using
      TCP_AO_CMDF_ACCEPT_ICMP and it is expected to fail under ICMP
      flood from client. Test that the default pre-TCP-AO behaviour functions
      when TCP_AO_CMDF_ACCEPT_ICMP is set.
      
      Expected output for ipv4 version (in case it receives ICMP_PROT_UNREACH):
      > # ./icmps-accept_ipv4
      > 1..3
      > # 3209[lib/setup.c:166] rand seed 1642623870
      > TAP version 13
      > # 3209[lib/proc.c:207]    Snmp6             Ip6InReceives: 0 => 1
      > # 3209[lib/proc.c:207]    Snmp6             Ip6InNoRoutes: 0 => 1
      > # 3209[lib/proc.c:207]    Snmp6               Ip6InOctets: 0 => 76
      > # 3209[lib/proc.c:207]    Snmp6            Ip6InNoECTPkts: 0 => 1
      > # 3209[lib/proc.c:207]      Tcp                    InSegs: 3 => 23
      > # 3209[lib/proc.c:207]      Tcp                   OutSegs: 2 => 22
      > # 3209[lib/proc.c:207]  IcmpMsg                   InType3: 0 => 4
      > # 3209[lib/proc.c:207]     Icmp                    InMsgs: 0 => 4
      > # 3209[lib/proc.c:207]     Icmp            InDestUnreachs: 0 => 4
      > # 3209[lib/proc.c:207]       Ip                InReceives: 3 => 27
      > # 3209[lib/proc.c:207]       Ip                InDelivers: 3 => 27
      > # 3209[lib/proc.c:207]       Ip               OutRequests: 2 => 22
      > # 3209[lib/proc.c:207]    IpExt                  InOctets: 288 => 3420
      > # 3209[lib/proc.c:207]    IpExt                 OutOctets: 124 => 3244
      > # 3209[lib/proc.c:207]    IpExt               InNoECTPkts: 3 => 25
      > # 3209[lib/proc.c:207]   TcpExt               TCPPureAcks: 1 => 2
      > # 3209[lib/proc.c:207]   TcpExt           TCPOrigDataSent: 0 => 20
      > # 3209[lib/proc.c:207]   TcpExt              TCPDelivered: 0 => 19
      > # 3209[lib/proc.c:207]   TcpExt                 TCPAOGood: 3 => 23
      > ok 1 InDestUnreachs delivered 4
      > ok 2 server failed with -92: Protocol not available
      > ok 3 TCPAODroppedIcmps counter didn't change: 0 >= 0
      > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Expected output for ipv6 version (in case it receives ADM_PROHIBITED):
      > # ./icmps-accept_ipv6
      > 1..3
      > # 3277[lib/setup.c:166] rand seed 1642624035
      > TAP version 13
      > # 3277[lib/proc.c:207]    Snmp6             Ip6InReceives: 6 => 31
      > # 3277[lib/proc.c:207]    Snmp6             Ip6InDelivers: 4 => 29
      > # 3277[lib/proc.c:207]    Snmp6            Ip6OutRequests: 4 => 24
      > # 3277[lib/proc.c:207]    Snmp6               Ip6InOctets: 592 => 4492
      > # 3277[lib/proc.c:207]    Snmp6              Ip6OutOctets: 332 => 3852
      > # 3277[lib/proc.c:207]    Snmp6            Ip6InNoECTPkts: 6 => 31
      > # 3277[lib/proc.c:207]    Snmp6               Icmp6InMsgs: 1 => 6
      > # 3277[lib/proc.c:207]    Snmp6       Icmp6InDestUnreachs: 0 => 5
      > # 3277[lib/proc.c:207]    Snmp6              Icmp6InType1: 0 => 5
      > # 3277[lib/proc.c:207]      Tcp                    InSegs: 3 => 23
      > # 3277[lib/proc.c:207]      Tcp                   OutSegs: 2 => 22
      > # 3277[lib/proc.c:207]   TcpExt               TCPPureAcks: 1 => 2
      > # 3277[lib/proc.c:207]   TcpExt           TCPOrigDataSent: 0 => 20
      > # 3277[lib/proc.c:207]   TcpExt              TCPDelivered: 0 => 19
      > # 3277[lib/proc.c:207]   TcpExt                 TCPAOGood: 3 => 23
      > ok 1 Icmp6InDestUnreachs delivered 5
      > ok 2 server failed with -13: Permission denied
      > ok 3 TCPAODroppedIcmps counter didn't change: 0 >= 0
      > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      With some luck the server may fail with ECONNREFUSED (depending on what
      icmp packet was delivered firstly).
      For the kernel error handlers see: tab_unreach[] and icmp_err_convert[].
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d11301f6
    • Dmitry Safonov's avatar
      selftests/net: Verify that TCP-AO complies with ignoring ICMPs · a8fcf8ca
      Dmitry Safonov authored
      
      
      Hand-crafted ICMP packets are sent to the server, the server checks for
      hard/soft errors and fails if any.
      
      Expected output for ipv4 version:
      > # ./icmps-discard_ipv4
      > 1..3
      > # 3164[lib/setup.c:166] rand seed 1642623745
      > TAP version 13
      > # 3164[lib/proc.c:207]    Snmp6             Ip6InReceives: 0 => 1
      > # 3164[lib/proc.c:207]    Snmp6             Ip6InNoRoutes: 0 => 1
      > # 3164[lib/proc.c:207]    Snmp6               Ip6InOctets: 0 => 76
      > # 3164[lib/proc.c:207]    Snmp6            Ip6InNoECTPkts: 0 => 1
      > # 3164[lib/proc.c:207]      Tcp                    InSegs: 2 => 203
      > # 3164[lib/proc.c:207]      Tcp                   OutSegs: 1 => 202
      > # 3164[lib/proc.c:207]  IcmpMsg                   InType3: 0 => 543
      > # 3164[lib/proc.c:207]     Icmp                    InMsgs: 0 => 543
      > # 3164[lib/proc.c:207]     Icmp            InDestUnreachs: 0 => 543
      > # 3164[lib/proc.c:207]       Ip                InReceives: 2 => 746
      > # 3164[lib/proc.c:207]       Ip                InDelivers: 2 => 746
      > # 3164[lib/proc.c:207]       Ip               OutRequests: 1 => 202
      > # 3164[lib/proc.c:207]    IpExt                  InOctets: 132 => 61684
      > # 3164[lib/proc.c:207]    IpExt                 OutOctets: 68 => 31324
      > # 3164[lib/proc.c:207]    IpExt               InNoECTPkts: 2 => 744
      > # 3164[lib/proc.c:207]   TcpExt               TCPPureAcks: 1 => 2
      > # 3164[lib/proc.c:207]   TcpExt           TCPOrigDataSent: 0 => 200
      > # 3164[lib/proc.c:207]   TcpExt              TCPDelivered: 0 => 199
      > # 3164[lib/proc.c:207]   TcpExt                 TCPAOGood: 2 => 203
      > # 3164[lib/proc.c:207]   TcpExt         TCPAODroppedIcmps: 0 => 541
      > ok 1 InDestUnreachs delivered 543
      > ok 2 Server survived 20000 bytes of traffic
      > ok 3 ICMPs ignored 541
      > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Expected output for ipv6 version:
      > # ./icmps-discard_ipv6
      > 1..3
      > # 3186[lib/setup.c:166] rand seed 1642623803
      > TAP version 13
      > # 3186[lib/proc.c:207]    Snmp6             Ip6InReceives: 4 => 568
      > # 3186[lib/proc.c:207]    Snmp6             Ip6InDelivers: 3 => 564
      > # 3186[lib/proc.c:207]    Snmp6            Ip6OutRequests: 2 => 204
      > # 3186[lib/proc.c:207]    Snmp6            Ip6InMcastPkts: 1 => 4
      > # 3186[lib/proc.c:207]    Snmp6           Ip6OutMcastPkts: 0 => 1
      > # 3186[lib/proc.c:207]    Snmp6               Ip6InOctets: 320 => 70420
      > # 3186[lib/proc.c:207]    Snmp6              Ip6OutOctets: 160 => 35512
      > # 3186[lib/proc.c:207]    Snmp6          Ip6InMcastOctets: 72 => 336
      > # 3186[lib/proc.c:207]    Snmp6         Ip6OutMcastOctets: 0 => 76
      > # 3186[lib/proc.c:207]    Snmp6            Ip6InNoECTPkts: 4 => 568
      > # 3186[lib/proc.c:207]    Snmp6               Icmp6InMsgs: 1 => 361
      > # 3186[lib/proc.c:207]    Snmp6              Icmp6OutMsgs: 1 => 2
      > # 3186[lib/proc.c:207]    Snmp6       Icmp6InDestUnreachs: 0 => 360
      > # 3186[lib/proc.c:207]    Snmp6      Icmp6OutMLDv2Reports: 0 => 1
      > # 3186[lib/proc.c:207]    Snmp6              Icmp6InType1: 0 => 360
      > # 3186[lib/proc.c:207]    Snmp6           Icmp6OutType143: 0 => 1
      > # 3186[lib/proc.c:207]      Tcp                    InSegs: 2 => 203
      > # 3186[lib/proc.c:207]      Tcp                   OutSegs: 1 => 202
      > # 3186[lib/proc.c:207]   TcpExt               TCPPureAcks: 1 => 2
      > # 3186[lib/proc.c:207]   TcpExt           TCPOrigDataSent: 0 => 200
      > # 3186[lib/proc.c:207]   TcpExt              TCPDelivered: 0 => 199
      > # 3186[lib/proc.c:207]   TcpExt                 TCPAOGood: 2 => 203
      > # 3186[lib/proc.c:207]   TcpExt         TCPAODroppedIcmps: 0 => 360
      > ok 1 Icmp6InDestUnreachs delivered 360
      > ok 2 Server survived 20000 bytes of traffic
      > ok 3 ICMPs ignored 360
      > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8fcf8ca
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO library · cfbab37b
      Dmitry Safonov authored
      
      
      Provide functions to create selftests dedicated to TCP-AO.
      They can run in parallel, as they use temporary net namespaces.
      They can be very specific to the feature being tested.
      This will allow to create a lot of TCP-AO tests, without complicating
      one binary with many --options and to create scenarios, that are
      hard to put in bash script that uses one binary.
      
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfbab37b
    • Vladimir Oltean's avatar
      net: phylink: reimplement population of pl->supported for in-band · 37a8997f
      Vladimir Oltean authored
      
      
      phylink_parse_mode() populates all possible supported link modes for a
      given phy_interface_t, for the case where a phylib phy may be absent and
      we can't retrieve the supported link modes from that.
      
      Russell points out that since the introduction of the generic validation
      helpers phylink_get_capabilities() and phylink_caps_to_linkmodes(), we
      can rewrite this procedure to populate the pl->supported mask, so that
      instead of spelling out the link modes, we derive an intermediary
      mac_capabilities bit field, and we convert that to the equivalent link
      modes.
      
      Suggested-by: default avatarRussell King (Oracle) <linux@armlinux.org.uk>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37a8997f
  4. Dec 16, 2023
  5. Dec 15, 2023
    • Wang Jinchao's avatar
      hv_netvsc: remove duplicated including of slab.h · e91db161
      Wang Jinchao authored
      
      
      rm the second include <linux/slab.h>
      
      Signed-off-by: default avatarWang Jinchao <wangjinchao@xfusion.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e91db161
    • David S. Miller's avatar
      Merge branch 'netlink-specs-legacy' · f06f0891
      David S. Miller authored
      
      
      Jakub Kicinski says:
      
      ====================
      netlink: specs: prep legacy specs for C code gen
      
      Minor adjustments to some specs to make them ready for C code gen.
      
      v2:
       - fix MAINATINERS and subject of patch 3
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f06f0891
    • Jakub Kicinski's avatar
      netlink: specs: mptcp: rename the MPTCP path management spec · b059aef7
      Jakub Kicinski authored
      
      
      We assume in handful of places that the name of the spec is
      the same as the name of the family. We could fix that but
      it seems like a fair assumption to make. Rename the MPTCP
      spec instead.
      
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b059aef7
    • Jakub Kicinski's avatar
      netlink: specs: ovs: correct enum names in specs · 209bcb9a
      Jakub Kicinski authored
      
      
      Align the enum-names of OVS with what's actually in the uAPI.
      Either correct the names, or mark the enum as empty because
      the values are in fact #defines.
      
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      209bcb9a
    • Jakub Kicinski's avatar
      netlink: specs: ovs: remove fixed header fields from attrs · 3ada0b33
      Jakub Kicinski authored
      
      
      Op's "attributes" list is a workaround for families with a single
      attr set. We don't want to render a single huge request structure,
      the same for each op since we know that most ops accept only a small
      set of attributes. "Attributes" list lets us narrow down the attributes
      to what op acctually pays attention to.
      
      It doesn't make sense to put names of fixed headers in there.
      They are not "attributes" and we can't really narrow down the struct
      members.
      
      Remove the fixed header fields from attrs for ovs families
      in preparation for C codegen support.
      
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ada0b33
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 283f105b
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      add v2 FW logging for ice driver
      
      Paul Stillwell says:
      
      Firmware (FW) log support was added to the ice driver, but that version is
      no longer supported. There is a newer version of FW logging (v2) that
      adds more control knobs to get the exact data out of the FW
      for debugging.
      
      The interface for FW logging is debugfs. This was chosen based on
      discussions here:
      https://lore.kernel.org/netdev/20230214180712.53fc8ba2@kernel.org/ and
      https://lore.kernel.org/netdev/20231012164033.1069fb4b@kernel.org/
      
      We talked about using devlink in a variety of ways, but none of those
      options made any sense for the way the FW reports data. We briefly talked
      about using ethtool, but that seemed to go by the wayside. Ultimately it
      seems like using debugfs is the way to go so re-implement the code to use
      that.
      
      FW logging is across all the PFs on the device so restrict the commands to
      only PF0.
      
      If the device supports FW logging then a directory named 'fwlog' will be
      created under '/sys/kernel/debug/ice/<pci_dev>'. A variety of files will be
      created to manage the behavior of logging. The following files will be
      created:
      - modules/<module>
      - nr_messages
      - enable
      - log_size
      - data
      
      where
      modules/<module> is used to read/write the log level for a specific module
      
      nr_messages is used to determine how many events should be in each message
      sent to the driver
      
      enable is used to start/stop FW logging. This is a boolean value so only 1
      or 0 are permissible values
      
      log_size is used to configure the amount of memory the driver uses for log
      data
      
      data is used to read/clear the log data
      
      Generally there is a lot of data and dumping that data to syslog will
      result in a loss of data. This causes problems when decoding the data and
      the user doesn't know that data is missing until later. Instead of dumping
      the FW log output to syslog use debugfs. This ensures that all the data the
      driver has gets retrieved correctly.
      
      The FW log data is binary data that the FW team decodes to determine what
      happened in firmware. The binary blob is sent to Intel for decoding.
      ---
      v6:
      - use seq_printf() for outputting module info when reading from 'module' file
      - replace code that created argc and argv for handling command line input
      - removed checks in all the _read() and _write() functions to see if FW logging
        is supported because the files will not exist if it is not supported
      - removed warnings on allocation failures on debugfs file creation failures
      - removed a newline between memory allocation and checking if the memory was
        allocated
      - fixed cases where we could just return the value from a function call
        instead of saving the value in a variable
      - moved the check for PFO in ice_fwlog_init() to an earlier patch
      - reworked all of argument scanning in the _write() functions in ice_debugfs.c
        to remove adding characters past the end of the buffer
      
      v5: https://lore.kernel.org/netdev/20231205211251.2122874-1-anthony.l.nguyen@intel.com/
      - changed the log level configuration from a single file for all modules to a
        file per module.
      - changed 'nr_buffs' to 'log_size' because users understand memory sizes
        better than a number of buffers
      - changed 'resolution' to 'nr_messages' to better reflect what it represents
      - updated documentation to reflect these changes
      - updated documentation to indicate that FW logging must be disabled to
        clear the data. also clarified that any value written to the 'data' file will
        clear the data
      
      v4: https://lore.kernel.org/netdev/20231005170110.3221306-1-anthony.l.nguyen@intel.com/
      - removed CONFIG_DEBUG_FS wrapper around code because the debugfs calls handle
        this case already
      - moved ice_debugfs_exit() call to remove unreachable code issue
      - minor changes to documentation based on feedback
      
      v3: https://lore.kernel.org/netdev/20230815165750.2789609-1-anthony.l.nguyen@intel.com/
      - Adjust error path cleanup in ice_module_init() for unreachable code.
      
      v2: https://lore.kernel.org/netdev/20230810170109.1963832-1-anthony.l.nguyen@intel.com/
      - Rewrote code to use debugfs instead of devlink
      
      v1: https://lore.kernel.org/netdev/20230209190702.3638688-1-anthony.l.nguyen@intel.com/
      
      
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      283f105b
    • David S. Miller's avatar
      Merge branch 'mv88e6xxx-counters' · b84d66b0
      David S. Miller authored
      
      
      Tobias Waldekranz says:
      
      ====================
      net: dsa: mv88e6xxx: Add "eth-mac" and "rmon" counter group support
      
      The majority of the changes (2/8) are about refactoring the existing
      ethtool statistics support to make it possible to read individual
      counters, rather than the whole set.
      
      4/8 tries to collect all information about a stat in a single place
      using a mapper macro, which is then used to generate the original list
      of stats, along with a matching enum. checkpatch is less than amused
      with this construct, but prior art exists (__BPF_FUNC_MAPPER in
      include/uapi/linux/bpf.h, for example).
      
      To support the histogram counters from the "rmon" group, we have to
      change mv88e6xxx's configuration of them. Instead of counting rx and
      tx, we restrict them to rx-only. 6/8 has the details.
      
      With that in place, adding the actual counter groups is pretty
      straight forward (5,7/8).
      
      Tie it all together with a selftest (8/8).
      
      v3 -> v4:
      - Return size_t from mv88e6xxx_stats_get_stats
      - Spelling errors in commit message of 6/8
      - Improve selftest:
        - Report progress per-bucket
        - Test both ports in the pair
        - Increase MTU, if required
      
      v2 -> v3:
      - Added 6/8
      - Added 8/8
      
      v1 -> v2:
      - Added 1/6
      - Added 3/6
      - Changed prototype of stats operation to reflect the fact that the
        number of read stats are returned, no errors
      - Moved comma into MV88E6XXX_HW_STAT_MAPPER definition
      - Avoid the construction of mapping table iteration which relied on
        struct layouts outside of mv88e6xxx's control
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b84d66b0