Skip to content
  1. Jul 02, 2023
  2. Jul 01, 2023
  3. Jun 30, 2023
  4. Jun 29, 2023
    • Paolo Abeni's avatar
      Merge branch 'fix-ptp-received-on-wrong-port-with-bridged-sja1105-dsa' · 5998bb76
      Paolo Abeni authored
      Vladimir Oltean says:
      
      ====================
      Fix PTP received on wrong port with bridged SJA1105 DSA
      
      Since the changes were made to tag_8021q to support imprecise RX for
      bridged ports, the tag_sja1105 driver still prefers the source port
      information deduced from the VLAN headers for link-local traffic, even
      though the switch can theoretically do better and report the precise
      source port.
      
      The problem is that the tagger doesn't know when to trust one source of
      information over another, because the INCL_SRCPT option (to "tag" link
      local frames) is sometimes enabled and sometimes it isn't.
      
      The first patch makes the switch provide the hardware tag for link local
      traffic under all circumstances, and the second patch makes the tagger
      always use that hardware tag as primary source of information for link
      local packets.
      ====================
      
      Link: https://lore.kernel.org/r/20230627094207.3385231-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5998bb76
    • Vladimir Oltean's avatar
      net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT · c1ae02d8
      Vladimir Oltean authored
      
      
      Currently the sja1105 tagging protocol prefers using the source port
      information from the VLAN header if that is available, falling back to
      the INCL_SRCPT option if it isn't. The VLAN header is available for all
      frames except for META frames initiated by the switch (containing RX
      timestamps), and thus, the "if (is_link_local)" branch is practically
      dead.
      
      The tag_8021q source port identification has become more loose
      ("imprecise") and will report a plausible rather than exact bridge port,
      when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local
      traffic always needs to know the precise source port. With incorrect
      source port reporting, for example PTP traffic over 2 bridged ports will
      all be seen on sockets opened on the first such port, which is incorrect.
      
      Now that the tagging protocol has been changed to make link-local frames
      always contain source port information, we can reverse the order of the
      checks so that we always give precedence to that information (which is
      always precise) in lieu of the tag_8021q VID which is only precise for a
      standalone port.
      
      Fixes: d7f9787a ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID")
      Fixes: 91495f21 ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c1ae02d8
    • Vladimir Oltean's avatar
      net: dsa: sja1105: always enable the INCL_SRCPT option · b4638af8
      Vladimir Oltean authored
      
      
      Link-local traffic on bridged SJA1105 ports is sometimes tagged by the
      hardware with source port information (when the port is under a VLAN
      aware bridge).
      
      The tag_8021q source port identification has become more loose
      ("imprecise") and will report a plausible rather than exact bridge port,
      when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local
      traffic always needs to know the precise source port.
      
      Modify the driver logic (and therefore: the tagging protocol itself) to
      always include the source port information with link-local packets,
      regardless of whether the port is standalone, under a VLAN-aware or
      VLAN-unaware bridge. This makes it possible for the tagging driver to
      give priority to that information over the tag_8021q VLAN header.
      
      The big drawback with INCL_SRCPT is that it makes it impossible to
      distinguish between an original MAC DA of 01:80:C2:XX:YY:ZZ and
      01:80:C2:AA:BB:ZZ, because the tagger just patches MAC DA bytes 3 and 4
      with zeroes. Only if PTP RX timestamping is enabled, the switch will
      generate a META follow-up frame containing the RX timestamp and the
      original bytes 3 and 4 of the MAC DA. Those will be used to patch up the
      original packet. Nonetheless, in the absence of PTP RX timestamping, we
      have to live with this limitation, since it is more important to have
      the more precise source port information for link-local traffic.
      
      Fixes: d7f9787a ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID")
      Fixes: 91495f21 ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b4638af8
    • Paolo Abeni's avatar
      Merge branch 'fix-ptp-packet-drops-with-ocelot-8021q-dsa-tag-protocol' · e999c897
      Paolo Abeni authored
      Vladimir Oltean says:
      
      ====================
      Fix PTP packet drops with ocelot-8021q DSA tag protocol
      
      Changes in v2:
      - Distinguish between L2 and L4 PTP packets
      v1 at:
      https://lore.kernel.org/netdev/20230626154003.3153076-1-vladimir.oltean@nxp.com/
      
      Patch 3/3 fixes an issue with the ocelot/felix driver, where it would
      drop PTP traffic on RX unless hardware timestamping for that packet type
      was enabled.
      
      Fixing that requires the driver to know whether it had previously
      configured the hardware to timestamp PTP packets on that port. But it
      cannot correctly determine that today using the existing code structure,
      so patches 1/3 and 2/3 fix the control path of the code such that
      ocelot->ports[port]->trap_proto faithfully reflects whether that
      configuration took place.
      ====================
      
      Link: https://lore.kernel.org/r/20230627163114.3561597-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e999c897
    • Vladimir Oltean's avatar
      net: dsa: felix: don't drop PTP frames with tag_8021q when RX timestamping is disabled · 2edcfcbb
      Vladimir Oltean authored
      
      
      The driver implements a workaround for the fact that it doesn't have an
      IRQ source to tell it whether PTP frames are available through the
      extraction registers, for those frames to be processed and passed
      towards the network stack. That workaround is to configure the switch,
      through felix_hwtstamp_set() -> felix_update_trapping_destinations(),
      to create two copies of PTP packets: one sent over Ethernet to the DSA
      master, and one to be consumed through the aforementioned CPU extraction
      queue registers.
      
      The reason why we want PTP packets to be consumed through the CPU
      extraction registers in the first place is because we want to see their
      hardware RX timestamp. With tag_8021q, that is only visible that way,
      and it isn't visible with the copy of the packet that's transmitted over
      Ethernet.
      
      The problem with the workaround implementation is that it drops the
      packet received over Ethernet, in expectation of its copy being present
      in the CPU extraction registers. However, if felix_hwtstamp_set() hasn't
      run (aka PTP RX timestamping is disabled), the driver will drop the
      original PTP frame and there will be no copy of it in the CPU extraction
      registers. So, the network stack will simply not see any PTP frame.
      
      Look at the port's trapping configuration to see whether the driver has
      previously enabled the CPU extraction registers. If it hasn't, just
      don't RX timestamp the frame and let it be passed up the stack by DSA,
      which is perfectly fine.
      
      Fixes: 0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2edcfcbb
    • Vladimir Oltean's avatar
      net: mscc: ocelot: don't keep PTP configuration of all ports in single structure · 45d0fcb5
      Vladimir Oltean authored
      
      
      In a future change, the driver will need to determine whether PTP RX
      timestamping is enabled on a port (including whether traps were set up
      on that port in particular) and that is currently not possible.
      
      The driver supports different RX filters (L2, L4) and kinds of TX
      timestamping (one-step, two-step) on its ports, but it saves all
      configuration in a single struct hwtstamp_config that is global to the
      switch. So, the latest timestamping configuration on one port
      (including a request to disable timestamping) affects what gets reported
      for all ports, even though the configuration itself is still individual
      to each port.
      
      The port timestamping configurations are only coupled because of the
      common structure, so replace the hwtstamp_config with a mask of trapped
      protocols saved per port. We also have the ptp_cmd to distinguish
      between one-step and two-step PTP timestamping, so with those 2 bits of
      information we can fully reconstruct a descriptive struct
      hwtstamp_config for each port, during the SIOCGHWTSTAMP ioctl.
      
      Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
      Fixes: 96ca08c0 ("net: mscc: ocelot: set up traps for PTP packets")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      45d0fcb5
    • Vladimir Oltean's avatar
      net: mscc: ocelot: don't report that RX timestamping is enabled by default · 4fd44b82
      Vladimir Oltean authored
      
      
      PTP RX timestamping should be enabled when the user requests it, not by
      default. If it is enabled by default, it can be problematic when the
      ocelot driver is a DSA master, and it sidesteps what DSA tries to avoid
      through __dsa_master_hwtstamp_validate().
      
      Additionally, after the change which made ocelot trap PTP packets only
      to the CPU at ocelot_hwtstamp_set() time, it is no longer even true that
      RX timestamping is enabled by default, because until ocelot_hwtstamp_set()
      is called, the PTP traps are actually not set up. So the rx_filter field
      of ocelot->hwtstamp_config reflects an incorrect reality.
      
      Fixes: 96ca08c0 ("net: mscc: ocelot: set up traps for PTP packets")
      Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4fd44b82
    • Paolo Abeni's avatar
      Merge branch 'net-sched-act_ipt-bug-fixes' · 3c4bb45a
      Paolo Abeni authored
      Florian Westphal says:
      
      ====================
      net/sched: act_ipt bug fixes
      
      v3: prefer skb_header() helper in patch 2.  No other changes.
      I've retained Acks and RvB-Tags of v2.
      
      While checking if netfilter could be updated to replace selected
      instances of NF_DROP with kfree_skb_reason+NF_STOLEN to improve
      debugging info via drop monitor I found that act_ipt is incompatible
      with such an approach.  Moreover, it lacks multiple sanity checks
      to avoid certain code paths that make assumptions that the tc layer
      doesn't meet, such as header sanity checks, availability of skb_dst,
      skb_nfct() and so on.
      
      act_ipt test in the tc selftest still pass with this applied.
      
      I think that we should consider removal of this module, while
      this should take care of all problems, its ipv4 only and I don't
      think there are any netfilter targets that lack a native tc
      equivalent, even when ignoring bpf.
      ====================
      
      Link: https://lore.kernel.org/r/20230627123813.3036-1-fw@strlen.de
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3c4bb45a
    • Florian Westphal's avatar
      net/sched: act_ipt: zero skb->cb before calling target · 93d75d47
      Florian Westphal authored
      
      
      xtables relies on skb being owned by ip stack, i.e. with ipv4
      check in place skb->cb is supposed to be IPCB.
      
      I don't see an immediate problem (REJECT target cannot be used anymore
      now that PRE/POSTROUTING hook validation has been fixed), but better be
      safe than sorry.
      
      A much better patch would be to either mark act_ipt as
      "depends on BROKEN" or remove it altogether. I plan to do this
      for -next in the near future.
      
      This tc extension is broken in the sense that tc lacks an
      equivalent of NF_STOLEN verdict.
      
      With NF_STOLEN, target function takes complete ownership of skb, caller
      cannot dereference it anymore.
      
      ACT_STOLEN cannot be used for this: it has a different meaning, caller
      is allowed to dereference the skb.
      
      At this time NF_STOLEN won't be returned by any targets as far as I can
      see, but this may change in the future.
      
      It might be possible to work around this via list of allowed
      target extensions known to only return DROP or ACCEPT verdicts, but this
      is error prone/fragile.
      
      Existing selftest only validates xt_LOG and act_ipt is restricted
      to ipv4 so I don't think this action is used widely.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      93d75d47
    • Florian Westphal's avatar
      net/sched: act_ipt: add sanity checks on skb before calling target · b2dc32dc
      Florian Westphal authored
      
      
      Netfilter targets make assumptions on the skb state, for example
      iphdr is supposed to be in the linear area.
      
      This is normally done by IP stack, but in act_ipt case no
      such checks are made.
      
      Some targets can even assume that skb_dst will be valid.
      Make a minimum effort to check for this:
      
      - Don't call the targets eval function for non-ipv4 skbs.
      - Don't call the targets eval function for POSTROUTING
        emulation when the skb has no dst set.
      
      v3: use skb_protocol helper (Davide Caratti)
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b2dc32dc
    • Florian Westphal's avatar
      net/sched: act_ipt: add sanity checks on table name and hook locations · b4ee9338
      Florian Westphal authored
      
      
      Looks like "tc" hard-codes "mangle" as the only supported table
      name, but on kernel side there are no checks.
      
      This is wrong.  Not all xtables targets are safe to call from tc.
      E.g. "nat" targets assume skb has a conntrack object assigned to it.
      Normally those get called from netfilter nat core which consults the
      nat table to obtain the address mapping.
      
      "tc" userspace either sets PRE or POSTROUTING as hook number, but there
      is no validation of this on kernel side, so update netlink policy to
      reject bogus numbers.  Some targets may assume skb_dst is set for
      input/forward hooks, so prevent those from being used.
      
      act_ipt uses the hook number in two places:
      1. the state hook number, this is fine as-is
      2. to set par.hook_mask
      
      The latter is a bit mask, so update the assignment to make
      xt_check_target() to the right thing.
      
      Followup patch adds required checks for the skb/packet headers before
      calling the targets evaluation function.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b4ee9338