Skip to content
Commit b3bce6a3 authored by Roman Tereshin's avatar Roman Tereshin
Browse files

[MachineVerifier] Doing ::calcRegsPassed over faster sets: ~15-20% faster MV, NFC

MachineVerifier still takes 45-50% of total compile time with
-verify-machineinstrs, with calcRegsPassed dataflow taking ~50-60% of
MachineVerifier.

The majority of that time is spent in BBInfo::addPassed, mostly within
DenseSet implementing the sets the dataflow is operating over.

In particular, 1/4 of that DenseSet time is spent just iterating over it
(operator++), 40-50% on insertions, and most of the rest in ::count.

Given that, we're implementing custom sets just for this analysis here,
focusing on cheap insertions and O(n) iteration time (as opposed to
O(U), where U is the universe).

As it's based _mostly_ on BitVector for sparse and SmallVector for
dense, it may remotely resemble SparseSet. The difference is, our
solution is a lot less clever, doesn't have constant time `clear` that
we won't use anyway as reusing these sets across analyses is cumbersome,
and thus more space efficient and safer (got a resizable Universe and a
fallback to DenseSet for sparse if it gets too big).

With this patch MachineVerifier gets ~15-20% faster, its contribution to
total compile time drops from 45-50% to ~35%, while contribution of
calcRegsPassed to MachineVerifier drops from 50-60% to ~35% as well.

calcRegsPassed itself gets another 2x faster here.

All measured on a large suite of shaders targeting a number of GPUs.

Reviewers: bogner, stoklund, rudkx, qcolombet

Reviewed By: rudkx

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75033
parent 50cac248
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment