[MachineVerifier] Doing ::calcRegsPassed over faster sets: ~15-20% faster MV, NFC
MachineVerifier still takes 45-50% of total compile time with -verify-machineinstrs, with calcRegsPassed dataflow taking ~50-60% of MachineVerifier. The majority of that time is spent in BBInfo::addPassed, mostly within DenseSet implementing the sets the dataflow is operating over. In particular, 1/4 of that DenseSet time is spent just iterating over it (operator++), 40-50% on insertions, and most of the rest in ::count. Given that, we're implementing custom sets just for this analysis here, focusing on cheap insertions and O(n) iteration time (as opposed to O(U), where U is the universe). As it's based _mostly_ on BitVector for sparse and SmallVector for dense, it may remotely resemble SparseSet. The difference is, our solution is a lot less clever, doesn't have constant time `clear` that we won't use anyway as reusing these sets across analyses is cumbersome, and thus more space efficient and safer (got a resizable Universe and a fallback to DenseSet for sparse if it gets too big). With this patch MachineVerifier gets ~15-20% faster, its contribution to total compile time drops from 45-50% to ~35%, while contribution of calcRegsPassed to MachineVerifier drops from 50-60% to ~35% as well. calcRegsPassed itself gets another 2x faster here. All measured on a large suite of shaders targeting a number of GPUs. Reviewers: bogner, stoklund, rudkx, qcolombet Reviewed By: rudkx Tags: #llvm Differential Revision: https://reviews.llvm.org/D75033
Loading
Please register or sign in to comment