I'm amazed that there are so many good responses above only this mentions fuzzing. In the context of security, inputs might be non-linear things like adjacent memory, so I don't see anyway to be confident about equilivancy without substantial fuzzing.
Honestly I just don't see a way to formally verify this at all, it's sounds like it could be a very useful tool but I don't see a way for it to be fully confident. But, heck, just getting you 90% of the way towards understanding it with LLMs is still amazing and useful in real life.
Honestly I just don't see a way to formally verify this at all, it's sounds like it could be a very useful tool but I don't see a way for it to be fully confident. But, heck, just getting you 90% of the way towards understanding it with LLMs is still amazing and useful in real life.