Here I want to present my thoughts on decompiler techniques I’d like to see. Maybe a lot of this is implemented somewhere but I haven’t seen working decompiler.
- Possibility to load disassembly instead of disassembling by itself.
- Good flow analyzer. REC, for example, produces a lot of silly gotos. Is it so hard to build directed graph for blocks, separate out conditional code and loops? IDA does so. And it’s pretty easy to recognize typical schemes like
- Watching ‘live’ registers. Each instruction may affect some registers and flags but some of them won’t be needed later (for example, sometimes substraction is used to modify some value and sometimes also result flags are checked too). And block of instructions may depend on some register values set before (if they are not modified before using). Boomerang had something like this but resulting code was too LISPy.
- Reiterations – if decompiler finds out that function uses registers for passing parameters then code must be changed to reflect this.
- Pattern recognition – it would be very nice if decompiler could recognize the same patterns over the code (in form:
A = B+ constant; B = A | constant;). And if it also could automatically label bitreading functions… But I fear that this is AI-complete problem.
Well, my rant ends here. Back to work.