The intermediate language is called microcode. 微码是一种中间语言。
each processor instruction is represented by several microinstructions in the microcode.
每个处理器指令由微代码中的几个微指令表示。
The decompiler performs a very straightforward sequence of steps:
反编译器执行一系列非常简单的步骤:
1. Generate microcode 1.生成微码
2. Transform microcode (optimize, resolve memrefs, analyze calls, etc) 2.转换微码
3. Allocate local variables 3.分配局部变量
4. Generate ctree (it is very similar to AST that is used in compilers) 4.生成ctree(
5. Beautify ctree, make it more readable 5.美化ctree,使其更具可读性
6. Print ctree 6.打印ctree
Before we delve into details, let us justify the use of microcode. Why do we need an intermediate
language when building a decompiler? Below are the reasons:
在我们深入研究细节之前,我们首先证明使用微码是正确的。为什么我们在构建反编译器时需要一个中间语言。下边是原因:
• It helps to get rid of the complexity of processor instructions 它有助于摆脱复杂的处理器指令
• Also we get rid of processor idiosyncrasies. 同时也使我们摆脱了处理器特性
Below are examples, they all require special handling
one way or another:
• x86: segment registers, fpu stack
• ARM: thumb mode addresses
• PowerPC: multiple copies of CF register (and other condition registers)
• MIPS: delay slots
• Sparc: stack windows
• It makes the decompiler portable. We "just" need to replace the microcode generator.
它使反编译器可移植。我们“只是”需要更换微码生成器。
I'm using quotes here because writing a microcode generator is still a complex task.
just 加上双引号的原因:写一个微码生成器也是一项艰巨的任务。
不同的指令体系优化到同一套中间语言之后,则优化规则不用重复开发。
Design goals
The main design goal for the microcode was simplicity:
• No processor specific stuff
• One microinstruction does one thing
• Small number of instructions (only 45 in 1999, now 72)
• Simple instruction operands (register, number, memory)
• Consider only compiler generated code
x86代码:
004014FB mov eax, [ebx+4]
004014FE mov dl, [eax+1]
00401501 sub dl, 61h ; 'a'
00401504 jz short loc_401517
转换成微代码,看起来有点像RISC代码:
2. 0 mov ebx.4, eoff.4 ; 4014FB u=ebx.4 d=eoff.4
2. 1 mov ds.2, seg.2 ; 4014FB u=ds.2 d=seg.2
2. 2 add eoff.4, #4.4, eoff.4 ; 4014FB u=eoff.4 d=eoff.4
2. 3 ldx seg.2, eoff.4, et1.4 ; 4014FB u=eoff.4,seg.2,(STACK,GLBMEM) d=et1.4
2. 4 mov et1.4, eax.4 ; 4014FB u=et1.4 d=eax.4
2. 5 mov eax.4, eoff.4 ; 4014FE u=eax.4 d=eoff.4
2. 6 mov ds.2, seg.2 ; 4014FE u=ds.2 d=seg.2
2. 7 add eoff.4, #1.4, eoff.4 ; 4014FE u=eoff.4 d=eoff.4
2. 8 ldx seg.2, eoff.4, t1.1 ; 4014FE u=eoff.4,seg.2,(STACK,GLBMEM) d=t1.1
2. 9 mov t1.1, dl.1 ; 4014FE u=t1.1 d=dl.1
2.10 mov #0x61.1, t1.1 ; 401501 u= d=t1.1
2.11 setb dl.1, t1.1, cf.1 ; 401501 u=dl.1,t1.1 d=cf.1
2.12 seto dl.1, t1.1, of.1 ; 401501 u=dl.1,t1.1 d=of.1
2.13 sub dl.1, t1.1, dl.1 ; 401501 u=dl.1,t1.1 d=dl.1
2.14 setz dl.1, #0.1, zf.1 ; 401501 u=dl.1 d=zf.1
2.15 setp dl.1, #0.1, pf.1 ; 401501 u=dl.1 d=pf.1
2.16 sets dl.1, sf.1 ; 401501 u=dl.1 d=sf.1
2.17 mov cs.2, seg.2 ; 401504 u=cs.2 d=seg.2
2.18 mov #0x401517.4, eoff.4 ; 401504 u= d=eoff.4
2.19 jcnd zf.1, $loc_401517 ; 401504 u=zf.1
The 4 processor instructions got translated into 20 microinstructions. Each microinstruction does just one
thing.
This approach simplifies analyzing and optimizing microcode.
However, microcode can represent more complex expressions. Let us see how it looks after the pre-optimization pass:
4条处理器指令被翻译成20个微指令。 每个微指令只做一个事情。
这种转化简化了微码的分析和优化。
但是,微代码可以代表更复杂的表达。 让我们看看看看预优化之后的结果:
2. 0 ldx ds.2, (ebx.4+#4.4), eax.4 ; 4014FB u=ebx.4,ds.2,
;(STACK,GLBMEM) d=eax.4
2. 1 ldx ds.2, (eax.4+#1.4), dl.1 ; 4014FE u=eax.4,ds.2,
;(STACK,GLBMEM) d=dl.1
2. 2 setb dl.1, #0x61.1, cf.1 ; 401501 u=dl.1 d=cf.1
2. 3 seto dl.1, #0x61.1, of.1 ; 401501 u=dl.1 d=of.1
2. 4 sub dl.1, #0x61.1, dl.1 ; 401501 u=dl.1 d=dl.1
2. 5 setz dl.1, #0.1, zf.1 ; 401501 u=dl.1 d=zf.1
2. 6 setp dl.1, #0.1, pf.1 ; 401501 u=dl.1 d=pf.1
2. 7 sets dl.1, sf.1 ; 401501 u=dl.1 d=sf.1
2. 8 jcnd zf.1, $loc_401517 ; 401504 u=zf.1
As we see, only 9 microinstructions remain; some intermediate registers disappeared. Sub-instructions
(like eax.4+#1.4) appeared. Overall the code is still too noisy and verbose.
如我们所见,只剩下9个微指令; 一些中间寄存器消失了。子指令(如eax.4 +#1.4)出现了。 整体而言,代码仍然过于嘈杂和冗长。
After further microcode transformations we have: 进一步优化之后
2. 1 ldx ds.2{3}, ([ds.2{3}:(ebx.4+#4.4)].4+#1.4), dl.1{5} ; 4014FE
; u=ebx.4,ds.2,(GLBLOW,sp+20..,GLBHIGH) d=dl.1
2. 2 sub dl.1{5}, #0x61.1, dl.1{6} ; 401501 u=dl.1 d=dl.1
2. 3 jz dl.1{6}, #0.1, @7 ; 401504 u=dl.1
(numbers in curly braces are value numbers)
The final microcode is:最终结果
jz [ds.2:([ds.2:(ebx.4+#4.4)].4+#1.4)].1, #0x61.1, @7
I would not call this code "very simple" but it is ready to be translated to ctree. It maps to C in a natural
way. The output will look like this:
我不会将此代码称为“非常简单”,但它已准备好转换为ctree。 它以自然形式映射到C.
办法。 输出将如下所示:
if ( argv[1][1] == 'a' )
...
I'm happy to tell you that it is possible to write plugins for the decompiler. Plugins can invoke the
decompiler engine and use the results, or improve the decompiler output. It is also possible to use the
microcode to find the possible register values at any given point, compare blocks of code, etc.
It is possible to hook to the optimization engine and add your own transformation rules. Please check the
Decompiler SDK: it has many examples.
这才是这篇文章的关键,有了微代码就可以写插件做优化,来对付花指令,指令膨胀,乱跳这些情况。
IDA 7.2 只有C++ 插件才支持微指令。
IDA 7.3 支持python 写微指令处理块。
可参考资源:
https://bbs.pediy.com/thread-250625.htm
http://www.hexblog.com/?p=1248
https://i.blackhat.com/us-18/Thu-August-9/us-18-Guilfanov-Decompiler-Internals-Microcode.pdf
https://i.blackhat.com/us-18/Thu-August-9/us-18-Guilfanov-Decompiler-Internals-Microcode-wp.pdf
https://github.com/patois/genmc/blob/master/genmc.py
https://github.com/NeatMonster/MCExplorer/blob/master/mcexplorer.py