>>> from miasm.arch.x86.arch import mn_x86
>>> from miasm.core.locationdb import LocationDB
mn_x86
是一个代表x86架构的类,用来处理x86架构
关于LocationDB
:
class LocationDB(__builtin__.object)
| LocationDB is a "database" of information associated to location.
|
| An entry in a LocationDB is uniquely identified with a LocKey.
| Additional information which can be associated with a LocKey are:
| - an offset (uniq per LocationDB)
| - several names (each are uniqs per LocationDB)
# LocationDB是一个与位置相关的信息的数据库
# 一个LocationDB的入口被一个LocKey唯一确定
# 关于LocKey的额外信息:
# - 一个偏移(每个LocationDB的LocKey都独一无二)
# - 几个名称(每一个名称在LocationDB中都是独一无二的)
>>> loc_db = LocationDB()
>>> l = mn_x86.fromstring('XOR ECX, ECX', loc_db, 32)
>>> print(l)
XOR ECX, ECX
>>> mn_x86.asm(l)
['1\xc9', '3\xc9', 'g1\xc9', 'g3\xc9']
mn_x86.fromstring
接受三个参数,第一个参数是指令字符串(必须大写),第二个是LocationDB对象,第三个表示32位模式
返回的l
是一个miasm.arch.x86.arch.instruction_x86
对象
mn_x86.asm
方法将该对象转换为字节码的列表,其中,该列表中的每一项都是XOR ECX, ECX
字节码
>>> l.args[0] = mn_x86.regs.EAX
>>> print(l)
XOR EAX, ECX
>>> a = mn_x86.asm(l)
>>> print(a)
['1\xc8', '3\xc1', 'g1\xc8', 'g3\xc1']
l.args[0]
即为该XOR
指令的第一个操作数,mn_x86.regs.EAX
代表EAX
寄存器
>>> print(mn_x86.dis(a[0], 32))
XOR EAX, ECX
mn_x86.dis
即为反汇编的方法,其接受两个参数,第一个参数可以是上述asm方法返回的列表的某一项,也可以是字符串形式的字节码,第二个参数代表32位模式
>>> from miasm.analysis.machine import Machine
>>> mn = Machine('x86_32').mn
>>> print(mn.dis('\x33\x30', 32))
XOR ESI, DWORD PTR [EAX]
关于Machine
:
class Machine(__builtin__.object)
| Abstract machine architecture to restrict architecture dependent code
# 限制架构
Machine('x86_32').mn
返回的其实就是mn_x86
>>> mn = Machine('mips32b').mn
>>> print(mn.dis(b'\x97\xa3\x00 ', "b"))
LHU V1, 0x20(SP)
这里展示了如何反汇编mips架构的字节码
这里的Intermediate representation就是简称的IR,即中间表达式
官方这里用的是arm架构的,我改成x86_32架构
>>> machine = Machine('x86_32')
>>> instr = machine.mn.dis('\x33\xc1', 'l')
>>> print(instr)
XOR EAX, ECX
>>> ira = machine.ira(loc_db)
machine.ira
方法接受一个参数,为LocationDB
>>> ircfg = ira.new_ircfg()
ircfg
是一个miasm.ir.ir.IRCFG
对象,意如其名……大概 : )
>>> ira.add_instr_to_ircfg(instr, ircfg)
意如其名,加_instr_到_ircfg
>>> for lbl, irblock in ircfg.blocks.items():
... print(irblock.to_string(loc_db))
...
loc_key_0:
pf = parity((EAX ^ ECX) & 0xFF)
zf = FLAG_EQ_CMP(EAX, ECX)
of = 0x0
EAX = EAX ^ ECX
cf = 0x0
nf = FLAG_SIGN_SUB(EAX ^ ECX, 0x0)
IRDst = loc_key_1
# (看起来官方文档选择arm架构还是有愿意的)
ircfg.blocks
返回之前传入的LocationDB对象,ircfg.blocks.items
是一个列表
[(<LocKey 0>, <miasm.ir.ir.IRBlock object at 0x7fd38cb052d0>)]
所以for循环中lbl就是LocKey,irblock是miasm.ir.ir.IRBlock
的对象
关于IRBlock
:
class IRBlock(__builtin__.object)
| Intermediate representation block object.
|
| Stand for an intermediate representation basic block.
# 中间表达式块对象
# 代表一个中间表达式的基本块
>>> for lbl, irblock in ircfg.blocks.iteritems():
... for assignblk in irblock:
... rw = assignblk.get_rw()
... for dst, reads in rw.iteritems():
... print('read: ', [str(x) for x in reads])
... print('written:', dst)
... print()
...
read: ['ECX', 'EAX']
written: pf
read: ['ECX', 'EAX']
written: zf
read: []
written: of
read: ['ECX', 'EAX']
written: nf
read: []
written: cf
read: ['ECX', 'EAX']
written: EAX
read: []
written: IRDst
(其实还挺好懂的)
00000000 8d4904 lea ecx, [ecx+0x4]
00000003 8d5b01 lea ebx, [ebx+0x1]
00000006 80f901 cmp cl, 0x1
00000009 7405 jz 0x10
0000000b 8d5bff lea ebx, [ebx-1]
0000000e eb03 jmp 0x13
00000010 8d5b01 lea ebx, [ebx+0x1]
00000013 89d8 mov eax, ebx
00000015 c3 ret
>>> s = '\x8dI\x04\x8d[\x01\x80\xf9\x01t\x05\x8d[\xff\xeb\x03\x8d[\x01\x89\xd8\xc3'
>>> from miasm.analysis.binary import Container
>>> c = Container.from_string(s)
>>> c
<miasm.analysis.binary.ContainerUnknown object at 0x7f34cefe6090>
关于Container
:
class Container(__builtin__.object)
| Container abstraction layer
|
| This class aims to offer a common interface for abstracting container
| such as PE or ELF.
# 容器抽象层
# 这个类旨在为如PE或ELF的抽象容器提供一个通用接口
# 类如其名,应该就是个装信息的容器
关于Container.from_string
:
from_string(cls, data, *args, **kwargs) method of __builtin__.type instance
Instantiate a container and parse the binary
@data: str containing the binary
实例化一个容器,解析二进制流
@data参数是二进制流的字符串
>>> from miasm.analysis.machine import Machine
>>> machine = Machine('x86_32')
>>> mdis = machine.dis_engine(c.bin_stream)
>>> asmcfg = mdis.dis_multiblock(0)
>>> for block in asmcfg.blocks:
... print(block.to_string(asmcfg.loc_db))
...
loc_0
LEA ECX, DWORD PTR [ECX + 0x4]
LEA EBX, DWORD PTR [EBX + 0x1]
CMP CL, 0x1
JZ loc_10
-> c_next:loc_b c_to:loc_10
loc_10
LEA EBX, DWORD PTR [EBX + 0x1]
-> c_next:loc_13
loc_b
LEA EBX, DWORD PTR [EBX + 0xFFFFFFFF]
JMP loc_13
-> c_to:loc_13
loc_13
MOV EAX, EBX
RET
关于machine.dis_engine
:
它是machine的一个成员,是一个miasm.arch.x86.disasm.dis_x86_32
对象
关于mdis.dis_multiblock
:
dis_multiblock(self, offset, blocks=None) method of miasm.arch.x86.disasm.dis_x86_32 instance
Disassemble every block reachable from @offset regarding
specific disasmEngine conditions
Return an AsmCFG instance containing disassembled blocks
@offset: starting offset
@blocks: (optional) AsmCFG instance of already disassembled blocks to
merge with
# 反汇编所有从@offset开始的可及的块
# 明确反汇编引擎环境
# 返回一个包含反汇编块的AsmCFG实例(私以为和IRCFG很像)
# @offset:开始的偏移
# @blocks:(可选)与已经反汇编的块的AsmCFG实例合并
>>> jitter = machine.jitter(jit_type='python')
>>> jitter.init_stack()
JIT,即Just-in-time compilation,能做到即时编译,它能做到在程序执行期间进行编译
如果第一条指令报错:Unsupported jit arch: x86
,不要在它的miasm目录下运行就不会报错了
>>> run_addr = 0x40000000
>>> from miasm.jitter.csts import PAGE_READ, PAGE_WRITE
>>> jitter.vm.add_memory_page(run_addr, PAGE_READ | PAGE_WRITE, s)
关于jitter.vm.add_memory_page
:
add_memory_page(...)
add_memory_page(address, access, content [, cmt]) -> Maps a memory page at @address of len(@content) bytes containing @content with protection @access
@cmt is a comment linked to the memory page
# 在@address映射一个长度为len(@content)字节、包含带有@access保护的内容的内存页
# @cmt是一个链接到内存页的???注释???
miasm.jitter.cts
包含了很多相关参数,具体可以看一下源码
def code_sentinelle(jitter):
jitter.run = False
jitter.pc = 0
return True
>>> jitter.add_breakpoint(0x1337beef, code_sentinelle)
>>> jitter.push_uint32_t(0x1337beef)
关于add_breakpoint
:
add_breakpoint(self, addr, callback) method of miasm.arch.x86.jit.jitter_x86_32 instance
Add a callback associated with addr.
@addr: breakpoint address
@callback: function with definition (jitter instance)
# 在相关地址加一个回调函数
# @addr:断点地址
# @callback:自定的回调函数,参数是jitter实例
看这个回调函数,将jitter.run
属性设置为False,这样做就停止了JIT的运行
jitter.pc
就是pc寄存器,学过80x86的同学都直到,pc:ip是它的寻址的方法
jitter.push_uint32_t
就向栈中push入一个unsigned int32
的数据,调用该方法的前提是先调用jitter.init_stack
>>> jitter.set_trace_log()
关于jitter.set_trace_log
:
set_trace_log(self, trace_instr=True, trace_regs=True, trace_new_blocks=False) method of miasm.arch.x86.jit.jitter_x86_32 instance
Activate/Deactivate trace log options
@trace_instr: activate instructions tracing log
@trace_regs: activate registers tracing log
@trace_new_blocks: dump new code blocks log
# 激活/停用追踪日志选项
# @trace_instr:激活追踪指令的日志
# @trace_regs:激活追踪寄存器的日志
# @trace_new_blocks:转储新代码块的日志
>>> jitter.init_run(run_addr)
>>> jitter.continue_run()
关于jitter.continue_run
:
continue_run(self, step=False) method of miasm.arch.x86.jit.jitter_x86_32 instance
PRE: init_run.
Continue the run of the current session until iterator returns or run is
set to False.
If step is True, run only one time.
Return the iterator value
# 前提:先运行init_run
# 继续运行当前阶段直到迭代器返回或run属性被设为False
# 如果step属性为True,则只运行一次
# 返回值为迭代器的值
官方文档显示输出中都是64位寄存器,他应该是整错了吧……
>>> jitter.vm
Addr Size Access Comment
0x1230000 0x10000 RW_ Stack
0x40000000 0x16 RW_
>>> hex(jitter.cpu.EAX)
'0x0L'
>>> jitter.cpu.ESI = 12
不写了