当前位置: 首页 > 工具软件 > BITS > 使用案例 >

hint bits

端木乐语
2023-12-01
之前因为做关于pg_rewind的测试,里面提到需要打开记录hint bits的参数。但是对于hint bits则是完全不晓得是什么东东,就从英文到wiki中,自己做了翻译,希望可以帮到有需要的人。同时,跟阿里内核组的人聊过,具体为什么要打开hint bits,至今没有结论,也希望牛人看到的时候,可以给予指正和提示。

参考文档,见以下wiki链接:
https://wiki.postgresql.org/wiki/Hint_Bits?spm=5176.100239.blogcont.4.EvuBsU


Hint bits are used to mark tuples as created and/or deleted by transactions that are known committed or aborted. To determine the visibility of a tuple without these bits set, you need to consult pg_clog and possibly pg_subtrans, so it is an expensive check. On the other hand, if the tuple has the bits set, then its state is known (or, at worst, it can be calculated easily from your current snapshot, without looking at pg_clog). --解决MVCC带来的问题,即确认当前行的可见性问题

There are four hint bits:
 XMIN_COMMITTED -- creating transaction is known committed
 XMIN_ABORTED -- creating transaction is known aborted
 XMAX_COMMITTED -- same, for the deleting transaction
 XMAX_ABORTED -- ditto


If neither of the XMIN bits is set, then either:
 The creating transaction is still in progress, which you can check by examining the list of running transactions in shared memory; —创建该行的事务还在继续,可以通过检查共享内存中正在运行的事务列表来确认事务未结束
 You are the first one to check since it ended, in which case you need to consult pg_clog to know the transaction's status, and you can update the hint bits if you find out its final state. —从创建该行的事务结束以来,第一次查看该行,在这种情况下需要询问pg_clog下的文件来得知事务的状态,如果找出最终状态,则更新hint bits的信息        事务已结束,但bits还未被第一个检查该行的hint bits的会话设置


If the tuple has been marked deleted, then similar remarks apply to the XMAX bits. —如果行被删除,则类似的标记会应用到XMAX位上



Any examination whatsoever of a tuple --- whether by vacuum or any ordinary DML operation --- will update its hint bits to match the commit/abort status of the inserting/deleting transaction(s) as of the instant of the examination. A plain SELECT, count(*), or VACUUM on the entire table will check every tuple for visibility and set its hint bits. --对于任意行上的任意一种检查,如无论是vacuum还是普通的DML操作,都会立刻检查更新hint bits以符合插入/删除事务的提交/中止状态。一个普通的、作用在整张表上的SELECT COUNT(*)或者VACUUM操作,也会检查每个行的可见性并设置hint bits位。由此可知,普通的查询操作也会引起数据块发生修改的情况。        在tuple的头部t_infomask中通过4个比特位来存储事务的提交状态,但是,事务状体信息并不是在事务结束时就设置hint bits。而是在后面的DML、DQL或者VACUUM等SQL扫描到对应的TUPLE时,触发SET BITS的操作。


Another point to note is that the hint bits are checked and set on a per tuple basis. Although a simple scan will visit all the tuples on a page and update all their hint bits at once, piecemeal access (such as fetching single tuples via index scans) might result in many writes of the same page as various hint bits get updated over time. --另外一点需要指出的是,hint bits的检查和设定是基于行的,虽然一个普通的扫描会访问所有页上的行并一次性的更新设置hint bits,但是随着时间的推移,零碎地访问,例如通过索引扫描获取单行数据,仍然会导致同一个数据页因不同的hint bits而被更新很多次的问题


当发生行级别锁冲突时,等待锁的会话在获得锁后,也会对修改的TUPLE设置HINT BITS。例如两个update语句更新同一条记录时,后获得锁的事务会在先获得锁的事务提交后,对它所修改的记录设置hint bits。


默认情况下,hint bits在主节点上是不会写到WAL中的,所以从节点上的数据很有可能会再次更新hints,这样即便是只读的从库还是会有写磁盘的情况出现,但是数据本身是不会有变化的。所以在hot standby模式下是没有数据库是真正只读的。从9.4版本开始,支持了是否开启记录hint bits到WAL的功能,由参数wal_log_hints控制。


开启参数wal_log_hints后,在发生检查点之后,只要数据块发生修改,无论是DML引起还是所谓的hint bits的修改引起的,都会将整个数据块写入WAL中,其他情况下都不会写整个数据块。如果打开数据块的checksum,那么hint bits的更新是一定会记入WAL中的,wal_log_hints参数的设定就会被忽略。数据块的校验,是在初始化数据库集群时,例如initdb或者pg_ctl init,由选项--data-checksums来指定的。



Commit logging

Some details here are in src/backend/access/transam/README:
 "pg_clog records the commit status for each transaction that has been assigned an XID." 
 "Transactions and subtransactions are assigned permanent XIDs only when/if they first do something that requires one --typically, insert/update/delete a tuple, though there are a few other places that need an XID assigned." 
    --pg_clog记录所有被赋予了XID的事务提交状态信息
    --事务和子事务只有在第一次去做某些事,例如插入/更新/删除某个行时,才会被赋予XID,当然也有一些其他的地方也是需要赋予XID的


pg_clog is updated only at sub or main transaction end. When the transactionid is assigned the page of the clog that contains that transactionid is checked to see if it already exists and if not, it is initialised.
    --pg_clog只有在主事务或子事务完成时才会被更新。当事务被赋予了XID,那么就会去检查是否存在包含该事务ID的clog页,如果不存在则初始化该clog页

pg
_clog is allocated in pages of 8kB apiece. Each transaction needs 2 bits, so on an 8 kB page there is space for 4 transactions/byte * 8k bytes = 32k transactions.
    --每个pg_clog页都是8KB大小,每个事务都需要2位,所以在一个8KB的页上会有32K的事务状态信息


CLOG pages don't make their way out to disk until the internal CLOG buffers are filled, at which point the least recently used buffer there is evicted to permanent storage.
    --clog页只有在内部的clog buffer被写满时才会将最近最少使用的buffer刷出到永久存储
 类似资料:

相关阅读

相关文章

相关问答