trellis参数说明:网格量化,0 关闭 1 仅在宏块最终编码时启用 2 所有情况下都启用
网格量化参考说明:
http://cbloomrants.blogspot.com/2009/05/05-15-09-trellis-quantization.html
举个例子:
{17,3,9,4,0,1,2,0} 量化后的结果
如果能调整一下dct系数,让量化后的结果变成
{17,3,9,4,0,1,1,0} or {17,0,8,4,0,0,0,0} 这样编码起来就更加节省码率了(更容易进行熵编码压缩),当然前提是dct系数调整的幅度不要太大,免得造成过大的失真。
对比看下x264源码:
x264_quant_4x4_trellis
x264_quant_8x8_trellis
x264_quant_chroma_dc_trellis
x264_quant_luma_dc_trellis
-->quant_trellis_cabac()
-->quant_trellis_cavlc()
dct系数分为AC,DC系数,AC是高频系数,DC低频系数,DC系数只有一个值,左上角,最重要的低频系数值。
dct[0] = trellis_dc_shortcut( orig_coefs[0], quant_coefs[0], unquant_mf[0], coef_weight2[0], lambda2, cabac_state, cost_sig );
参数说明:quant_mf qp 转换偏移
quant_coefs qp转码幂指数。转换后运算不是除法,算起来更简单。
//这里DC单独处理。
static NOINLINE
int trellis_dc_shortcut( int sign_coef, int quant_coef, int unquant_mf, int coef_weight, int lambda2, uint8_t *cabac_state, int cost_sig )
{
uint64_t bscore = TRELLIS_SCORE_MAX;
int ret = 0;
int q = abs( quant_coef );
for( int abs_level = q-1; abs_level <= q; abs_level++ )// for循环,微调DC系数
{
int unquant_abs_level = (unquant_mf * abs_level + 128) >> 8;
/* Optimize rounding for DC coefficients in DC-only luma 4x4/8x8 blocks. */
int d = sign_coef - ((SIGN(unquant_abs_level, sign_coef) + 8)&~15);
uint64_t score = (uint64_t)d*d * coef_weight;
/* code the proposed level, and count how much entropy it would take */
if( abs_level )
{
unsigned f8_bits = cost_sig;
int prefix = X264_MIN( abs_level - 1, 14 );
f8_bits += x264_cabac_size_decision_noup2( cabac_state+1, prefix > 0 );
f8_bits += x264_cabac_size_unary[prefix][cabac_state[5]];
if( abs_level >= 15 )
f8_bits += bs_size_ue_big( abs_level - 15 ) << CABAC_SIZE_BITS;
score += (uint64_t)f8_bits * lambda2 >> ( CABAC_SIZE_BITS - LAMBDA_BITS );
}
COPY2_IF_LT( bscore, score, ret, abs_level );
}
return SIGN(ret, sign_coef);
}
//其它还有AC系数?
#define TRELLIS_LOOP(ctx_hi)\
for( ; i >= b_ac; i-- )\
{\
/* skip 0s: this doesn't affect the output, but saves some unnecessary computation. */\
if( !quant_coefs[i] )\
{\
}
}
uint64_t ssd0[2], ssd1[2];\
for( int k = 0; k < 2; k++ )\
{\
int abs_level = q-1+k;\
int unquant_abs_level = (((dc?unquant_mf[0]<<1:unquant_mf[zigzag[i]]) * abs_level + 128) >> 8);\
int d = abs_coef - unquant_abs_level;\
/* Psy trellis: bias in favor of higher AC coefficients in the reconstructed frame. */\
if( h->mb.i_psy_trellis && i && !dc && !b_chroma )\
{\
int orig_coef = (num_coefs == 64) ? h->mb.pic.fenc_dct8[idx][zigzag[i]] : h->mb.pic.fenc_dct4[idx][zigzag[i]];\
int predicted_coef = orig_coef - sign_coef;\
int psy_value = abs(unquant_abs_level + SIGN(predicted_coef, sign_coef));\
int psy_weight = coef_weight1[zigzag[i]] * h->mb.i_psy_trellis;\
ssd1[k] = (uint64_t)d*d * coef_weight2[zigzag[i]] - psy_weight * psy_value;\
}\
else\
/* FIXME: for i16x16 dc is this weight optimal? */\
ssd1[k] = (uint64_t)d*d * (dc?256:coef_weight2[zigzag[i]]);\
ssd0[k] = ssd1[k];\
if( !i && !dc && !ctx_hi )\
{\
/* Optimize rounding for DC coefficients in DC-only luma 4x4/8x8 blocks. */\
d = sign_coef - ((SIGN(unquant_abs_level, sign_coef) + 8)&~15);\
ssd0[k] = (uint64_t)d*d * coef_weight2[zigzag[i]];\
}\
}\ 调整系数,重新量化,得到新的AC量化后的数据