2021SC@SDUSC Zxing开源代码（七）QR码的编码（四）——Encoder详解

庄欣然

2023-12-01

2021SC@SDUSC

前言：本篇博客继续对QR码的生成代码进行分析。主要对Encoder类的代码进行详细分析，详见代码注释。

1、ALPHANUMERIC_TABLE

在使用字母数字模式创建二维码时，编码过程中需要使用字符的字母数字值。ALPHANUMERIC_TABLE提供了QR码在字母数字模式下支持的字符的参考。具体定义如下：

private static final int[] ALPHANUMERIC_TABLE = {
      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  // 0x00-0x0f
      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  // 0x10-0x1f
      36, -1, -1, -1, 37, 38, -1, -1, -1, -1, 39, 40, -1, 41, 42, 43,  // 0x20-0x2f
      0,   1,  2,  3,  4,  5,  6,  7,  8,  9, 44, -1, -1, -1, -1, -1,  // 0x30-0x3f
      -1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,  // 0x40-0x4f
      25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, -1, -1, -1, -1, -1,  // 0x50-0x5f
  };

2、calculateMaskPenalty

calculateMaskPenalty用于掩码惩罚的计算。基本上，它适用四条规则，并将所有处罚相加。

  private static int calculateMaskPenalty(ByteMatrix matrix) {
    return MaskUtil.applyMaskPenaltyRule1(matrix) //规则1
        + MaskUtil.applyMaskPenaltyRule2(matrix)  //规则2
        + MaskUtil.applyMaskPenaltyRule3(matrix)  //规则3
        + MaskUtil.applyMaskPenaltyRule4(matrix);  //规则4
  }

3、encode

encode是Encoder类中最重要的函数，用来编码QR码，返回表示编码的QR码。
其中：content 表示要编码的内容文本；
ecLevel 表示要使用的错误更正级别；
如果编码无法成功，例如由于无效的内容或配置，则会 throws WriterException 。

  public static QRCode encode(String content,
                              ErrorCorrectionLevel ecLevel,
                              Map<EncodeHintType,?> hints) throws WriterException {

    // 确定调用者指定的字符编码（如果有的话）
    Charset encoding = DEFAULT_BYTE_MODE_ENCODING;
    boolean hasEncodingHint = hints != null && hints.containsKey(EncodeHintType.CHARACTER_SET);
    if (hasEncodingHint) {
      encoding = Charset.forName(hints.get(EncodeHintType.CHARACTER_SET).toString());
    }
    // 选择适合内容的编码模式。注意，这不会尝试使用多个模式/段。
    Mode mode = chooseMode(content, encoding);
    // 存储头信息，如模式和长度，以及“头”段（如ECI段）。
    BitArray headerBits = new BitArray();

    // 如果适用，附加ECI段
    if (mode == Mode.BYTE && hasEncodingHint) {
      CharacterSetECI eci = CharacterSetECI.getCharacterSetECI(encoding);
      if (eci != null) {
        appendECI(eci, headerBits);
      }
    }
    // 如果适用，为GS1格式的数据附加FNC1模式标头
    boolean hasGS1FormatHint = hints != null && hints.containsKey(EncodeHintType.GS1_FORMAT);
    if (hasGS1FormatHint && Boolean.parseBoolean(hints.get(EncodeHintType.GS1_FORMAT).toString())) {
      // GS1格式的代码在第一位置模式标题中以FNC1作为前缀
      appendModeInfo(Mode.FNC1_FIRST_POSITION, headerBits);
    }
    // （ECI就位后）写入模式标记
    appendModeInfo(mode, headerBits);
    //单独收集主数据段内的数据，以计算其大小（如果需要）。不要将其添加到主有效载荷中。
    BitArray dataBits = new BitArray();
    appendBytes(content, mode, dataBits, encoding);

    Version version;
    if (hints != null && hints.containsKey(EncodeHintType.QR_VERSION)) {
      int versionNumber = Integer.parseInt(hints.get(EncodeHintType.QR_VERSION).toString());
      version = Version.getVersionForNumber(versionNumber);
      int bitsNeeded = calculateBitsNeeded(mode, headerBits, dataBits, version);
      if (!willFit(bitsNeeded, version, ecLevel)) {
        throw new WriterException("Data too big for requested version");
      }
    } else {
      version = recommendVersion(ecLevel, mode, headerBits, dataBits);
    }

    BitArray headerAndDataBits = new BitArray();
    headerAndDataBits.appendBitArray(headerBits);
    // 找到主段的“长度”并写下来
    int numLetters = mode == Mode.BYTE ? dataBits.getSizeInBytes() : content.length();
    appendLengthInfo(numLetters, version, mode, headerAndDataBits);
    // 将数据合并到整个有效负载中
    headerAndDataBits.appendBitArray(dataBits);

    Version.ECBlocks ecBlocks = version.getECBlocksForLevel(ecLevel);
    int numDataBytes = version.getTotalCodewords() - ecBlocks.getTotalECCodewords();
    // 正确终止位。
    terminateBits(numDataBytes, headerAndDataBits);
    // 用纠错码交织数据位。
    BitArray finalBits = interleaveWithECBytes(headerAndDataBits,
                                               version.getTotalCodewords(),
                                               numDataBytes,
                                               ecBlocks.getNumBlocks());

    QRCode qrCode = new QRCode();

    qrCode.setECLevel(ecLevel);
    qrCode.setMode(mode);
    qrCode.setVersion(version);  
    //  选择遮罩图案并设置为“qrCode”。
    int dimension = version.getDimensionForVersion();
    ByteMatrix matrix = new ByteMatrix(dimension, dimension);
    // 允许通过提示手动选择要使用的模式
    int maskPattern = -1;
    if (hints != null && hints.containsKey(EncodeHintType.QR_MASK_PATTERN)) {
      int hintMaskPattern = Integer.parseInt(hints.get(EncodeHintType.QR_MASK_PATTERN).toString());
      maskPattern = QRCode.isValidMaskPattern(hintMaskPattern) ? hintMaskPattern : -1;
    }
    if (maskPattern == -1) {
      maskPattern = chooseMaskPattern(finalBits, ecLevel, version, matrix);
    }
    qrCode.setMaskPattern(maskPattern);
    // 构建矩阵并将其设置为“qrCode”。
    MatrixUtil.buildMatrix(finalBits, ecLevel, version, maskPattern, matrix);
    qrCode.setMatrix(matrix);

    return qrCode;
  }

  private static void appendECI(CharacterSetECI eci, BitArray bits) {
    bits.appendBits(Mode.ECI.getBits(), 4);
    // 这对于高达127的值是正确的，这就是我们现在所需要的。
    bits.appendBits(eci.getValue(), 8);
  }

4、recommendVersion

决定包含所有提供数据的二维码的最小版本，返回QR码的版本。
难点：需要知道版本，才能知道长度需要多少位。但需要知道需要多少位才能知道版本。因此先假设版本为最小值，然后再计算出正确的版本。

  private static Version recommendVersion(ErrorCorrectionLevel ecLevel,
                                          Mode mode,
                                          BitArray headerBits,
                                          BitArray dataBits) throws WriterException {
    // 首先，我们假设版本为最小值（1），对版本进行猜测：
    int provisionalBitsNeeded = calculateBitsNeeded(mode, headerBits, dataBits, Version.getVersionForNumber(1));
    Version provisionalVersion = chooseVersion(provisionalBitsNeeded, ecLevel);

    // 用这个猜测来计算正确的版本。
    int bitsNeeded = calculateBitsNeeded(mode, headerBits, dataBits, provisionalVersion);
    return chooseVersion(bitsNeeded, ecLevel);
  }

  private static int calculateBitsNeeded(Mode mode,
                                         BitArray headerBits,
                                         BitArray dataBits,
                                         Version version) {
    return headerBits.getSize() + mode.getCharacterCountBits(version) + dataBits.getSize();
  }

5、getAlphanumericCode

在字母数字模式下使用的表格的代码点，如果表格中没有相应的代码，则为-1。

  static int getAlphanumericCode(int code) {
    if (code < ALPHANUMERIC_TABLE.length) {
      return ALPHANUMERIC_TABLE[code];
    }
    return -1;
  }

6、chooseMode

通过检查内容选择最佳模式。请注意，“编码”用作提示；如果它是Shift_JIS，并且输入仅为双字节汉字，那么我们返回Mode。

  private static Mode chooseMode(String content, Charset encoding) {
    if (StringUtils.SHIFT_JIS_CHARSET.equals(encoding) && isOnlyDoubleByteKanji(content)) {
      // 如果所有输入都是双字节字符，请选择汉字模式
      return Mode.KANJI;
    }
    boolean hasNumeric = false;
    boolean hasAlphanumeric = false;
    for (int i = 0; i < content.length(); ++i) {
      char c = content.charAt(i);
      if (c >= '0' && c <= '9') {
        hasNumeric = true;
      } else if (getAlphanumericCode(c) != -1) {
        hasAlphanumeric = true;
      } else {
        return Mode.BYTE;
      }
    }
    if (hasAlphanumeric) {
      return Mode.ALPHANUMERIC;
    }
    if (hasNumeric) {
      return Mode.NUMERIC;
    }
    return Mode.BYTE;
  }

  private static boolean isOnlyDoubleByteKanji(String content) {
    byte[] bytes = content.getBytes(StringUtils.SHIFT_JIS_CHARSET);
    int length = bytes.length;
    if (length % 2 != 0) {
      return false;
    }
    for (int i = 0; i < length; i += 2) {
      int byte1 = bytes[i] & 0xFF;
      if ((byte1 < 0x81 || byte1 > 0x9F) && (byte1 < 0xE0 || byte1 > 0xEB)) {
        return false;
      }
    }
    return true;
  }

类似的，还有chooseMaskPattern、chooseVersion等方法，在这里不再赘述。

7、其他

willFit
如果输入位数适合具有指定版本和纠错级别的代码，则返回true。

  private static boolean willFit(int numInputBits, Version version, ErrorCorrectionLevel ecLevel) {
      // numBytes = 196 （版本7-H）
      int numBytes = version.getTotalCodewords();
      // getNumECBytes = 130
      Version.ECBlocks ecBlocks = version.getECBlocksForLevel(ecLevel);
      int numEcBytes = ecBlocks.getTotalECCodewords();
      // getNumDataBytes = 196 - 130 = 66
      int numDataBytes = numBytes - numEcBytes;
      int totalInputBytes = (numInputBits + 7) / 8;
      return numDataBytes >= totalInputBytes;
  }

terminateBits
终止位。

  static void terminateBits(int numDataBytes, BitArray bits) throws WriterException {
    int capacity = numDataBytes * 8;
    if (bits.getSize() > capacity) {
      throw new WriterException("data bits cannot fit in the QR Code" + bits.getSize() + " > " +
          capacity);
    }
    for (int i = 0; i < 4 && bits.getSize() < capacity; ++i) {
      bits.appendBit(false);
    }
    // 如果最后一个字节不是8位对齐的，将添加填充位。
    int numBitsInLastByte = bits.getSize() & 0x07;
    if (numBitsInLastByte > 0) {
      for (int i = numBitsInLastByte; i < 8; i++) {
        bits.appendBit(false);
      }
    }
    // 如果我们有更多的空间，我们将用定义的填充模式填充空间。
    int numPaddingBytes = numDataBytes - bits.getSizeInBytes();
    for (int i = 0; i < numPaddingBytes; ++i) {
      bits.appendBits((i & 0x01) == 0 ? 0xEC : 0x11, 8);
    }
    if (bits.getSize() != capacity) {
      throw new WriterException("Bits size does not equal capacity");
    }
  }

getNumDataBytesAndNumECBytesForBlockID
获取块id“blockID”的数据字节数和纠错字节数。将结果存储在“numDataBytesInBlock”和“numByteSinBlock”中。

  static void getNumDataBytesAndNumECBytesForBlockID(int numTotalBytes,
                                                     int numDataBytes,
                                                     int numRSBlocks,
                                                     int blockID,
                                                     int[] numDataBytesInBlock,
                                                     int[] numECBytesInBlock) throws WriterException {
    if (blockID >= numRSBlocks) {
      throw new WriterException("Block ID too large");
    }
    // numRsBlocksInGroup2 = 196 % 5 = 1
    int numRsBlocksInGroup2 = numTotalBytes % numRSBlocks;
    // numRsBlocksInGroup1 = 5 - 1 = 4
    int numRsBlocksInGroup1 = numRSBlocks - numRsBlocksInGroup2;
    // numTotalBytesInGroup1 = 196 / 5 = 39
    int numTotalBytesInGroup1 = numTotalBytes / numRSBlocks;
    // numTotalBytesInGroup2 = 39 + 1 = 40
    int numTotalBytesInGroup2 = numTotalBytesInGroup1 + 1;
    // numDataBytesInGroup1 = 66 / 5 = 13
    int numDataBytesInGroup1 = numDataBytes / numRSBlocks;
    // numDataBytesInGroup2 = 13 + 1 = 14
    int numDataBytesInGroup2 = numDataBytesInGroup1 + 1;
    // numEcBytesInGroup1 = 39 - 13 = 26
    int numEcBytesInGroup1 = numTotalBytesInGroup1 - numDataBytesInGroup1;
    // numEcBytesInGroup2 = 40 - 14 = 26
    int numEcBytesInGroup2 = numTotalBytesInGroup2 - numDataBytesInGroup2;
    // 合理性检验
    // 26 = 26
    if (numEcBytesInGroup1 != numEcBytesInGroup2) {
      throw new WriterException("EC bytes mismatch");
    }
    // 5 = 4 + 1.
    if (numRSBlocks != numRsBlocksInGroup1 + numRsBlocksInGroup2) {
      throw new WriterException("RS blocks mismatch");
    }
    // 196 = (13 + 26) * 4 + (14 + 26) * 1
    if (numTotalBytes !=
        ((numDataBytesInGroup1 + numEcBytesInGroup1) *
            numRsBlocksInGroup1) +
            ((numDataBytesInGroup2 + numEcBytesInGroup2) *
                numRsBlocksInGroup2)) {
      throw new WriterException("Total bytes mismatch");
    }

    if (blockID < numRsBlocksInGroup1) {
      numDataBytesInBlock[0] = numDataBytesInGroup1;
      numECBytesInBlock[0] = numEcBytesInGroup1;
    } else {
      numDataBytesInBlock[0] = numDataBytesInGroup2;
      numECBytesInBlock[0] = numEcBytesInGroup2;
    }
  }

interleaveWithECBytes
用相应的纠错字节交织“位”。成功后，将结果存储在“结果”中。交织规则很复杂。

  static BitArray interleaveWithECBytes(BitArray bits,
                                        int numTotalBytes,
                                        int numDataBytes,
                                        int numRSBlocks) throws WriterException {
    // “bits”必须有“getNumDataBytes”字节的数据。
    if (bits.getSizeInBytes() != numDataBytes) {
      throw new WriterException("Number of bits and data bytes does not match");
    }
    // 第一步。将数据字节划分为块，并为其生成纠错字节。我们将把分割的数据字节块和纠错字节块存储到“块”中。
    int dataBytesOffset = 0;
    int maxNumDataBytes = 0;
    int maxNumEcBytes = 0;
    // 因为我们知道reedsolmon块的数量，所以我们可以用这个数字初始化向量。
    Collection<BlockPair> blocks = new ArrayList<>(numRSBlocks);

    for (int i = 0; i < numRSBlocks; ++i) {
      int[] numDataBytesInBlock = new int[1];
      int[] numEcBytesInBlock = new int[1];
      getNumDataBytesAndNumECBytesForBlockID(
          numTotalBytes, numDataBytes, numRSBlocks, i,
          numDataBytesInBlock, numEcBytesInBlock);

      int size = numDataBytesInBlock[0];
      byte[] dataBytes = new byte[size];
      bits.toBytes(8 * dataBytesOffset, dataBytes, 0, size);
      byte[] ecBytes = generateECBytes(dataBytes, numEcBytesInBlock[0]);
      blocks.add(new BlockPair(dataBytes, ecBytes));

      maxNumDataBytes = Math.max(maxNumDataBytes, size);
      maxNumEcBytes = Math.max(maxNumEcBytes, ecBytes.length);
      dataBytesOffset += numDataBytesInBlock[0];
    }
    if (numDataBytes != dataBytesOffset) {
      throw new WriterException("Data bytes does not match offset");
    }

    BitArray result = new BitArray();
    // 首先，放置数据块。
    for (int i = 0; i < maxNumDataBytes; ++i) {
      for (BlockPair block : blocks) {
        byte[] dataBytes = block.getDataBytes();
        if (i < dataBytes.length) {
          result.appendBits(dataBytes[i], 8);
        }
      }
    }
    //然后，放置纠错块。
    for (int i = 0; i < maxNumEcBytes; ++i) {
      for (BlockPair block : blocks) {
        byte[] ecBytes = block.getErrorCorrectionBytes();
        if (i < ecBytes.length) {
          result.appendBits(ecBytes[i], 8);
        }
      }
    }
    if (numTotalBytes != result.getSizeInBytes()) {  // Should be same. 应该是一样的。
      throw new WriterException("Interleaving error: " + numTotalBytes + " and " +
          result.getSizeInBytes() + " differ.");
    }

    return result;
  }

appendModeInfo
附加模式信息。成功后，将结果存储在“位”中。

  static void appendModeInfo(Mode mode, BitArray bits) {
    bits.appendBits(mode.getBits(), 4);
  }

appendLengthInfo
附加长度信息。成功后，将结果存储在“位”中。

  static void appendLengthInfo(int numLetters, Version version, Mode mode, BitArray bits) throws WriterException {
    int numBits = mode.getCharacterCountBits(version);
    if (numLetters >= (1 << numBits)) {
      throw new WriterException(numLetters + " is bigger than " + ((1 << numBits) - 1));
    }
    bits.appendBits(numLetters, numBits);
  }

appendBytes
将“模式”模式（编码）中的“字节”追加到“位”中。成功后，将结果存储在“位”中。

  static void appendBytes(String content,
                          Mode mode,
                          BitArray bits,
                          Charset encoding) throws WriterException {
    switch (mode) {
      case NUMERIC:
        appendNumericBytes(content, bits);
        break;
      case ALPHANUMERIC:
        appendAlphanumericBytes(content, bits);
        break;
      case BYTE:
        append8BitBytes(content, bits, encoding);
        break;
      case KANJI:
        appendKanjiBytes(content, bits);
        break;
      default:
        throw new WriterException("Invalid mode: " + mode);
    }
  }

  static void appendNumericBytes(CharSequence content, BitArray bits) {
    int length = content.length();
    int i = 0;
    while (i < length) {
      int num1 = content.charAt(i) - '0';
      if (i + 2 < length) {
        // 将三个数字字母编码为十位。
        int num2 = content.charAt(i + 1) - '0';
        int num3 = content.charAt(i + 2) - '0';
        bits.appendBits(num1 * 100 + num2 * 10 + num3, 10);
        i += 3;
      } else if (i + 1 < length) {
        // 将两个数字字母编码为七位。
        int num2 = content.charAt(i + 1) - '0';
        bits.appendBits(num1 * 10 + num2, 7);
        i += 2;
      } else {
        // 将一个数字字母编码为四位。
        bits.appendBits(num1, 4);
        i++;
      }
    }
  }

  static void appendAlphanumericBytes(CharSequence content, BitArray bits) throws WriterException {
    int length = content.length();
    int i = 0;
    while (i < length) {
      int code1 = getAlphanumericCode(content.charAt(i));
      if (code1 == -1) {
        throw new WriterException();
      }
      if (i + 1 < length) {
        int code2 = getAlphanumericCode(content.charAt(i + 1));
        if (code2 == -1) {
          throw new WriterException();
        }
        // 将两个字母数字字母编码为11位。
        bits.appendBits(code1 * 45 + code2, 11);
        i += 2;
      } else {
        // 将一个字母数字字母编码为六位。
        bits.appendBits(code1, 6);
        i++;
      }
    }
  }

  static void append8BitBytes(String content, BitArray bits, Charset encoding) {
    byte[] bytes = content.getBytes(encoding);
    for (byte b : bytes) {
      bits.appendBits(b, 8);
    }
  }

  static void appendKanjiBytes(String content, BitArray bits) throws WriterException {
    byte[] bytes = content.getBytes(StringUtils.SHIFT_JIS_CHARSET);
    if (bytes.length % 2 != 0) {
      throw new WriterException("Kanji byte size not even");
    }
    int maxI = bytes.length - 1; // bytes.length must be even
    for (int i = 0; i < maxI; i += 2) {
      int byte1 = bytes[i] & 0xFF;
      int byte2 = bytes[i + 1] & 0xFF;
      int code = (byte1 << 8) | byte2;
      int subtracted = -1;
      if (code >= 0x8140 && code <= 0x9ffc) {
        subtracted = code - 0x8140;
      } else if (code >= 0xe040 && code <= 0xebbf) {
        subtracted = code - 0xc140;
      }
      if (subtracted == -1) {
        throw new WriterException("Invalid byte sequence");
      }
      int encoded = ((subtracted >> 8) * 0xc0) + (subtracted & 0xff);
      bits.appendBits(encoded, 13);
    }
  }

2021SC@SDUSC Zxing开源代码（七）QR码的编码（四）——Encoder详解

1、ALPHANUMERIC_TABLE

2、calculateMaskPenalty

3、encode

4、recommendVersion

5、getAlphanumericCode

6、chooseMode

7、其他

相关阅读

相关文章

相关问答

相关文档