Google Protobuf java 序列化工具的使用

逑衡

2023-12-01

Google Protobuf java 序列化工具的使用

why?

使用Java序列化。这是默认的方法,效率比较低。
将数据序列化为XML。这种方法可能非常有吸引力，因为XML是（可能的）人类可读的，并且有很多语言的绑定库。如果您想与其他应用程序项目共享数据，这可能是一个很好的选择。然而，XML是众所周知的空间密集型，编码解码可能会对应用程序造成巨大的性能损失。另外，导航XML DOM树比在一般类中简单的字段导航要复杂得多。
protobuf:Protocol buffers are the flexible, efficient, automated solution to solve exactly this problem.使用协议缓冲区，您将写入.proto要存储的数据结构的描述。从那里，protobuf缓冲区编译器创建一个实现自动编码和解析协议缓冲区数据的类，并使用高效的二进制格式。生成的类为组成协议缓冲区的字段提供getter和setter，并且将单元中的协议缓冲区的读取和写入细节。重要的是，协议缓冲格式支持延长格式随时间推移的想法，使得代码仍然可以读取以旧格式编码的数据。(更改之后还是可以使用旧的代码，这个就厉害了额)
https://developers.google.com/protocol-buffers/docs/javatutorial

how to use ? https://developers.google.com/protocol-buffers/docs/proto

不管是XML还是Json还是Google Protobuf都是需要去规定格式的，固定的格式才能去好好的解析出来，双方都可以知道这个代表的意义，毕竟这个经过序列化后的二进制文件将进行多语言之间的通信，必须保持一致性，不然怎么沟通。

Create a file xxx.proto
The definitions in a .proto file are simple: you add a message for each data structure you want to serialize, then specify a name and a type for each field in the message.定义文件的message这个结构就是我们需要序列化的信息字段。
编写这个文件的语法与C ++或Java类似的，不需要什么不同。
如下是在proto文件中配置的信息，比如Java的包名称，生成的数据的类名称。

option java_package = "com.test.proto";//包的名称
option java_outer_classname = "Person";//生成的数据访问类的类名

A message is just an aggregate containing a set of typed fields. Many standard simple data types are available as field types, including bool, int32, float, double, and string。(一个message相当于一系列的类型的fields，有很多的基本类型，譬如bool,int32,float,double,double,string等等)

option java_package = "com.test.proto";//包的名称
option java_outer_classname = "PersonEntity";//生成的数据访问类的类名 
message Person {  
  required int32 id = 1;//同上  
  required string name = 2;//必须字段，在后面的使用中必须为该段设置值  
  optional string email = 3;//可选字段，在后面的使用中可以自由决定是否为该字段设置值
}

我们也可以嵌套message，内置message,还可以使用枚举类型。You can also add further structure to your messages by using other message types as field types Person message contains PhoneNumber messages, while the AddressBook message contains Person messages(一个引用另外一个message). You can even define message types nested inside(嵌套)other messages – as you can see the PhoneNumber type is defined inside Person. 你也可以定义枚举类型，如果你想要你的一个领域有一个预定义的值列表 – here you want to specify that a phone number can be one of MOBILE, HOME, or WORK.

package tutorial;

option java_package = "com.example.tutorial";
option java_outer_classname = "AddressBookProtos";

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}

看看格式类型 role type name = tag [default value];
role有三种取值：
required：表示该字段必须有值，不能为空，否则message被认为是未初始化的。如果试图build一个未初始化的message将会抛出RuntimeException。解析未初始化的message会抛出IOException。

optional：表示该段为可选值，可以不进行设置。如果不设置，会设置一个默认值。可以设置一个默认值，正如示例中的type字段。否则使用系统默认值，数字类型默认为0；字符串类型默认为空字符串；逻辑类型默认为false；内部自定义的message，默认值通常是message的实例或原型。

repeated：表示该字段可以重复，可出现任意次的, 字段限定标识

注意：使用required字段一定要小心，因为该字段是永久性的。如果以后因为某种原因，想不用该字段，或者要将该字段改成optional或者repeated，那么使用旧接口读取新的协议时，如果发现没有该字段，他们会认为该字段是不完整的，会拒接接收该消息，或者直接丢弃。
tag：每个元素上的“= 1”、“= 2”标记标识二进制编码中字段所使用的唯一“标记”。 numbers 1-15 require one less byte to encode than higher numbers, so as an optimization you can decide to use those tags for the commonly used or repeated elements, leaving tags 16 and higher for less-commonly used optional elements. Each element in a repeated field requires re-encoding the tag number, so repeated fields are particularly good candidates for this optimization.(这个估计是一种优化吧)

Compiling Your Protocol Buffers(生成你想要的java或者C++的结构)

下载 https://developers.google.com/protocol-buffers/docs/downloads.html 工具
配置path：protoc.exe的路径

protoc -I=.proto文件所在的路径 --java_out=输出的java文件的路径  需要生成的文件的位置addressbook.proto（可以跟着一长串）

例子

E:\Demo\google-proto>protoc.exe -I=E:\Demo\google-proto --java_out=E:\Demo\googl
e-proto E:\Demo\google-proto\person.proto

option java_package ="test.proto";//包的名称
option java_outer_classname ="PersonEntity";//生成的数据访问类的类名 
message Person {  
  required int32 id = 1;//同上  
  required string name = 2;//必须字段，在后面的使用中必须为该段设置值  
  optional string email = 3;//可选字段，在后面的使用中可以自由决定是否为该字段设置值
}

package test.proto;

import com.google.protobuf.InvalidProtocolBufferException;
import lombok.extern.slf4j.Slf4j;

/**
 * Created by wangji on 2017/3/30.
 */
@Slf4j
public class Test {
    public static void main(String[] args) {
        PersonEntity.Person.Builder builder = PersonEntity.Person.newBuilder();
        builder.setName("nnn").setId(1).setEmail("dddd");
        PersonEntity.Person person = builder.build();
        log.info(person.toString());
        for(byte b : person.toByteArray()){
            System.out.print(b);
        }
        log.info(person.toByteString().toString());
        try {
            //模拟接收Byte[]，反序列化成Person类
            byte[] byteArray =person.toByteArray();
            PersonEntity.Person p2 = PersonEntity.Person.parseFrom(byteArray);
            log.info("after :" +p2.toString());
        } catch (InvalidProtocolBufferException e) {
            e.printStackTrace();
        }

    }
}

INFO [main] (Test.java:15) - id: 1
name: "nnn"
email: "dddd"

81183110110110264100100100100 INFO [main] <ByteString@61dc03ce size=13>
INFO [main] (Test.java:24) - after :id: 1
name: "nnn"
email: "dddd"

还可以转换为json可以去 https://github.com/bivas/protobuf-java-format

Google Protobuf java 序列化工具的使用