编码指南

优质

小牛编辑

144浏览

2023-12-01

This document describes the coding guidelines for the Libra Core Rust codebase.

Code Formatting

All code formatting is enforced with rustfmt using a project specific configuration. Here is a sample command to run rustfmt and adhere to the project conventions:

libra$ cargo fmt

Code Analysis

Clippy is used to catch common mistakes and is run as a part of continuous integration. Before submitting your code for review, you can run clippy with our configuration:

libra$ ./scripts/clippy.sh

In general, we follow the recommendations from rust-lang-nursery. The remainder of this guide provides detailed guidelines on specific topics, to achieve uniformity of the codebase.

Code Documentation

Any public fields, functions, and methods should be documented with Rustdoc. Follow the conventions described below for modules, structs, enums, and functions. The content on the [single line] shown in the following example is used as a preview in the generated rust documentation. Refer to the 'Structs' and 'Enums' sections in the collections Rustdoc, for examples.

/// [Single line] One line summary description
///
/// [Longer description] Multiple lines, inline code 
/// examples, invariants, purpose, usage, etc. 
[Attributes] If attributes exist, add after Rustdoc

Here is an example:

/// Represents (x, y) of a 2-dimensional grid
///
/// A line is defined by 2 instances.
/// A plane is defined by 3 instances.
#[repr(C)]
struct Point {
    x: i32,
    y: i32,
}

Constants and Fields

Describe the purpose and definition of the constants and fields in the code.

Functions and Methods

Document the following for each function:

The action the method performs — “This method adds a new transaction to the mempool.” Use active voice and present tense (i.e., adds/creates/checks/updates/deletes).
Describe how and why to use this method.
Any condition that must be met before calling the method.
State conditions under which the method will panic!() or returns an Error
Provide a brief description of return values.
Clearly state any special behavior that is not obvious in the code itself.

README.md for Top-Level Directories and Other Major Components

Each major component of the system needs to have a README.md file. Major components are:

Top-level directories (e.g., libra/network, libra/language).
The most important crates in the system (e.g., vm_runtime).

The README.md should contain:

The conceptual documentation of the component.
A link to external API documentation for the component.

You can refer to this sample README libra/network/README.md which describes the network crate.

Here is a template for a README.md:

# Component Name

[Summary line: Start with one sentence about this component.]

## Overview

* Describe the purpose of this component and how the code in this directory works.
* Describe the interaction of code in this directory with other components.
* Describe the security model and assumptions about the crates in this directory. 

## Implementation Details

* Describe how the component is modeled. For example, why is the code organized the way it is?
* Other relevant implementation details.

## API Documentation

For the external API of this crate refer to the API documentation.

For a top-level directory, link to the most important APIs within.

Code Suggestions

In the following sections we have suggested some best practices for a uniform codebase. We will investigate and identify the practices that can be enforced using Clippy. This information will evolve and improve over time.

Attributes

Use appropriate attributes for handling dead code:

// For code that is intended for production usage in the future
#[allow(dead_code)]
// For code that is only intended for testing and 
// has no intended production use
#[cfg(test)]

Avoid Deref Polymorphism

Don't abuse the Deref trait to emulate inheritance between structs, and reuse methods. For more information, read here.

Comments

We recommend that you use // and /// comments rather than block comments /* ... */ for uniformity and easier grepping.

Cloning

If x is reference counted, prefer Arc::clone(x) over x.clone().Arc::clone(x) explicitly indicates that we are cloning x. This avoids confusion about whether we are performing an expensive clone of a struct, enum, or other type, or just a cheap reference copy.

Also, if you are passing around Arc<T> types, consider using a newtype wrapper:

#[derive(Clone, Debug)]
pub struct Foo(Arc<FooInner>);

Concurrent Types

Concurrent types such as CHashMap, AtomicUsize, etc. have an immutable borrow on self, i.e. fn foo_mut(&self,...), to support concurrent access on interior mutating methods. Good practices (such as those in the examples mentioned) avoid exposing synchronization primitives externally (e.g. Mutex, RwLock) and document the method semantics and invariants clearly.

When to use channels versus concurrent types?

Here are some high-level suggestions:

Channels are for ownership transfer, decoupling of types, and coarse grained messages. They fit well for transferring ownership of data, distributing units of work, and communicating async results. Furthermore, they help break circular dependencies (e.g. struct Foo contains an Arc<Bar> and struct Bar contains an Arc<Foo> that leads to complex initialization).
Concurrent types (such as CHashMap or structs that have interior mutability building on Mutex, RwLock, etc.) are better suited for caches and states.

Error Handling

Error handling suggestions follow the Rust book guidance. Rust groups errors into two major categories: recoverable and unrecoverable errors. Recoverable errors should be handled with Result. For our suggestions on unrecoverable errors, see below:

Panic

panic!() - Runtime panic! should only be used when the resulting state cannot be processed going forward. It should not be used for any recoverable errors.
unwrap() - Unwrap should only be used for mutexes (e.g. lock().unwrap()) and test code. For all other use cases, prefer expect(). The only exception is if the error message is custom-generated, in this case use .unwrap_or_else(|| panic!("error: {}", foo))
expect() - Expect should be invoked when a system invariant is expected to be preserved. expect() is preferred over unwrap() and it should have a detailed error message on failure in most cases.
assert!() - This macro is kept in both debug/release and should be used to protect invariants of the system as necessary
unreachable!() - This macro will panic on code that should not be reached (violating an invariant), and it can be used where appropriate.

Generics

Generics allow dynamic behavior (similar to trait methods) with static dispatch. As the number of generic type parameters increase, the difficulty of using the type/method also increases (e.g. you have to consider the combination of trait bounds required for this type, duplicate trait bounds on related types, etc.). To avoid this complexity, we generally try to avoid using a large number of generic type parameters. We have found that converting code with a large number of generic objects to trait objects with dynamic dispatch often simplifies our code.

Getters and Setters

Exclude test code and set field visibility to private as much as possible. Private fields allow constructors to enforce internal invariants. Implement getters for data that consumers may need, but avoid setters unless mutable state is necessary.

Public fields are most appropriate for struct types in the C spirit: compound, passive data structures without internal invariants. Naming suggestions follow the guidance here as shown below.

struct Foo {
    size: usize,
    key_to_value: HashMap<u32, u32>
}

impl Foo {
    /// Return a copy when inexpensive
    fn size(&self) -> usize {
        self.size
    }

    /// Borrow for expensive copies
    fn key_to_value(&self) -> &HashMap<u32, u32> {
        &self.key_to_value
    }

    /// Setter follows set_xxx pattern
    fn set_foo(&mut self, size: usize){
        self.size = size;
    }

    /// For a more complex getter, using get_XXX is acceptable 
    /// (similar to HashMap) with well-defined and 
    /// commented semantics
    fn get_value(&self, key: u32) -> Option<&u32> {
        self.key_to_value.get(&key)
    }
}

Logging

We currently use slog for logging.

error! - Error-level messages have the highest urgency in slog. They are used for unexpected errors (e.g. exceeded the maximum number of retries to complete an RPC or inability to store data to local storage).
warn! - Warn-level messages help notify admins about automatically handled issues (e.g. retrying a failed network connection or receiving the same message multiple times, etc.).
info! - Info-level messages are well suited for "one time" events (such as logging state on one-time startup and shutdown) or periodic events that are not frequently occurring - e.g. changing the validator set every day.
debug! - Debug-level messages are frequently occurring (i.e. potentially > 1 message per second) and are not typically expected to be enabled in production.
trace! - Trace-level logging is typically only used for function entry/exit.

Testing

Unit tests

Ideally, all code will be unit tested. Unit test files should exist in the same directory as mod.rs, and their file names should end in _test.rs. A module to be tested should have the test modules annotated with #[cfg(test)]. For example, if in a crate there is a db module, the expected directory structure is as follows:

src/db                        -> directory of db module
src/db/mod.rs                 -> code of db module
src/db/read_test.rs           -> db test 1
src/db/write_test.rs          -> db test 2
src/db/access/mod.rs          -> directory of access submodule
src/db/access/access_test.rs  -> test of access submodule

Property-based tests

Libra contains property-based tests written in Rust using the proptest framework. Property-based tests generate random test cases and assert that invariants, also called properties, hold about the code under test.

Some examples of properties tested in Libra Core:

Every serializer and deserializer pair is tested for correctness, with random inputs to the serializer. A pair of functions that are inverse of each other can be tested this way.
The results of executing common transactions through the VM are tested using randomly generated scenarios, and a simplified model as an oracle.

A tutorial for proptest can be found in the proptest book. For further information on property testing refer to:

What is Property Based Testing? (includes a comparison with fuzzing)
An introduction to property-based testing
Choosing properties for property-based testing

Fuzzing

Libra contains harnesses for fuzzing crash-prone code like deserializers, which use libFuzzer through cargo fuzz. For more, see the testsuite/libra_fuzzer directory.