在Java中，相同代码块的运行时间不同。这是为什么？

陶宜民

2023-03-14

问题内容：

我有下面的代码。我只想检查代码块的运行时间。错误地，我再次复制并粘贴了相同的代码，并得到了有趣的结果。尽管代码块相同，但运行时间不同。而且code block 1 比其他人花费更多的时间。如果我切换代码块，(say i move the code blocks 4 to the top)则代码块4将比其他代码花费更多时间。

我在代码块中使用了两种不同类型的数组来检查它是否依赖于此。结果是一样的。如果代码块具有相同类型的数组，则最上面的代码块将花费更多时间。参见下面的代码和给出的输出。

public class ABBYtest {

public static void main(String[] args) {
    long startTime;
    long endTime;

    //code block 1
    startTime = System.nanoTime();
    Long a[] = new Long[10];
    for (int i = 0; i < a.length; i++) {
        a[i] = 12l;
    }
    Arrays.sort(a);
    endTime = System.nanoTime();
    System.out.println("code block (has Long array) 1 = " + (endTime - startTime));

    //code block 6
    startTime = System.nanoTime();
    Long aa[] = new Long[10];
    for (int i = 0; i < aa.length; i++) {
        aa[i] = 12l;
    }
    Arrays.sort(aa);
    endTime = System.nanoTime();
    System.out.println("code block (has Long array) 6 = " + (endTime - startTime));


    //code block 7
    startTime = System.nanoTime();
    Long aaa[] = new Long[10];
    for (int i = 0; i < aaa.length; i++) {
        aaa[i] = 12l;
    }
    Arrays.sort(aaa);
    endTime = System.nanoTime();
    System.out.println("code block (has Long array) 7 = " + (endTime - startTime));

    //code block 2
    startTime = System.nanoTime();
    long c[] = new long[10];
    for (int i = 0; i < c.length; i++) {
        c[i] = 12l;
    }
    Arrays.sort(c);
    endTime = System.nanoTime();
    System.out.println("code block (has long array) 2 = " + (endTime - startTime));

    //code block 3
    startTime = System.nanoTime();
    long d[] = new long[10];
    for (int i = 0; i < d.length; i++) {
        d[i] = 12l;
    }
    Arrays.sort(d);
    endTime = System.nanoTime();
    System.out.println("code block (has long array) 3 = " + (endTime - startTime));

    //code block 4
    startTime = System.nanoTime();
    long b[] = new long[10];
    for (int i = 0; i < b.length; i++) {
        b[i] = 12l;
    }
    Arrays.sort(b);
    endTime = System.nanoTime();
    System.out.println("code block (has long array) 4 = " + (endTime - startTime));

    //code block 5
    startTime = System.nanoTime();
    Long e[] = new Long[10];
    for (int i = 0; i < e.length; i++) {
        e[i] = 12l;
    }
    Arrays.sort(e);
    endTime = System.nanoTime();
    System.out.println("code block (has Long array) 5 = " + (endTime - startTime));


}
}

运行时间：

code block (has Long array) 1 = 802565

code block (has Long array) 6 = 6158

code block (has Long array) 7 = 4619

code block (has long array) 2 = 171906

code block (has long array) 3 = 4105

code block (has long array) 4 = 3079

code block (has Long array) 5 = 8210

如我们所见，包含的第一个代码块Long array将比包含的其他代码花费更多的时间Long arrays。包含的第一个代码块也是如此long array。

任何人都可以解释这种行为。还是我在这里做错了？

问题答案：

基准测试错误。错误的非详尽清单：

没有预热 ：单次测量几乎总是错误的；
在单个方法中混合几个代码路径 ：我们可能开始使用仅适用于该方法中第一个循环的执行数据来编译该方法；
来源是可预测的 ：如果循环编译，我们实际上可以预测结果；
结果消除了死代码 ：如果循环编译，我们可以将循环丢掉

这可以说是用jmh正确地做到的：

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 3, time = 1)
@Measurement(iterations = 3, time = 1)
@Fork(10)
@State(Scope.Thread)
public class Longs {

    public static final int COUNT = 10;

    private Long[] refLongs;
    private long[] primLongs;

    /*
     * Implementation notes:
     *   - copying the array from the field keeps the constant
     *     optimizations away, but we implicitly counting the
     *     costs of arraycopy() in;
     *   - two additional baseline experiments quantify the
     *     scale of arraycopy effects (note you can't directly
     *     subtract the baseline scores from the tests, because
     *     the code is mixed together;
     *   - the resulting arrays are always fed back into JMH
     *     to prevent dead-code elimination
     */

    @Setup
    public void setup() {
        primLongs = new long[COUNT];
        for (int i = 0; i < COUNT; i++) {
            primLongs[i] = 12l;
        }

        refLongs = new Long[COUNT];
        for (int i = 0; i < COUNT; i++) {
            refLongs[i] = 12l;
        }
    }

    @GenerateMicroBenchmark
    public long[] prim_baseline() {
        long[] d = new long[COUNT];
        System.arraycopy(primLongs, 0, d, 0, COUNT);
        return d;
    }

    @GenerateMicroBenchmark
    public long[] prim_sort() {
        long[] d = new long[COUNT];
        System.arraycopy(primLongs, 0, d, 0, COUNT);
        Arrays.sort(d);
        return d;
    }

    @GenerateMicroBenchmark
    public Long[] ref_baseline() {
        Long[] d = new Long[COUNT];
        System.arraycopy(refLongs, 0, d, 0, COUNT);
        return d;
    }

    @GenerateMicroBenchmark
    public Long[] ref_sort() {
        Long[] d = new Long[COUNT];
        System.arraycopy(refLongs, 0, d, 0, COUNT);
        Arrays.sort(d);
        return d;
    }

}

…产生：

Benchmark                   Mode   Samples         Mean   Mean error    Units
o.s.Longs.prim_baseline     avgt        30       19.604        0.327    ns/op
o.s.Longs.prim_sort         avgt        30       51.217        1.873    ns/op
o.s.Longs.ref_baseline      avgt        30       16.935        0.087    ns/op
o.s.Longs.ref_sort          avgt        30       25.199        0.430    ns/op

在这一点上，您可能会开始怀疑为什么排序Long[]和排序long[]会花费不同的时间。答案在于Array.sort()重载：OpenJDK通过不同的算法（使用TimSort的引用，使用双数据点快速排序的基元）对基元数组和引用数组进行排序。这是使用选择另一个算法的亮点-Djava.util.Arrays.useLegacyMergeSort=true，这又落到了合并排序的参考上：

Benchmark                   Mode   Samples         Mean   Mean error    Units
o.s.Longs.prim_baseline     avgt        30       19.675        0.291    ns/op
o.s.Longs.prim_sort         avgt        30       50.882        1.550    ns/op
o.s.Longs.ref_baseline      avgt        30       16.742        0.089    ns/op
o.s.Longs.ref_sort          avgt        30       64.207        1.047    ns/op

希望这有助于解释差异。

上面的解释几乎没有涉及排序的性能。当使用不同的源数据（包括可用的预排序子序列，它们的模式和游程长度，数据本身的大小）呈现时，性能会有很大不同。

在Java中，相同代码块的运行时间不同。这是为什么？

相关阅读

相关文章

相关问答

相关工具

相关文档