嵌套循环性能

算法 刘宇帅 3年前 阅读量: 613

下面有两个函数,猜一下哪一个执行的快

func nestFor1() {
    a := 0
    for i := 0; i < 10000000; i++ {
        for j := 0; j < 2; j++ {
            a = a + i
            a = a + j
        }
    }
}

func nestFor2() {
    a := 0
    for i := 0; i < 2; i++ {
        for j := 0; j < 10000000; j++ {
            a = a + i
            a = a + j
        }
    }
}

从 golang 语言层面看这两个函数运行起来怎么都是一样的。
我们跑下 Benchmark

func BenchmarkNestFor1(b *testing.B) {
    for i := 0; i < b.N; i++ {
        nestFor1()
    }
}
func BenchmarkNestFor2(b *testing.B) {
    for i := 0; i < b.N; i++ {
        nestFor2()
    }
}

跑测试

> $ go test -bench=NestFor                                                                                          [20:38:19]
goos: darwin
goarch: amd64
pkg: github.com/yushuailiu/go-algorithm/golang
BenchmarkNestFor1-4          100      10355409 ns/op
BenchmarkNestFor2-4          200       6776159 ns/op
PASS
ok      github.com/yushuailiu/go-algorithm/golang   3.127s

我们可以看到 nestFor1 比 nestFor2 慢了快一半了。
go 语言最终是要编译成汇编进而转化成机器语言,而性能就体现在指令少,而上面问题的根本原因是 nestFor1 比 nestFor2 执行的指令少,首先两者的内层循环跳转次数都是 10000000*2 而 nestFor1 的外层跳转次数是 10000000 nestFor2 的外层跳转次数是 2,同理 nestFor2 的外层循环中的比较次数和递增次数都少于 nestFor1,所以结论就是把循环次数较多的循环放在能不可以提升性能,虽然这是微乎其微的但是养成这样的习惯也挺好的。
感兴趣的同学可以看下两个函数汇编指令各个指令执行的次数:
例子源码
生成汇编源码指令

go tool compile -N -S nest_for_test.go

nestFor1 汇编 ()

.nestFor1 STEXT nosplit size=135 args=0x0 locals=0x20
    0x0000 00000 (nest_for_test.go:5)   TEXT    "".nestFor1(SB), NOSPLIT, $32-0
    0x0000 00000 (nest_for_test.go:5)   SUBQ    $32, SP
    0x0004 00004 (nest_for_test.go:5)   MOVQ    BP, 24(SP)
    0x0009 00009 (nest_for_test.go:5)   LEAQ    24(SP), BP
    0x000e 00014 (nest_for_test.go:5)   FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:5)   FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:5)   FUNCDATA    $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:6)   PCDATA  $2, $0
    0x000e 00014 (nest_for_test.go:6)   PCDATA  $0, $0
    0x000e 00014 (nest_for_test.go:6)   MOVQ    $0, "".a+16(SP)
    0x0017 00023 (nest_for_test.go:7)   MOVQ    $0, "".i+8(SP)
    0x0020 00032 (nest_for_test.go:7)   JMP 34
    0x0022 00034 (nest_for_test.go:7)   CMPQ    "".i+8(SP), $10000000
    0x002b 00043 (nest_for_test.go:7)   JLT 47
    0x002d 00045 (nest_for_test.go:7)   JMP 125
    0x002f 00047 (nest_for_test.go:8)   MOVQ    $0, "".j(SP)
    0x0037 00055 (nest_for_test.go:8)   JMP 57
    0x0039 00057 (nest_for_test.go:8)   CMPQ    "".j(SP), $2
    0x003e 00062 (nest_for_test.go:8)   JLT 66
    0x0040 00064 (nest_for_test.go:8)   JMP 108
    0x0042 00066 (nest_for_test.go:9)   MOVQ    "".a+16(SP), AX
    0x0047 00071 (nest_for_test.go:9)   ADDQ    "".i+8(SP), AX
    0x004c 00076 (nest_for_test.go:9)   MOVQ    AX, "".a+16(SP)
    0x0051 00081 (nest_for_test.go:10)  MOVQ    "".j(SP), CX
    0x0055 00085 (nest_for_test.go:10)  ADDQ    CX, AX
    0x0058 00088 (nest_for_test.go:10)  MOVQ    AX, "".a+16(SP)
    0x005d 00093 (nest_for_test.go:10)  JMP 95
    0x005f 00095 (nest_for_test.go:8)   MOVQ    "".j(SP), AX
    0x0063 00099 (nest_for_test.go:8)   INCQ    AX
    0x0066 00102 (nest_for_test.go:8)   MOVQ    AX, "".j(SP)
    0x006a 00106 (nest_for_test.go:8)   JMP 57
    0x006c 00108 (nest_for_test.go:7)   PCDATA  $2, $-2
    0x006c 00108 (nest_for_test.go:7)   PCDATA  $0, $-2
    0x006c 00108 (nest_for_test.go:7)   JMP 110
    0x006e 00110 (nest_for_test.go:7)   PCDATA  $2, $0
    0x006e 00110 (nest_for_test.go:7)   PCDATA  $0, $0
    0x006e 00110 (nest_for_test.go:7)   MOVQ    "".i+8(SP), AX
    0x0073 00115 (nest_for_test.go:7)   INCQ    AX
    0x0076 00118 (nest_for_test.go:7)   MOVQ    AX, "".i+8(SP)
    0x007b 00123 (nest_for_test.go:7)   JMP 34
    0x007d 00125 (<unknown line number>)    PCDATA  $2, $-2
    0x007d 00125 (<unknown line number>)    PCDATA  $0, $-2
    0x007d 00125 (<unknown line number>)    MOVQ    24(SP), BP
    0x0082 00130 (<unknown line number>)    ADDQ    $32, SP
    0x0086 00134 (<unknown line number>)    RET
    0x0000 48 83 ec 20 48 89 6c 24 18 48 8d 6c 24 18 48 c7  H.. H.l$.H.l$.H.
    0x0010 44 24 10 00 00 00 00 48 c7 44 24 08 00 00 00 00  D$.....H.D$.....
    0x0020 eb 00 48 81 7c 24 08 80 96 98 00 7c 02 eb 4e 48  ..H.|$.....|..NH
    0x0030 c7 04 24 00 00 00 00 eb 00 48 83 3c 24 02 7c 02  ..$......H.<$.|.
    0x0040 eb 2a 48 8b 44 24 10 48 03 44 24 08 48 89 44 24  .*H.D$.H.D$.H.D$
    0x0050 10 48 8b 0c 24 48 01 c8 48 89 44 24 10 eb 00 48  .H..$H..H.D$...H
    0x0060 8b 04 24 48 ff c0 48 89 04 24 eb cd eb 00 48 8b  ..$H..H..$....H.
    0x0070 44 24 08 48 ff c0 48 89 44 24 08 eb a5 48 8b 6c  D$.H..H.D$...H.l
    0x0080 24 18 48 83 c4 20 c3                             $.H.. .

nestFor2 汇编

.nestFor2 STEXT nosplit size=135 args=0x0 locals=0x20
    0x0000 00000 (nest_for_test.go:15)  TEXT    "".nestFor2(SB), NOSPLIT, $32-0
    0x0000 00000 (nest_for_test.go:15)  SUBQ    $32, SP
    0x0004 00004 (nest_for_test.go:15)  MOVQ    BP, 24(SP)
    0x0009 00009 (nest_for_test.go:15)  LEAQ    24(SP), BP
    0x000e 00014 (nest_for_test.go:15)  FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:15)  FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:15)  FUNCDATA    $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (nest_for_test.go:16)  PCDATA  $2, $0
    0x000e 00014 (nest_for_test.go:16)  PCDATA  $0, $0
    0x000e 00014 (nest_for_test.go:16)  MOVQ    $0, "".a+16(SP)
    0x0017 00023 (nest_for_test.go:17)  MOVQ    $0, "".i+8(SP)
    0x0020 00032 (nest_for_test.go:17)  JMP 34
    0x0022 00034 (nest_for_test.go:17)  CMPQ    "".i+8(SP), $2
    0x0028 00040 (nest_for_test.go:17)  JLT 44
    0x002a 00042 (nest_for_test.go:17)  JMP 125
    0x002c 00044 (nest_for_test.go:18)  MOVQ    $0, "".j(SP)
    0x0034 00052 (nest_for_test.go:18)  JMP 54
    0x0036 00054 (nest_for_test.go:18)  CMPQ    "".j(SP), $10000000
    0x003e 00062 (nest_for_test.go:18)  JLT 66
    0x0040 00064 (nest_for_test.go:18)  JMP 108
    0x0042 00066 (nest_for_test.go:19)  MOVQ    "".a+16(SP), AX
    0x0047 00071 (nest_for_test.go:19)  ADDQ    "".i+8(SP), AX
    0x004c 00076 (nest_for_test.go:19)  MOVQ    AX, "".a+16(SP)
    0x0051 00081 (nest_for_test.go:20)  MOVQ    "".j(SP), CX
    0x0055 00085 (nest_for_test.go:20)  ADDQ    CX, AX
    0x0058 00088 (nest_for_test.go:20)  MOVQ    AX, "".a+16(SP)
    0x005d 00093 (nest_for_test.go:20)  JMP 95
    0x005f 00095 (nest_for_test.go:18)  MOVQ    "".j(SP), AX
    0x0063 00099 (nest_for_test.go:18)  INCQ    AX
    0x0066 00102 (nest_for_test.go:18)  MOVQ    AX, "".j(SP)
    0x006a 00106 (nest_for_test.go:18)  JMP 54
    0x006c 00108 (nest_for_test.go:17)  PCDATA  $2, $-2
    0x006c 00108 (nest_for_test.go:17)  PCDATA  $0, $-2
    0x006c 00108 (nest_for_test.go:17)  JMP 110
    0x006e 00110 (nest_for_test.go:17)  PCDATA  $2, $0
    0x006e 00110 (nest_for_test.go:17)  PCDATA  $0, $0
    0x006e 00110 (nest_for_test.go:17)  MOVQ    "".i+8(SP), AX
    0x0073 00115 (nest_for_test.go:17)  INCQ    AX
    0x0076 00118 (nest_for_test.go:17)  MOVQ    AX, "".i+8(SP)
    0x007b 00123 (nest_for_test.go:17)  JMP 34
    0x007d 00125 (<unknown line number>)    PCDATA  $2, $-2
    0x007d 00125 (<unknown line number>)    PCDATA  $0, $-2
    0x007d 00125 (<unknown line number>)    MOVQ    24(SP), BP
    0x0082 00130 (<unknown line number>)    ADDQ    $32, SP
    0x0086 00134 (<unknown line number>)    RET
    0x0000 48 83 ec 20 48 89 6c 24 18 48 8d 6c 24 18 48 c7  H.. H.l$.H.l$.H.
    0x0010 44 24 10 00 00 00 00 48 c7 44 24 08 00 00 00 00  D$.....H.D$.....
    0x0020 eb 00 48 83 7c 24 08 02 7c 02 eb 51 48 c7 04 24  ..H.|$..|..QH..$
    0x0030 00 00 00 00 eb 00 48 81 3c 24 80 96 98 00 7c 02  ......H.<$....|.
    0x0040 eb 2a 48 8b 44 24 10 48 03 44 24 08 48 89 44 24  .*H.D$.H.D$.H.D$
    0x0050 10 48 8b 0c 24 48 01 c8 48 89 44 24 10 eb 00 48  .H..$H..H.D$...H
    0x0060 8b 04 24 48 ff c0 48 89 04 24 eb ca eb 00 48 8b  ..$H..H..$....H.
    0x0070 44 24 08 48 ff c0 48 89 44 24 08 eb a5 48 8b 6c  D$.H..H.D$...H.l
    0x0080 24 18 48 83 c4 20 c3                             $.H.. .

提示

功能待开通!


暂无评论~