当前位置: 首页 > 工具软件 > nsg-DAO > 使用案例 >

NSG使用

焦阎宝
2023-12-01

 报错:

Could not find OpenBLAS include

运行以下命令可解决(参考):

$ sudo apt-get install libopenblas-dev

执行NSG的步骤: 

生成KNN:./test_nndescent sift_base.fvecs sift.200NN.graph 200 200 10 10 100

转换NSG:nsg/build/tests/test_nsg_index efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift.200NN.graph 40 50 500 sift.nsg


检索:nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift_query.fvecs sift.nsg 70 50 nsg/search_result.ivecs

数据集,生成的结果用matlab看(matlab安装),argparse

生成knn图:

./test_nndescent sift_base.fvecs myindex/sift.200NN.graph 200 200 10 10 100





data dimension: 128
recall : 0.0003
iter: 0
recall : 0.0019
iter: 1
recall : 0.011
iter: 2
recall : 0.0577
iter: 3
recall : 0.1761
iter: 4
recall : 0.376
iter: 5
recall : 0.5735
iter: 6
recall : 0.7368
iter: 7
recall : 0.8494
iter: 8
recall : 0.9114
iter: 9
1000000
Time cost: 53.4881




Degree Statistics: Max = 50, Min = 1, Avg = 29
indexing time: 87.951


search time: 2.81812

-----------------------------------------
最近邻为100时:

nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift_query.fvecs sift_33.nsg 110 100 myindex/search_result_33.ivecs

search time: 4.09004

计算recall为1.0(计算方法不对)
recall: 0.973034(SPTAG、NSG的计算方法)

100_search_result.ivecs----------recall: 0.958752
50_search_result.ivecs-----------recall: 0.9426990000000001

读取文件要用到tarfile,读取fvecs和ivecs用到的

def ivecs_read(filename, c_contiguous=True):
    fv = np.fromfile(filename, dtype=np.int32)
    if fv.size == 0:
        return np.zeros((0, 0))
    dim = fv.view(np.int32)[0]
    assert dim > 0
    fv = fv.reshape(-1, 1 + dim)
    if not all(fv.view(np.int32)[:, 0] == dim):
        raise IOError("Non-uniform vector sizes in " + filename)
    fv = fv[:, 1:]
    if c_contiguous:
        fv = fv.copy()
    return fv




def fvecs_read(filename, c_contiguous=True):
    fv = np.fromfile(filename, dtype=np.float32)
    if fv.size == 0:
        return np.zeros((0, 0))
    dim = fv.view(np.int32)[0]
    assert dim > 0
    fv = fv.reshape(-1, 1 + dim)
    if not all(fv.view(np.int32)[:, 0] == dim):
        raise IOError("Non-uniform vector sizes in " + filename)
    fv = fv[:, 1:]
    if c_contiguous:
        fv = fv.copy()
    return fv
NSG实验:

./test_nndescent sift_base.fvecs sift.100NN.graph 100 100 10 10 100
data dimension: 128
recall : 0.0003
iter: 0
recall : 0.0021
iter: 1
recall : 0.0147
iter: 2
recall : 0.0759
iter: 3
recall : 0.2259
iter: 4
recall : 0.4538
iter: 5
recall : 0.6686
iter: 6
recall : 0.8114
iter: 7
recall : 0.895
iter: 8
recall : 0.9396
iter: 9
1000000
Time cost: 42.9568



nsg/build/tests/test_nsg_index efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift.200NN.graph 40 50 500 sift.nsg

Degree Statistics: Max = 50, Min = 1, Avg = 25
indexing time: 51.8853




nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift_query.fvecs sift.nsg 100 100 myindex/100_search_result.ivecs

search time: 3.49593
recall:1

------------------------------------------------------

./test_nndescent sift_base.fvecs sift.50NN.graph 50 70 8 10 100
data dimension: 128
recall : 0.0003
iter: 0
recall : 0.002
iter: 1
recall : 0.0218
iter: 2
recall : 0.1011
iter: 3
recall : 0.2857
iter: 4
recall : 0.5116
iter: 5
recall : 0.6401
iter: 6
recall : 0.6847
iter: 7
1000000
Time cost: 26.3734


nsg/build/tests/test_nsg_index efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift.50NN.graph 40 50 500 sift.nsg

Degree Statistics: Max = 50, Min = 1, Avg = 21
indexing time: 36.3078


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/sift_base.fvecs efanna_graph/tests/sift_query.fvecs sift.nsg 100 100 myindex/50_search_result.ivecs

search time: 4.38013




GIST:

./test_nndescent gist/gist_base.fvecs gist_400nn.graph 400 400 12 15 100
data dimension: 960
recall : 0.0006
iter: 0
recall : 0.0082
iter: 1
recall : 0.0711
iter: 2
recall : 0.1992
iter: 3
recall : 0.391
iter: 4
recall : 0.5886
iter: 5
recall : 0.741
iter: 6
recall : 0.8399
iter: 7
recall : 0.8976
iter: 8
recall : 0.9299
iter: 9
recall : 0.9496
iter: 10
recall : 0.9628
iter: 11
1000000
Time cost: 820.812


nsg/build/tests/test_nsg_index efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist_400nn.graph 60 70 500 gist.nsg

Degree Statistics: Max = 81, Min = 1, Avg = 21
indexing time: 923.836


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 100 100 myindex_n/gist_result.ivecs

search time: 2.10635

recall:0.26157
recall: 0.86575(使用SPTAG中recall计算方式)

---------------------------修改查询时的L--------
nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 110 100 myindex_n/gist_t4.ivecs

search time: 4.67848
recall: 0.8796299999999999


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 120 100 myindex_n/gist_t5.ivecs

search time: 4.42567
recall: 0.89137


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 130 100 myindex_n/gist_t6.ivecs

search time: 4.7973
recall: 0.9015000000000001


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 140 100 myindex_n/gist_t7.ivecs

search time: 5.48264
recall: 0.9104099999999999


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 150 100 myindex_n/gist_t8.ivecs

search time: 5.68317
recall: 0.91892


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist.nsg 160 100 myindex_n/gist_t9.ivecs

search time: 5.99639
recall: 0.92527


--------------------------------------------------------------------------------
第二次:

./test_nndescent gist/gist_base.fvecs gist_400nn_it14.graph 400 400 14 15 100
data dimension: 960
recall : 0.0006
iter: 0
recall : 0.0086
iter: 1
recall : 0.0697
iter: 2
recall : 0.1904
iter: 3
recall : 0.3844
iter: 4
recall : 0.5842
iter: 5
recall : 0.7404
iter: 6
recall : 0.8379
iter: 7
recall : 0.8952
iter: 8
recall : 0.934
iter: 9
recall : 0.9522
iter: 10
recall : 0.9634
iter: 11
recall : 0.9713
iter: 12
recall : 0.9765
iter: 13
1000000
Time cost: 1204.44


nsg/build/tests/test_nsg_index efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist_400nn_it14.graph 70 80 500 gist_70.nsg

Degree Statistics: Max = 91, Min = 1, Avg = 21
indexing time: 916.75



nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_70.nsg 100 100 myindex_n/gist_t2.ivecs

search time: 4.37568
recall: 0.8725700000000001

---------------修改查询时的L-----------
nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_70.nsg 110 100 myindex_n/gist_t3.ivecs

search time: 4.69543
recall: 0.8859999999999999


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_70.nsg 120 100 myindex_n/gist_t10.ivecs

search time: 5.1802
recall: 0.89782


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_70.nsg 130 100 myindex_n/gist_t11.ivecs

search time: 5.52218
recall: 0.90739



-----------------------------------------------------------
第三次:

./test_nndescent gist/gist_base.fvecs gist_400nn_t3.graph 400 400 12 10 100


data dimension: 960
recall : 0.0005
iter: 0
recall : 0.0025
iter: 1
recall : 0.0214
iter: 2
recall : 0.0808
iter: 3
recall : 0.1699
iter: 4
recall : 0.2947
iter: 5
recall : 0.4444
iter: 6
recall : 0.5894
iter: 7
recall : 0.6987
iter: 8
recall : 0.7769
iter: 9
recall : 0.8317
iter: 10
recall : 0.8753
iter: 11
1000000
Time cost: 325.05


nsg/build/tests/test_nsg_index efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist_400nn_t3.graph 60 70 500 gist_t3.nsg

Degree Statistics: Max = 82, Min = 1, Avg = 21
indexing time: 889.543



nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t3.nsg 100 100 myindex_n/gist_t12.ivecs

search time: 2.92435
recall: 0.8654600000000001

--------------------修改查询时的L---------
nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t3.nsg 110 100 myindex_n/gist_t13.ivecs

search time: 4.2919
recall: 0.8793500000000001


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t3.nsg 120 100 myindex_n/gist_t14.ivecs

search time: 5.0113
recall: 0.8911199999999999


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t3.nsg 130 100 myindex_n/gist_t15.ivecs

search time: 5.32371
recall: 0.90133


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t3.nsg 140 100 myindex_n/gist_t16.ivecs

search time: 5.81129
recall: 0.91012


----------------------KNN图与第三次相同,修改了转换nsg时的L、R-------------

nsg/build/tests/test_nsg_index efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist_400nn_t3.graph 70 80 500 gist_t4.nsg

Degree Statistics: Max = 93, Min = 1, Avg = 21
indexing time: 886.57


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t4.nsg 100 100 myindex_n/gist_t17.ivecs

search time: 3.66675
recall: 0.87237

--------------------------------------修改查询时的L---

nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t4.nsg 110 100 myindex_n/gist_t18.ivecs

search time: 2.94138
recall: 0.88579


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t4.nsg 120 100 myindex_n/gist_t19.ivecs

search time: 4.15369
recall: 0.8971899999999999


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t4.nsg 130 100 myindex_n/gist_t20.ivecs

search time: 4.29626
recall: 0.90703


----------------------KNN图与第三次相同,修改了转换nsg时的L、R-------------

nsg/build/tests/test_nsg_index efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist_400nn_t3.graph 80 90 500 gist_t5.nsg


Degree Statistics: Max = 103, Min = 1, Avg = 21
indexing time: 897.102


nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t5.nsg 100 100 myindex_n/gist_t21.ivecs

search time: 3.3715
recall: 0.87698

-------------------------修改查询时的L----------

nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t5.nsg 110 100 myindex_n/gist_t22.ivecs

search time: 3.75139
recall: 0.88996

nsg/build/tests/test_nsg_optimized_search efanna_graph/tests/gist/gist_base.fvecs efanna_graph/tests/gist/gist_query.fvecs gist_t5.nsg 120 100 myindex_n/gist_t23.ivecs

search time: 5.45572
recall: 0.90122

 

 类似资料: