当前位置: 首页 > 工具软件 > Gremlin > 使用案例 >

Gremlin 参数化查询

邴修远
2023-12-01

先给出用Python 请求支持Gremlin查询的图数据库时使用参数化查询和不使用参数化查询的实现方式。

  • 参数化查询

start_time = time.time()
request_body = {
    "Gremlin": '''g.V().has('Category','code', category)
                ''',
    "bindings": {"category":"aircondition"},
    "language": "Gremlin-groovy",
    "aliases": {
        "graph": 'home',
        "g": "__g_{0}".format('home')
        }
    }
r = requests.post('http://127.0.0.1:8183/Gremlin', data=json.dumps(request_body),
                   headers={"Content-Type": "application/json"})
print(f"time cost is {time.time()-start_time}")
print(r.json())
  • 非参数化查询

start_time = time.time()
request_body = {
    "Gremlin": '''g.V().has('Category','code', 'aircondition')
                ''',
    "bindings": {},
    "language": "Gremlin-groovy",
    "aliases": {
        "graph": 'home',
        "g": "__g_{0}".format('home')
        }
    }
r = requests.post('http://127.0.0.1:8183/Gremlin', data=json.dumps(request_body),
                   headers={"Content-Type": "application/json"})
print(f"time cost is {time.time()-start_time}")
print(r.json())

对比两种请求方式,参数化请求与不使用参数化请求的写法相比,需在Gremlin语句中定义变量, 每次请求时,将语句中的变量赋值放在了bindings中,binding是一个变量字典。

Gremlin server文档中强调了Parameterized request(参数化请求) 对于提高查询性能很关键,因为避免了重复的Gremlin脚本编译,可以加快查询时间。考虑到Gremlin server会缓存其收到的所有请求,因此参数化请求也可以减少Gremlin server的资源占用。所以对于一个线上服务,需要尽可能的将所有的Gremlin查询都使用参数化请求方式来实现,不然每次查询请求都会进行脚本编译和缓存,这样就会面临Gremlin server内存占用越来越多的风险。

Parameterized request are considered the most efficient way to send Gremlin to the server as they can be cached, which will boost performance and reduce resources required on the server.
The bindings argument is a Map of variables where the keys become available as variables in the Gremlin script. Note that parameterization of requests is critical to performance, as repeated script compilation can be avoided on each request.
Use script parameterization. Period. Gremlin Server caches all scripts that are passed to it. The cache is keyed based on the a hash of the script. Therefore g.V(1) and g.V(2) will be recognized as two separate scripts in the cache. If that script is parameterized to g.V(x) where x is passed as a parameter from the client, there will be no additional compilation cost for future requests on that script. Compilation of a script should be considered "expensive" and avoided when possible.

 类似资料: