[Erlang 0008] Erlang的Match specifications

司空瑾瑜
2023-12-01

 上一篇文章我在最后提到了使用ets:fun2ms在ETS中进行复杂的查询,详情点击这里: [Erlang 0007] Erlang ETS Table 二三事

最近我在Erlang社区发现有人提了这样一个关于MatchSpecification问题:

http://erlang.2086793.n4.nabble.com/RE-select-count-bug-td2087163.html

---------------引用的分割线--------------------

1> T = ets:new(task_table, [set, protected, {keypos, 1}]).
16
2> ets:insert(T, {1, "hi", b}).
true
3> ets:select(T, [{{'_', "hi", '_'},[],['$_']}]).
[{1,"hi",b}]
4> ets:select_count(T, [{{'_', "hi", '_'},[],['$_']}]).
0
5> ets:select_count(T, [{{'_', "hi", '_'},[],[true]}]).
1

You have to modify the select pattern slightly for
select_count. It takes the same type of match spec as
select_delete/2 -- not as select/2.

 

---------------引用结束的分割线--------------------
 
这里实际上是对ets:select_count()方法存在误解,我在最初使用这个方法的时候也折腾一下,我们看下到底怎么使用这个方法,先看下文档
select_count(Tab, MatchSpec) -> NumMatched
Types:
Tab = tid() | atom()
Object = tuple()
MatchSpec = match_spec()
NumMatched = integer()
Matches the objects in the table Tab using a match_spec. If the match_spec returns true for an object, that object
considered a match and is counted. For any other result from the match_spec the object is not considered a match
and is therefore not counted.
The function could be described as a match_delete/2 that does not actually delete any elements, but only counts
them.
The function returns the number of objects matched.
其实关键就在这里:If the match_spec returns true for an object, that object considered a match and is counted.
如果match_spec返回true的时候才会进行计数.这个地方还是容易误解,我们重现上面的问题场景:
首先构造数据
ets:new(task_table, [set, protected,named_table, {keypos, 1}]),
ets:insert(task_table ,{1, "hi", 100 }),
ets:insert(task_table ,{2, "hi", 200 }),
ets:insert(task_table,{3, "ab",300}),
然后,我们首先看他尝试的第一种写法
Count0= ets:select_count(task_table,[{{'_', "hi", '_'},[],['$_']}]),
io:format("Count0: ~p ~n" ,[Count0]),

这种写法的结果是0

我们换一种更容易理解的方式,用fun2ms重写一下:
MS=ets:fun2ms(fun(T={A,B,C}) when B=:="hi" -> T end),
io:format("MS: ~p ~n" ,[MS]),
Count= ets:select_count(task_table,MS),
io:format("Count: ~p ~n" ,[Count]),
输出结果 MS: [{{'$1','$2','$3'},[{'=:=','$2',"hi"}],['$_']}]
%Count: 0

MS2=ets:fun2ms(fun(T={A,B,C}) when B=:="hi" -> true end),
io:format("MS2 ~p ~n ",[MS2]),
Count2 = ets:select_count(task_table,MS2),
io:format("Count2: ~p ~n" ,[Count2]).
这种写法的输出结果 :
%MS2 [{{'$1','$2','$3'},[{'=:=','$2',"hi"}],[true]}]
% Count2: 2

Match specifications

      Erlang的Match specifications主要在两方面使用,一是erlang:trace_pattern/2 ,另一方面是ETS查询;Match specifications的构成不是Erlang代码

而是Erlang terms.在形式上Ms像Erlang的小方法片段,但它编译后的结果比Erlang方法执行要快,由于它的构成仅仅是Erlang的term而不是所有代码

集,所以Ms的表达能力也是有限的;我使用Match specifications(后简称Ms)都是在ETS中,所以把和ETS相关的内容整理了一下:

       ETS的Ms都会产生返回值,而且MatchHead是一个tuple();所以上文说的If the match_spec returns true for an object是指Ms的返回值,对应到代码里面

就是这一句MS2=ets:fun2ms(fun(T={A,B,C}) when B=:="hi" -> true end).同样的方法还有select_delete(Tab, MatchSpec) -> NumDeleted,这里也是要

求Ms返回是true;这位老兄认为返回值错误的那句ets:select_count(task_table,[{{'_', "hi", '_'},[],['$_']}]),返回值是'$_'不是true,不符合这个方法对Ms

返回值的要求.那这里的'$_'又代表什么呢?

      我们把select_count修改为select看一下返回结果ets:select(task_table,[{{'_', "hi", '_'},[],['$_']}]).结果是:[{1,"hi",100},{2,"hi",200}]也就是返回的

是符合条件的整条数据记录.

  ets:select(task_table,[{{'_', "hi", '_'},[],['$_']}]).
 [{1,"hi",100},{2,"hi",200}]

同样特殊的还有'$$'符号,它的作用是按照顺序输出所有变量:

 ets:select(task_table,[{{'$2', "hi", '$1'},[],['$$']}]).
[[100,1],[200,2]]

  ets:select(task_table,[{{'$3', "hi", '$1'},[],['$$']}]).
[[100,1],[200,2]]

 

下面是从Ms规格说明中摘取的一段,注意标红的部分:

        A match_spec used in ets can be described in this informal grammar:

  • MatchExpression ::= [ MatchFunction, ... ]
  • MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
  • MatchHead ::= MatchVariable | '_' | { MatchHeadPart, ... }
  • MatchHeadPart ::= term() | MatchVariable | '_'
  • MatchVariable ::= '$<number>'
  • MatchConditions ::= [ MatchCondition, ...] | []
  • MatchCondition ::= { GuardFunction } | { GuardFunction, ConditionExpression, ... }
  • BoolFunction ::= is_atom | is_constant | is_float | is_integer | is_list | is_number | is_pid | is_port | is_reference | is_tuple | is_binary |
  •                              is_function | is_record |is_seq_trace | 'and' | 'or' | 'not' | 'xor' | andalso | orelse
  • ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
  • ExprMatchVariable ::= MatchVariable (bound in the MatchHead) | '$_' | '$$'
  • TermConstruct = {{}} | {{ ConditionExpression, ... }} | [] | [ConditionExpression, ...] | NonCompositeTerm | Constant
  • NonCompositeTerm ::= term() (not list or tuple)
  • Constant ::= {const, term()}
  • GuardFunction ::= BoolFunction | abs | element | hd | length | node | round | size | tl | trunc | '+' | '-' | '*' | 'div' | 'rem' | 'band' | 'bor' | 'bxor' | 'bnot' | 'bsl' | 'bsr' | '>'| '>=' | '<' | '=<' | '=:=' | '==' | '=/=' | '/=' | self | get_tcw
  • MatchBody ::= [ ConditionExpression, ... ]

 上面会有对常量的说明,我们可以在stdlib中找到这样一个例子,当Ms有变量是从外部传递进来的时候,会把这个变量的值一次性编译进表达式:

X=3.

ets:fun2ms(fun({M,N}) when N > X -> M end).
[{{'$1','$2'},[{'>','$2',{const,3}}],['$1']}]

 

关于Match specifications详细文档在这里<Match specifications in Erlang> http://www.erlang.org/doc/apps/erts/match_spec.html

 类似资料: