2.1 包含和统一

认为特征结构提供一些对象的部分信息是很正常的,在这个意义上,我们可以根据它们通用的程度给特征结构排序。例如,(23a)(23b)具有更少特征,(23b)比(23c)具有更少特征。

  1. [NUMBER = 74]

统一被正式定义为一个(部分)二元操作:FS<sub>0</sub> ⊔ FS<sub>1</sub>。统一是对称的,所以 FS<sub>0</sub> ⊔ FS<sub>1</sub> = FS<sub>1</sub> ⊔ FS<sub>0</sub>。在 Python 中也是如此:

  1. >>> print(fs2.unify(fs1))
  2. [ CITY = 'Paris' ]
  3. [ NUMBER = 74 ]
  4. [ STREET = 'rue Pascal' ]

如果我们统一两个具有包含关系的特征结构,那么统一的结果是两个中更具体的那个:

  1. >>> fs0 = nltk.FeatStruct(A='a')
  2. >>> fs1 = nltk.FeatStruct(A='b')
  3. >>> fs2 = fs0.unify(fs1)
  4. >>> print(fs2)
  5. None

现在,如果我们看一下统一如何与结构共享相互作用,事情就变得很有趣。首先,让我们在 Python 中定义(21)

  1. >>> fs0 = nltk.FeatStruct("""[NAME=Lee,
  2. ... ADDRESS=[NUMBER=74,
  3. ... STREET='rue Pascal'],
  4. ... SPOUSE= [NAME=Kim,
  5. ... ADDRESS=[NUMBER=74,
  6. ... STREET='rue Pascal']]]""")
  7. >>> print(fs0)
  8. [ ADDRESS = [ NUMBER = 74 ] ]
  9. [ [ STREET = 'rue Pascal' ] ]
  10. [ ]
  11. [ NAME = 'Lee' ]
  12. [ ]
  13. [ [ ADDRESS = [ NUMBER = 74 ] ] ]
  14. [ SPOUSE = [ [ STREET = 'rue Pascal' ] ] ]
  15. [ [ ] ]
  16. [ [ NAME = 'Kim' ] ]

我们为 Kim 的地址指定一个CITY作为参数会发生什么?请注意,fs1需要包括从特征结构的根到CITY的整个路径。

  1. >>> fs1 = nltk.FeatStruct("[SPOUSE = [ADDRESS = [CITY = Paris]]]")
  2. >>> print(fs1.unify(fs0))
  3. [ ADDRESS = [ NUMBER = 74 ] ]
  4. [ [ STREET = 'rue Pascal' ] ]
  5. [ ]
  6. [ NAME = 'Lee' ]
  7. [ ]
  8. [ [ [ CITY = 'Paris' ] ] ]
  9. [ [ ADDRESS = [ NUMBER = 74 ] ] ]
  10. [ SPOUSE = [ [ STREET = 'rue Pascal' ] ] ]
  11. [ [ ] ]
  12. [ [ NAME = 'Kim' ] ]

通过对比,如果fs1fs2的结构共享版本统一,结果是非常不同的(如图(22)所示):

  1. >>> fs2 = nltk.FeatStruct("""[NAME=Lee, ADDRESS=(1)[NUMBER=74, STREET='rue Pascal'],
  2. ... SPOUSE=[NAME=Kim, ADDRESS->(1)]]""")
  3. >>> print(fs1.unify(fs2))
  4. [ [ CITY = 'Paris' ] ]
  5. [ ADDRESS = (1) [ NUMBER = 74 ] ]
  6. [ [ STREET = 'rue Pascal' ] ]
  7. [ ]
  8. [ NAME = 'Lee' ]
  9. [ ]
  10. [ SPOUSE = [ ADDRESS -> (1) ] ]
  11. [ [ NAME = 'Kim' ] ]

不是仅仅更新 Kim 的 Lee 的地址的“副本”,我们现在同时更新他们两个的地址。更一般的,如果统一包含指定一些路径π的值,那么统一同时更新等价于π的任何路径的值。

正如我们已经看到的,结构共享也可以使用变量表示,如?x

  1. >>> fs1 = nltk.FeatStruct("[ADDRESS1=[NUMBER=74, STREET='rue Pascal']]")
  2. >>> fs2 = nltk.FeatStruct("[ADDRESS1=?x, ADDRESS2=?x]")
  3. >>> print(fs2)
  4. [ ADDRESS1 = ?x ]
  5. [ ADDRESS2 = ?x ]
  6. >>> print(fs2.unify(fs1))
  7. [ ADDRESS1 = (1) [ NUMBER = 74 ] ]
  8. [ [ STREET = 'rue Pascal' ] ]
  9. [ ]
  10. [ ADDRESS2 -> (1) ]