2023-05-05 11:08 已编辑暨南大学研发工程师

关注

表达式问题的三种解法

本文稍后将同步发布于知乎

本文基于

[1]Wadler, P.: The Expression Problem. Email (Nov 1998), discussion on the JavaGenericity mailing list

[2]Zenger, M., Odersky, M.: Independently extensible solutions to the expression prob-lem. In: FOOL’05 (2005)

[3]Oliveira, B.C.d.S., Cook, W.R. (2012). Extensibility for the Masses. In: Noble, J. (eds) ECOOP 2012 – Object-Oriented Programming. ECOOP 2012. Lecture Notes in Computer Science, vol 7313. Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-31057-7_2

[4]SWIERSTRA, W. (2008). Data types à la carte. Journal of Functional Programming, 18(4), 423-436. doi:10.1017/S0956796808006758

[5]Bahr, P. (2014). Composing and decomposing data types. Proceedings of the 10th ACM SIGPLAN Workshop on Generic Programming, p. 71-82, 2014. https://doi.org/10.1145/2633628.2633635

[6]Carette, J., Kiselyov, O., Shan, Cc. (2007). Finally Tagless, Partially Evaluated. In: Shao, Z. (eds) Programming Languages and Systems. APLAS 2007. Lecture Notes in Computer Science, vol 4807. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76637-7_15

[7]Kiselyov, O. (2012). Typed Tagless Final Interpreters.In: Gibbons, J. (eds) Generic and Indexed Programming. Lecture Notes in Computer Science, vol 7470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32202-0_3

[8]Kiselyov, O., & Ishii, H. (2015a). Freer Monads, More Extensible Effects. Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell, 94–105. Presented at the Vancouver, BC, Canada. doi:10.1145/2804302.2804319

表达式问题

如何在一个类型内部的 representation 和外部的 behavior 均根据需求不断修改、增加，而无需重新编译乃至修改原有代码的问题称为表达式问题(Expression Problem, EP)[Wadler, 1998]

要求：
双向可扩展性：解决方案必须允许添加新的数据变体和新的操作
无修改或重复：现有代码不得被修改或重复
分离编译和类型检查：安全检查或编译步骤不得推迟到链接或运行时进行
独立可扩展性：不同的拓展可以组合使用[Zenger & Odersky, 2005]

在 OOP 中

interface Exp { int eval(); }

class Lit implements Exp {
    int x;
    Lit(int x) { this.x = x; }
    public int eval() { return x; }
}

class Add implements Exp {
    Exp l, r;
    Add(Exp l, Exp r) { this.l = l; this.r = r; }
    public int eval() { return l.eval() + r.eval(); }
}

public class Main {
    public static void main(String[] args) {
        Exp e = new Add(new Lit(1), new Lit(2));
        System.out.println(r.eval());
    }
}

添加新的 data constructor 是简单的：

interface Exp { int eval(); }

class Lit implements Exp {
    int x;
    Lit(int x) { this.x = x; }
    public int eval() { return x; }
}

class Add implements Exp {
    Exp l, r;
    Add(Exp l, Exp r) { this.l = l; this.r = r; }
    public int eval() { return l.eval() + r.eval(); }
}

class Mult implements Exp {
    Exp l, r;
    Mult(Exp l, Exp r) { this.l = l; this.r = r; }
    public int eval() { return l.eval() * r.eval(); }
}

public class Main {
    public static void main(String[] args) {
        Exp e = new Add(new Lit(1), new Lit(2));
        Exp r = new Mult(e, new Lit(4));
        System.out.println(r.eval());
    }
}

修改 representation 就不那么简单了：如果我们在 IExp 中添加一个格式化输出的方法，那么下面的每一个类都要修改添加对应的实现方法

解决方法（Object Algebra）

An object algebra is a class that implements a generic abstract factory interface, which corresponds to a particular kind of algebraic signature[Oliveira, 2005]

使用 Visitor Pattern

interface Exp { <A> A visit(Visitor<A> vis); }

interface Visitor<A> {
  A lit(Lit a);
  A add(Add a);
}

class Lit implements Exp {
  int value;
  public <A> A visit(Visitor<A> vis) { return vis.lit(this); }
}

class Add implements Exp {
  Exp left, right;
  public <A> A visit(Visitor<A> vis) {
    return vis.add(left.visit(vis), right.visit(vis));
  }
}

class Eval implements Visitor<Integer> {
  public Integer lit(Lit a) { return a.value; }
  public Integer add(Add a) { return a.visit(this) + b.visit(this); }
}

无论是加 data constructor 还是 method 都只需修改 visitor，原有的代码无需修改

interface Exp { <A> A visit(Visitor<A> vis); }

interface Visitor<A> {
    A lit(Lit a);
    A add(Add a);
    A mult(Mult a);
}

class Lit implements Exp {
    int value;
    public Lit(int value) { this.value = value; }
    public <A> A visit(Visitor<A> vis) { return vis.lit(this); }
}

class Add implements Exp {
    Exp left, right;
    public Add(Exp left, Exp right) { this.left = left; this.right = right; }
    public <A> A visit(Visitor<A> vis) {
        return vis.add(this);
    }
}

class Mult implements Exp {
    Exp left, right;
    public Mult(Exp left, Exp right) { this.left = left; this.right = right; }
    public <A> A visit(Visitor<A> vis) {
        return vis.mult(this);
    }
}

class Eval implements Visitor<Integer> {
    public Integer lit(Lit a) { return a.value; }
    public Integer add(Add a) { return a.left.visit(this) + a.right.visit(this); }
    public Integer mult(Mult a) { return a.left.visit(this) * a.right.visit(this); }
}

class PrettyPrint implements Visitor<String> {
    public String lit(Lit a) { return Integer.toString(a.value); }
    public String add(Add a) { return "(" + a.left.visit(this) + " + " + a.right.visit(this) + ")"; }
    public String mult(Mult a) { return "(" + a.left.visit(this) + " * " + a.right.visit(this) + ")"; }
}

例题

PatternCraft - Visitor

在 FP 中

在函数式语言中的情形恰好相反，我们可以很轻松地添加新的函数，却不能简单地修改数据的 representation

Haskell 没有 OOP，并不能使用 Object Algebra。但是我们有别的解决方案。

解决方法（Finally Tagless）

remove the tags

将原本的数据构造子写作 typeclass 中的 method，添加不同的操作就是为这个 typeclass 实现新的实例。

tagless 得名于我们最后的结果里用方法而非代数数据类型（ADT）来表示数据构造。

现在我们只需要标注类型就可调用不同的操作

这个实现依然存在问题，如下：

我们的语义阻止不了把布尔值相加这种不合法的操作

稍微改写一下，我们不直接修改最后的结果，而是多加一层 Functor

然后给每个想要的操作写一个 newtype

问题解决

除了直接计算，我们还可以构建语法树

解决方法（Data types à la carte）

Fixing

将我们的 constructor 拆分开来

首先我们需要表达任意次函数应用的嵌套。我们引入了 Fix 不动点来解决这个问题。

其中 In 的类型为 `f (Fix f) -> Fix f`， out 的类型为 `Fix f -> f (Fix f)`

有了 Fix 我们就可以表达树状结构的表达式

现在我们需要表达不同操作之间的组合，这就引入了 coproduct

两个 Functor 的 coproduct 还是 Functor

如果 f 是 Functor，我们可以折叠任意层数的 f (F-algebra : https://en.wikipedia.org/wiki/F-algebra)

其中 `foldFix f :: Fix f -> a`，`fmap (foldFix f) :: f (Fix f) -> f a`

foldFix 的第一个参数被称为 algebra，她每一次折叠一层 f a，而 foldFix 将这些操作应用到每一层

Evaluating

定义运算：

Automating Injection

现在这个写法很别扭，写一个表达式要套很多层 In。我们可以用自动化注入来解决这个问题。

:<: 称为 Subsume。这一段 typeclass 定义了 Subtyping (sub 是 sup 的子类型，sup 是 sub 的父类型)。OOP 中提及到的多态基本都是子类型多态。一句话解释：subtyping 是类型上的偏序关系，如果 A 是 B 的子类型，那么 A 可以出现在任何 B 应该出现的位置。