Haskell 学习笔记06

导航页

Haskell 是纯的 lambda 演算法实现。Haskell 的类型系统是 System F 的改进版本，改进部分包括通用递归、Hindley-Milner 系统实现的类型推断等。

为何需要类型？

TL;DR：lol，看看 JavaScript。

类型避免或减少错误产生，这显而易见。Bool 类型只有两个可能的值，True 或 False，如果期待 Bool 但传入 123，这无疑过不了编译。Haskell 的类型是静态的（static），并在编译时（compile time）检查。

良好设计的类型系统还能使程序更快。因为编译器能够更好地预测程序运行和优化。

类型签名

可以查询类型签名：

ghci> :t 't'
't' :: Char
ghci> :t "Hello"
"Hello" :: String
ghci> :t False
False :: Bool

查询数字类型时，:t 将返回类型信息，而不是具体类型。直到类型被声明或编译器被迫推断，编译器都不知道具体类型。因此，对于数字字面量，编译器默认为最多态的类型 Num。

ghci> :t a
a :: Num p => p
ghci> b = 13 :: Integer
ghci> :t b
b :: Integer

之前我们已经提到，Num p => 这种写法为类型约束。

函数类型

在 Haskell 中，箭头符 (->) 是函数的类型构造器。其签名类似元组构造器 (,)：

ghci> :i (->)
data (->) a b
ghci> :i (,)
data (,) a b = (,) a b

元组需要应用两个参数才能构造，函数同样有两个参数，一个输入和一个结果。但和元组构造函数不同，函数类型没有数据构造器。因此，函数就是值。

函数的特点是可以被应用，箭头符是一个中缀操作符，有两个形参并向右结合。

看看两个例子：

fst :: (a, b) ->   a
--       [1]  [2] [3]

第一个形参为 (a, b) 类型，一个双元组。
函数类型，(->)，有两个形参，一个是 (a, b) 另一个是结果 a。
函数的结果类型 a，和元组中的 a 一致。

length :: [a] -> Int

length 函数有一个形参 [a]，是一个列表，并且返回 Int 结果。该函数不关心列表的元素类型，因此没有类型限制。

类型限制

(+) :: Num a => a -> a -> a
(/) :: Fractional a => a -> a -> a

简单地说，这些函数要求两个实参，并返回一个结果，三者的类型一样。

编译器给出的是最不具体、最通用的类型。a 是一个有类型类限制的多态类型变量。我们说 a 是限制了的（constrained），因为我们不知道 a 的具体类型。但我们要求，a 必须具有 Num 的实例。

ghci> fifteen = 15
ghci> :t fifteen
fifteen :: Num p => p
ghci> fifteenInt = fifteen :: Int
ghci> fifteenDouble = fifteen :: Double
ghci> :t fifteenInt
fifteenInt :: Int
ghci> :t fifteenDouble
fifteenDouble :: Double
ghci> :info Num
...
instance Num Int -- Defined in ‘GHC.Num’
instance Num Double -- Defined in ‘GHC.Float’

Num 和 Int 或 Double 相加是可行的：

ghci> fifteen + fifteenDouble
30.0
ghci> fifteen + fifteenInt
30

然而 Double 不能和 Int 相加：

ghci> fifteenDouble + fifteenInt
<interactive>:40:18: error:
    • Couldn't match expected type ‘Double’ with actual type ‘Int’
    • In the second argument of ‘(+)’, namely ‘fifteenInt’
      In the expression: fifteenDouble + fifteenInt
      In an equation for ‘it’: it = fifteenDouble + fifteenInt

一个变量可能有多个类型签名：

(Num a, Num b) => a -> b -> b
-- or
(Ord a, Num a) => a -> a -> Ordering

这里的类型限制长得像元组，但实际上这个语法表示积类型。

在第二个例子中，a 必须同时具有 Ord 和 Num 的实例。

柯里化

Haskell 的函数只能有一个实参和返回值。因此要实现多个实参则必须柯里化。

(+) :: Num a => a -> a -> a
      |    1   |    2    | [3]

类型限制，要求 a 必须具有 Num 实例。
通过将函数嵌套，可以实现类似多个参数的效果。连续应用函数，每次都接受一个参数并返回值。
返回类型为 a 的值

因为 (->) 是右结合的中缀操作符，所以会这样解析：

f :: a -> a -> a
f :: a -> (a -> a)
map :: (a -> b) -> [a] -> [b]
map :: (a -> b) -> ([a] -> [b])

请时刻注意，当 lambda 表达式有两个形参时，实质上是嵌套的 lambda。

部分应用

柯里化允许我们部分应用一个函数。

addStuff :: Integer -> Integer -> Integer
addStuff a b = a + b + 5

ghci> :t addStuff
addStuff :: Integer -> Integer -> Integer
ghci> let addTen = addStuff 5
ghci> :t addStuff
addStuff :: Integer -> Integer -> Integer
ghci> addTen = addStuff 5
ghci> fifteen = addTen 5
ghci> fifteen
15
ghci> addTen 15
25
ghci> addStuff 5 5
15

如上面写的，柯里化允许我们应用部分参数并创建一个新函数，这也叫部分应用（partial application）。

手动柯里和反柯里

Haskell 默认柯里化，但也可以反柯里化函数。反柯里化（uncurrying）意味着将嵌套结构铺平，并用相应个数的元组替代。如果反柯里化 (+)，类型签名将从 Num a => a -> a -> a 变为 Num a => (a, a) -> a。

反柯里化的函数：一个函数多个实参
柯里化的函数：很多函数，每次单个实参

nonsense :: Bool -> Integer
nonsense True = 805
nonsense False = 31337

curriedFunction :: Integer -> Bool -> Integer
curriedFunction i b =
  i + nonsense b

uncurriedFunction :: (Integer, Bool) -> Integer
uncurriedFunction (i, b) = i + nonsense b

anonymous :: Integer -> Bool -> Integer
anonymous = \i b -> i + nonsense b

anonNested :: Integer -> Bool -> Integer
anonNested = \i -> \b -> i + nonsense b

ghci> curriedFunction 10 False
31347
ghci> anonymous 10 False
31347
ghci> anonNested 10 False
31347

以函数功能上上都是相同的。annoNested 手动嵌套了匿名 lambda 实现和 curriedFunction 一样的柯里化。那些看起来接受多个参数的函数如 a -> a -> a -> a 实质上是高阶函数（higher-order functions）。每次应用都返回一个新函数，知道没有更多 (->) 为止。

柯里化和反柯里化现存函数

柯里化：

ghci> curry f a b = f (a, b)
ghci> :t curry
curry :: ((a, b) -> t) -> a -> b -> t
ghci> :t fst
fst :: (a, b) -> a
ghci> :t curry fst
curry fst :: t -> b -> t
ghci> fst (1, 2)
1
ghci> curry fst 1 2
1

反柯里化：

ghci> uncurry f (a, b) = f a b
ghci> :t uncurry
uncurry :: (t1 -> t2 -> t3) -> (t1, t2) -> t3
ghci> :t (+)
(+) :: Num a => a -> a -> a
ghci> (+) 1 2
3
ghci> uncurry (+) (1, 2)
3

分片

分片（sectioning）特指对中缀操作符的部分应用，有一个特殊的语法，并且可以选择应用于第一个参数还是第二个。

ghci> let x = 5
ghci> let y = (2^)
ghci> let z = (^2)
ghci> y x
32
ghci> z x
25

满足交换律的函数，例如加，实参的位置并不重要。看看另一个例子：

ghci> let celebrate = (++ " woot!")
ghci> celebrate "naptime"
"naptime woot!"
ghci> celebrate "dogs"
"dogs woot!"

也可以对反引号包裹的前缀函数使用：

ghci> elem 9 [1..10]
True
ghci> 9 `elem` [1..10]
True
ghci> let c = (`elem` [1..10])
ghci> c 9
True
ghci> c 25
False

如果直接应用 elem，那么只能应用于第一个参数：

ghci> let hasTen = elem 10
ghci> hasTen [1..9]
False
ghci> hasTen [5..15]
True

多态性

多态能够避免反复实现同一个功能。

在 Haskell 中，多态分为两种：参数多态（parametric polymorphism）和约束多态（constrained polymorphism）。约束多态也叫特设多态（ad hoc polymorphism）。

参数多态比约束多态更加广泛。参数多态是指完全多态的类型变量或参数，具体类型不受任何限制。另一方面，特设多态允许增加更多操作，但减少了具体类型的数量。

类型签名中，小写的名字是一个类型变量，而且是多态的；如果名字是大写的，那么就是一个特定的、具体的类型。

不妨查看恒等式的类型签名：

id :: a -> a

这个类型意味着：对于所有 \(a\)，都可取得一个 \(a\) 的实参，并返回一个类型为 \(a\) 的值。

该函数可以和任何类型交互：

Prelude> id 1
1
Prelude> id "blah"
"blah"
Prelude> id [(1,2,3), (1,2,3), (2,3,4)]
[(1,2,3),(1,2,3),(2,3,4)]

以上的签名除了返回输入，不可能有更多功能了。因为参数没有附加任何信息或方案。

而特设多态允许更多的功能，比如 negate 的类型签名为：

Num a => a -> a

现在可以使用 Num 实现的方法。

具体的类型可以有更多灵活性。Int 可以使用 Num 和 Integral 类型的方法，因为具有两种类型的特征。

多态常量

Haskell 具有类型限制，但直觉上而言，我们希望 -10 + 6.3 这种运算能够通过编译：

Prelude> (-10) + 6.3
-3.7

为什么成功了？看看它们的类型：

 Prelude> :t (-10) + 6.3
(-10) + 6.3 :: Fractional a => a
 Prelude> :t (-10)
(-10) :: Num a => a

数字字面量如 (-10) 和 6.3 都是多态的，直到被要求一个显式的类型。在这个例子中，Num a => 或 Fractional => 是类型类约束，\(a\) 是类型变量。对于整个表达式的类型，编译器推断为 Fractional。这是为了容纳 6.3 所必须的。同时，(-10) 默认为约束最少的多态 Num，可以是任何类型的数。我们称其为多态常量（polymorphic constant）。(-10) 不是常量，但它实例化的类型可以是任意类型，所以它是多态的。

可以通过手动写类型注解的方式，让编译器知道具体的类型：

Prelude> let x = 5 + 5
Prelude> :t x
x :: Num a => a
Prelude> let x = 5 + 5 :: Int
Prelude> :t x
x :: Int

类型转换

Prelude> 6 / length [1, 2, 3]
No instance for (Fractional Int) arising
  from a use of ‘/’

以上代码当然不会通过编译，因为 length 函数返回 Int。Int 没有 Fractional 的实例。

可以使用以下函数转换类型：

fromIntegral :: (Num b, Integral a) => a -> b

Prelude> 6 / fromIntegral (length [1, 2, 3])
2.0

类型推断

Haskell 不要求为每个表达式都标准类型。因为有类型推断（type inference）。类型推断是一种用于确定表达式类型的算法。Haskell 的类型推断建立在 Damas-Hindley-Milner 类型系统的拓展之上。Haskell 将推断出最普遍，最为广泛的类型。

在编写新代码时，这很有用。但如果代码已经写完，最好显式标注顶层的函数或常量。

可以试试让 GHC 推断类型：

Prelude> myGreet x = "Hello, " ++ x
Prelude> :t myGreet
myGreet :: [Char] -> [Char]

这里编译器推断的很正确，因为 ++ 给了足够的信息。

而如果这样：

Prelude> let myGreet x y = x ++ y
Prelude> :type myGreet
myGreet :: [a] -> [a] -> [a]

编译器便只能给出最广泛版本的泛型。

为声明断言类型

多数时候，我们希望显式声明类型而不是依赖类型推断：

-- type declaration
triple :: Integer -> Integer
-- function declaration
triple x = x * 3

虽然不常用，但也可以在 where 或 let 子句中声明函数的类型：

triple x = tripleItYo x
  where tripleItYo :: Integer -> Integer
    tripleItYo y = y * 3

若我们不声明，GHC 将推断为：

Prelude> tripleItYo y = y * 3
Prelude> :t tripleItYo
tripleItYo :: Num a => a -> a

当然，错误的类型会报错：

Prelude> let x = 5 + 5 :: String
No instance for (Num String) arising from a use of ‘+’
In the expression: 5 + 5 :: String
In an equation for ‘x’: x = 5 + 5 :: String

以上，即为本文的全部内容了。