五、高阶函数 - 识别文本 - 《JavaScript 编程精解中文第三版》

识别文本

我们有了characterScript函数和一种正确遍历字符的方法。下一步将是计算属于每个脚本的字符。下面的计数抽象会很实用：

function countBy(items, groupName) {
  let counts = [];
  for (let item of items) {
    let name = groupName(item);
    let known = counts.findIndex(c => c.name == name);
    if (known == -1) {
      counts.push({name, count: 1});
    } else {
      counts[known].count++;
    }
  }
  return counts;
}  
console.log(countBy([1, 2, 3, 4, 5], n => n > 2));
// → [{name: false, count: 2}, {name: true, count: 3}]

countBy函数需要一个集合（我们可以用for/of来遍历的任何东西）以及一个函数，它计算给定元素的组名。它返回一个对象数组，每个对象命名一个组，并告诉你该组中找到的元素数量。

它使用另一个数组方法findIndex。这个方法有点像indexOf，但它不是查找特定的值，而是查找给定函数返回true的第一个值。像indexOf一样，当没有找到这样的元素时，它返回 -1。

使用countBy，我们可以编写一个函数，告诉我们在一段文本中使用了哪些脚本。

function textScripts(text) {
  let scripts = countBy(text, char => {
    let script = characterScript(char.codePointAt(0));
    return script ? script.name : "none";
  }).filter(({name}) => name != "none");
  let total = scripts.reduce((n, {count}) => n + count, 0);
  if (total == 0) return "No scripts found";
  return scripts.map(({name, count}) => {
    return `${Math.round(count * 100 / total)}% ${name}`;
  }).join(", ");
}
console.log(textScripts('英国的狗说"woof", 俄罗斯的狗说"тяв"'));
// → 61% Han, 22% Latin, 17% Cyrillic

该函数首先按名称对字符进行计数，使用characterScript为它们分配一个名称，并且对于不属于任何脚本的字符，回退到字符串"none"。 filter调用从结果数组中删除"none"的条目，因为我们对这些字符不感兴趣。

为了能够计算百分比，我们首先需要属于脚本的字符总数，我们可以用reduce来计算。如果没有找到这样的字符，该函数将返回一个特定的字符串。否则，它使用map将计数条目转换为可读的字符串，然后使用join合并它们。