fasttext.wasm.js

Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.

NPM version Download monthly

WebAssembly version of fastText(archived) with compressed lid.176.ftz model (~900KB) and a typescript wrapper. This project focuses on cross-platform, zero-dependency and out-of-the-box.

Features

  • Written in TypeScript
  • Supported Node, Worker, Browser and Browser extension runtime
  • Integrated language identification and normalized default result, supported 176 languages
  • Significantly faster and accurate than languagedetect and franc, superior to eld and cld.

Usage

In Node.js, you should use this approach for binding js best performance.

import { getLIDModel } from 'fasttext.wasm.js'
const lidModel = await getLIDModel()
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

In others environments, use like below:

import { getLIDModel } from 'fasttext.wasm.js/common'
const lidModel = await getLIDModel()
// Default paths:
// {
// wasmPath: '<globalThis.location.origin>/fastText/fastText.common.wasm',
// modelPath: '<globalThis.location.origin>/fastText/models/lid.176.ftz',
// }
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

Do not forget that download and place /fastText/fastText.common.wasm and /fastText/models/lid.176.ftz in public root directory. You can override the default paths if necessary.

Benchmark

Dataset papluca/language-identification/test accuracy test result in Node.js runtime:

NameError RateAccuracyTotal
fastText0.020.9810000
cld0.040.9610000
eld0.060.9410000
languageDetect0.240.7610000
franc0.270.7310000

How to?

codesandbox/fasttext.wasm.js

  • Run Bench Test task for accuracy test
  • Run Bench task for benchmark test

or

  • Clone the repo
  • pnpm i
  • pnpm run build
  • cd bench
  • pnpm run test for accuracy test
  • pnpm run bench for benchmark test

Credits

References

Build & Publish

Requirements

Pay attention, add source ./emsdk_env.sh to shell profile to auto load emsdk env, and export EMSDK_QUIET=1 can be used to suppress these messages.

  • npm run build
  • npx changeset
  • npx changeset version
  • git commit
  • npx changeset publish
  • git push --follow-tags

changeset prerelease doc

License

MIT License © 2023 Yuns