fasttext.wasm.js

Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.

WebAssembly version of fastText(archived) with compressed lid.176.ftz model (~900KB) and a typescript wrapper. This project focuses on cross-platform, zero-dependency and out-of-the-box.

Features

Written in TypeScript
Supported Node, Worker, Browser and Browser extension runtime
Integrated language identification and normalized default result, supported 176 languages
Significantly faster and accurate than languagedetect and franc, superior to eld and cld.

Usage

In Node.js, you should use this approach for binding js best performance.

import { getLIDModel } from 'fasttext.wasm.js'

const lidModel = await getLIDModel()
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

In others environments, use like below:

import { getLIDModel } from 'fasttext.wasm.js/common'

const lidModel = await getLIDModel()
// Default paths:
// {
//   wasmPath: '<globalThis.location.origin>/fastText/fastText.common.wasm',
//   modelPath: '<globalThis.location.origin>/fastText/models/lid.176.ftz',
// }
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

Do not forget that download and place /fastText/fastText.common.wasm and /fastText/models/lid.176.ftz in public root directory. You can override the default paths if necessary.

Benchmark

Dataset papluca/language-identification/test accuracy test result in Node.js runtime:

Name	Error Rate	Accuracy	Total
fastText	0.02	0.98	10000
cld	0.04	0.96	10000
eld	0.06	0.94	10000
languageDetect	0.24	0.76	10000
franc	0.27	0.73	10000

How to?

codesandbox/fasttext.wasm.js

Run Bench Test task for accuracy test
Run Bench task for benchmark test

Clone the repo
pnpm i
pnpm run build
cd bench
pnpm run test for accuracy test
pnpm run bench for benchmark test

awesome-exhibition/fasttext.wasm.js - Next.js example.
yunsii/browser-extension-with-fasttext.wasm.js - Use fasttext.wasm.js in browser extension.

Credits

References

Build & Publish

Requirements

emsdk
- emcc@^3.1.53
xmake

Pay attention, add source ./emsdk_env.sh to shell profile to auto load emsdk env, and export EMSDK_QUIET=1 can be used to suppress these messages.

npm run build
npx changeset
npx changeset version
git commit
npx changeset publish
git push --follow-tags

changeset prerelease doc