Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.
WebAssembly version of fastText(archived) with compressed lid.176.ftz model (~900KB) and a typescript wrapper. This project focuses on cross-platform, zero-dependency and out-of-the-box.
In Node.js, you should use this approach for binding js best performance.
import { getLIDModel } from 'fasttext.wasm.js'const lidModel = await getLIDModel()await lidModel.load()const result = await lidModel.identify('Hello, world!')console.log(result.alpha2) // 'en'
In others environments, use like below:
import { getLIDModel } from 'fasttext.wasm.js/common'const lidModel = await getLIDModel()// Default paths:// {// wasmPath: '<globalThis.location.origin>/fastText/fastText.common.wasm',// modelPath: '<globalThis.location.origin>/fastText/models/lid.176.ftz',// }await lidModel.load()const result = await lidModel.identify('Hello, world!')console.log(result.alpha2) // 'en'
Do not forget that download and place /fastText/fastText.common.wasm and /fastText/models/lid.176.ftz in public root directory. You can override the default paths if necessary.
Dataset papluca/language-identification/test accuracy test result in Node.js runtime:
| Name | Error Rate | Accuracy | Total |
|---|---|---|---|
| fastText | 0.02 | 0.98 | 10000 |
| cld | 0.04 | 0.96 | 10000 |
| eld | 0.06 | 0.94 | 10000 |
| languageDetect | 0.24 | 0.76 | 10000 |
| franc | 0.27 | 0.73 | 10000 |
Bench Test task for accuracy testBench task for benchmark testor
pnpm ipnpm run buildcd benchpnpm run test for accuracy testpnpm run bench for benchmark testPay attention, add source ./emsdk_env.sh to shell profile to auto load emsdk env, and export EMSDK_QUIET=1 can be used to suppress these messages.
npm run buildnpx changesetnpx changeset versiongit commitnpx changeset publishgit push --follow-tags