docs: update text-processing-rs description with multilingual support (#365)

## Summary

- Update text-processing-rs description in README and PostProcessing
docs to reflect current capabilities
- Now mentions ITN + TN support across 7 languages (EN, DE, ES, FR, HI,
JA, ZH)
- Added 100% NeMo test compatibility note (3,011 tests passing)

## Test plan

- [x] Docs-only change, no code affected
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/fluidinference/fluidaudio/pull/365"
target="_blank">
  <picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open with Devin">
  </picture>
</a>
<!-- devin-review-badge-end -->
This commit is contained in:
Alex
2026-03-12 22:44:22 -04:00
committed by GitHub
parent ac4df1536e
commit d83c9e2587
2 changed files with 33 additions and 12 deletions
+32 -11
View File
@@ -1,8 +1,12 @@
# Post-Processing ASR Output
# Text Processing
## Overview
**[text-processing-rs](https://github.com/FluidInference/text-processing-rs)** provides both Inverse Text Normalization (ITN) and Text Normalization (TN) across 7 languages (EN, DE, ES, FR, HI, JA, ZH). 100% NeMo test compatibility (3,011 tests). Rust port of [NVIDIA NeMo Text Processing](https://github.com/NVIDIA/NeMo-text-processing) with Swift wrapper.
## Inverse Text Normalization (ITN)
Inverse Text Normalization converts spoken-form ASR output to written form:
ITN converts spoken-form ASR output to written form — useful for post-processing ASR transcriptions:
| Input (spoken) | Output (written) |
|----------------|------------------|
@@ -12,17 +16,23 @@ Inverse Text Normalization converts spoken-form ASR output to written form:
| "two thirty pm" | "2:30 p.m." |
| "test at gmail dot com" | "test@gmail.com" |
## Post-Processing Tools
## Text Normalization (TN)
| Tool | Description | Language |
|------|-------------|----------|
| **[text-processing-rs](https://github.com/FluidInference/text-processing-rs)** | Inverse Text Normalization (ITN) - converts spoken-form ASR output to written form. Rust port of [NVIDIA NeMo Text Processing](https://github.com/NVIDIA/NeMo-text-processing) with Swift wrapper. | Rust, Swift |
TN converts written-form text to spoken form — useful for TTS preprocessing:
## Using ITN with FluidAudio
| Input (written) | Output (spoken) |
|-----------------|-----------------|
| "123" | "one hundred twenty three" |
| "$5.50" | "five dollars and fifty cents" |
| "January 5, 2025" | "january fifth twenty twenty five" |
| "2:30 PM" | "two thirty p m" |
| "1st" | "first" |
FluidAudio includes optional support for text-processing-rs through the `TextNormalizer` class. The library uses dynamic loading, so it's completely optional - if not linked, `normalize()` returns the input unchanged.
## Using with FluidAudio
### Basic Usage
FluidAudio includes optional support for text-processing-rs through the `TextNormalizer` class. The library uses dynamic loading, so it's completely optional — if not linked, `normalize()` returns the input unchanged.
### ITN (Spoken to Written)
```swift
import FluidAudio
@@ -39,20 +49,31 @@ let result = normalizer.normalize("two hundred dollars")
// Returns "$200" (with native library) or "two hundred dollars" (without)
```
### TN (Written to Spoken)
```swift
// Convert written text to spoken form for TTS
let spoken = normalizer.tnNormalize("$5.50")
// Returns "five dollars and fifty cents"
let spoken = normalizer.tnNormalize("January 5, 2025")
// Returns "january fifth twenty twenty five"
```
### With ASR Results
```swift
// Transcribe audio
let asrResult = try await asrManager.transcribe(samples, source: .system)
// Normalize the result
// Normalize the result (ITN: spoken written)
let normalizedResult = normalizer.normalize(result: asrResult)
print(normalizedResult.text) // Written form
```
### Linking the Native Library
To enable ITN support, link your app against `libnemo_text_processing`:
To enable text processing support, link your app against `libnemo_text_processing`:
1. Build text-processing-rs for your target platform
2. Add the library to your Xcode project's linker settings
+1 -1
View File
@@ -123,7 +123,7 @@ Enhance ASR output with post-processing:
| Tool | Description | Language |
|------|-------------|----------|
| **[text-processing-rs](https://github.com/FluidInference/text-processing-rs)** | Inverse Text Normalization (ITN) - converts spoken-form ASR output to written form ("two hundred" → "200", "five dollars" → "$5"). Rust port of [NVIDIA NeMo Text Processing](https://github.com/NVIDIA/NeMo-text-processing) with Swift wrapper. | Rust, Swift |
| **[text-processing-rs](https://github.com/FluidInference/text-processing-rs)** | Inverse Text Normalization (ITN) and Text Normalization (TN) across 7 languages (EN, DE, ES, FR, HI, JA, ZH). 100% NeMo test compatibility (3,011 tests). Converts spoken-form ASR output to written form ("two hundred" → "200", "five dollars" → "$5"). Rust port of [NVIDIA NeMo Text Processing](https://github.com/NVIDIA/NeMo-text-processing) with Swift wrapper. | Rust, Swift |
## Configuration