feat(tts/kokoro-ane/zh): include g2pw.mlmodelc in requiredModelsZh

Wire the g2pW CoreML bundle into the bulk `ensureModels(.mandarin)`
grab so the polyphone-disambiguation path is on by default for any
fresh Mandarin checkout. Without this entry the model was only
fetchable via the lazy fallback that this PR also documents as nil-on-
failure — meaning users hit the dict-only path even when their network
worked fine.

The two auxiliary text files (`vocab.txt`, `POLYPHONIC_CHARS.txt`)
stay on the lazy `ensureMandarinG2pw` helper:
`DownloadUtils.downloadRepo`'s subPath matcher does not whitelist
`.txt` (only `.json`/`.model`/`.bin`), so adding them to
`requiredModelsZh` would trigger an infinite re-download loop on
each startup. The manual fetch already handles them correctly.
This commit is contained in:
Alex-Wengg
2026-05-04 00:31:06 -04:00
parent f909bfca4b
commit 770af29a65
+15 -2
View File
@@ -1006,6 +1006,17 @@ public enum ModelNames {
/// `<repoDir>/voices/zf_001.bin`.
public static let defaultVoiceFileZh = "voices/zf_001.bin"
/// Mandarin g2pW polyphone-disambiguator CoreML bundle. Lives under
/// `<repoDir>/g2pw/` included in `requiredModelsZh` so the bulk
/// `ensureModels(.mandarin)` grab pulls it without an extra round
/// trip. The two auxiliary text files (`vocab.txt`,
/// `POLYPHONIC_CHARS.txt`) ship via the lazy
/// `KokoroAneResourceDownloader.ensureMandarinG2pw` helper because
/// `DownloadUtils.downloadRepo` does not whitelist `.txt` for
/// subPath repos and a manual fetch keeps the bulk-grab matcher
/// idempotent.
public static let g2pwModelZh = "g2pw/g2pw.mlmodelc"
/// All seven .mlmodelc bundles.
public static let requiredCoreMLModels: Set<String> = [
albert, postAlbert, alignment, prosody, noise, vocoder, tail,
@@ -1017,9 +1028,11 @@ public enum ModelNames {
}
/// CoreML bundles + the vocab JSON + the Mandarin default voice .bin
/// (which lives under `voices/`).
/// (under `voices/`) + the g2pW CoreML bundle (under `g2pw/`).
public static var requiredModelsZh: Set<String> {
requiredCoreMLModels.union([vocab, defaultVoiceFileZh])
requiredCoreMLModels.union([
vocab, defaultVoiceFileZh, g2pwModelZh,
])
}
}