FluidAudio

mirror of https://github.com/FluidInference/FluidAudio.git synced 2026-05-12 20:20:36 +00:00

Author	SHA1	Message	Date
Alex	13990a0fcb	docs: update README with current versions and product names (#348 ) ## Summary - Swift badge 5.9+ → 6.0+ (matches `swift-tools-version: 6.0`) - SPM dependency `0.7.9` → `0.12.2` - CocoaPods `0.7.8` → `0.12.2` - `FluidAudioTTS` → `FluidAudioEspeak` (renamed in #302) - Streaming ASR: "Coming soon" → describes `StreamingAsrManager` - Citation: version 0.7.0/2024 → 0.12.2/2025 ## Test plan - [x] No code changes, documentation only <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/fluidinference/fluidaudio/pull/348" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end -->	2026-03-06 09:43:19 -05:00
Evan Rosenfeld	bd0467fdc4	fix: MachTaskSelfWrapper module import and remove unused preprocessorFile from requiredModels (#248 ) Fix MachTaskSelfWrapper module import and remove unused preprocessorFile from requiredModels This patch fixes two issues in FluidAudio 0.9.1: 1. MachTaskSelfWrapper module import error Swift code cannot `import MachTaskSelfWrapper` because the C library lacks a module.modulemap file. This adds the modulemap and configures it in the podspec. 2. preprocessorFile in ParakeetEOU.requiredModels The preprocessor model is no longer used (replaced by native Swift NeMoMelSpectrogram in StreamingEouAsrManager), but it's still listed in requiredModels. This causes model download validation to fail since the file doesn't exist in the HuggingFace repo.	2026-01-04 15:59:35 -05:00
Alex	73fb84aa9d	feat: Migrate to Swift 6 with strict concurrency (#233 ) - Update Package.swift to swift-tools-version: 6.0 - Add @preconcurrency import CoreML/AVFoundation throughout codebase - Make structs Sendable (AppLogger, DownloadConfig, etc.) - Use nonisolated(unsafe) for static mutable state - Fix AudioStream by removing @unchecked Sendable, making AsyncCallback @Sendable - Fix Task closures with proper explicit captures - Convert concurrent tests to sequential where types aren't Sendable - Add @MainActor to test methods using waitForExpectations 🤖 Generated with [Claude Code](https://claude.com/claude-code) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? --> resolve #231	2025-12-31 14:57:14 -05:00
Alex	d540f0092d	Fix: Move ESpeakNG.xcframework to top-level Frameworks directory (#205 ) This PR moves `ESpeakNG.xcframework` from `Sources/FluidAudio/Frameworks` to a top-level `Frameworks` directory. This resolves a runtime error `Library not loaded: @rpath/ESpeakNG.framework/ESpeakNG` experienced by users when integrating FluidAudio, likely due to Xcode/SPM not correctly embedding the framework when it's nested deep within the source structure. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-29 02:59:58 -05:00
Alex	f5f30a4940	optionalize TTS via FluidAudioTTS target (#186 ) - Make TTS optional to avoid GPL by default; enable with `FLUIDAUDIO_ENABLE_TTS=1.` - Split TTS into FluidAudioTTS target; CLI imports it only when enabled. - Expose MLModel.compatPrediction as public for cross‑module use. --------- Co-authored-by: Brandon Weng <18161326+BrandonWeng@users.noreply.github.com>	2025-11-24 00:55:57 -05:00
Alex-Wengg	0c77595ea5	Revert "chore: sync release version 0.0.0-test" This reverts commit `d0f43fe3bb`.	2025-11-15 14:29:37 -05:00
eddy	d0f43fe3bb	chore: sync release version 0.0.0-test	2025-11-15 19:10:53 +00:00
Brandon Weng	8ef20fdca6	0.7.8	2025-11-04 09:56:02 -05:00
Brandon Weng	c3906db456	0.7.7 update	2025-10-30 13:58:21 -04:00
Brandon Weng	fa95133be2	0.7.6 release	2025-10-28 12:19:07 -04:00
Brandon Weng	dff1b5d158	Bump version from 0.7.4 to 0.7.5	2025-10-28 10:49:53 -04:00
Brandon Weng	7fd5ac5446	pyannote community-1 model for offline speaker diarization pipeline (#150 ) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? --> Keeping the streaming one around as the VBx and AHC clustering gets pretty expensive after 30mins of audio and running it constantly gets expensive. Its still possible to support clustering between files but will save that for another PR. Pyannote's Bench mark is around 11% - i increased steps to 0.2s instead of 0.1 to double the speed but also selective fp16 results in more operations to run on ANE but also means that we lose some precision. ``` Average DER: 14.95% \| Median DER: 10.89% \| Average JER: 39.27% \| Median JER: 40.74% (collar=0.25s, ignoreOverlap=True) Average RTFx: 139.63 (from 232 clips) Metrics summary saved to: /Users/brandonweng/FluidAudioDatasets/voxconverse/metrics/test_metrics_release.json Completed. New results: 232, Skipped existing: 0, Total attempted: 232 ``` See benchmark.md for more info but compared to Pytorch model, we are 100x faster than the CPU version and ~6x faster compared to the mps backend on mb pro 4 --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: Brandon Weng <BrandonWeng@users.noreply.github.com> Co-authored-by: Alex <36247722+Alex-Wengg@users.noreply.github.com> Co-authored-by: Alex-Wengg <hanweng9@gmail.com>	2025-10-22 15:11:57 -04:00
Brandon Weng	977f99611c	0.7.4 release	2025-10-20 19:22:29 -04:00
Brandon Weng	9e8e7457f2	Bump to 0.7.2	2025-10-19 16:54:32 -04:00
Brandon Weng	a8f3bc7a3b	Update ESpeak and hard fail if missing (#148 ) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? --> Rebuilt ESpeak to support macos 14 + iOS17+, using our forked version here https://github.com/FluidInference/espeak-ng/releases/tag/xcframework%2F1.52.0 - this makes it easier to maintain. Also removed x86 (intel chips) from the support framework. we dont need it - Removed the fallbacks in the path and hard fail when the bundling fails	2025-10-19 01:56:45 -04:00
Brandon Weng	3a7402877e	BUumnp to v0.7.0	2025-10-16 22:47:28 -04:00
Alex	93bd9cf49a	Kokoro Text-to-Speech (#112 )	2025-10-06 17:53:30 -04:00
Brandon Weng	43de42a9a7	Bump docs to 0.6.1	2025-09-28 14:22:14 -04:00
Brandon Weng	14d90a9d87	Bump cocoapod to 0.6.0	2025-09-25 20:17:21 -04:00
Brandon Weng	038c9229ac	0.5.2 for cocoapods	2025-09-20 23:21:29 -05:00
Brandon Weng	be8e1d6c60	v0.5.1 bump (#109 ) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? -->	2025-09-16 00:53:38 -04:00
Brandon Weng	b22a6ee31f	v0.5.0 bump	2025-09-14 15:29:39 -04:00
Brandon Weng	e97e4e15ef	Migrate ANE optimizer away from singleton (#107 ) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? --> Preparing for Swift 6 migration, singletons are generally not recommended	2025-09-14 12:13:46 -04:00
Brandon Weng	261276ea94	Minimal Pod Spec (#102 ) ### Why is this change needed? <!-- Explain the motivation for this change. What problem does it solve? --> We don't have any external deps so its not hard to support podspec, see the issue for more details. https://github.com/FluidInference/FluidAudio/issues/100	2025-09-11 20:41:25 +00:00

24 Commits