Commit Graph

24 Commits

Author SHA1 Message Date
Alex 13990a0fcb docs: update README with current versions and product names (#348)
## Summary

- Swift badge 5.9+ → 6.0+ (matches `swift-tools-version: 6.0`)
- SPM dependency `0.7.9` → `0.12.2`
- CocoaPods `0.7.8` → `0.12.2`
- `FluidAudioTTS` → `FluidAudioEspeak` (renamed in #302)
- Streaming ASR: "Coming soon" → describes `StreamingAsrManager`
- Citation: version 0.7.0/2024 → 0.12.2/2025

## Test plan

- [x] No code changes, documentation only
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/fluidinference/fluidaudio/pull/348"
target="_blank">
  <picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open with Devin">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-03-06 09:43:19 -05:00
Evan Rosenfeld bd0467fdc4 fix: MachTaskSelfWrapper module import and remove unused preprocessorFile from requiredModels (#248)
Fix MachTaskSelfWrapper module import and remove unused preprocessorFile
from requiredModels

This patch fixes two issues in FluidAudio 0.9.1:

1. MachTaskSelfWrapper module import error
Swift code cannot `import MachTaskSelfWrapper` because the C library
lacks a
module.modulemap file. This adds the modulemap and configures it in the
podspec.

2. preprocessorFile in ParakeetEOU.requiredModels
The preprocessor model is no longer used (replaced by native Swift
NeMoMelSpectrogram
in StreamingEouAsrManager), but it's still listed in requiredModels.
This causes
model download validation to fail since the file doesn't exist in the
HuggingFace repo.
2026-01-04 15:59:35 -05:00
Alex 73fb84aa9d feat: Migrate to Swift 6 with strict concurrency (#233)
- Update Package.swift to swift-tools-version: 6.0
- Add @preconcurrency import CoreML/AVFoundation throughout codebase
- Make structs Sendable (AppLogger, DownloadConfig, etc.)
- Use nonisolated(unsafe) for static mutable state
- Fix AudioStream by removing @unchecked Sendable, making AsyncCallback
@Sendable
- Fix Task closures with proper explicit captures
- Convert concurrent tests to sequential where types aren't Sendable
- Add @MainActor to test methods using waitForExpectations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->

resolve #231
2025-12-31 14:57:14 -05:00
Alex d540f0092d Fix: Move ESpeakNG.xcframework to top-level Frameworks directory (#205)
This PR moves `ESpeakNG.xcframework` from
`Sources/FluidAudio/Frameworks` to a top-level `Frameworks` directory.
This resolves a runtime error `Library not loaded:
@rpath/ESpeakNG.framework/ESpeakNG` experienced by users when
integrating FluidAudio, likely due to Xcode/SPM not correctly embedding
the framework when it's nested deep within the source structure.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-29 02:59:58 -05:00
Alex f5f30a4940 optionalize TTS via FluidAudioTTS target (#186)
- Make TTS optional to avoid GPL by default; enable with
`FLUIDAUDIO_ENABLE_TTS=1.`
- Split TTS into FluidAudioTTS target; CLI imports it only when enabled.
- Expose MLModel.compatPrediction as public for cross‑module use.

---------

Co-authored-by: Brandon Weng <18161326+BrandonWeng@users.noreply.github.com>
2025-11-24 00:55:57 -05:00
Alex-Wengg 0c77595ea5 Revert "chore: sync release version 0.0.0-test"
This reverts commit d0f43fe3bb.
2025-11-15 14:29:37 -05:00
eddy d0f43fe3bb chore: sync release version 0.0.0-test 2025-11-15 19:10:53 +00:00
Brandon Weng 8ef20fdca6 0.7.8 2025-11-04 09:56:02 -05:00
Brandon Weng c3906db456 0.7.7 update 2025-10-30 13:58:21 -04:00
Brandon Weng fa95133be2 0.7.6 release 2025-10-28 12:19:07 -04:00
Brandon Weng dff1b5d158 Bump version from 0.7.4 to 0.7.5 2025-10-28 10:49:53 -04:00
Brandon Weng 7fd5ac5446 pyannote community-1 model for offline speaker diarization pipeline (#150)
### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->

Keeping the streaming one around as the VBx and AHC clustering gets
pretty expensive after 30mins of audio and running it constantly gets
expensive. Its still possible to support clustering between files but
will save that for another PR.

Pyannote's Bench mark is around 11% - i increased steps to 0.2s instead
of 0.1 to double the speed but also selective fp16 results in more
operations to run on ANE but also means that we lose some precision.

```
Average DER: 14.95% | Median DER: 10.89% | Average JER: 39.27% | Median JER: 40.74% (collar=0.25s, ignoreOverlap=True)
Average RTFx: 139.63 (from 232 clips)
Metrics summary saved to: /Users/brandonweng/FluidAudioDatasets/voxconverse/metrics/test_metrics_release.json
Completed. New results: 232, Skipped existing: 0, Total attempted: 232
```

See benchmark.md for more info but compared to Pytorch model, we are
100x faster than the CPU version and ~6x faster compared to the mps
backend on mb pro 4

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Brandon Weng <BrandonWeng@users.noreply.github.com>
Co-authored-by: Alex <36247722+Alex-Wengg@users.noreply.github.com>
Co-authored-by: Alex-Wengg <hanweng9@gmail.com>
2025-10-22 15:11:57 -04:00
Brandon Weng 977f99611c 0.7.4 release 2025-10-20 19:22:29 -04:00
Brandon Weng 9e8e7457f2 Bump to 0.7.2 2025-10-19 16:54:32 -04:00
Brandon Weng a8f3bc7a3b Update ESpeak and hard fail if missing (#148)
### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->

Rebuilt ESpeak to support macos 14 + iOS17+, using our forked version
here

https://github.com/FluidInference/espeak-ng/releases/tag/xcframework%2F1.52.0

- this makes it easier to maintain. Also removed x86 (intel chips) from
the support framework. we dont need it
- Removed the fallbacks in the path and hard fail when the bundling
fails
2025-10-19 01:56:45 -04:00
Brandon Weng 3a7402877e BUumnp to v0.7.0 2025-10-16 22:47:28 -04:00
Alex 93bd9cf49a Kokoro Text-to-Speech (#112) 2025-10-06 17:53:30 -04:00
Brandon Weng 43de42a9a7 Bump docs to 0.6.1 2025-09-28 14:22:14 -04:00
Brandon Weng 14d90a9d87 Bump cocoapod to 0.6.0 2025-09-25 20:17:21 -04:00
Brandon Weng 038c9229ac 0.5.2 for cocoapods 2025-09-20 23:21:29 -05:00
Brandon Weng be8e1d6c60 v0.5.1 bump (#109)
### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->
2025-09-16 00:53:38 -04:00
Brandon Weng b22a6ee31f v0.5.0 bump 2025-09-14 15:29:39 -04:00
Brandon Weng e97e4e15ef Migrate ANE optimizer away from singleton (#107)
### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->

Preparing for Swift 6 migration, singletons are generally not
recommended
2025-09-14 12:13:46 -04:00
Brandon Weng 261276ea94 Minimal Pod Spec (#102)
### Why is this change needed?
<!-- Explain the motivation for this change. What problem does it solve?
-->

We don't have any external deps so its not hard to support podspec, see
the issue for more details.

https://github.com/FluidInference/FluidAudio/issues/100
2025-09-11 20:41:25 +00:00