Compare commits
44 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| e00c1448cb | |||
| f04ad3bfeb | |||
| eb11f90cab | |||
| 2c612ee669 | |||
| dfe92f9d0e | |||
| c59e7b42ab | |||
| 948705a87b | |||
| 6b0b5c1799 | |||
| 9553691e88 | |||
| 997456631e | |||
| 104165198b | |||
| f405b827f4 | |||
| 089568cf9f | |||
| a92d88ff8f | |||
| c60030b48f | |||
| b400d651f6 | |||
| 2dfa9aa782 | |||
| bd2c065415 | |||
| 1b95144b05 | |||
| 3f60936471 | |||
| 9a3c2f8a3f | |||
| 1fb75f24d7 | |||
| 6c0187e606 | |||
| 16262ec896 | |||
| 9d1303ed21 | |||
| a12e537110 | |||
| 63fb3cf046 | |||
| f5e6666931 | |||
| 44a9c84bc7 | |||
| 9a076936c5 | |||
| 102b29ecf6 | |||
| 51455bfd97 | |||
| 5bf6086164 | |||
| 08ebaf0914 | |||
| 4b3f26c12f | |||
| 707c31e4d3 | |||
| 509d47ea0f | |||
| 1d2b08df6e | |||
| 0608443482 | |||
| 9a1a41385c | |||
| 2d8095f0e1 | |||
| 8f3c1c5d61 | |||
| deb742d768 | |||
| fa24588ea4 |
+3
-1
@@ -172,4 +172,6 @@
|
||||
!rhasspy/train/*.py
|
||||
!rhasspy/train/jsgf2fst/*.py
|
||||
!*.py
|
||||
!VERSION
|
||||
!VERSION
|
||||
|
||||
!pip
|
||||
|
||||
@@ -0,0 +1,77 @@
|
||||
## [2.4.18] - 2020 Feb 07
|
||||
|
||||
### Added
|
||||
|
||||
- /api/listen-for-wake accepts "on" and "off" as POST data to enable/disable wake word
|
||||
- /api/events/wake websocket endpoint reports wake up events
|
||||
- /api/events/text websocket endpoint reports transcription events
|
||||
- Rhasspy logo changes in web UI when wake word is detected
|
||||
- espeak arguments list for text to speech
|
||||
|
||||
### Changed
|
||||
|
||||
- STT output casing is fixed outside of HTTP API calls
|
||||
- All voice commands show up in web UI test page
|
||||
- Play last voice command button in web UI works for any command
|
||||
- Fixed commas in numbers with thousand separators
|
||||
- Words from Pocketsphinx wake keyphrase are added to dictionary
|
||||
- Pocketsphinx wake word keyphrase casing is fixed
|
||||
|
||||
## [2.4.17] - 2020 Jan 21
|
||||
|
||||
### Added
|
||||
|
||||
- Button to web UI to play last recorded voice command
|
||||
- RHASSPY_LOG_LEVEL environment variable
|
||||
- Web UI feedback during download
|
||||
- Add "asoundrc" config option to Hass.IO add-on
|
||||
|
||||
### Changed
|
||||
|
||||
- Moved $profile/kaldi/custom_words.txt to $profile/kaldi_custom_words.txt
|
||||
- Slot substitution casing is kept during training/recognition
|
||||
- Fixed fuzzywuzzy and other intent recognizer training after addition of converters
|
||||
- Fix thread max count issue
|
||||
- Hide web UI alerts after 10 seconds
|
||||
- Delete partially downloaded profile files
|
||||
- Force slot programs to run each training cycle
|
||||
- Fix _raw_text in Hass event being same as _text
|
||||
|
||||
### Removed
|
||||
|
||||
- Flair intent recognizer
|
||||
|
||||
## [2.4.16] - 2020 Jan 5
|
||||
|
||||
### Added
|
||||
|
||||
- Number ranges (0..100)
|
||||
- Converters for transforming JSON values in intents (!int)
|
||||
- Slot programs for generating slot values
|
||||
- $rhasspy/days and $rhasspy/months built-in slots
|
||||
|
||||
## [2.4.15] - 2019 Dec 27
|
||||
|
||||
### Added
|
||||
|
||||
- Preliminary support for Raspberry Pi Zero (no Kaldi)
|
||||
- Play error sound when intent not recognized
|
||||
- _text and _raw_text to Home Assistant events
|
||||
|
||||
### Changed
|
||||
|
||||
- Disable wake word when TTS is speaking
|
||||
- Use json5 library to parse profile
|
||||
- Remove picotts pop sound
|
||||
- Don't open/close microphone after wake-up
|
||||
|
||||
## [2.4.14] - 2019 Dec 19
|
||||
|
||||
### Added
|
||||
|
||||
- Ability to split sentences across multiple .ini file in intents directory
|
||||
- Support (future) /api/intent for Home Assistant
|
||||
- Support for Home Assistant TTS system
|
||||
- Emulate MaryTTS /process API in web API
|
||||
- Include wakeId/siteId in JSON intent (MQTT/Websocket)
|
||||
- ?voice and ?language query parameters to /api/text-to-speech
|
||||
@@ -1,6 +1,6 @@
|
||||

|
||||
|
||||
Rhasspy (pronounced RAH-SPEE) is an offline, [multilingual](#supported-languages) voice assistant toolkit inspired by [Jasper](https://jasperproject.github.io/) that works well with [Home Assistant](https://www.home-assistant.io/), [Hass.io](https://www.home-assistant.io/hassio/), and [Node-RED](https://nodered.org).
|
||||
Rhasspy (pronounced RAH-SPEE) is an offline voice assistant toolkit inspired by [Jasper](https://jasperproject.github.io/) [supports many languages](#supported-languages). It works well with [Home Assistant](https://www.home-assistant.io/), [Hass.io](https://www.home-assistant.io/hassio/), and [Node-RED](https://nodered.org).
|
||||
|
||||
* [Documentation](https://rhasspy.readthedocs.io/)
|
||||
* [Discussion](https://community.rhasspy.org)
|
||||
@@ -58,7 +58,7 @@ The table below summarizes language support across the various supporting techno
|
||||
| | [rasaNLU](https://rhasspy.readthedocs.io/en/latest/intent-recognition/#rasanlu) | *needs extra software* | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
||||
| **Text to Speech** | [espeak](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#espeak) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
||||
| | [flite](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#flite) | ✓ | ✓ | | | | | | | | ✓ | | | | | |
|
||||
| | [picotts](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#picotts) | ✓ | ✓ | | | | | | | | | | | | | |
|
||||
| | [picotts](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#picotts) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | | | | | | | | |
|
||||
| | [marytts](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#marytts) | ✓ | ✓ | ✓ | | ✓ | ✓ | | ✓ | | | | | | | |
|
||||
| | [wavenet](https://rhasspy.readthedocs.io/en/latest/text-to-speech/#google-wavenet) | | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | ✓ | ✓ | | ✓ | ✓ | |
|
||||
|
||||
|
||||
@@ -7,9 +7,11 @@ import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import time
|
||||
from functools import wraps
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Tuple, Union
|
||||
from typing import Any, Dict, List, Optional, Set, Tuple, Union
|
||||
from uuid import uuid4
|
||||
|
||||
import attr
|
||||
@@ -29,7 +31,13 @@ from swagger_ui import quart_api_doc
|
||||
|
||||
from rhasspy.actor import ActorSystem, ConfigureEvent, RhasspyActor
|
||||
from rhasspy.core import RhasspyCore
|
||||
from rhasspy.events import IntentRecognized, ProfileTrainingFailed
|
||||
from rhasspy.events import (
|
||||
IntentRecognized,
|
||||
ProfileTrainingFailed,
|
||||
VoiceCommand,
|
||||
WakeWordDetected,
|
||||
WavTranscription,
|
||||
)
|
||||
from rhasspy.utils import (
|
||||
FunctionLoggingHandler,
|
||||
buffer_to_wav,
|
||||
@@ -53,6 +61,10 @@ app = Quart("rhasspy")
|
||||
app.secret_key = str(uuid4())
|
||||
app = cors(app)
|
||||
|
||||
# WAV data from last voice command
|
||||
last_voice_wav: Optional[bytes] = None
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Parse Arguments
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -91,8 +103,12 @@ parser.add_argument("--log-level", default="DEBUG", help="Set logging level")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Set log level
|
||||
log_level = getattr(logging, args.log_level.upper())
|
||||
logging.basicConfig(level=log_level)
|
||||
if "RHASSPY_LOG_LEVEL" in os.environ:
|
||||
log_level = os.environ["RHASSPY_LOG_LEVEL"]
|
||||
else:
|
||||
log_level = args.log_level
|
||||
|
||||
logging.basicConfig(level=getattr(logging, log_level.upper()))
|
||||
|
||||
|
||||
logger.debug(args)
|
||||
@@ -206,6 +222,14 @@ async def api_download_profile() -> str:
|
||||
return "OK"
|
||||
|
||||
|
||||
@app.route("/api/download-status", methods=["GET"])
|
||||
async def api_download_status() -> str:
|
||||
"""Get status of profile download"""
|
||||
assert core is not None
|
||||
|
||||
return "\n".join(core.download_status)
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
|
||||
@@ -256,8 +280,11 @@ async def api_speakers() -> Response:
|
||||
async def api_listen_for_wake() -> str:
|
||||
"""Make Rhasspy listen for a wake word"""
|
||||
assert core is not None
|
||||
core.listen_for_wake()
|
||||
return "OK"
|
||||
enabled_str = (await request.data).decode().strip().lower()
|
||||
enabled = enabled_str not in ["false", "off"]
|
||||
core.listen_for_wake(enabled)
|
||||
|
||||
return str(enabled)
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -278,6 +305,10 @@ async def api_listen_for_command() -> Response:
|
||||
entity = request.args.get("entity")
|
||||
value = request.args.get("value")
|
||||
|
||||
# Emulate wake
|
||||
wake_json = json.dumps({"wakewordId": "default", "siteId": core.siteId})
|
||||
await add_ws_event("wake", wake_json)
|
||||
|
||||
return jsonify(
|
||||
await core.listen_for_command(
|
||||
handle=(not no_hass), timeout=timeout, entity=entity, value=value
|
||||
@@ -369,7 +400,7 @@ async def api_pronounce() -> Union[Response, str]:
|
||||
|
||||
if download:
|
||||
# Return WAV
|
||||
return Response(wav_data) # , mimetype="audio/wav")
|
||||
return Response(wav_data, mimetype="audio/wav")
|
||||
|
||||
# Play through speakers
|
||||
core.play_wav_data(wav_data)
|
||||
@@ -524,6 +555,26 @@ async def api_custom_words():
|
||||
assert core is not None
|
||||
speech_system = core.profile.get("speech_to_text.system", "pocketsphinx")
|
||||
|
||||
# Temporary fix for kaldi/custom_words -> kaldi_custom_words.txt
|
||||
old_kaldi_words_path = Path(core.profile.read_path("kaldi/custom_words.txt"))
|
||||
if old_kaldi_words_path.is_file():
|
||||
new_kaldi_words_path = Path(
|
||||
core.profile.write_path(
|
||||
core.profile.get(
|
||||
"speech_to_text.kaldi.custom_words", "custom_words.txt"
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
if (
|
||||
new_kaldi_words_path != old_kaldi_words_path
|
||||
and not new_kaldi_words_path.is_file()
|
||||
):
|
||||
logger.warning(
|
||||
"Moving %s to %s", str(old_kaldi_words_path), str(new_kaldi_words_path)
|
||||
)
|
||||
shutil.move(old_kaldi_words_path, new_kaldi_words_path)
|
||||
|
||||
if request.method == "POST":
|
||||
custom_words_path = Path(
|
||||
core.profile.write_path(
|
||||
@@ -618,6 +669,7 @@ async def api_restart() -> str:
|
||||
@app.route("/api/speech-to-text", methods=["POST"])
|
||||
async def api_speech_to_text() -> str:
|
||||
"""Transcribe speech from WAV file."""
|
||||
global last_voice_wav
|
||||
no_header = request.args.get("noheader", "false").lower() == "true"
|
||||
assert core is not None
|
||||
|
||||
@@ -627,10 +679,20 @@ async def api_speech_to_text() -> str:
|
||||
# Wrap in WAV
|
||||
wav_data = buffer_to_wav(wav_data)
|
||||
|
||||
last_voice_wav = wav_data
|
||||
|
||||
start_time = time.perf_counter()
|
||||
result = await core.transcribe_wav(wav_data)
|
||||
end_time = time.perf_counter()
|
||||
|
||||
# Send to websocket
|
||||
await add_ws_event(
|
||||
"transcription",
|
||||
json.dumps(
|
||||
{"text": result.text, "wakewordId": "default", "siteId": core.siteId}
|
||||
),
|
||||
)
|
||||
|
||||
if prefers_json():
|
||||
return jsonify(
|
||||
{
|
||||
@@ -665,7 +727,7 @@ async def api_text_to_intent():
|
||||
|
||||
intent_json = json.dumps(intent)
|
||||
logger.debug(intent_json)
|
||||
await add_ws_event(WS_EVENT_INTENT, intent_json)
|
||||
await add_ws_event("intent", intent_json)
|
||||
|
||||
if not no_hass:
|
||||
# Send intent to Home Assistant
|
||||
@@ -680,11 +742,13 @@ async def api_text_to_intent():
|
||||
@app.route("/api/speech-to-intent", methods=["POST"])
|
||||
async def api_speech_to_intent() -> Response:
|
||||
"""Transcribe speech, recognize intent, and optionally handle."""
|
||||
global last_voice_wav
|
||||
assert core is not None
|
||||
no_hass = request.args.get("nohass", "false").lower() == "true"
|
||||
|
||||
# Prefer 16-bit 16Khz mono, but will convert with sox if needed
|
||||
wav_data = await request.data
|
||||
last_voice_wav = wav_data
|
||||
|
||||
# speech -> text
|
||||
start_time = time.time()
|
||||
@@ -692,6 +756,12 @@ async def api_speech_to_intent() -> Response:
|
||||
text = transcription.text
|
||||
logger.debug(text)
|
||||
|
||||
# Send to websocket
|
||||
await add_ws_event(
|
||||
"transcription",
|
||||
json.dumps({"text": text, "wakewordId": "default", "siteId": core.siteId}),
|
||||
)
|
||||
|
||||
# text -> intent
|
||||
intent = (await core.recognize_intent(text)).intent
|
||||
intent["speech_confidence"] = transcription.confidence
|
||||
@@ -701,7 +771,7 @@ async def api_speech_to_intent() -> Response:
|
||||
|
||||
intent_json = json.dumps(intent)
|
||||
logger.debug(intent_json)
|
||||
await add_ws_event(WS_EVENT_INTENT, intent_json)
|
||||
await add_ws_event("intent", intent_json)
|
||||
|
||||
if not no_hass:
|
||||
# Send intent to Home Assistant
|
||||
@@ -726,6 +796,7 @@ async def api_start_recording() -> str:
|
||||
@app.route("/api/stop-recording", methods=["POST"])
|
||||
async def api_stop_recording() -> Response:
|
||||
"""End recording voice command. Transcribe and handle."""
|
||||
global last_voice_wav
|
||||
assert core is not None
|
||||
no_hass = request.args.get("nohass", "false").lower() == "true"
|
||||
|
||||
@@ -739,20 +810,43 @@ async def api_stop_recording() -> Response:
|
||||
text = transcription.text
|
||||
logger.debug(text)
|
||||
|
||||
# Send to websocket
|
||||
await add_ws_event(
|
||||
"transcription",
|
||||
json.dumps({"text": text, "wakewordId": "default", "siteId": core.siteId}),
|
||||
)
|
||||
|
||||
intent = (await core.recognize_intent(text)).intent
|
||||
intent["speech_confidence"] = transcription.confidence
|
||||
|
||||
intent_json = json.dumps(intent)
|
||||
logger.debug(intent_json)
|
||||
await add_ws_event(WS_EVENT_INTENT, intent_json)
|
||||
await add_ws_event("intent", intent_json)
|
||||
|
||||
if not no_hass:
|
||||
# Send intent to Home Assistant
|
||||
intent = (await core.handle_intent(intent)).intent
|
||||
|
||||
# Save last voice command WAV data
|
||||
last_voice_wav = wav_data
|
||||
|
||||
return jsonify(intent)
|
||||
|
||||
|
||||
@app.route("/api/play-recording", methods=["POST"])
|
||||
async def api_play_recording() -> str:
|
||||
"""Play last recorded voice command through the configured audio output system"""
|
||||
global last_voice_wav
|
||||
assert core is not None
|
||||
|
||||
if last_voice_wav:
|
||||
# Play through speakers
|
||||
logger.debug("Playing %s byte(s)", len(last_voice_wav))
|
||||
core.play_wav_data(last_voice_wav)
|
||||
|
||||
return "OK"
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
|
||||
@@ -806,7 +900,7 @@ async def api_text_to_speech() -> Union[bytes, str]:
|
||||
|
||||
if not play:
|
||||
# Return WAV data instead of speaking
|
||||
return result.wav_data
|
||||
return Response(result.wav_data, mimetype="audio/wav")
|
||||
|
||||
return sentence
|
||||
|
||||
@@ -823,16 +917,6 @@ async def api_slots() -> Union[str, Response]:
|
||||
overwrite_all = request.args.get("overwrite_all", "false").lower() == "true"
|
||||
new_slot_values = json5.loads(await request.data)
|
||||
|
||||
word_casing = core.profile.get(
|
||||
"speech_to_text.dictionary_casing", "ignore"
|
||||
).lower()
|
||||
word_transform = lambda s: s
|
||||
|
||||
if word_casing == "lower":
|
||||
word_transform = str.lower
|
||||
elif word_casing == "upper":
|
||||
word_transform = str.upper
|
||||
|
||||
slots_dir = Path(
|
||||
core.profile.write_path(
|
||||
core.profile.get("speech_to_text.slots_dir", "slots")
|
||||
@@ -859,11 +943,10 @@ async def api_slots() -> Union[str, Response]:
|
||||
slots_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Merge with existing values
|
||||
values = {word_transform(v.strip()) for v in values}
|
||||
values = {v.strip() for v in values}
|
||||
if slots_path.is_file():
|
||||
values.update(
|
||||
word_transform(line.strip())
|
||||
for line in slots_path.read_text().splitlines()
|
||||
line.strip() for line in slots_path.read_text().splitlines()
|
||||
)
|
||||
|
||||
# Write merged values
|
||||
@@ -989,7 +1072,7 @@ def api_intents():
|
||||
|
||||
|
||||
@app.route("/process", methods=["GET"])
|
||||
async def marytts_process():
|
||||
async def marytts_process() -> Response:
|
||||
"""Emulate MaryTTS /process API"""
|
||||
global last_sentence
|
||||
|
||||
@@ -1001,7 +1084,7 @@ async def marytts_process():
|
||||
sentence, play=False, voice=voice, language=locale
|
||||
)
|
||||
|
||||
return spoken.wav_data
|
||||
return Response(spoken.wav_data, mimetype="audio/wav")
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -1073,26 +1156,26 @@ async def swagger_yaml() -> Response:
|
||||
# WebSocket API
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
WS_EVENT_INTENT = 0
|
||||
WS_EVENT_LOG = 1
|
||||
|
||||
ws_queues: List[List[asyncio.Queue]] = [[], []]
|
||||
ws_locks: List[asyncio.Lock] = [asyncio.Lock(), asyncio.Lock()]
|
||||
user_queues: Set[asyncio.Queue] = set()
|
||||
logging_queues: Set[asyncio.Queue] = set()
|
||||
|
||||
|
||||
async def add_ws_event(event_type: int, text: str):
|
||||
"""Send text out to all websockets for a specific event."""
|
||||
async with ws_locks[event_type]:
|
||||
for q in ws_queues[event_type]:
|
||||
await q.put(text)
|
||||
async def add_ws_event(message_type: str, text: str):
|
||||
"""Send text out to all user websockets for a specific event."""
|
||||
for q in user_queues:
|
||||
await q.put((message_type, text))
|
||||
|
||||
|
||||
async def log_ws_event(text: str):
|
||||
"""Send logging message out to websockets."""
|
||||
for q in logging_queues:
|
||||
await q.put(text)
|
||||
|
||||
|
||||
# Send logging messages out to websocket
|
||||
logging.root.addHandler(
|
||||
FunctionLoggingHandler(
|
||||
lambda msg: asyncio.run_coroutine_threadsafe(
|
||||
add_ws_event(WS_EVENT_LOG, msg), loop
|
||||
)
|
||||
lambda msg: asyncio.run_coroutine_threadsafe(log_ws_event(msg), loop)
|
||||
)
|
||||
)
|
||||
|
||||
@@ -1102,6 +1185,8 @@ class WebSocketObserver(RhasspyActor):
|
||||
|
||||
def in_started(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in started state."""
|
||||
global last_voice_wav
|
||||
|
||||
if isinstance(message, IntentRecognized):
|
||||
# Add slots
|
||||
intent_slots = {}
|
||||
@@ -1113,29 +1198,75 @@ class WebSocketObserver(RhasspyActor):
|
||||
# Convert to JSON
|
||||
intent_json = json.dumps(message.intent)
|
||||
self._logger.debug(intent_json)
|
||||
asyncio.run_coroutine_threadsafe(
|
||||
add_ws_event(WS_EVENT_INTENT, intent_json), loop
|
||||
asyncio.run_coroutine_threadsafe(add_ws_event("intent", intent_json), loop)
|
||||
elif isinstance(message, WakeWordDetected):
|
||||
assert core is not None
|
||||
wake_json = json.dumps({"wakewordId": message.name, "siteId": core.siteId})
|
||||
asyncio.run_coroutine_threadsafe(add_ws_event("wake", wake_json), loop)
|
||||
elif isinstance(message, WavTranscription):
|
||||
assert core is not None
|
||||
transcription_json = json.dumps(
|
||||
{
|
||||
"text": message.text,
|
||||
"wakewordId": message.wakewordId,
|
||||
"siteId": core.siteId,
|
||||
}
|
||||
)
|
||||
asyncio.run_coroutine_threadsafe(
|
||||
add_ws_event("transcription_json", transcription_json), loop
|
||||
)
|
||||
elif isinstance(message, VoiceCommand):
|
||||
# Save last voice command
|
||||
last_voice_wav = buffer_to_wav(message.data)
|
||||
|
||||
|
||||
def api_websocket(func):
|
||||
"""Wraps a websocket route to use a user websocket queue"""
|
||||
|
||||
@wraps(func)
|
||||
async def wrapper(*_args, **kwargs):
|
||||
global user_queues
|
||||
queue = asyncio.Queue()
|
||||
user_queues.add(queue)
|
||||
try:
|
||||
return await func(queue, *_args, **kwargs)
|
||||
except Exception:
|
||||
logger.exception("api_websocket")
|
||||
finally:
|
||||
user_queues.discard(queue)
|
||||
|
||||
return wrapper
|
||||
|
||||
|
||||
@app.websocket("/api/events/intent")
|
||||
async def api_events_intent() -> None:
|
||||
@api_websocket
|
||||
async def api_events_intent(queue) -> None:
|
||||
"""Websocket endpoint to receive intents as JSON."""
|
||||
# Add new queue for websocket
|
||||
q: asyncio.Queue = asyncio.Queue()
|
||||
async with ws_locks[WS_EVENT_INTENT]:
|
||||
ws_queues[WS_EVENT_INTENT].append(q)
|
||||
|
||||
try:
|
||||
while True:
|
||||
text = await q.get()
|
||||
while True:
|
||||
message_type, text = await queue.get()
|
||||
if message_type == "intent":
|
||||
await websocket.send(text)
|
||||
except Exception:
|
||||
logger.exception("api_events_intent")
|
||||
|
||||
# Remove queue
|
||||
async with ws_locks[WS_EVENT_INTENT]:
|
||||
ws_queues[WS_EVENT_INTENT].remove(q)
|
||||
|
||||
@app.websocket("/api/events/text")
|
||||
@api_websocket
|
||||
async def api_events_text(queue) -> None:
|
||||
"""Websocket endpoint for transcriptions."""
|
||||
while True:
|
||||
message_type, text = await queue.get()
|
||||
if message_type == "transcription":
|
||||
await websocket.send(text)
|
||||
|
||||
|
||||
@app.websocket("/api/events/wake")
|
||||
@api_websocket
|
||||
async def api_events_wake(queue) -> None:
|
||||
"""Websocket endpoint to report wake up."""
|
||||
while True:
|
||||
message_type, text = await queue.get()
|
||||
if message_type == "wake":
|
||||
await websocket.send(text)
|
||||
|
||||
|
||||
@app.websocket("/api/events/log")
|
||||
@@ -1143,8 +1274,7 @@ async def api_events_log() -> None:
|
||||
"""Websocket endpoint to receive logging messages as text."""
|
||||
# Add new queue for websocket
|
||||
q: asyncio.Queue = asyncio.Queue()
|
||||
async with ws_locks[WS_EVENT_LOG]:
|
||||
ws_queues[WS_EVENT_LOG].append(q)
|
||||
logging_queues.add(q)
|
||||
|
||||
try:
|
||||
while True:
|
||||
@@ -1152,12 +1282,9 @@ async def api_events_log() -> None:
|
||||
await websocket.send(text)
|
||||
except concurrent.futures.CancelledError:
|
||||
pass
|
||||
except Exception:
|
||||
logger.exception("api_events_log")
|
||||
|
||||
# Remove queue
|
||||
async with ws_locks[WS_EVENT_LOG]:
|
||||
ws_queues[WS_EVENT_LOG].remove(q)
|
||||
logging_queues.discard(q)
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -1193,6 +1320,9 @@ loop.run_until_complete(start_rhasspy())
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
# Disable useless logging messages
|
||||
logging.getLogger("wsproto").setLevel(logging.CRITICAL)
|
||||
|
||||
# Start web server
|
||||
if args.ssl is not None:
|
||||
logger.debug("Using SSL with certfile, keyfile = %s", args.ssl)
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import sys
|
||||
import json
|
||||
import random
|
||||
import datetime
|
||||
|
||||
|
||||
def speech(text):
|
||||
global o
|
||||
o["speech"] = {"text": text}
|
||||
|
||||
|
||||
# get json from stdin and load into python dict
|
||||
o = json.loads(sys.stdin.read())
|
||||
|
||||
intent = o["intent"]["name"]
|
||||
|
||||
if intent == "GetTime":
|
||||
now = datetime.datetime.now()
|
||||
speech("It's %s %d %s." % (now.strftime('%H'), now.minute, now.strftime('%p')))
|
||||
|
||||
elif intent == "Hello":
|
||||
replies = ['Hi!', 'Hello!', 'Hey there!', 'Greetings.']
|
||||
speech(random.choice(replies))
|
||||
|
||||
# convert dict to json and print to stdout
|
||||
print(json.dumps(o))
|
||||
@@ -124,7 +124,7 @@ pip3 install -r requirements.txt
|
||||
You should also re-build the web interface:
|
||||
|
||||
1. Install [yarn](https://yarnpkg.com) on your system
|
||||
2. Run `yarn build` in the `rhasspy` directory
|
||||
2. Run `yarn install && yarn build` in the `rhasspy` directory
|
||||
3. Restart any running instances of Rhasspy
|
||||
|
||||
### Running as a Service
|
||||
|
||||
@@ -207,7 +207,7 @@ The following environment variables are available to your program:
|
||||
* `$RHASSPY_PROFILE` - name of the current profile (e.g., "en")
|
||||
* `$RHASSPY_PROFILE_DIR` - directory of the current profile (where `profile.json` is)
|
||||
|
||||
See [handle.sh](https://github.com/synesthesiam/rhasspy/blob/master/bin/mock-commands/handle.sh) for an example program.
|
||||
See [handle.sh](https://github.com/synesthesiam/rhasspy/blob/master/bin/mock-commands/handle.sh) or [handle.py](https://github.com/synesthesiam/rhasspy/blob/master/bin/mock-commands/handle.py) for example programs.
|
||||
|
||||
### Speech
|
||||
|
||||
|
||||
@@ -10,8 +10,8 @@ The following table summarizes the trade-offs of using each intent recognizer:
|
||||
| [fsticuffs](intent-recognition.md#fsticuffs) | 1M+ | very fast | very fast | ignores unknown words |
|
||||
| [fuzzywuzzy](intent-recognition.md#fuzzywuzzy) | 12-100 | fast | fast | fuzzy string matching |
|
||||
| [adapt](intent-recognition.md#mycroft-adapt) | 100-1K | moderate | fast | ignores unknown words |
|
||||
| [flair](intent-recognition.md#flair) | 1K-100K | very slow | moderate | handles unseen words |
|
||||
| [rasaNLU](intent-recognition.md#rasanlu) | 1K-100K | very slow | moderate | handles unseen words |
|
||||
| [flair](intent-recognition.md#flair) | 1K-100K | very slow | moderate | handles unseen words |
|
||||
|
||||
## Fsticuffs
|
||||
|
||||
|
||||
+2
-1
@@ -53,7 +53,8 @@ Application authors may want to use the [rhasspy-client](https://pypi.org/projec
|
||||
* `?timeout=<seconds>` - override default command timeout
|
||||
* `?entity=<entity>&value=<value>` - set custom entity/value in recognized intent
|
||||
* `/api/listen-for-wake-word`
|
||||
* POST to wake Rhasspy up and return immediately
|
||||
* POST "on" to have Rhasspy listen for a wake word
|
||||
* POST "off" to disable wake word
|
||||
* `/api/lookup`
|
||||
* POST word as plain text to look up or guess pronunciation
|
||||
* `?n=<number>` - return at most `n` guessed pronunciations
|
||||
|
||||
+16
-1
@@ -29,6 +29,19 @@ Add to your [profile](profiles.md):
|
||||
|
||||
Remove the `voice` option to have `espeak` use your profile's language automatically.
|
||||
|
||||
You may also pass additional arguments to the `espeak` command. For example,
|
||||
|
||||
```json
|
||||
"text_to_speech": {
|
||||
"system": "espeak",
|
||||
"espeak": {
|
||||
"arguments": ["-s", "80"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
will speak the sentence more slowly.
|
||||
|
||||
See `rhasspy.tts.EspeakSentenceSpeaker` for more details.
|
||||
|
||||
## Flite
|
||||
@@ -52,7 +65,9 @@ See `rhasspy.tts.FliteSentenceSpeaker` for details.
|
||||
|
||||
## PicoTTS
|
||||
|
||||
Uses SVOX's [picotts](https://en.wikipedia.org/wiki/SVOX) for text to speech. Sounds a bit better (to me) than `flite` or `espeak`, but only has a single English voice.
|
||||
Uses SVOX's [picotts](https://en.wikipedia.org/wiki/SVOX) for text to speech. Sounds a bit better (to me) than `flite` or `espeak`.
|
||||
|
||||
Included languages are `en-US`, `en-GB`, `de-DE`, `es-ES`, `fr-FR` and `it-IT`.
|
||||
|
||||
Add to your [profile](profiles.md):
|
||||
|
||||
|
||||
+1
-1
@@ -247,7 +247,7 @@ Add a file in `slot_programs` with the name of your slot, e.g. `colors`. Write a
|
||||
|
||||
```bash
|
||||
cat <<EOF > "${slot_programs}/colors"
|
||||
#/usr/bin/env bash
|
||||
#!/usr/bin/env bash
|
||||
echo 'red'
|
||||
echo 'green'
|
||||
echo 'blue'
|
||||
|
||||
@@ -2,6 +2,11 @@
|
||||
|
||||
* [RGB Light Example](#rgb-light-example)
|
||||
* [Client/Server Setup](#clientserver-setup)
|
||||
* MATRIX Labs
|
||||
* [Rhasspy Voice Assistant on MATRIX Voice and MATRIX Creator](https://www.hackster.io/matrix-labs/rhasspy-voice-assistant-on-matrix-voice-and-matrix-creator-97f92e)
|
||||
* [Adding Intents for Rhasspy Offline Voice Assistant](https://www.hackster.io/matrix-labs/adding-intents-for-rhasspy-offline-voice-assistant-faa221)
|
||||
* Rendered Obsolete
|
||||
* [Home Assistant Voice Recognition with Rhasspy](https://rendered-obsolete.github.io/2020/01/02/rhasspy.html)
|
||||
|
||||
## RGB Light Example
|
||||
|
||||
|
||||
+39
-1
@@ -142,7 +142,18 @@ More example flows are available [on Github](https://github.com/synesthesiam/rha
|
||||
|
||||
### WebSocket Events
|
||||
|
||||
Whenever a voice command is recognized, Rhasspy emits JSON events over a websocket connection available at `ws://rhasspy:12101/api/events/intent` (replace `ws://` with `wss://` if you're using [secure hosting](usage.md#secure-hosting-with-https)).
|
||||
Rhasspy supports multiple websocket event endpoints:
|
||||
|
||||
* `/api/events/intent`
|
||||
* Intent recognized or not
|
||||
* `/api/events/wake`
|
||||
* Wake word detected
|
||||
* `/api/events/text`
|
||||
* Speech transcription
|
||||
|
||||
#### WebSocket Intents
|
||||
|
||||
Whenever a voice command is recognized, Rhasspy emits JSON events over a websocket connection available at `ws://YOUR_SERVER:12101/api/events/intent` (replace `ws://` with `wss://` if you're using [secure hosting](usage.md#secure-hosting-with-https)).
|
||||
You can listen to these events in a [Node-RED](https://nodered.org) flow, and easily add offline, private voice commands to your home automation set up!
|
||||
|
||||
For the `ChangLightState` intent from the [RGB Light Example](index.md#rgb-light-example), Rhasspy will emit a JSON event like this over the websocket:
|
||||
@@ -171,6 +182,33 @@ For the `ChangLightState` intent from the [RGB Light Example](index.md#rgb-light
|
||||
}
|
||||
```
|
||||
|
||||
#### WebSocket Wake
|
||||
|
||||
When the wake word is detected, or Rhasspy is woken up via the `/api/listen-for-command` HTTP endpoint, a JSON event is emitted at `ws://YOUR_SERVER:12101/api/events/wake` (`wss://` if using HTTPS) like:
|
||||
|
||||
```json
|
||||
{
|
||||
"wakewordId": "default",
|
||||
"siteId": "default"
|
||||
}
|
||||
```
|
||||
|
||||
The `wakewordId` is set using the model or file name of your wakeword model (e.g., `porcupine` for `porcupine.ppn`). The `siteId` comes from your `mqtt.siteId` profile setting.
|
||||
|
||||
#### WebSocket Transcriptions
|
||||
|
||||
Each time a voice command is transcribed, Rhasspy emits a JSON event at `ws://YOUR_SERVER:12101/api/events/text` (`wss://` if using HTTPS) like:
|
||||
|
||||
```json
|
||||
{
|
||||
"text": "text from voice command",
|
||||
"wakewordId": "default",
|
||||
"siteId": "default"
|
||||
}
|
||||
```
|
||||
|
||||
The transcription is contained in the `text` property. `wakewordId` is the id of the wakeword that initiated the voice command (or `default`). The `siteId` comes from your `mqtt.siteId` profile setting.
|
||||
|
||||
## MQTT and Snips
|
||||
|
||||
Rhasspy is able to interoperate with Snips.AI services using the [Hermes protocol](https://docs.snips.ai/reference/hermes) over [MQTT](http://mqtt.org). The following components are Snips/Hermes compatible:
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
"base_language_model": "kaldi/base_language_model.txt",
|
||||
"base_language_model_fst": "kaldi/base_language_model.fst",
|
||||
"compatible": true,
|
||||
"custom_words": "kaldi/custom_words.txt",
|
||||
"custom_words": "kaldi_custom_words.txt",
|
||||
"dictionary": "kaldi/dictionary.txt",
|
||||
"graph": "graph",
|
||||
"language_model": "kaldi/language_model.txt",
|
||||
|
||||
@@ -76,7 +76,8 @@
|
||||
"rasa": {
|
||||
"examples_markdown": "intent_examples.md",
|
||||
"project_name": "rhasspy",
|
||||
"url": "http://localhost:5005/"
|
||||
"url": "http://localhost:5005/",
|
||||
"model_dir": "/app/models"
|
||||
},
|
||||
"remote": {
|
||||
"url": "http://my-server:12101/api/text-to-intent"
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
"base_language_model": "kaldi/base_language_model.txt",
|
||||
"base_language_model_fst": "kaldi/base_language_model.fst",
|
||||
"compatible": true,
|
||||
"custom_words": "kaldi/custom_words.txt",
|
||||
"custom_words": "kaldi_custom_words.txt",
|
||||
"dictionary": "kaldi/dictionary.txt",
|
||||
"graph": "graph",
|
||||
"language_model": "kaldi/language_model.txt",
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
"base_language_model": "kaldi/base_language_model.txt",
|
||||
"base_language_model_fst": "kaldi/base_language_model.fst",
|
||||
"compatible": true,
|
||||
"custom_words": "kaldi/custom_words.txt",
|
||||
"custom_words": "kaldi_custom_words.txt",
|
||||
"dictionary": "kaldi/dictionary.txt",
|
||||
"graph": "graph",
|
||||
"language_model": "kaldi/language_model.txt",
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
"base_dictionary": "kaldi/base_dictionary.txt",
|
||||
"base_language_model": "kaldi/base_language_model.txt",
|
||||
"compatible": true,
|
||||
"custom_words": "kaldi/custom_words.txt",
|
||||
"custom_words": "kaldi_custom_words.txt",
|
||||
"dictionary": "kaldi/dictionary.txt",
|
||||
"graph": "graph",
|
||||
"language_model": "kaldi/language_model.txt",
|
||||
|
||||
@@ -13,6 +13,11 @@ body {
|
||||
z-index: 9999;
|
||||
}
|
||||
|
||||
#logo {
|
||||
border-color: red;
|
||||
border-width: 0;
|
||||
}
|
||||
|
||||
.response {
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
@@ -538,3 +538,15 @@ paths:
|
||||
description: intents
|
||||
schema:
|
||||
type: object
|
||||
/api/play-recording:
|
||||
post:
|
||||
summary: 'Play the last recorded voice command from web API'
|
||||
produces:
|
||||
- text/plain
|
||||
responses:
|
||||
'200':
|
||||
description: OK
|
||||
content:
|
||||
text/plain:
|
||||
schema:
|
||||
type: string
|
||||
|
||||
+2
-2
@@ -4,7 +4,7 @@ doit==0.31.1
|
||||
fuzzywuzzy[speedup]==0.17.0
|
||||
google-cloud-texttospeech==0.5.0
|
||||
html5lib==1.0.1
|
||||
json5==0.8.5
|
||||
json5==0.7.0
|
||||
multidict==4.6.1
|
||||
networkx>=2.0
|
||||
num2words==0.5.10
|
||||
@@ -15,6 +15,6 @@ pydash==4.7.6
|
||||
quart==0.6.15
|
||||
quart-cors==0.1.3
|
||||
requests==2.22.0
|
||||
rhasspy-nlu==0.1.4.1
|
||||
rhasspy-nlu==0.1.6
|
||||
swagger-ui-py==0.1.7
|
||||
webrtcvad==2.0.10
|
||||
|
||||
+1
-1
@@ -618,7 +618,7 @@ async def wav2mqtt(core: RhasspyCore, profile: Profile, args: Any) -> None:
|
||||
|
||||
async def text2wav(core: RhasspyCore, profile: Profile, args: Any) -> None:
|
||||
"""Speak a sentence and output WAV data"""
|
||||
result = await core.speak_sentence(args)
|
||||
result = await core.speak_sentence(args.sentence)
|
||||
sys.stdout.buffer.write(result.wav_data)
|
||||
|
||||
|
||||
|
||||
+44
-9
@@ -39,6 +39,7 @@ from rhasspy.events import (
|
||||
SentenceSpoken,
|
||||
SpeakSentence,
|
||||
SpeakWord,
|
||||
StopListeningForWakeWord,
|
||||
StartRecordingToBuffer,
|
||||
StopRecordingToBuffer,
|
||||
TestMicrophones,
|
||||
@@ -88,6 +89,8 @@ class RhasspyCore:
|
||||
self._session: Optional[aiohttp.ClientSession] = aiohttp.ClientSession()
|
||||
self.dialogue_manager: Optional[RhasspyActor] = None
|
||||
|
||||
self.download_status: List[str] = []
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
@property
|
||||
@@ -96,6 +99,14 @@ class RhasspyCore:
|
||||
assert self._session is not None
|
||||
return self._session
|
||||
|
||||
@property
|
||||
def siteId(self) -> str:
|
||||
"""Get default MQTT siteId"""
|
||||
try:
|
||||
siteIds = self.profile.get("mqtt.siteId", "default").split(",")[0]
|
||||
except Exception:
|
||||
return "default"
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
async def start(
|
||||
@@ -160,10 +171,14 @@ class RhasspyCore:
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def listen_for_wake(self) -> None:
|
||||
def listen_for_wake(self, enabled: bool = True) -> None:
|
||||
"""Tell Rhasspy to start listening for a wake word."""
|
||||
assert self.actor_system is not None
|
||||
self.actor_system.tell(self.dialogue_manager, ListenForWakeWord())
|
||||
|
||||
if enabled:
|
||||
self.actor_system.tell(self.dialogue_manager, ListenForWakeWord())
|
||||
else:
|
||||
self.actor_system.tell(self.dialogue_manager, StopListeningForWakeWord())
|
||||
|
||||
async def listen_for_command(
|
||||
self,
|
||||
@@ -344,9 +359,10 @@ class RhasspyCore:
|
||||
"""Generate speech/intent artifacts for profile."""
|
||||
if no_cache:
|
||||
# Delete doit database
|
||||
db_path = Path(self.profile.write_path(".doit.db"))
|
||||
if db_path.is_file():
|
||||
db_path.unlink()
|
||||
profile_dir = Path(self.profile.write_path())
|
||||
for db_path in profile_dir.glob(".doit.db*"):
|
||||
if db_path.is_file():
|
||||
db_path.unlink()
|
||||
|
||||
assert self.actor_system is not None
|
||||
with self.actor_system.private() as sys:
|
||||
@@ -480,6 +496,8 @@ class RhasspyCore:
|
||||
|
||||
async def download_profile(self, delete=False, chunk_size=4096) -> None:
|
||||
"""Download all necessary profile files from the internet and extract them."""
|
||||
self.download_status = []
|
||||
|
||||
output_dir = Path(self.profile.write_path())
|
||||
download_dir = Path(
|
||||
self.profile.write_path(self.profile.get("download.cache_dir", "download"))
|
||||
@@ -500,7 +518,9 @@ class RhasspyCore:
|
||||
|
||||
async def download_file(url, filename):
|
||||
try:
|
||||
self._logger.debug("Downloading %s to %s", url, filename)
|
||||
status = f"Downloading {url} to {filename}"
|
||||
self.download_status.append(status)
|
||||
self._logger.debug(status)
|
||||
os.makedirs(os.path.dirname(filename), exist_ok=True)
|
||||
|
||||
async with self.session.get(url) as response:
|
||||
@@ -508,10 +528,21 @@ class RhasspyCore:
|
||||
async for chunk in response.content.iter_chunked(chunk_size):
|
||||
out_file.write(chunk)
|
||||
|
||||
self._logger.debug("Downloaded %s", filename)
|
||||
status = f"Downloaded {filename}"
|
||||
self.download_status.append(status)
|
||||
self._logger.debug(status)
|
||||
except Exception:
|
||||
self._logger.exception(url)
|
||||
|
||||
# Try to delete partially downloaded file
|
||||
try:
|
||||
status = f"Failed to download {filename}"
|
||||
self.download_status.append(status)
|
||||
self._logger.debug(status)
|
||||
os.unlink(filename)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Check conditions
|
||||
machine_type = platform.machine()
|
||||
download_tasks = []
|
||||
@@ -595,7 +626,9 @@ class RhasspyCore:
|
||||
os.makedirs(os.path.dirname(dest_path), exist_ok=True)
|
||||
|
||||
# Copy file/directory as is
|
||||
self._logger.debug("Copying %s to %s", src_path, dest_path)
|
||||
status = f"Copying {src_path} to {dest_path}"
|
||||
self.download_status.append(status)
|
||||
self._logger.debug(status)
|
||||
if os.path.isdir(src_path):
|
||||
shutil.copytree(src_path, dest_path)
|
||||
else:
|
||||
@@ -668,7 +701,9 @@ class RhasspyCore:
|
||||
extract_path = os.path.join(temp_dir, src_extract)
|
||||
|
||||
# Copy specific file/directory
|
||||
self._logger.debug("Copying %s to %s", extract_path, dest_path)
|
||||
status = f"Copying {extract_path} to {dest_path}"
|
||||
self.download_status.append(status)
|
||||
self._logger.debug(status)
|
||||
if os.path.isdir(extract_path):
|
||||
if src_exclude:
|
||||
# Ignore some files
|
||||
|
||||
+26
-1
@@ -386,6 +386,10 @@ class DialogueManager(RhasspyActor):
|
||||
for hook_url in awake_hooks:
|
||||
self._logger.debug("POST-ing to %s", hook_url)
|
||||
requests.post(hook_url, json=hook_json)
|
||||
|
||||
# Forward to observer
|
||||
if self.observer:
|
||||
self.send(self.observer, message)
|
||||
elif isinstance(message, WakeWordNotDetected):
|
||||
self._logger.debug("Wake word NOT detected. Staying asleep.")
|
||||
self.transition("ready")
|
||||
@@ -423,6 +427,10 @@ class DialogueManager(RhasspyActor):
|
||||
wav_data = buffer_to_wav(message.data)
|
||||
self.send(self.decoder, TranscribeWav(wav_data, handle=message.handle))
|
||||
self.transition("decoding")
|
||||
|
||||
# Forward to observer
|
||||
if self.observer:
|
||||
self.send(self.observer, message)
|
||||
else:
|
||||
self.handle_any(message, sender)
|
||||
|
||||
@@ -433,6 +441,15 @@ class DialogueManager(RhasspyActor):
|
||||
def in_decoding(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in decoding state."""
|
||||
if isinstance(message, WavTranscription):
|
||||
message.wakewordId = self.wake_detected_name or "default"
|
||||
|
||||
# Fix casing
|
||||
dict_casing = self.profile.get("speech_to_text.dictionary_casing", "")
|
||||
if dict_casing == "lower":
|
||||
message.text = message.text.lower()
|
||||
elif dict_casing == "upper":
|
||||
message.text = message.text.upper()
|
||||
|
||||
# text -> intent
|
||||
self._logger.debug("%s (confidence=%s)", message.text, message.confidence)
|
||||
|
||||
@@ -447,7 +464,8 @@ class DialogueManager(RhasspyActor):
|
||||
"text": message.text,
|
||||
"likelihood": 1,
|
||||
"seconds": 0,
|
||||
"wakeId": self.wake_detected_name or "",
|
||||
"wakeId": message.wakewordId,
|
||||
"wakewordId": message.wakewordId,
|
||||
}
|
||||
).encode()
|
||||
|
||||
@@ -460,6 +478,10 @@ class DialogueManager(RhasspyActor):
|
||||
)
|
||||
self.send(self.mqtt, MqttPublish("hermes/asr/textCaptured", payload))
|
||||
|
||||
# Forward to observer
|
||||
if self.observer:
|
||||
self.send(self.observer, message)
|
||||
|
||||
# Pass to intent recognizer
|
||||
self.send(
|
||||
self.recognizer,
|
||||
@@ -732,6 +754,9 @@ class DialogueManager(RhasspyActor):
|
||||
elif isinstance(message, GetProblems):
|
||||
# Report problems from child actors
|
||||
self.send(sender, Problems(self.problems))
|
||||
elif isinstance(message, (ListenForWakeWord, StopListeningForWakeWord)):
|
||||
# Forward to wake actor
|
||||
self.send(self.wake, message)
|
||||
else:
|
||||
self.handle_forward(message, sender)
|
||||
|
||||
|
||||
+8
-1
@@ -390,10 +390,17 @@ class TranscribeWav:
|
||||
class WavTranscription:
|
||||
"""Response to TranscribeWav."""
|
||||
|
||||
def __init__(self, text: str, handle: bool = True, confidence: float = 1) -> None:
|
||||
def __init__(
|
||||
self,
|
||||
text: str,
|
||||
handle: bool = True,
|
||||
confidence: float = 1,
|
||||
wakewordId: str = "default",
|
||||
) -> None:
|
||||
self.text = text
|
||||
self.confidence = confidence
|
||||
self.handle = handle
|
||||
self.wakewordId = wakewordId
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
+5
-153
@@ -30,7 +30,6 @@ def get_recognizer_class(system: str) -> Type[RhasspyActor]:
|
||||
"adapt",
|
||||
"rasa",
|
||||
"remote",
|
||||
"flair",
|
||||
"conversation",
|
||||
"command",
|
||||
], f"Invalid intent system: {system}"
|
||||
@@ -54,10 +53,6 @@ def get_recognizer_class(system: str) -> Type[RhasspyActor]:
|
||||
# Use remote rhasspy server
|
||||
return RemoteRecognizer
|
||||
|
||||
if system == "flair":
|
||||
# Use flair locally
|
||||
return FlairRecognizer
|
||||
|
||||
if system == "conversation":
|
||||
# Use HA conversation
|
||||
return HomeAssistantConversationRecognizer
|
||||
@@ -293,8 +288,8 @@ class FuzzyWuzzyRecognizer(RhasspyActor):
|
||||
self._logger.exception("in_loaded")
|
||||
intent = empty_intent()
|
||||
intent["text"] = message.text
|
||||
intent["raw_text"] = message.text
|
||||
|
||||
intent["raw_text"] = message.text
|
||||
intent["speech_confidence"] = message.confidence
|
||||
self.send(
|
||||
message.receiver or sender,
|
||||
@@ -409,13 +404,14 @@ class RasaIntentRecognizer(RhasspyActor):
|
||||
if isinstance(message, RecognizeIntent):
|
||||
try:
|
||||
intent = self.recognize(message.text)
|
||||
intent["intent"]["name"] = intent["intent"]["name"] or ""
|
||||
logging.debug(repr(intent))
|
||||
except Exception:
|
||||
self._logger.exception("in_started")
|
||||
intent = empty_intent()
|
||||
intent["text"] = message.text
|
||||
intent["raw_text"] = message.text
|
||||
|
||||
intent["raw_text"] = message.text
|
||||
self.send(
|
||||
message.receiver or sender,
|
||||
IntentRecognized(intent, handle=message.handle),
|
||||
@@ -476,8 +472,8 @@ class AdaptIntentRecognizer(RhasspyActor):
|
||||
self._logger.exception("in_loaded")
|
||||
intent = empty_intent()
|
||||
intent["text"] = message.text
|
||||
intent["raw_text"] = message.text
|
||||
|
||||
intent["raw_text"] = message.text
|
||||
intent["speech_confidence"] = message.confidence
|
||||
self.send(
|
||||
message.receiver or sender,
|
||||
@@ -558,150 +554,6 @@ class AdaptIntentRecognizer(RhasspyActor):
|
||||
self._logger.debug("Loaded engine from config file %s", config_path)
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Flair Intent Recognizer
|
||||
# https://github.com/zalandoresearch/flair
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
|
||||
class FlairRecognizer(RhasspyActor):
|
||||
"""Flair based recognizer"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
RhasspyActor.__init__(self)
|
||||
|
||||
try:
|
||||
# pylint: disable=E0401
|
||||
from flair.models import TextClassifier, SequenceTagger
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
self.class_model: Optional[TextClassifier] = None
|
||||
self.ner_models: Optional[Dict[str, SequenceTagger]] = None
|
||||
self.intent_map: Optional[Dict[str, str]] = None
|
||||
self.preload = False
|
||||
|
||||
def to_started(self, from_state: str) -> None:
|
||||
"""Transition to started state."""
|
||||
self.preload = self.config.get("preload", False)
|
||||
if self.preload:
|
||||
try:
|
||||
# Pre-load models
|
||||
self.load_models()
|
||||
except Exception as e:
|
||||
self._logger.warning("preload: %s", e)
|
||||
|
||||
def in_started(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in started state."""
|
||||
if isinstance(message, RecognizeIntent):
|
||||
try:
|
||||
self.load_models()
|
||||
intent = self.recognize(message.text)
|
||||
except Exception:
|
||||
self._logger.exception("in_started")
|
||||
intent = empty_intent()
|
||||
intent["text"] = message.text
|
||||
intent["raw_text"] = message.text
|
||||
|
||||
intent["speech_confidence"] = message.confidence
|
||||
self.send(
|
||||
message.receiver or sender,
|
||||
IntentRecognized(intent, handle=message.handle),
|
||||
)
|
||||
|
||||
def recognize(self, text: str) -> Dict[str, Any]:
|
||||
"""Run intent classifier and then named-entity recognizer."""
|
||||
# pylint: disable=E0401
|
||||
from flair.data import Sentence
|
||||
|
||||
intent = empty_intent()
|
||||
sentence = Sentence(text)
|
||||
|
||||
assert self.intent_map is not None
|
||||
if self.class_model is not None:
|
||||
self.class_model.predict(sentence)
|
||||
assert sentence.labels, "No intent predicted"
|
||||
|
||||
label = sentence.labels[0]
|
||||
intent_id = label.value
|
||||
intent["intent"]["confidence"] = label.score
|
||||
else:
|
||||
# Assume first intent
|
||||
intent_id = next(iter(self.intent_map))
|
||||
intent["intent"]["confidence"] = 1
|
||||
|
||||
intent["intent"]["name"] = self.intent_map[intent_id]
|
||||
|
||||
assert self.ner_models is not None
|
||||
if intent_id in self.ner_models:
|
||||
# Predict entities
|
||||
self.ner_models[intent_id].predict(sentence)
|
||||
ner_dict = sentence.to_dict(tag_type="ner")
|
||||
for named_entity in ner_dict["entities"]:
|
||||
intent["entities"].append(
|
||||
{
|
||||
"entity": named_entity["type"],
|
||||
"value": named_entity["text"],
|
||||
"start": named_entity["start_pos"],
|
||||
"end": named_entity["end_pos"],
|
||||
"confidence": named_entity["confidence"],
|
||||
}
|
||||
)
|
||||
|
||||
return intent
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def load_models(self) -> None:
|
||||
"""Load intent classifier and named entity recognizers."""
|
||||
# pylint: disable=E0401
|
||||
from flair.models import TextClassifier, SequenceTagger
|
||||
|
||||
# Load mapping from intent id to user intent name
|
||||
if self.intent_map is None:
|
||||
intent_map_path = self.profile.read_path(
|
||||
self.profile.get("training.intent.intent_map", "intent_map.json")
|
||||
)
|
||||
|
||||
with open(intent_map_path, "r") as intent_map_file:
|
||||
self.intent_map = json.load(intent_map_file)
|
||||
|
||||
data_dir = self.profile.read_path(
|
||||
self.profile.get("intent.flair.data_dir", "flair_data")
|
||||
)
|
||||
|
||||
# Only load intent classifier if there is more than one intent
|
||||
if (self.class_model is None) and (len(self.intent_map) > 1):
|
||||
class_model_path = os.path.join(
|
||||
data_dir, "classification", "final-model.pt"
|
||||
)
|
||||
self._logger.debug("Loading classification model from %s", class_model_path)
|
||||
self.class_model = TextClassifier.load_from_file(class_model_path)
|
||||
self._logger.debug("Loaded classification model")
|
||||
|
||||
if self.ner_models is None:
|
||||
ner_models = {}
|
||||
ner_data_dir = os.path.join(data_dir, "ner")
|
||||
for file_name in os.listdir(ner_data_dir):
|
||||
ner_model_dir = os.path.join(ner_data_dir, file_name)
|
||||
if os.path.isdir(ner_model_dir):
|
||||
# Assume directory is intent name
|
||||
intent_name = file_name
|
||||
if intent_name not in self.intent_map:
|
||||
self._logger.warning(
|
||||
"%s was not found in intent map", intent_name
|
||||
)
|
||||
|
||||
ner_model_path = os.path.join(ner_model_dir, "final-model.pt")
|
||||
self._logger.debug("Loading NER model from %s", ner_model_path)
|
||||
ner_models[intent_name] = SequenceTagger.load_from_file(
|
||||
ner_model_path
|
||||
)
|
||||
|
||||
self._logger.debug("Loaded NER model(s)")
|
||||
self.ner_models = ner_models
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Home Assistant Conversation
|
||||
# https://www.home-assistant.io/integrations/conversation
|
||||
@@ -807,8 +659,8 @@ class CommandRecognizer(RhasspyActor):
|
||||
self._logger.exception("in_started")
|
||||
intent = empty_intent()
|
||||
intent["text"] = message.text
|
||||
intent["raw_text"] = message.text
|
||||
|
||||
intent["raw_text"] = message.text
|
||||
intent["speech_confidence"] = message.confidence
|
||||
self.send(
|
||||
message.receiver or sender,
|
||||
|
||||
+81
-325
@@ -1,12 +1,8 @@
|
||||
"""Training for intent recognizers."""
|
||||
import json
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
import tempfile
|
||||
import time
|
||||
from collections import Counter, defaultdict
|
||||
from io import StringIO
|
||||
from typing import Any, Callable, Dict, List, Set, Type
|
||||
@@ -14,7 +10,7 @@ from urllib.parse import urljoin
|
||||
|
||||
from rhasspy.actor import RhasspyActor
|
||||
from rhasspy.events import IntentTrainingComplete, IntentTrainingFailed, TrainIntent
|
||||
from rhasspy.utils import lcm, make_sentences_by_intent, load_converters
|
||||
from rhasspy.utils import make_sentences_by_intent, load_converters
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
@@ -30,7 +26,6 @@ def get_intent_trainer_class(
|
||||
"fuzzywuzzy",
|
||||
"adapt",
|
||||
"rasa",
|
||||
"flair",
|
||||
"auto",
|
||||
"command",
|
||||
], f"Invalid intent training system: {trainer_system}"
|
||||
@@ -46,9 +41,6 @@ def get_intent_trainer_class(
|
||||
if recognizer_system == "adapt":
|
||||
# Use Mycroft Adapt locally
|
||||
return AdaptIntentTrainer
|
||||
if recognizer_system == "flair":
|
||||
# Use flair locally
|
||||
return FlairIntentTrainer
|
||||
if recognizer_system == "rasa":
|
||||
# Use Rasa NLU remotely
|
||||
return RasaIntentTrainer
|
||||
@@ -67,9 +59,6 @@ def get_intent_trainer_class(
|
||||
if trainer_system == "rasa":
|
||||
# Use Rasa NLU remotely
|
||||
return RasaIntentTrainer
|
||||
if trainer_system == "flair":
|
||||
# Use flair RNN locally
|
||||
return FlairIntentTrainer
|
||||
if trainer_system == "command":
|
||||
# Use command-line intent trainer
|
||||
return CommandIntentTrainer
|
||||
@@ -96,7 +85,7 @@ class DummyIntentTrainer(RhasspyActor):
|
||||
|
||||
|
||||
class FsticuffsIntentTrainer(DummyIntentTrainer):
|
||||
"""No training needed. Intent FST will be used directly during recognition."""
|
||||
"""No training needed. Intent graph will be used directly during recognition."""
|
||||
|
||||
pass
|
||||
|
||||
@@ -114,6 +103,10 @@ class FuzzyWuzzyIntentTrainer(RhasspyActor):
|
||||
RhasspyActor.__init__(self)
|
||||
self.converters: Dict[str, Callable[..., Any]] = {}
|
||||
|
||||
def to_started(self, from_state: str) -> None:
|
||||
# Load user-defined converters
|
||||
self.converters = load_converters(self.profile)
|
||||
|
||||
def in_started(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in started state."""
|
||||
if isinstance(message, TrainIntent):
|
||||
@@ -130,9 +123,8 @@ class FuzzyWuzzyIntentTrainer(RhasspyActor):
|
||||
self.profile.get("intent.fuzzywuzzy.examples_json")
|
||||
)
|
||||
|
||||
converters = load_converters(self.profile)
|
||||
sentences_by_intent = make_sentences_by_intent(
|
||||
intent_graph, extra_converters=converters
|
||||
intent_graph, extra_converters=self.converters
|
||||
)
|
||||
with open(examples_path, "w") as examples_file:
|
||||
json.dump(sentences_by_intent, examples_file, indent=4)
|
||||
@@ -153,11 +145,15 @@ class RasaIntentTrainer(RhasspyActor):
|
||||
RhasspyActor.__init__(self)
|
||||
self.converters: Dict[str, Callable[..., Any]] = {}
|
||||
|
||||
def to_started(self, from_state: str) -> None:
|
||||
# Load user-defined converters
|
||||
self.converters = load_converters(self.profile)
|
||||
|
||||
def in_started(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in started state."""
|
||||
if isinstance(message, TrainIntent):
|
||||
try:
|
||||
self.train(message.intent_fst)
|
||||
self.train(message.intent_graph)
|
||||
self.send(message.receiver or sender, IntentTrainingComplete())
|
||||
except Exception as e:
|
||||
self._logger.exception("train")
|
||||
@@ -165,9 +161,8 @@ class RasaIntentTrainer(RhasspyActor):
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def train(self, intent_fst) -> None:
|
||||
def train(self, intent_graph) -> None:
|
||||
"""Convert examples to Markdown and POST to RasaNLU server."""
|
||||
from rhasspy.train.jsgf2fst import fstprintall
|
||||
import requests
|
||||
|
||||
# Load settings
|
||||
@@ -183,39 +178,58 @@ class RasaIntentTrainer(RhasspyActor):
|
||||
)
|
||||
|
||||
# Build Markdown sentences
|
||||
sentences_by_intent: Dict[str, Any] = defaultdict(list)
|
||||
for symbols in fstprintall(intent_fst, exclude_meta=False):
|
||||
intent_name = ""
|
||||
strings = []
|
||||
for sym in symbols:
|
||||
if sym.startswith("<"):
|
||||
continue # <eps>
|
||||
sentences_by_intent = make_sentences_by_intent(
|
||||
intent_graph, extra_converters=self.converters
|
||||
)
|
||||
|
||||
if sym.startswith("__label__"):
|
||||
intent_name = sym[9:]
|
||||
elif sym.startswith("__begin__"):
|
||||
strings.append("[")
|
||||
elif sym.startswith("__end__"):
|
||||
strings[-1] = strings[-1].strip()
|
||||
tag = sym[7:]
|
||||
strings.append(f"]({tag})")
|
||||
strings.append(" ")
|
||||
else:
|
||||
strings.append(sym)
|
||||
strings.append(" ")
|
||||
|
||||
sentence = "".join(strings).strip()
|
||||
sentences_by_intent[intent_name].append(sentence)
|
||||
|
||||
# Write to YAML file
|
||||
# Write to YAML/Markdown file
|
||||
with open(examples_md_path, "w") as examples_md_file:
|
||||
for intent_name, intent_sents in sentences_by_intent.items():
|
||||
# Rasa Markdown training format
|
||||
print(f"## intent:{intent_name}", file=examples_md_file)
|
||||
for intent_sent in intent_sents:
|
||||
print("-", intent_sent, file=examples_md_file)
|
||||
raw_index = 0
|
||||
index_entity = {e["raw_start"]: e for e in intent_sent["entities"]}
|
||||
entity = None
|
||||
sentence_tokens = []
|
||||
entity_tokens = []
|
||||
for token in intent_sent["raw_tokens"]:
|
||||
if entity and (raw_index >= entity["raw_end"]):
|
||||
# Finish current entity
|
||||
last_token = entity_tokens[-1]
|
||||
entity_tokens[-1] = f"{last_token}]({entity['entity']})"
|
||||
sentence_tokens.extend(entity_tokens)
|
||||
entity = None
|
||||
entity_tokens = []
|
||||
|
||||
print("", file=examples_md_file)
|
||||
new_entity = index_entity.get(raw_index)
|
||||
if new_entity:
|
||||
# Begin new entity
|
||||
assert entity is None, "Unclosed entity"
|
||||
entity = new_entity
|
||||
entity_tokens = []
|
||||
token = f"[{token}"
|
||||
|
||||
if entity:
|
||||
# Add to current entity
|
||||
entity_tokens.append(token)
|
||||
else:
|
||||
# Add directly to sentence
|
||||
sentence_tokens.append(token)
|
||||
|
||||
raw_index += len(token) + 1
|
||||
|
||||
if entity:
|
||||
# Finish final entity
|
||||
last_token = entity_tokens[-1]
|
||||
entity_tokens[-1] = f"{last_token}]({entity['entity']})"
|
||||
sentence_tokens.extend(entity_tokens)
|
||||
|
||||
# Print single example
|
||||
print("-", " ".join(sentence_tokens), file=examples_md_file)
|
||||
|
||||
# Newline between intents
|
||||
print("", file=examples_md_file)
|
||||
|
||||
# Create training YAML file
|
||||
with tempfile.NamedTemporaryFile(
|
||||
@@ -263,6 +277,14 @@ class RasaIntentTrainer(RhasspyActor):
|
||||
|
||||
try:
|
||||
response.raise_for_status()
|
||||
|
||||
model_dir = rasa_config.get("model_dir", "")
|
||||
model_file = os.path.join(model_dir, response.headers["filename"])
|
||||
self._logger.debug("Received model %s", model_file)
|
||||
|
||||
# Replace model
|
||||
model_url = urljoin(url, "model")
|
||||
requests.put(model_url, json={"model_file": model_file})
|
||||
except Exception:
|
||||
# Rasa gives quite helpful error messages, so extract them from the response.
|
||||
raise Exception(
|
||||
@@ -291,7 +313,7 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
"""Handle messages in started state."""
|
||||
if isinstance(message, TrainIntent):
|
||||
try:
|
||||
self.train(message.intent_fst)
|
||||
self.train(message.intent_graph)
|
||||
self.send(message.receiver or sender, IntentTrainingComplete())
|
||||
except Exception as e:
|
||||
self._logger.exception("train")
|
||||
@@ -299,7 +321,7 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def train(self, intent_fst) -> None:
|
||||
def train(self, intent_graph) -> None:
|
||||
"""Create intents, entities, and keywords."""
|
||||
# Load "stop" words (common words that are excluded from training)
|
||||
stop_words: Set[str] = set()
|
||||
@@ -309,7 +331,9 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
stop_words = {line.strip() for line in stop_words_file if line.strip()}
|
||||
|
||||
# { intent: [ { 'text': ..., 'entities': { ... } }, ... ] }
|
||||
sentences_by_intent: Dict[str, Any] = make_sentences_by_intent(intent_fst)
|
||||
sentences_by_intent = make_sentences_by_intent(
|
||||
intent_graph, extra_converters=self.converters
|
||||
)
|
||||
|
||||
# Generate intent configuration
|
||||
entities: Dict[str, Set[str]] = {}
|
||||
@@ -328,17 +352,12 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
|
||||
# Process sentences for this intent
|
||||
for intent_sent in intent_sents:
|
||||
_, slots, word_tokens = (
|
||||
intent_sent.get("raw_text", intent_sent["text"]),
|
||||
intent_sent["entities"],
|
||||
intent_sent["tokens"],
|
||||
)
|
||||
entity_tokens: Set[str] = set()
|
||||
|
||||
# Group slot values by entity
|
||||
slot_entities: Dict[str, List[str]] = defaultdict(list)
|
||||
for sent_ent in slots:
|
||||
slot_entities[sent_ent["entity"]].append(sent_ent["value"])
|
||||
for sent_ent in intent_sent["entities"]:
|
||||
slot_entities[sent_ent["entity"]].append(sent_ent["raw_value"])
|
||||
|
||||
# Add entities
|
||||
for entity_name, entity_values in slot_entities.items():
|
||||
@@ -352,10 +371,10 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
|
||||
# Split entity values by whitespace
|
||||
for value in entity_values:
|
||||
entity_tokens.update(re.split(r"\s", value))
|
||||
entity_tokens.update(value.split())
|
||||
|
||||
# Get all non-stop words that are not part of entity values
|
||||
words = set(word_tokens) - entity_tokens - stop_words
|
||||
words = set(intent_sent["raw_tokens"]) - entity_tokens - stop_words
|
||||
|
||||
# Increment count for words
|
||||
for word in words:
|
||||
@@ -415,273 +434,6 @@ class AdaptIntentTrainer(RhasspyActor):
|
||||
self._logger.debug("Wrote adapt configuration to %s", config_path)
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Flair Intent Trainer
|
||||
# https://github.com/zalandoresearch/flair
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
|
||||
class FlairIntentTrainer(RhasspyActor):
|
||||
"""Trains a classification and NER model using flair"""
|
||||
|
||||
def __init__(self):
|
||||
RhasspyActor.__init__(self)
|
||||
self.embeddings = []
|
||||
self.converters: Dict[str, Callable[..., Any]] = {}
|
||||
|
||||
def to_started(self, from_state: str) -> None:
|
||||
# Load user-defined converters
|
||||
self.converters = load_converters(self.profile)
|
||||
|
||||
def in_started(self, message: Any, sender: RhasspyActor) -> None:
|
||||
"""Handle messages in started state."""
|
||||
if isinstance(message, TrainIntent):
|
||||
try:
|
||||
self.train(message.intent_fst)
|
||||
self.send(message.receiver or sender, IntentTrainingComplete())
|
||||
except Exception as e:
|
||||
self._logger.exception("train")
|
||||
self.send(message.receiver or sender, IntentTrainingFailed(repr(e)))
|
||||
|
||||
def train(self, intent_fst) -> None:
|
||||
"""Train intent classifier and named entity recognizers."""
|
||||
# pylint: disable=E0401
|
||||
from flair.data import Sentence, Token
|
||||
|
||||
# pylint: disable=E0401
|
||||
from flair.models import SequenceTagger, TextClassifier
|
||||
|
||||
# pylint: disable=E0401
|
||||
from flair.embeddings import (
|
||||
FlairEmbeddings,
|
||||
StackedEmbeddings,
|
||||
DocumentRNNEmbeddings,
|
||||
)
|
||||
|
||||
# pylint: disable=E0401
|
||||
from flair.data import TaggedCorpus
|
||||
|
||||
# pylint: disable=E0401
|
||||
from flair.trainers import ModelTrainer
|
||||
|
||||
# Directory to look for downloaded embeddings
|
||||
cache_dir = self.profile.read_path(
|
||||
self.profile.get("intent.flair.cache_dir", "flair/cache")
|
||||
)
|
||||
|
||||
os.makedirs(cache_dir, exist_ok=True)
|
||||
|
||||
# Directory to store generated models
|
||||
data_dir = self.profile.write_path(
|
||||
self.profile.get("intent.flair.data_dir", "flair/data")
|
||||
)
|
||||
|
||||
if os.path.exists(data_dir):
|
||||
shutil.rmtree(data_dir)
|
||||
|
||||
self.embeddings = self.profile.get("intent.flair.embeddings", [])
|
||||
assert self.embeddings, "No word embeddings"
|
||||
|
||||
# Create directories to write training data to
|
||||
class_data_dir = os.path.join(data_dir, "classification")
|
||||
ner_data_dir = os.path.join(data_dir, "ner")
|
||||
os.makedirs(class_data_dir, exist_ok=True)
|
||||
os.makedirs(ner_data_dir, exist_ok=True)
|
||||
|
||||
# Convert FST to training data
|
||||
# ----------------------------
|
||||
|
||||
# { intent: [ { 'text': ..., 'entities': { ... } }, ... ] }
|
||||
sentences_by_intent: Dict[str, Any] = {}
|
||||
|
||||
# Get sentences for training
|
||||
do_sampling = self.profile.get("intent.flair.do_sampling", True)
|
||||
start_time = time.time()
|
||||
|
||||
if do_sampling:
|
||||
# Sample from each intent FST
|
||||
num_samples = int(self.profile.get("intent.flair.num_samples", 10000))
|
||||
intent_map_path = self.profile.read_path(
|
||||
self.profile.get("training.intent.intent_map", "intent_map.json")
|
||||
)
|
||||
|
||||
with open(intent_map_path, "r") as intent_map_file:
|
||||
intent_map = json.load(intent_map_file)
|
||||
|
||||
# Gather FSTs for all known intents
|
||||
fsts_dir = self.profile.write_dir(
|
||||
self.profile.get("speech_to_text.fsts_dir")
|
||||
)
|
||||
|
||||
intent_fst_paths = {
|
||||
intent_id: os.path.join(fsts_dir, f"{intent_id}.fst")
|
||||
for intent_id in intent_map
|
||||
}
|
||||
|
||||
# Generate samples
|
||||
self._logger.debug(
|
||||
"Generating %s sample(s) from %s intent(s)",
|
||||
num_samples,
|
||||
len(intent_fst_paths),
|
||||
)
|
||||
|
||||
sentences_by_intent = sample_sentences_by_intent(
|
||||
intent_fst_paths, num_samples
|
||||
)
|
||||
else:
|
||||
# Exhaustively generate all sentences
|
||||
self._logger.debug(
|
||||
"Generating all possible sentences (may take a long time)"
|
||||
)
|
||||
sentences_by_intent = make_sentences_by_intent(intent_fst)
|
||||
|
||||
sentence_time = time.time() - start_time
|
||||
self._logger.debug("Generated sentences in %s second(s)", sentence_time)
|
||||
|
||||
# Get least common multiple in order to balance sentences by intent
|
||||
lcm_sentences = lcm(*(len(sents) for sents in sentences_by_intent.values()))
|
||||
|
||||
# Generate examples
|
||||
class_sentences = []
|
||||
ner_sentences: Dict[str, List[Sentence]] = defaultdict(list)
|
||||
for intent_name, intent_sents in sentences_by_intent.items():
|
||||
num_repeats = max(1, lcm_sentences // len(intent_sents))
|
||||
for intent_sent in intent_sents:
|
||||
# Only train an intent classifier if there's more than one intent
|
||||
if len(sentences_by_intent) > 1:
|
||||
# Add balanced copies
|
||||
for _ in range(num_repeats):
|
||||
class_sent = Sentence(labels=[intent_name])
|
||||
for word in intent_sent["tokens"]:
|
||||
class_sent.add_token(Token(word))
|
||||
|
||||
class_sentences.append(class_sent)
|
||||
|
||||
if not intent_sent["entities"]:
|
||||
continue # no entities, no sequence tagger
|
||||
|
||||
# Named entity recognition (NER) example
|
||||
token_idx = 0
|
||||
entity_start = {ev["start"]: ev for ev in intent_sent["entities"]}
|
||||
entity_end = {ev["end"]: ev for ev in intent_sent["entities"]}
|
||||
entity = None
|
||||
|
||||
word_tags = []
|
||||
for word in intent_sent["tokens"]:
|
||||
# Determine tag label
|
||||
tag = "O" if not entity else f"I-{entity}"
|
||||
if token_idx in entity_start:
|
||||
entity = entity_start[token_idx]["entity"]
|
||||
tag = f"B-{entity}"
|
||||
|
||||
word_tags.append((word, tag))
|
||||
|
||||
# word ner
|
||||
token_idx += len(word) + 1
|
||||
|
||||
if (token_idx - 1) in entity_end:
|
||||
entity = None
|
||||
|
||||
# Add balanced copies
|
||||
for _ in range(num_repeats):
|
||||
ner_sent = Sentence()
|
||||
for word, tag in word_tags:
|
||||
token = Token(word)
|
||||
token.add_tag("ner", tag)
|
||||
ner_sent.add_token(token)
|
||||
|
||||
ner_sentences[intent_name].append(ner_sent)
|
||||
|
||||
# Start training
|
||||
max_epochs = int(self.profile.get("intent.flair.max_epochs", 100))
|
||||
|
||||
# Load word embeddings
|
||||
self._logger.debug("Loading word embeddings from %s", cache_dir)
|
||||
word_embeddings = [
|
||||
FlairEmbeddings(os.path.join(cache_dir, "embeddings", e))
|
||||
for e in self.embeddings
|
||||
]
|
||||
|
||||
if class_sentences:
|
||||
self._logger.debug("Training intent classifier")
|
||||
|
||||
# Random 80/10/10 split
|
||||
class_train, class_dev, class_test = self._split_data(class_sentences)
|
||||
class_corpus = TaggedCorpus(class_train, class_dev, class_test)
|
||||
|
||||
# Intent classification
|
||||
doc_embeddings = DocumentRNNEmbeddings(
|
||||
word_embeddings,
|
||||
hidden_size=512,
|
||||
reproject_words=True,
|
||||
reproject_words_dimension=256,
|
||||
)
|
||||
|
||||
classifier = TextClassifier(
|
||||
doc_embeddings,
|
||||
label_dictionary=class_corpus.make_label_dictionary(),
|
||||
multi_label=False,
|
||||
)
|
||||
|
||||
self._logger.debug(
|
||||
"Intent classifier has %s example(s)", len(class_sentences)
|
||||
)
|
||||
trainer = ModelTrainer(classifier, class_corpus)
|
||||
trainer.train(class_data_dir, max_epochs=max_epochs)
|
||||
else:
|
||||
self._logger.info("Skipping intent classifier training")
|
||||
|
||||
if ner_sentences:
|
||||
self._logger.debug("Training %s NER sequence tagger(s)", len(ner_sentences))
|
||||
|
||||
# Named entity recognition
|
||||
stacked_embeddings = StackedEmbeddings(word_embeddings)
|
||||
|
||||
for intent_name, intent_ner_sents in ner_sentences.items():
|
||||
ner_train, ner_dev, ner_test = self._split_data(intent_ner_sents)
|
||||
ner_corpus = TaggedCorpus(ner_train, ner_dev, ner_test)
|
||||
|
||||
tagger = SequenceTagger(
|
||||
hidden_size=256,
|
||||
embeddings=stacked_embeddings,
|
||||
tag_dictionary=ner_corpus.make_tag_dictionary(tag_type="ner"),
|
||||
tag_type="ner",
|
||||
use_crf=True,
|
||||
)
|
||||
|
||||
ner_intent_dir = os.path.join(ner_data_dir, intent_name)
|
||||
os.makedirs(ner_intent_dir, exist_ok=True)
|
||||
|
||||
self._logger.debug(
|
||||
"NER tagger for %s has %s example(s)",
|
||||
intent_name,
|
||||
len(intent_ner_sents),
|
||||
)
|
||||
trainer = ModelTrainer(tagger, ner_corpus)
|
||||
trainer.train(ner_intent_dir, max_epochs=max_epochs)
|
||||
else:
|
||||
self._logger.info("Skipping NER sequence tagger training")
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def _split_data(self, data, split=0.1):
|
||||
"""Randomly splits a data set into train, dev, and test sets"""
|
||||
|
||||
random.shuffle(data)
|
||||
split_index = int(len(data) * split)
|
||||
|
||||
# 1 - (2*split)
|
||||
train = data[(split_index * 2) :]
|
||||
|
||||
# split
|
||||
dev = data[:split_index]
|
||||
|
||||
# split
|
||||
test = data[split_index : (split_index * 2)]
|
||||
|
||||
return train, dev, test
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Command-line Based Intent Trainer
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -726,10 +478,14 @@ class CommandIntentTrainer(RhasspyActor):
|
||||
self._logger.debug(self.command)
|
||||
|
||||
# { intent: [ { 'text': ..., 'entities': { ... } }, ... ] }
|
||||
sentences_by_intent: Dict[str, Any] = make_sentences_by_intent(intent_fst)
|
||||
sentences_by_intent = make_sentences_by_intent(intent_fst)
|
||||
json_sentences = {
|
||||
intent: [r.asdict() for r in sentences_by_intent[intent]]
|
||||
for intent in sentences_by_intent
|
||||
}
|
||||
|
||||
# JSON -> STDIN
|
||||
json_input = json.dumps({sentences_by_intent}).encode()
|
||||
json_input = json.dumps(json_sentences).encode()
|
||||
|
||||
subprocess.run(self.command, input=json_input, check=True)
|
||||
except Exception:
|
||||
|
||||
+29
-17
@@ -257,7 +257,7 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
n = int(match.group(1))
|
||||
|
||||
# 75 -> (seventy five):75!int
|
||||
number_text = num2words(n, lang=language).replace("-", " ").strip()
|
||||
number_text = re.sub(r"[-,]\s*", " ", num2words(n, lang=language)).strip()
|
||||
assert number_text, f"Empty num2words result for {n}"
|
||||
number_words = number_text.split()
|
||||
|
||||
@@ -321,6 +321,19 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
def __setitem__(self, key, value):
|
||||
self.values[key] = value
|
||||
|
||||
# Determine whether word casing has to be fixed
|
||||
word_transform = None
|
||||
if word_casing == "upper":
|
||||
word_transform = str.upper
|
||||
elif word_casing == "lower":
|
||||
word_transform = str.lower
|
||||
|
||||
def fix_word_case(word):
|
||||
if isinstance(word, jsgf.Word):
|
||||
word.text = word_transform(word.text)
|
||||
|
||||
return word
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
def do_intents_to_graph(sentences, slot_names, replacements, targets):
|
||||
@@ -331,25 +344,11 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
for sentence in intent_sentences:
|
||||
jsgf.walk_expression(sentence, number_transform, replacements)
|
||||
|
||||
# Determine whether word casing has to be fixed
|
||||
transform = None
|
||||
if word_casing == "upper":
|
||||
transform = str.upper
|
||||
elif word_casing == "lower":
|
||||
transform = str.lower
|
||||
|
||||
if transform:
|
||||
|
||||
def fix_case(word):
|
||||
if isinstance(word, jsgf.Word):
|
||||
word.text = transform(word.text)
|
||||
|
||||
return word
|
||||
|
||||
if word_transform:
|
||||
# Fix casing
|
||||
for intent_sentences in sentences.values():
|
||||
for sentence in intent_sentences:
|
||||
jsgf.walk_expression(sentence, fix_case, replacements)
|
||||
jsgf.walk_expression(sentence, fix_word_case, replacements)
|
||||
|
||||
# Convert to directed graph
|
||||
graph = intents_to_graph(sentences, replacements)
|
||||
@@ -377,6 +376,7 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
slot_names.add(slot_name)
|
||||
|
||||
# Load slot values
|
||||
has_slot_program = False
|
||||
for slot_key in slot_names:
|
||||
slot_info = find_slot(slot_key)
|
||||
|
||||
@@ -388,9 +388,13 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
line = line.strip()
|
||||
if line:
|
||||
sentence = jsgf.Sentence.parse(line)
|
||||
if word_transform:
|
||||
jsgf.walk_expression(sentence, fix_word_case)
|
||||
|
||||
slot_values.append(sentence)
|
||||
elif isinstance(slot_info, SlotProgramInfo):
|
||||
# Program that will generate values
|
||||
has_slot_program = True
|
||||
slot_values = SlotProgram(slot_info.path, command_args=slot_info.args)
|
||||
|
||||
# Replace $slot with sentences
|
||||
@@ -408,6 +412,7 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
"file_dep": ini_paths + deps,
|
||||
"targets": [intent_graph],
|
||||
"actions": [(do_intents_to_graph, [sentences, slot_names, replacements])],
|
||||
"uptodate": [False if has_slot_program else None],
|
||||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
@@ -521,6 +526,13 @@ def train_profile(profile_dir: Path, profile: Profile) -> Tuple[int, List[str]]:
|
||||
for word in read_dict(dict_file):
|
||||
print(word, file=vocab_file)
|
||||
|
||||
if profile.get("wake.system", "dummy") == "pocketsphinx":
|
||||
# Add words from Pocketsphinx wake keyphrase
|
||||
keyphrase = profile.get("wake.pocketsphinx.keyphrase", "")
|
||||
if keyphrase:
|
||||
for word in re.split(r"\s+", keyphrase):
|
||||
print(word, file=vocab_file)
|
||||
|
||||
@create_after(executed="language_model")
|
||||
def task_vocab():
|
||||
"""Writes all vocabulary words to a file from intent.fst."""
|
||||
|
||||
@@ -91,7 +91,7 @@ def make_dict(
|
||||
if (i < 1) or no_number:
|
||||
print(word, pronounce, file=dictionary_file)
|
||||
else:
|
||||
print(f"{word, i + 1}({pronounce})", file=dictionary_file)
|
||||
print(f"{word}({i + 1})", pronounce, file=dictionary_file)
|
||||
|
||||
words_in_dict.add(word)
|
||||
|
||||
|
||||
+4
-2
@@ -94,6 +94,7 @@ class EspeakSentenceSpeaker(RhasspyActor):
|
||||
self.disable_wake = True
|
||||
self.enable_wake = False
|
||||
self.wake: Optional[RhasspyActor] = None
|
||||
self.espeak_args: List[str] = []
|
||||
|
||||
def to_started(self, from_state: str) -> None:
|
||||
"""Transition to started state."""
|
||||
@@ -104,6 +105,7 @@ class EspeakSentenceSpeaker(RhasspyActor):
|
||||
self.wake = self.config.get("wake")
|
||||
self.wake_on_start = self.profile.get("rhasspy.listen_on_start", False)
|
||||
self.disable_wake = self.profile.get("text_to_speech.disable_wake", True)
|
||||
self.espeak_args = list(self.profile.get("text_to_speech.espeak.arguments", []))
|
||||
self.transition("ready")
|
||||
|
||||
def in_ready(self, message: Any, sender: RhasspyActor) -> None:
|
||||
@@ -143,7 +145,7 @@ class EspeakSentenceSpeaker(RhasspyActor):
|
||||
def speak(self, sentence: str, voice: Optional[str] = None) -> bytes:
|
||||
"""Get WAV buffer for sentence."""
|
||||
try:
|
||||
espeak_cmd = ["espeak"]
|
||||
espeak_cmd = ["espeak"] + self.espeak_args
|
||||
if voice:
|
||||
espeak_cmd.extend(["-v", str(voice)])
|
||||
|
||||
@@ -896,7 +898,7 @@ class HomeAssistantSentenceSpeaker(RhasspyActor):
|
||||
|
||||
# Convert to WAV
|
||||
if audio_url.endswith(".mp3"):
|
||||
lame_command = ["lame", "--decode", "-", "-"]
|
||||
lame_command = ["lame", "--decode", "--mp3input", "-", "-"]
|
||||
self._logger.debug(lame_command)
|
||||
|
||||
return subprocess.run(
|
||||
|
||||
+1
-1
@@ -407,7 +407,7 @@ def numbers_to_words(sentence: str, language: Optional[str] = None) -> str:
|
||||
number = float(word)
|
||||
|
||||
# 75 -> seventy-five -> seventy five
|
||||
words[i] = num2words(number, lang=language).replace("-", " ")
|
||||
words[i] = re.sub(r"[-,]\s*", " ", num2words(number, lang=language))
|
||||
changed = True
|
||||
except ValueError:
|
||||
pass # not a number
|
||||
|
||||
+10
-7
@@ -227,18 +227,19 @@ class PocketsphinxWakeListener(RhasspyActor):
|
||||
self.keyphrase = self.profile.get("wake.pocketsphinx.keyphrase", "")
|
||||
assert self.keyphrase, "No wake keyphrase"
|
||||
|
||||
# Fix casing
|
||||
dict_casing = self.profile.get("speech_to_text.dictionary_casing", "")
|
||||
if dict_casing == "lower":
|
||||
self.keyphrase = self.keyphrase.lower()
|
||||
elif dict_casing == "upper":
|
||||
self.keyphrase = self.keyphrase.upper()
|
||||
|
||||
# Verify that keyphrase words are in dictionary
|
||||
keyphrase_words = re.split(r"\s+", self.keyphrase)
|
||||
with open(dict_path, "r") as dict_file:
|
||||
word_dict = read_dict(dict_file)
|
||||
|
||||
dict_upper = self.profile.get("speech_to_text.dictionary_upper", False)
|
||||
for word in keyphrase_words:
|
||||
if dict_upper:
|
||||
word = word.upper()
|
||||
else:
|
||||
word = word.lower()
|
||||
|
||||
if word not in word_dict:
|
||||
self._logger.warning("%s not in dictionary", word)
|
||||
|
||||
@@ -570,7 +571,9 @@ class PreciseWakeListener(RhasspyActor):
|
||||
self.prediction_sem = threading.Semaphore()
|
||||
for _ in range(num_chunks):
|
||||
chunk = self.audio_buffer[: self.chunk_size]
|
||||
self.stream.write(chunk)
|
||||
if chunk:
|
||||
self.stream.write(chunk)
|
||||
|
||||
self.audio_buffer = self.audio_buffer[self.chunk_size :]
|
||||
|
||||
if self.send_not_detected:
|
||||
|
||||
+47
-2
@@ -3,7 +3,7 @@
|
||||
<!-- Top Bar -->
|
||||
<nav class="navbar navbar-expand-sm navbar-dark bg-dark fixed-top">
|
||||
<a href="/">
|
||||
<img class="navbar-brand" v-bind:class="spinnerClass" src="/img/logo.png">
|
||||
<img id="logo" class="navbar-brand" v-bind:class="spinnerClass" src="/img/logo.png">
|
||||
</a>
|
||||
|
||||
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
|
||||
@@ -119,6 +119,9 @@
|
||||
Rhasspy will not work correctly until these files are downloaded.
|
||||
</p>
|
||||
<tree-view :data="missingFiles" :options="{ rootObjectKey: 'missing'}"></tree-view>
|
||||
<br>
|
||||
<label for="downloadStatus">Status:</label>
|
||||
<textarea id="downloadStatus" v-model="this.downloadStatus" style="width: 100%;" rows="3"></textarea>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button type="button" class="btn btn-secondary" data-dismiss="modal">Cancel</button>
|
||||
@@ -186,7 +189,11 @@
|
||||
|
||||
missingFiles: {},
|
||||
|
||||
version: ''
|
||||
version: '',
|
||||
|
||||
downloadStatus: '',
|
||||
|
||||
wakeSocket: null
|
||||
}
|
||||
},
|
||||
|
||||
@@ -209,6 +216,9 @@
|
||||
this.hasAlert = true
|
||||
this.alertText = text
|
||||
this.alertClass = 'alert-' + level
|
||||
|
||||
// Hide alert after 20 seconds
|
||||
setTimeout(this.clearAlert, 20000)
|
||||
},
|
||||
|
||||
beginAsync: function() {
|
||||
@@ -334,6 +344,8 @@
|
||||
downloadProfile: function() {
|
||||
this.beginAsync()
|
||||
this.downloading = true
|
||||
this.downloadStatus = ''
|
||||
setTimeout(this.updateDownloadStatus, 1000)
|
||||
ProfileService.downloadProfile()
|
||||
.then(() => {
|
||||
alert("Download is complete. Rhasspy will now restart. Make sure to train before using your profile!")
|
||||
@@ -344,6 +356,38 @@
|
||||
this.downloading = false
|
||||
this.endAsync()
|
||||
})
|
||||
},
|
||||
|
||||
updateDownloadStatus: function() {
|
||||
ProfileService.downloadStatus()
|
||||
.then((request) => {
|
||||
this.downloadStatus = request.data
|
||||
})
|
||||
|
||||
if (this.downloading) {
|
||||
setTimeout(this.updateDownloadStatus, 1000)
|
||||
}
|
||||
},
|
||||
|
||||
connectWakeSocket: function() {
|
||||
// Connect to /api/events/intent websocket
|
||||
var wsProtocol = 'ws://'
|
||||
if (window.location.protocol == 'https:') {
|
||||
wsProtocol = 'wss://'
|
||||
}
|
||||
|
||||
var wsURL = wsProtocol + window.location.host + '/api/events/wake'
|
||||
this.wakeSocket = new WebSocket(wsURL)
|
||||
this.wakeSocket.onmessage = (evt) => {
|
||||
$('#logo').css('filter', 'invert()')
|
||||
setTimeout(() => {
|
||||
$('#logo').css('filter', 'initial')
|
||||
}, 2000)
|
||||
}
|
||||
this.wakeSocket.onclose = () => {
|
||||
// Try to reconnect
|
||||
setTimeout(this.connectWakeSocket, 1000)
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
@@ -355,6 +399,7 @@
|
||||
this.getCustomWords()
|
||||
this.getUnknownWords()
|
||||
this.getProblems()
|
||||
this.connectWakeSocket()
|
||||
this.$options.sockets.onmessage = function(event) {
|
||||
this.rhasspyLog = event.data + '\n' + this.rhasspyLog
|
||||
}
|
||||
|
||||
@@ -12,7 +12,7 @@
|
||||
<div class="col-auto">
|
||||
<button type="submit" class="btn btn-success"
|
||||
v-if="sentences"
|
||||
:disabled="sentences[newKey] || newKey.length == 0">Add File</button>
|
||||
:disabled="sentences[newKey] || newKey.length == 0">New File</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -20,6 +20,10 @@
|
||||
title="Record a voice command while held, interpret when released"
|
||||
:disabled="interpreting || (holdRecording && !tapRecording)">{{ tapRecording ? 'Tap to Stop' : 'Tap to Record' }}</button>
|
||||
</div>
|
||||
<div class="col-auto">
|
||||
<button type="button" class="btn btn-success" @click="this.playLastVoiceCommand"
|
||||
title="Play last voice command"><i class="fas fa-play"></i></button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
@@ -132,7 +136,9 @@
|
||||
audioContext: null,
|
||||
recorder: null,
|
||||
|
||||
sendHass: true
|
||||
sendHass: true,
|
||||
|
||||
intentSocket: null
|
||||
}
|
||||
},
|
||||
|
||||
@@ -267,7 +273,35 @@
|
||||
event.preventDefault()
|
||||
PronounceService.saySentence(this.sentence)
|
||||
.catch(err => this.$parent.error(err))
|
||||
},
|
||||
|
||||
playLastVoiceCommand: function(event) {
|
||||
TranscribeService.playRecording()
|
||||
.catch(err => this.$parent.error(err))
|
||||
},
|
||||
|
||||
connectIntentSocket: function() {
|
||||
// Connect to /api/events/intent websocket
|
||||
var wsProtocol = 'ws://'
|
||||
if (window.location.protocol == 'https:') {
|
||||
wsProtocol = 'wss://'
|
||||
}
|
||||
|
||||
var wsURL = wsProtocol + window.location.host + '/api/events/intent'
|
||||
this.intentSocket = new WebSocket(wsURL)
|
||||
this.intentSocket.onmessage = (evt) => {
|
||||
this.jsonSource = JSON.parse(evt.data)
|
||||
this.sentence = this.jsonSource.raw_text
|
||||
}
|
||||
this.intentSocket.onclose = () => {
|
||||
// Try to reconnect
|
||||
setTimeout(this.connectIntentSocket, 1000)
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
mounted: function() {
|
||||
this.connectIntentSocket()
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
@@ -108,7 +108,6 @@
|
||||
},
|
||||
data: function () {
|
||||
return {
|
||||
device: '',
|
||||
speakers: {}
|
||||
}
|
||||
},
|
||||
@@ -124,20 +123,21 @@
|
||||
},
|
||||
|
||||
computed: {
|
||||
devicePath: function() {
|
||||
return 'sounds.' + this.profile.sounds.system + '.device'
|
||||
device: {
|
||||
get: function() {
|
||||
if(this.profile.sounds[this.profile.sounds.system]) {
|
||||
return this.profile.sounds[this.profile.sounds.system].device;
|
||||
}
|
||||
return "";
|
||||
},
|
||||
set: function(newValue) {
|
||||
this.profile.sounds[this.profile.sounds.system].device = newValue;
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
mounted: function() {
|
||||
this.getSpeakers()
|
||||
this.device = this._.get(this.profile, this.devicePath, '')
|
||||
},
|
||||
|
||||
watch: {
|
||||
device: function() {
|
||||
this._.set(this.profile, this.devicePath, this.device)
|
||||
}
|
||||
this.getSpeakers();
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
@@ -173,7 +173,6 @@
|
||||
},
|
||||
data: function () {
|
||||
return {
|
||||
device: '',
|
||||
microphones: {},
|
||||
testing: false
|
||||
}
|
||||
@@ -217,20 +216,21 @@
|
||||
},
|
||||
|
||||
computed: {
|
||||
devicePath: function() {
|
||||
return 'microphone.' + this.profile.microphone.system + '.device'
|
||||
device: {
|
||||
get: function() {
|
||||
if(this.profile.microphone[this.profile.microphone.system]) {
|
||||
return this.profile.microphone[this.profile.microphone.system].device;
|
||||
}
|
||||
return "";
|
||||
},
|
||||
set: function(newValue) {
|
||||
this.profile.microphone[this.profile.microphone.system].device = newValue;
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
mounted: function() {
|
||||
this.getMicrophones()
|
||||
this.device = this._.get(this.profile, this.devicePath, '')
|
||||
},
|
||||
|
||||
watch: {
|
||||
device: function() {
|
||||
this._.set(this.profile, this.devicePath, this.device)
|
||||
}
|
||||
this.getMicrophones();
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
@@ -70,5 +70,9 @@ export default {
|
||||
|
||||
return Api().post('/api/download-profile', '',
|
||||
{ 'params': params })
|
||||
},
|
||||
|
||||
downloadStatus() {
|
||||
return Api().get('/api/download-status')
|
||||
}
|
||||
}
|
||||
|
||||
@@ -37,6 +37,10 @@ export default {
|
||||
{ params: params })
|
||||
},
|
||||
|
||||
playRecording() {
|
||||
return Api().post('/api/play-recording', '')
|
||||
},
|
||||
|
||||
wakeup() {
|
||||
return Api().post('/api/listen-for-command')
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user