Offline voice agent
Go to file
Nick Thomas f8e1ddc5cb
Initial commit
2018-02-20 01:23:27 +00:00
deps Initial commit 2018-02-20 01:23:27 +00:00
target Initial commit 2018-02-20 01:23:27 +00:00
.gitignore Initial commit 2018-02-20 01:23:27 +00:00
Makefile Initial commit 2018-02-20 01:23:27 +00:00
README.md Initial commit 2018-02-20 01:23:27 +00:00

README.md

Pardner

An offline voice agent designed to run on Linux desktops.

If I can, I'll make this a thin configuration around mycroft. If I can't, I'll build it using lots of Mozilla goodness.

Targets

Here are a few sentences I'd like to work while sat at my computer.

  • I'm starting work now
  • I've finished work now
  • "Read me the news"
  • "Tell me the weather"
  • "What's the time?"
  • "Install the updates"
  • "Go to sleep"
  • "Wake up" (yeah, right :p)
  • "Shutdown"
  • "Yes officer" / "You are under arrest" -> rm -rf ~/secrets ;)
  • "This is police brutality" -> webcam-on
  • "I weigh 75kg today"
  • "I ran 5 miles today"
  • "I ate too much yesterday"
  • "What's my fortune?"
  • "Who's online?"
  • "Is X online?"
  • "Play some Enya"
  • "Play some HARD ROCK"
  • ...

A few work-related ideas

  • "Open a new issue"
  • "Close issue X"
  • (Other GL chat integrations)

Jasper's idea of notification modules is also interesting. It would be nice for the computer to talk to me when there's an aurora coming in, or disk space is getting low.

It would be good to replace Gnome's notification sounds with TTS of the notification text.

Requirements

Pocketsphinx

  • sphinxbase
    • build-essential
    • swig
    • libpulse-dev
  • pocketsphinx
    • sphinxbase
    • libpulse-dev

Building

TODO:

First, run make deps. It will download the necessary dependencies into the deps/ folder, compile them, and install the output into target/usr. If you already have an up-to-date pocketsphinx installed (I'm targeting 5-prealpha for now), you can skip this step.

Running

Nothing to run yet!

Playground

target/usr/bin/pocketsphinx_continuous -inmic yes 2>/dev/null

(We'll probably want to use the shared libraries directly, eventually)

Recognition is atrocious, for me at least.

Similar in concept to Jasper or Mycroft, except:

  • No support for online, third-party STT operators \o/ \o/ \o/ . o O ( what about accuracy? )
  • Simple to install and run on desktop \o/ \o/ \o/ . o O ( Why would you want to do that? )

Bibliography

Speech-to-text engines

  1. CMUSphinx
  2. DeepSpeech

Open datasets

  1. Common Voice
  2. Voxforge

Other FOSS voice agents

  1. Jasper Project
  2. Mycroft
  3. Simon