Train All the Things: Version 0.1
Sat Mar 28, 2020

My first commit to on-air shows March 3, 2020. I know that the weeks leading up to that commit I spent some time reading through the TF Lite documentation, playing with Cloudflare Workers K/V and getting my first setup of esp-idf squared away. After that it was off to the races. I outlined my original goal in the planning post. I didn't quite get to that goal. The project currently doesn't have a VAD to handle the scenario where I forget to activate the display before starting a call or hangout. Additionally I wasn't able to train a custom keyword as highlighted in the custom model post. I was however able to get a functional implementation of the concept. I am able to hang the display up, and then in my lab with the ESP-EYE plugged in I can use the wake word visual followed by on/off to toggle the display status.

Voice Display Demo Voice Display Demo Two

While it's not quite what I had planned it's a foundation. I've got a lot more tools and knowledge under my belt. Round 2 will probably involved Skainet just due to the limitations in voice data that's readily available. Keep an eye out for a couple more post highlighting some bumps along the way and final take aways.

The code, docs, images etc for the project can be found here and I'll be posting any further updates to HackadayIO. For anybody that might be interested in building this the instructions below provide a brief outline. Updated versions will be hosted in the repo. If you have any questions or ideas reach out.

Required Hardware:

  1. ESP-EYE
  2. Optional ESP-EYE case
  3. PyPortal
  4. Optional PyPortal case
  5. Two 3.3v usb to outler adapters and two usb to usb mini cables

OR

  1. Two 3.3v micro usb wall outlet chargers

Build Steps:

  1. Clone the on-air repo.

Cloudflare Worker:

  1. Setup Cloudflare DNS records for your domain and endpoint, or setup a new domain with Cloudflare if you don't have one to resolve the endpoint.
  2. Setup a Cloudflare workers account with worker K/V.
  3. Setup the Wrangler CLI tool.
  4. cd into the on-air/sighandler directory.
  5. Update [toml](https://git.sr.ht/~n0mn0m/on-air/tree/master/sighandler/wrangler.toml)
  6. Run wrangler preview
  7. wrangler publish
  8. Update [Makefile](https://git.sr.ht/~n0mn0m/on-air/tree/master/sighandler/Makefile) with your domain and test calling.

PyPortal:

  1. Setup CircuitPython 5.x on the PyPortal.

  2. If you're new to CircuitPython you should read this first.

  3. Go to the directory where you cloned on-air.

  4. cd into display.

  5. Update [secrets.py](https://git.sr.ht/~n0mn0m/on-air/tree/master/display/secrets.py) with your wifi information and status URL endpoint.

  6. Copy code.py, secrets.py and the bitmap files in screens/ to the root of the PyPortal.

  7. The display is now good to go.

    ESP-EYE:

  8. Setup [esp-idf](https://docs.espressif.com/projects/esp-idf/en/latest/esp32/get-started/) using the 4.1 release branch.

  9. Install espeak and sox.

  10. Setup a Python 3.7 virtual environment and install Tensorflow 1.15.

  11. cd into on-air/voice-assistant/train

  12. chmod +x orchestrate.sh and ./orchestrate.sh

  13. Once training completes cd ../smalltalk

  14. Activate the esp-idf tooling so that $IDF_PATH is set correctly and all requirements are met.

  15. idf.py menuconfig and set your wifi settings.

  16. Update the URL in [toggle\_status.cc](https://git.sr.ht/~n0mn0m/on-air/tree/master/voice-assistant/smalltalk/main/http/toggle_status.cc)

  17. This should match the host and endpoint you deployed the Cloudflare worker to above

  18. idf.py build

  19. idf.py --port \<device port\> flash monitor

  20. You should see the device start, attach to WiFi and begin listening for the wake word "visual" followed by "on" or "off".


blog · about · sourcehut · hackaday · home