Get started
564 commits | Last commit ≈ 17 months ago
Platalea contains deep neural network architectures for modeling spoken language from multiple sources at once: audio, and simultaneously images, text or video. This can be used to learn about what different context sources add to the language acquisition process.
An alternative approach for intelligent systems to understand human speech