Platalea
Platalea contains deep neural network architectures for modeling spoken language from multiple sources at once: audio, and simultaneously images, text or video. This can be used to learn about what different context sources add to the language acquisition process.