mexca

Malte Lüken

doi:10.5281/zenodo.6976414

mexca

Capture emotion expressions from video, audio, and text with a single pipeline.

mentions

contributors

Get started

586 commitsLast commit ≈ 15 months ago37 stars7 forks

Cite this software

Software version:

DOI:

10.5281/zenodo.6976414

Choose a reference manager format:

Description

mexca is an open-source Python package which aims to capture human emotion expressions from videos in a single pipeline. The package implements the customizable yet easy-to-use Multimodal Emotion eXpression Capture Amsterdam (MEXCA) pipeline for extracting emotion expression features from videos. It contains building blocks that can be used to extract features for individual modalities (i.e., facial expressions, voice, and dialogue/spoken text). The blocks can also be integrated into a single pipeline to extract the features from all modalities at once. Next to extracting features, mexca can also identify the speakers shown in the video by clustering speaker and face representations. This allows users to compare emotion expressions across speakers, time, and contexts.

The package contains five components that can be used to build the MEXCA pipeline:

FaceExtractor: Detects faces, encodes them into an embedding space, clusters the embeddings to link reoccuring faces, and extracts facial landmarks and action units.
SpeakerIdentifier: Performs speaker diarization, that is, detects speech and speech segments, encodes speakers into an embedding space, and clusters the embeddings. Attempts to answer the question: “Who speaks when?”.
VoiceExtractor: Extracts voice features, such as pitch, associated with emotion expressions.
AudioTranscriber: Transcribes detected speech segments to text.
SentimentExtractor: Predicts sentiment scores for the transcribed text.