IA-for-AI

IA-for-AI : Intelligence Augmentation for Artificial Intelligence

Emotions form an integral part of human interactions. The Intelligence Augmentation for AI Hackathon 2021 paves the way toward more empathy AI systems by aiming to build systems to recognize emotions from audio. The best entry into the competition from our team - Prompt Engineers - is a system that leverages not only the audio features but also the semantics of the spoken words, fusing the two intertwined modalities to achieve a runner-up position on the leaderboard with 61.38% Accuracy. We further improve the latency of the approach by more than 42% via feature reuse, weight sharing and multi-task learning at the cost of only 0.2% Accuracy drop.

Our best performing model is a phono-linguistic model, leveraging both the semantics of the spoken works and the speech features. We obtain speech features from Hubert - a speech pretrained transformer model and language features from Bert - a language model running over the output of the transcribed speech. The features from the two modalities are fused together to achieve 61.38% Accuracy. We observe that Bert features over the transcribed speech alone achieves 55.77% Accuracy. Whereas classifying on only the speech features from Hubert yields 58.98% Accuracy. Together the two modalities achieves the best performance.

We improve latency by multi-task learning the HuBert for Audio features as well as speech transcribing (ASR). This leads to 42% less model parameters with only 0.2% performance drop.

Approach

Qries

Dependencies:

We exported our conda environments for training the models and running the app.

Please note that our app_env was ran on a MacOS 11.2 Machine with Intel processor, whereas our training (train_env) was done on a linux machine with Nvidia GPUs. The same conda environment may not work on other machines. Instead you may download the packages individually.

You may also download the following dependencies individually as an alternate means to create the environment:

Additional dependencies for running the webapp: streamlit, plotly

If you are training the AST model, then also download the following dependencies: matplotlib, numba, timm, zipp, wget, llvmlite

Running the webapp

WebApp

Training model

Demo Video

Alt text

Miscellanous

Made with ❤️