View on GitHub

AVESA

(Audio-Visual Event Sentiment Analysis)

AVESA (Audio-Visual Event Sentiment Analysis)

Contributors

License

Check Our Previous Work:

B. Karakaya, E.B. Boztepe, and B. Karasulu, “Development of a Deep Learning Based Model for Recognizing the Environmental Sounds in Videos,” in The SETSCI Conference Proceedings Book, vol. 5, no. 1, pp. 53-58, 2022.

Link

Backends (Used Frameworks and Tools)

Framework & Tool Used For
Keras Deep Learning API
Tensorflow-io & Opencv-Python Getting spectrograms and Data Augmentation techniques such as Multiple Masking, CLAHE etc.
h5py To Save trained deep learning model
Numpy & Pandas & Matplotlib General Purpose
Moviepy & Pydub Applying some operations into frames and videos
NLTK & Zeyrek & Jellyfish To find Similarity Score
NLTK & Spacy Applying NLP techniques
Vosk Get the speech to text translation from the video
BERT Pre-trained models for Sentiment Analysis
Transformers To Apply BERT models into the text for Sentiment Analysis
Librosa Audio Processing, Gaussian Noise and some data augmentation techniques
Gradio To build GUI structure. "Apache License 2.0"
Torch Used in backend for BERT, Vosk models

(Abid, Abubakar and Abdalla, Ali and Abid, Ali and Khan, Dawood and Alfozan, Abdulrahman and Zou, James, (2019), “Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild”, ICML HILL 2019, arXiv preprint arXiv:1906.02569.):

arXiv

Metholodological Flow Diagram of our AVESA system

How To Use the AVESA:

Download

Known Issues and Fixing the Possible Errors

Some Notes:

Enjoy the AVESA

To learn more about the model, you can check the journal article about the AVESA here: http://saucis.sakarya.edu.tr/tr/pub/issue/72246/1139765