Models for MeMAD language identification pipeline

Kuvaus

A collection of models for MeMAD spoken language identification pipeline. Zip contains four models: An xvector embedding model trained on 67 languages using the lidbox toolkit.A scikit-learn StandardScaler for standardizing the embedding model output before Naîve Bayes classification.A probabilistic linear discriminant analysis (PLDA) model for reducing the dimensions of the embedding vectors.A scikit-learn Naïve Bayes model for classifying embedding vectors to six categories: de, en, fi, fr, sv, x-nolang

Näytä enemmän

Julkaisuvuosi

2021

Aineiston tyyppi

Tekijät

Aalto-yliopisto

Department of Signal Processing and Acoustics

Anja Virkkunen - Tekijä

Matias Lindgren - Tekijä

Zenodo - Julkaisija

Projekti

Muut tiedot

Tieteenalat

Tietojenkäsittely ja informaatiotieteet

Kieli

Saatavuus

Avoin

Lisenssi

Creative Commons Nimeä 4.0 Kansainvälinen (CC BY 4.0)

Avainsanat

Asiasanat

Ajallinen kattavuus

undefined