Genre classification with Neural Networks

Neural networks are computing systems inspired by the biological neural networks that learn tasks by considering examples. They obtain better outcomes at a specific task without any a priori knowledge about that.

As neural network model we used the multi-layer perceptron from sklearn.
This model has many parameters we can set and they can change results quite drastically.
Then again we used k-fold cross validation to train our model. We used the same settings we used in logistic regression for cross validation.

mlp = MLPClassifier(solver='lbfgs', hidden_layer_sizes=(13, 35, 35))
cv = KFold(n_splits=n_folds)
scores = model_selection.cross_val_score(mlp, x, y, cv=cv)
y_pred = model_selection.cross_val_predict(mlp, x, y, cv=cv)

# accuracy = scores.mean()

Now that we have the model predictions it's easy to calculate precision, recall and other useful information or plot a confusion matrix to see the quality of the predictions.

print(classification_report(y, y_pred))

conf_matr = confusion_matrix(y, y_pred)
# plot conf_matr

Performance with FFT

Using FFT data it takes about 20 seconds and we get really poor results. In this case the solver used was 'adam', the activation function is relu and there is only one hidden layer consisting in 100 neurons. These parameters were the one that gave the better results over many tests.
Th

precision recall f1-score
Blues 0.35 0.30 0.32
Classical 0.45 0.35 0.40
Country 0.08 0.16 0.11
Disco 0.07 0.07 0.07
Jazz 0.26 0.33 0.29
Metal 0.22 0.19 0.20
Pop 0.13 0.10 0.11
Reggae 0.10 0.10 0.10
Rock 0.14 0.08 0.10
Average / total 0.20 0.19 0.19

Performance with MFCC

Using MFCC it takes about 1.5 minutes but, like in the other classifiers, results are better. We can see that general performance improved a lot even if some genres are classified with better results than others. Here we use 'lbfgs' as solver, relu as activation function and 3 hidden layers with 13, 35, 35 neurons respectively.

Precision Recall F1-score
Blues 0.35 0.27 0.31
Classical 0.76 0.75 0.75
Country 0.33 0.29 0.31
Disco 0.31 0.34 0.33
Jazz 0.23 0.20 0.21
Metal 0.67 0.86 0.75
Pop 0.66 0.77 0.71
Reaggae 0.30 0.35 0.32
Rock 0.14 0.11 0.12
Average / total 0.42 0.44 0.42

results matching ""

    No results matching ""