Jul 21, 2009

A Paper on ISMIR 2008

In the following paper published on ISMIR 2008:the authors present their work using NMF (Non-negative Matrix Factorization) to analyze semantic topics from song lyrics. In Section 3.4:

"We decide to use NMF for automatic topic detection as it is a clustering technique that results in additive representation of items (e.g., song X is represented as 10% topci A, 30% topic B and 60% topic C), a property that distinguishes it from most other clustering techniques."

However, "most other techniques" including pLSA, LDA and Mix-Noisy-OR models all have the "distinguishing property" stated by the authors. In addition, the equivalence between NMF and pLSA has been well studied in the following papers:
The authors also criticize that LSA cannot process large sparse matrices. However, LSA is in fact applying SVD on term-document-matrix (TDM), and there are many SVD algorithms that can decompose large sparse matrices.

No comments: