Mini-Symposium Topic: Solving Some Large Matrix Application Problems

Organizer: Qiang Ye (University of Manitoba)

Large-scale SVD and Information Retrieval

O. Marques1, H. Simon1 and H. Zha2

1Lawrence Berkeley National Laboratory, Berkeley, California, U. S. A.

2Pennsylvania State University, University Park, Pennsylvania, U. S. A.

We present a theoretical foundation based on subspaces for latent semantic indexing (LSI) in information retrieval. We show that our model leads to a low-rank-plus-shift structure that is approximately satisfied by the cross-product of the term-document matrices. This structure can be exploited for the compution of the partial singular value decomposition (SVD) of a large sparse term-document matrix used in LSI. We also discuss several parallel implementation issues and present emperical numerical results on Cray T3E using text collections with millions of documents.


Saturday, 5:00 p.m. - 5:30 p.m. Room 1520