Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples #9

jmadison222 · 2014-12-11T20:36:43Z

When I run this code:

svd.recommend(x, n=3, is_row=False, only_unknowns=True)

I get this error:

ValueError: Matrix is empty! If you loaded an SVD model you can't use only_unknowns=True, unless svd.create_matrix() is called

And it's very right: I'm loading with this code:

svd = SVD(filename=sFileTarget) # Loading already computed SVD model

Which I had previously generated with this code:

svd = SVD()
svd.load_data(filename=sFileSource, sep=',', format=dictFormat)
k = 5 # Number of clusters
svd.compute(
    k=k, min_values=3, pre_normalize=None, 
    mean_center=False,
    post_normalize=True, savefile=sFileTarget
)

But when I use create_matrix(), I get this:

>>> svd = SVD(filename=sFileTarget) # Loading already computed SVD model
>>> svd.create_matrix()
Creating matrix (0 tuples)
Matrix density is: None%

And nothing works from there, of course.

What might the solution be?

Thanks!

The text was updated successfully, but these errors were encountered:

ocelma · 2014-12-12T05:15:03Z

Once you've created the model, and stored it in disk, you can load the SVD model again, but (unfortunately) this does not load the original input matrix, as this takes much time.

The only way for you to do so would be to load the data again from the raw file. Something like:

svd = SVD(filename=sFileTarget) # Loading already computed SVD model
svd.load_data(filename=sFileSource, sep=',', format=dictFormat) # Re-load the matrix each time!
svd.create_matrix() # And (re)create it
(...)

jmadison222 · 2014-12-15T13:40:57Z

Great thanks for the reply. Makes sense. And that's still a big lift since compute() is the more time consuming. But now I get this error:

/appl/build/anaconda/lib/python2.7/site-packages/csc_utils/ordered_set.pyc in __getitem__(self, index)
     50         else:
     51             # assume it's a fancy index list
---> 52             return OrderedSet([self.items[i] for i in index])
     53 
     54     def copy(self):

IndexError: list index out of range

Where the code I have is this:

from recsys.algorithm.factorize import SVD
svd = SVD(filename=sFileTarget) # Loading already computed SVD model
svd.load_data(filename=sFileSource, sep=',', format=dictFormat)
svd.create_matrix()
setCustIDs = set([x[2] for x in svd.get_data().get()[0:100]]) # Get some sample customer IDs
print(setCustIDs)
[(x, svd.recommend(x, n=3, is_row=False, only_unknowns=True)) for x in setCustIDs]

Such that the last line works fine if I do the full compute. That is, this code works:

sFileSource = '/appl/cwa/data/cov_100k.out'
sFileTarget = '/appl/cwa/data/coverage.model'
# dictFormat = {'col':0, 'row':1, 'value':2, 'ids': int}
dictFormat = {'col':0, 'row':1, 'value':2}

import recsys.algorithm
recsys.algorithm.VERBOSE = True
from recsys.algorithm.factorize import SVD
svd = SVD()
svd.load_data(filename=sFileSource, sep=',', format=dictFormat)
k = 5 # Number of clusters
svd.compute(
    k=k, min_values=3, pre_normalize=None, 
    mean_center=False,
    post_normalize=True, savefile=sFileTarget
)
print(svd.get_data().get()[0:5]) # Look at data for sanity check.
setCustIDs = set([x[2] for x in svd.get_data().get()[0:100]]) # Get some sample customer IDs
print(setCustIDs) 
[(x, svd.recommend(x, n=3, is_row=False, only_unknowns=True)) for x in setCustIDs]

And both the working and non-working code are getting the same customer IDs for that print statement:

set([33, 195, 262, 198, 266, 285, 144, 254, 222, 215, 218, 123, 61, 126, 63])

Thoughts?

jmadison222 changed the title ~~Loaded SVD model with only_unknowns=True, so need create_matrix(), but can't find it~~ Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples Dec 11, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples #9

Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples #9

jmadison222 commented Dec 11, 2014

ocelma commented Dec 12, 2014

jmadison222 commented Dec 15, 2014

Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples #9

Loaded SVD model with only_unknowns=True, so need create_matrix(), but it finds 0 tuples #9

Comments

jmadison222 commented Dec 11, 2014

ocelma commented Dec 12, 2014

jmadison222 commented Dec 15, 2014