An expansion of the second task in that list.
Here’s something which has just caused me to run out of memory. I have been building my matricies for natural language texts by reading adjacency information into a graph, and then casting it as a matrix. But memory usage is getting out of control for larger data sets. If someone could show me a more efficient way to build these matricies, that would help.
Here is the code I use now:
for line in file.readlines():
tmp = line.strip()
sentence = tmp.split(" ")
for i, source in enumerate(sentence):
for dest in sentence[i+1:]:
#print “Adding edge between source:”, source, “and dest:”, dest
C = to_numpy_matrix(G)
Where to_numpy_matrix is a function I found online which does the conversion.
There must be better ways of doing this.
I also want to keep a dictionary of words associated with the matrix rows and columns, for interpretation later on. Currently I make it from the graph before conversion.