This is the lassie Reference Manual, version 0.0.1, generated automatically by Declt version 4.0 beta 2 "William Riker" on Sun Sep 15 05:38:42 2024 GMT+0.
The main system appears first, followed by any subsystem dependency.
lassie
Library for Latent Semantic Indexing.
Gabor Melis
MIT
0.0.1
fsvd
(system).
package.lisp
(file).
indexer.lisp
(file).
normalizer.lisp
(file).
mapper.lisp
(file).
assemble.lisp
(file).
lsa.lisp
(file).
lsa-extra.lisp
(file).
Files are sorted by type and then listed depth-first from the systems components trees.
lassie/lassie.asd
lassie/package.lisp
lassie/indexer.lisp
lassie/normalizer.lisp
lassie/mapper.lisp
lassie/assemble.lisp
lassie/lsa.lisp
lassie/lsa-extra.lisp
lassie/indexer.lisp
package.lisp
(file).
lassie
(system).
->index
(generic function).
<-index
(generic function).
make-counting-indexer
(function).
make-hashing-indexer
(function).
make-random-indexer
(function).
print-object
(method).
%make-hashing-indexer
(function).
%make-random-indexer
(function).
alist->hashing-indexer
(function).
copy-counting-indexer
(function).
copy-hashing-indexer
(function).
copy-random-indexer
(function).
counting-indexer
(structure).
counting-indexer-count
(reader).
(setf counting-indexer-count)
(writer).
counting-indexer-p
(function).
hashing-indexer
(structure).
hashing-indexer->alist
(function).
hashing-indexer-index->object
(reader).
(setf hashing-indexer-index->object)
(writer).
hashing-indexer-next-index
(reader).
(setf hashing-indexer-next-index)
(writer).
hashing-indexer-object->index
(reader).
(setf hashing-indexer-object->index)
(writer).
hashing-indexer-p
(function).
make-index-vector
(function).
random-indexer
(structure).
random-indexer-length
(reader).
(setf random-indexer-length)
(writer).
random-indexer-n
(reader).
(setf random-indexer-n)
(writer).
random-indexer-object->index
(reader).
(setf random-indexer-object->index)
(writer).
random-indexer-p
(function).
test-hashing-indexer
(function).
lassie/normalizer.lisp
indexer.lisp
(file).
lassie
(system).
column-power-normalizer
(class).
normalize-document-vector
(generic function).
normalize-matrix
(generic function).
normalize-term-vector
(generic function).
null-normalizer
(class).
pmi-normalizer
(class).
print-object
(method).
print-object
(method).
print-object
(method).
row-centering-normalizer
(class).
sign-normalizer
(class).
tf-idf-normalizer
(class).
class-counts
(reader method).
column-norm
(function).
document-class-fn
(reader method).
hash-table=
(function).
idfs
(reader method).
inverse-document-frequency
(function).
make-column-power-normalizer
(function).
make-null-normalizer
(function).
make-tf-idf-normalizer
(function).
n-columns
(function).
n-documents
(reader method).
n-documents-with-term
(function).
n-rows
(function).
norm
(function).
normalize-column
(function).
normalize-vector
(function).
power
(reader method).
row-average
(function).
row-averages
(reader method).
sum-column
(function).
sum-matrix
(function).
sum-row
(function).
sum-vector
(function).
term-counts
(reader method).
term-counts-per-class
(reader method).
term-frequency
(function).
term-total
(reader method).
term-total-per-class
(reader method).
test-pmi-normalizer
(function).
test-tf-idf-normalizer
(function).
lassie/mapper.lisp
normalizer.lisp
(file).
lassie
(system).
compose-mappers
(function).
concatente-mappers
(function).
curry-mapper
(function).
encode-mapper
(function).
make-encoded-term-document-lister
(function).
make-encoded-term-document-mapper
(function).
make-mapper
(function).
map-lines
(generic function).
null-mapper
(function).
test-compose-mappers
(function).
lassie/assemble.lisp
mapper.lisp
(file).
lassie
(system).
assemble-co-occurrence-matrix
(generic function).
assemble-document-vector
(generic function).
assemble-term-vector
(generic function).
lsa-assembler
(class).
print-object
(method).
print-object
(method).
ri-term-assembler
(class).
assemble-occurence-vector
(function).
incf-and-maybe-grow
(function).
make-lsa-assembler
(function).
make-ri-term-assembler
(function).
n-documents
(reader method).
n-terms
(reader method).
lassie/lsa.lisp
assemble.lisp
(file).
lassie
(system).
construct-document-vector
(function).
construct-term-vector
(function).
document->vector
(function).
document-features
(function).
document-indexer
(reader method).
document-mapper
(reader method).
document-vector-features
(function).
load-lsa
(function).
lsa
(function).
lsa
(class).
normalizer
(reader method).
save-lsa
(function).
svd
(reader method).
term->vector
(function).
term-features
(function).
term-indexer
(reader method).
term-mapper
(reader method).
term-vector-features
(function).
assembler
(reader method).
construct-lsa-vector
(function).
coordinate
(function).
extract-lsa-features
(function).
extract-svd-features
(function).
inner*
(function).
single-float-vector
(type).
lassie/lsa-extra.lisp
lsa.lisp
(file).
lassie
(system).
cosine-similarity
(function).
most-similar-documents
(function).
insert-into-sorted-vector
(function).
Packages are listed by definition order.
lassie
The core functionality of Lassie.
common-lisp
.
lassie.assembler
.
lassie.indexer
.
construct-document-vector
(function).
construct-term-vector
(function).
cosine-similarity
(function).
document->vector
(function).
document-features
(function).
document-indexer
(generic reader).
document-mapper
(generic reader).
document-vector-features
(function).
load-lsa
(function).
lsa
(function).
lsa
(class).
most-similar-documents
(function).
normalizer
(generic reader).
save-lsa
(function).
svd
(generic reader).
term->vector
(function).
term-features
(function).
term-indexer
(generic reader).
term-mapper
(generic reader).
term-vector-features
(function).
assembler
(generic reader).
compose-mappers
(function).
concatente-mappers
(function).
construct-lsa-vector
(function).
coordinate
(function).
curry-mapper
(function).
encode-mapper
(function).
extract-lsa-features
(function).
extract-svd-features
(function).
inner*
(function).
insert-into-sorted-vector
(function).
make-encoded-term-document-lister
(function).
make-encoded-term-document-mapper
(function).
make-mapper
(function).
map-lines
(generic function).
null-mapper
(function).
single-float-vector
(type).
test-compose-mappers
(function).
lassie.assembler
Different assemblers and normalizers that one plug
into Lassie. Assemblers to construct a co-occurence matrix or document
vector from a corpus, and normalizers to perform post processing on
the assembled data. Normalizers can be printed and read readably.
common-lisp
.
assemble-co-occurrence-matrix
(generic function).
assemble-document-vector
(generic function).
assemble-term-vector
(generic function).
column-power-normalizer
(class).
lsa-assembler
(class).
normalize-document-vector
(generic function).
normalize-matrix
(generic function).
normalize-term-vector
(generic function).
null-normalizer
(class).
pmi-normalizer
(class).
ri-term-assembler
(class).
row-centering-normalizer
(class).
sign-normalizer
(class).
tf-idf-normalizer
(class).
assemble-occurence-vector
(function).
class-counts
(generic reader).
column-norm
(function).
document-class-fn
(generic reader).
hash-table=
(function).
idfs
(generic reader).
incf-and-maybe-grow
(function).
inverse-document-frequency
(function).
make-column-power-normalizer
(function).
make-lsa-assembler
(function).
make-null-normalizer
(function).
make-ri-term-assembler
(function).
make-tf-idf-normalizer
(function).
n-columns
(function).
n-documents
(generic reader).
n-documents-with-term
(function).
n-rows
(function).
n-terms
(generic reader).
norm
(function).
normalize-column
(function).
normalize-vector
(function).
power
(generic reader).
row-average
(function).
row-averages
(generic reader).
sum-column
(function).
sum-matrix
(function).
sum-row
(function).
sum-vector
(function).
term-counts
(generic reader).
term-counts-per-class
(generic reader).
term-frequency
(function).
term-total
(generic reader).
term-total-per-class
(generic reader).
test-pmi-normalizer
(function).
test-tf-idf-normalizer
(function).
lassie.indexer
Indexers provide a - sometimes reversible - mapping
from objects and indices. The word ‘index’ is used here in a very
general sense, random indexers, for instance, map to a set of indices.
Within Lassie they are used in conjunction with assemblers that know
how to change the co-occurence matrix when encountering given an
index. They can be printed and read readably.
common-lisp
.
->index
(generic function).
<-index
(generic function).
make-counting-indexer
(function).
make-hashing-indexer
(function).
make-random-indexer
(function).
%make-hashing-indexer
(function).
%make-random-indexer
(function).
alist->hashing-indexer
(function).
copy-counting-indexer
(function).
copy-hashing-indexer
(function).
copy-random-indexer
(function).
counting-indexer
(structure).
counting-indexer-count
(reader).
(setf counting-indexer-count)
(writer).
counting-indexer-p
(function).
hashing-indexer
(structure).
hashing-indexer->alist
(function).
hashing-indexer-index->object
(reader).
(setf hashing-indexer-index->object)
(writer).
hashing-indexer-next-index
(reader).
(setf hashing-indexer-next-index)
(writer).
hashing-indexer-object->index
(reader).
(setf hashing-indexer-object->index)
(writer).
hashing-indexer-p
(function).
make-index-vector
(function).
random-indexer
(structure).
random-indexer-length
(reader).
(setf random-indexer-length)
(writer).
random-indexer-n
(reader).
(setf random-indexer-n)
(writer).
random-indexer-object->index
(reader).
(setf random-indexer-object->index)
(writer).
random-indexer-p
(function).
test-hashing-indexer
(function).
Definitions are sorted by export status, category, package, and then by lexicographic order.
Construct a document vector from FEATURES. Inverse of DOCUMENT-VECTOR-FEATURES.
Construct a term vector from FEATURES. Inverse of TERM-VECTOR-FEATURES.
Turn DOCUMENT into a document vector.
Convenience function that returns the features of DOCUMENT after turning into into a vector with LSA.
Return the feature vector for the document given by document VECTOR or INDEX.
Return the lsa loaded from FILENAME and SVD-FILENAME.
Perform LSA and return the lsa object that contains the SVD and
remembers the mappers, indexers, ASSEMBLER and NORMALIZER for easy
querying later by for example DOCUMENT-FEATURES.
This fat function assembles the co-occurrence matrix by iterating over
all terms by TERM-LISTER and all documents by DOCUMENT-LISTER (either
may be NIL). If DOCUMENT-LISTER is provided then DOCUMENT-MAPPER is
employed to iterate over the terms of each document. Similarly
TERM-MAPPER complements TERM-LISTER. TERM-INDEXER and DOCUMENT-INDEXER
provide a - sometimes invertible - mapping from terms/documents to
indices.
After the initial construction the mappers and indexers are stored in
the LSA instance because they are needed to assemble term/document
vectors later.
Finally the co-occurrence matrix is decomposed into singular vector
pairs that define the feature spaces.
SUPERVISOR is a FSVD supervisor on which FSVD:SUPERVISE-SVD is invoked to control iteration (see FSVD:SVD). The lsa instance being constructed is passed as the :LSA argument to allow inspecting, saving, etc.
Return a vector of index and similarity pairs of the - at most N - documents whose features are most similar to DOCUMENT-FEATURES according to the similarity MEASURE.
Save LSA to FILENAME and its svd to SVD-FILENAME.
Turn TERM into a document vector.
Convenience function that returns the features of TERM after turning into into a vector with LSA.
Return the feature vector for the term given by term VECTOR or INDEX.
Return an index representing OBJECT.
random-indexer
) object &key allocate-new-index-p) ¶hashing-indexer
) object &key allocate-new-index-p) ¶counting-indexer
) object &key allocate-new-index-p) ¶Return the object that is encoded to INDEX.
hashing-indexer
) index) ¶counting-indexer
) object) ¶Assemble MATRIX and remember how to perform the
same kind of activity on subsequent calls to ASSEMBLE-TERM-VECTOR and
ASSEMBLE-DOCUMENT-VECTOR.
ri-term-assembler
) lister) ¶lsa-assembler
) lister) ¶Iterate over terms of LISTER and assemble a
document vector in the same way as the matrix was assembled
previously.
ri-term-assembler
) lister) ¶lsa-assembler
) lister) ¶Iterate over documents of LISTER and assemble a
term vector in the same way as the matrix was assembled previously.
lsa-assembler
) lister) ¶Returned the normalized DOCUMENT-VECTOR. Possibly desctructive.
column-power-normalizer
) document-vector document) ¶row-centering-normalizer
) document-vector document) ¶null-normalizer
) document-vector document) ¶sign-normalizer
) document-vector document) ¶pmi-normalizer
) document-vector document) ¶tf-idf-normalizer
) document-vector document) ¶Return the normalized MATRIX possibly destructively
and remember how to perform the same kind normalizations on subsequent
calls to NORMALIZE-TERM and NORMALIZE-DOCUMENT.
column-power-normalizer
) matrix) ¶row-centering-normalizer
) matrix) ¶null-normalizer
) matrix) ¶sign-normalizer
) matrix) ¶pmi-normalizer
) matrix) ¶tf-idf-normalizer
) matrix) ¶Return the normalized TERM-VECTOR. Possibly desctructive.
tf-idf-normalizer
) stream) ¶lsa-assembler
) stream) ¶column-power-normalizer
) stream) ¶null-normalizer
) stream) ¶ri-term-assembler
) stream) ¶hashing-indexer
) stream) ¶This is not much more than a convenience class that
remembers how the SVD was produced to be able to extract features
later, or just to know what a given row or column corresponds to.
A mapper over all documents in which a given term occurs.
:term-mapper
This slot is read-only.
A mapper over all terms that occur in a given document.
:document-mapper
This slot is read-only.
Term indexer.
:term-indexer
This slot is read-only.
Document indexer.
:document-indexer
This slot is read-only.
Turns co-occurrences into a matrix, term and document vectors.
:assembler
This slot is read-only.
Performs some last minute transformations on the assembled matrix.
:normalizer
This slot is read-only.
The standard assembler that adds ...
:n-documents
This slot is read-only.
(error "document-class-fn is required.")
:document-class-fn
This slot is read-only.
:term-total
This slot is read-only.
:term-counts
This slot is read-only.
:class-counts
This slot is read-only.
:term-counts-per-class
This slot is read-only.
:term-total-per-class
This slot is read-only.
:n-documents
This slot is read-only.
Terms are random indexed, documents are not.
This slot is read-only.
Return a vector of SIZE whose elements represent the frequency with which their indices were listed by LISTER.
Return a mapper that maps from the same set as the first of MAPPERS maps from and maps to what the last of MAPPERS maps to, composing them in a chain. If MAPPERS is NIL #’FUNCALL, the identity mapper, is returned.
Return a mapper that is the concatention of MAPPERS.
Return the length of the projection of VECTOR to BASIS.
What makes a mapper is that the first is a function that is somehow applied to arguments. Currying a mapper leaves the function parameter alone and curries the rest of the parameters.
Translate MAPPER by encoding its sole argument with ENCODER.
Insert ITEM into VECTOR while keeping it sorted by TEST. Extend the vector if needed while respecting MAX-LENGTH
Relative importance of TERM across MATRIX.
Return a lister that maps to encoded terms and documents. If DOCUMENT-LISTER is not NIL get its documents, encode them and list their terms with DOCUMENT-MAPPER. Act similary with TERM-LISTER and TERM-MAPPER.
Return a mapper that applies to a document and calls its function argument with two parameters: the encoded term and the encoded document.
Create a random index vector of LENGTH with N 1s and N -1s. It is stored as a sparse vector (only the indices of non-zero elements where the first N are +1 the rest are -1).
Return a mapper that maps from SEQUENCES to elements of SEQUENCES.
n
.
A normalized measure of how often TERM appears in DOCUMENT.
pmi-normalizer
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
tf-idf-normalizer
)) ¶The inverse document frequencies in the originally assembled matrix.
idfs
.
lsa-assembler
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
lsa-assembler
)) ¶automatically generated reader method
column-power-normalizer
)) ¶automatically generated reader method
row-centering-normalizer
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
pmi-normalizer
)) ¶automatically generated reader method
Simply assigns a new index to every object.
structure-object
.
common-lisp
.
0
Jump to: | %
(
-
<
A C D E F G H I L M N P R S T |
---|
Jump to: | %
(
-
<
A C D E F G H I L M N P R S T |
---|
Jump to: | A C D I L N O P R S T |
---|
Jump to: | A C D I L N O P R S T |
---|
Jump to: | A C F H I L M N P R S T |
---|
Jump to: | A C F H I L M N P R S T |
---|