| |
- exceptions.Exception(exceptions.BaseException)
-
- FastSSException
- FastSSManager
class FastSSManager |
|
A FastSSManager is associated with a fastss index (not to be confused
with an sqlite database index, which is also used, discussed later.)
Example of usage:
>>> import fastss
>>> manager = fastss.FastSSManager('idx')
>>> manager.create_index(False)
# Insert lemmas in the index, using 20% of their length as the depth:
>>> manager.update_index([u'Mike', u'Johnny', u'johnny'], lambda s: int(len(s)*.2))
# Search:
>>> for match in manager.search('Mike', 2): print match
(u'Mike', 0)
>>> for match in manager.search('johnny', 1): print match
(u'johnny', 0)
(u'Johnny', 1)
>>> for match in manager.search('johnny', 1, nocase=True): print match
(u'Johnny', 0)
(u'johnny', 0)
>>> for match in manager.search('johny', 2, nocase=True): print match
(u'johnny', 1)
(u'Johnny', 1) |
|
Methods defined here:
- __init__(self, filename)
- filename refers to the index to be used. If the file doesn't exist
it will not be created on the filesystem, only its filename will be
remembered so that the index proper can be created using the
create_index function.
- accelerate_case(self)
- Accelerate case-sensitive searches.
- accelerate_deletes(self)
- Accelerate deletes from the index.
- accelerate_nocase(self)
- Accelerate case-insensitive searches.
- analyze(self)
- Reanalyze the index to enable faster searching. It is only useful
to call this function When significant amount (in the order of tens of
thousands) of lemmas have been added and/or deleted from the index.
- create_index(self, include_user_field=False, accelerate_nocase=True, accelerate_deletes=True, accelerate_case=True)
- Create the internal structure of the index. For information on the
include_user_field parameter, see the update_index function. The final
three parameters specify whether or not to accelerate (use sql indices)
searches and deletions. Each "acceleration" will slightly reduce the
index updating performance. Note that accelerate_case and
accelerate_nocase should not be both disabled at the same time, as
searches will be tremendously slow.
- decelerate_case(self)
- Decelerate case-sensitive searches.
- decelerate_deletes(self)
- Decelerate deletes from the index.
- decelerate_nocase(self)
- Decelerate case-insensitive searches.
- delete_from_index(self, lemmas)
- Delete the contents of the lemmas sequence from the index.
- remove_index(self)
- Remove the index from the filesystem.
- search(self, query, max_distance, nocase=False, show_user_fields=False)
- Search in the index for the query string, returning all lemmas
whose edit distance is equal or smaller than max_distance. If nocase is True
searches are case-insensitive. If show_user_fields is True the user_fields
associated with each lemma will also be returned.
- update_index(self, lemmas, depth_callback, user_fields=None)
- Update the index with the contents of the lemmas sequence, which
should be unicode strings. Please don't use non-unicode strings unless
you like subtle bugs.
depth_callback is a function that accepts a
string as a parameter and returns an integer. The function will be
applied to each lemma to determine the depth to which it will be
indexed.
If the index has been created with user_fields (see the
documentation for the __init__ function) then user_fields can
optionally be a sequence of the same length with lemmas, each
user_field will be associated with its respective lemma and it will be
returned with said lemma when it is a search result.
| |