5

I use whoosh for full text search ,

and I want to know: how to get all 'index data' that has been added.

This is my main.py:

import cgi,os

from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
from google.appengine.ext.webapp.util import run_wsgi_app

from whoosh import store
from whoosh.fields import Schema, STORED, ID, KEYWORD, TEXT
from whoosh.index import getdatastoreindex
from whoosh.qparser import QueryParser, MultifieldParser
import logging

SEARCHSCHEMA = Schema(content=TEXT(stored=True))

class BaseRequestHandler(webapp.RequestHandler):
  def render_template(self, filename, template_args=None):
    if not template_args:
      template_args = {}
    path = os.path.join(os.path.dirname(__file__), 'templates', filename)
    self.response.out.write(template.render(path, template_args))

class MainPage(BaseRequestHandler):
  def get(self):
    self.render_template('index.html')

class SearchPage(BaseRequestHandler):
  def get(self):  
    ix = getdatastoreindex("hello", schema=SEARCHSCHEMA)
    parser = QueryParser("content", schema = ix.schema)
    q = parser.parse(self.request.get('query'))
    results = ix.searcher().search(q)
    a=''
    for result in results:
      a+=('<blockquote>%s</blockquote>' %
                              cgi.escape(result['content']))
    all=ix.schema
    self.render_template('index.html',{'results':a,'all':all})

class Guestbook(BaseRequestHandler):
  def post(self):
    ix = getdatastoreindex("hello", schema=SEARCHSCHEMA)
    writer = ix.writer()
    writer.add_document(content=u"%s" %  self.request.get('content'))
    writer.commit()
    self.redirect('/')

application = webapp.WSGIApplication(
                                     [('/', MainPage),
                                      ('/search', SearchPage),
                                      ('/sign', Guestbook)],
                                     debug=True)

def main():
  run_wsgi_app(application)

if __name__ == "__main__":
  main()

And my index.html is :

<form action="/search" method="get">
 <div><input name="query" type="text" value=""><input type="submit" value="Search"></div>
</form>


<form action="/sign" method="post">
 <div><textarea name="content" rows="3" cols="60"></textarea></div>
 <div><input type="submit" value="Sign Guestbook"></div>
</form>

{{results}}

all data:


 {{all}}
{% for i in all%}
 {{i}}
{%endfor%}
Assem
  • 11,574
  • 5
  • 59
  • 97
zjm1126
  • 63,397
  • 81
  • 173
  • 221
  • 1
    You will probably get more, better, and quicker answers by asking the Whoosh mailing list at http://groups.google.com/group/whoosh and are you sure this isn't covered in the docs? http://packages.python.org/Whoosh/ – Jason Hall Jun 23 '10 at 19:45
  • What are you actually trying to achieve? Trying to obtain _all_ the index data in a request is an odd thing to do. – Nick Johnson Jun 24 '10 at 08:25
  • probably the following link may help you: http://stackoverflow.com/questions/2395675/whoosh-index-viewer – Steve Harrison Jul 19 '14 at 22:02

1 Answers1

5

This solution is tested on Whoosh 2.7 but could work in previous versions as well

You can list all results with:

all_docs = ix.searcher().documents() 

In the template, you can iterate through them like:

{% for doc in all_docs %}
    {{ doc.content }} <!-- or any doc.field as field is in your schema -->
{% endfor %}
Assem
  • 11,574
  • 5
  • 59
  • 97