Ticket #13 (new enhancement)

Opened 2 years ago

Last modified 2 years ago

More search functionality

Reported by: Randy Selzler <RSelzler@…> Owned by: sheep
Priority: Low Milestone:
Component: Hatta Wiki Version:
Keywords: search Cc:

Description

More search functionality would be useful.
No doubt you have your own wish list that's just waiting
for time and resources to be implemented.

  • Quoted strings: For example, searches for "foo bar" table might require an exact match (or case insensitive match) on "Foo bar". This might be used to find this particular 'table' and/or references to it. "fast/car" example might find pages that link to "fast/car" and contain the word 'example'.
  • Refined searches: For example, the page containing search results might show hundreds of hits. If another search was initiated from that result page, the second search might be limited to within those results.
  • Search expressions: Search expressions that support grouping and boolean operators. fast AND sexy (car OR women) might be a popular search ;) The 'AND', 'OR' special stop-words might be more intuitive to non-geeks than '&', '|'.

Change History

comment:1 Changed 2 years ago by sheep

The MoinMoin wiki engine has that, and I take care of several wikis using that engine -- real wikis used by real human communites. I have access to the logs from those wikis. In the 5 year history of their existence nobody ever used a search more complicated than three words. Most of the searches are for a single word. The only searches that use regular expressions and/or any kind of meta-language are the canned searches saved on the pages as macros -- and the users never even see their query. This finding is consistent with findings of other researchers on the topic, like  Jakob Nielsen.

You can already narrow down your search results simply by adding the extra words you want to your old query. Hatta requires all the words to appear on the page to show it in the search results. I might make the old query re-appear in the search box on the results page, but I'm pretty sure that most of the time you'd want to search for something completely different when your first search doesn't give you what you want.

The meta-language that you propose is simply not possible to implement with how Hatta does searching: Hatta only knows which words appear on which pages (and how many times), but has no idea on the order in which they appear. I may add some meta-language to the search, but it would be much simpler:

  • Enclose a (single) word in quotes to prevent matching as a part of a larger word. For example, if you search for cat, you will see also pages that have catapult in them. If you search for "cat", you wouldn't see them. This may be useful when looking for short words.
  • Precede a word with a minus sign to exclude all pages that contain that word from search results.
  • Precede a page name with some special character (haven't decided yet) to only include pages that link to that page. I'm not entirely sure about this, I might have a special search box on the backlinks page instead or just drop the idea completely.
  • If I ever add tags, you'd use a similar mechanism for searching for tags.

This is what Hatta's indexer allows, with some modifications. I'm not going to make it any more complicated or inefficient just to include features that are never used. And remember that each line of code added means extra maintenance costs and more opportunities for introducing bugs, even when working on some distant part of code.

Maybe when Hatta has plugins someone will write a plugin that integrates it with Xapian? You'd get all what you write about for free then.

Again, sorry for ranting, but it's too cold to code and I have to let off some steam ;) -- Radomir Dopieralski


Your post was more insightful than a rant (no worry there!)
and I mostly agree with its points (especially the
merits of simplicity, which is a big plus for Hatta).

I don't know enough about Hatta internals to restrict my
"new feature requests" to things that are easy and reasonable.
It wasn't meant to be a detailed, specific proposal, but I
did want to avoid vague generalities by using examples.
Take all of this as my guessing, as a newbie.

As you suggested:

  • Support for quotes and whole-word could help reduce false matches, especially for short words... Nice.
  • Exclude pages that contain '-' words would be useful... Nice.

Support for "pages that contain a link to..." may have
problems, if the page name contains a blank
(or other special character that implicitly delimits "words",
such as '/' as used in docs tree page names... my thingy).

Question: You mentioned "add tags" (support?)... could you
describe that functionality or what you mean?

"Tags" seems similar to "backlinks" and I'm using backlinks
to implement "Categories" (mentioned somewhere else in our
past dialogue). Backlink tags have quickly become an
indispensable part of my Wiki (see my site  http://72.192.119.192:8090/ for example
usage, "Topic" on the Menu).

How cold is it??? We had record cold last week here in
Oklahoma (0 degree F, whooopie). I suspect you've got
me beat ;)

-- Randy

comment:2 Changed 2 years ago by sheep

  • Priority changed from Normal to Low
Note: See TracTickets for help on using tickets.