Post by Andrew CohenCHARSET ISO-8859-1 BODY "dirÃa" -> no matches CHARSET UTF-8 BODY
"dirÃa" -> no matches
CHARSET ISO-8859-1 BODY "que" -> fine, lot of matches CHARSET
UTF-8 BODY "que" -> fin, lot of matches
Eric> Are you doing the above in a telnet session, or where? IMAP
Eric> uses a variant of UTF-7 as its internal coding, maybe try with
Eric> that and see what happens?
Eric> FWIW, it doesn't look like nnir/IMAP work for me here, either,
Eric> searching for Chinese characters.
Eric> Eric
This isn't currently supported in nnir directly, but can be done with a
raw imap query. As I recall the trick is you can't use a quoted string
but have to use a literal; literals need to include the number of octets
(including CR and LF) in brackets following the search term (this is
what I understand from reading the imap RFC).
Here's a patch that (mostly) makes this work. It's probably not ready
for application, but I'm fairly confident that the basic approach is
sound. I had to make changes to nnir-run-imap, because literals have to
be fed to the server on separate lines.
Things that bear examination:
1. Raw imap queries are still not touched. You couldn't put non-ascii
search terms in raw queries anyway, because, as mentioned above, they
have to be sent as separate lines to the server, and the current search
routine can't do that.
2. The 'coding' let-var that gets sent to the search command (the
argument to CHARSET) is a total guess on my part. It should be easy to
fix if my guess is wrong, though. Do we need to care about the coding
system used in the server's message storage?
3. I've only tested this with my local dovecot (2.2.13). Imap servers
are weird, and edge cases abound. It would be nice if someone could test
this directly with Gmail and Exchange, at least.
What are the future plans for nnir's mini-imap search language? Right
now it doesn't seem like a whole lot is gained by the
nnir-imap-search-arguments/default-search-key setup. It seems like it
would be simpler and more flexible to allow all valid imap search
criteria as lower-cased keys followed by a colon, which would make the
language look a little bit more like notmuch or what have you. Unknown
keys would be considered header values. Or something else like that --
mostly it just seems very limiting that we can specify at most one
criteria to search on.
Anyway, just curious what the plan is.
Eric