MsgFiler is primarily an AppleScript Studio application. To perform the search on mailboxes, MsgFiler calls several home-grown PHP scripts. These scripts, however, don’t handle Unicode text all that well. As a result, when users search for mailboxes with accented characters, they invariably run into problems with MsgFiler.
I’ve thrown in some custom-subclasses and functions into the MsgFiler application that I can access from within the AppleScript Studio app. My goal is to rewrite the search algorithm using Cocoa/Objective-C. I’m eager to read up some pointers on how to successfully search Unicode strings under Mac OS X. Any tips from the development community on where to start?
Probably not any specific help as you’ve probably already been through this, but PHP uses PCRE (Perl Compatible Regular Expressions) for regular expressions which sounds like it support Unicode fairly well (although I’ve never tested its unicode support). Of course, PCRE can be used in Cocoa using AGregex (a wrapper for PCRE).
I also bumped into Unicode’s document describing how to adapt regular expressions to handle unicode, if that’s helpful.
HTH,
Morgan
Morgan:
I am using PHP’s preg_match_all function to perform the string matching. Perhaps my implementation of it needs reworking:
Where $mboxes is a string containing all of the mailboxes to search. There is a /u option in preg_match_all, but that doesn’t seem to work. Any additional thoughts?