• Hi,

    I don’t know why but I have a problem into the function ‘relevanssi_create_excerpt’.

    (FIY, I’m calling it into my code to create an excerpt for some post metas.)

    I have a text that is return with some “?” characters. I was able ot fix the problem by replacing

    $content = preg_replace('/\s+/', ' ', $content);

    by

    $content = preg_replace('/\s+/u', ' ', $content);

    What’s weird is that sometimes it works even without the /u modifier.

    I suspect that there is a php bug here but not really sure as sometimes it happens, sometimes not. With the exact same text !

    As all my texts are Unicode (UTF8) encoded I can go with the /u modifier but somehow that don’t seem ok to me as the behavior is so weird.

    My Setup is WordPress 4.4.2 and relevanssi 3.4.2

    PHP 5.5.12
    Apache 2.4.9
    MySQL 5.6.17

    https://www.ads-software.com/plugins/relevanssi/

Viewing 7 replies - 1 through 7 (of 7 total)
  • Thread Starter leup

    (@leup)

    In the same function, I have another problem.

    There is this line (234~):
    $term = " $term";

    I do understand that you are searching for words and not parts of words (not fuzzy) but there is a problem here with words with an apostroph.

    Example: query => “afrique”. If the text is “L’afrique”, the excerpt will fail on finding the term ” afrique”.

    Also, if I understand this correctly, if the function “mb_stripos” do not exists you do :

    $titlecased = mb_strtoupper(mb_substr($term, 0, 1)) . mb_substr($term, 1);

    and as the term always start with a blank space it fails to search the term with a first uppercase character.

    Plugin Author Mikko Saari

    (@msaari)

    Yes, that’s a bug with the first uppercase character. Also, adding the space – that’s a bit complicated as well, as it makes sense in some situations and not so much in other.

    I think adding the /u modifier makes sense, since WP content is pretty much always UTF8. I’ll have to see about the added space –?something needs to be done with that, I’m just not quite sure what.

    In general the whole excerpt-building is far from being the most brilliant bit of programming in Relevanssi =)

    Thread Starter leup

    (@leup)

    Hi ! Thanks for your answer ! ??

    I removed the leading space character as it suits my needs better and added the \u modifier.

    I understand why you add the leading space but indeed it is far from perfect for every cases. Maybe using some regular expressions may be best ? Well, it would not give you the position of the occurence into the text… complex indeed. I will check what solution exists on the internet ^^

    Thread Starter leup

    (@leup)

    I made a quick search

    Google

    I think these links could be useful

    Drupal 7

    Stackoverflow

    WordPress plugin for search excerpts

    Plugin Author Mikko Saari

    (@msaari)

    Thanks, those should help.

    Plugin Author Mikko Saari

    (@msaari)

    Leup, I’m working on a better excerpt-building mechanism. If you’re interested in testing it, please drop me an email at mikko @ mikkosaari.fi.

    Thread Starter leup

    (@leup)

    Hi Mikko,

    Sorry for the delay. It would be definitely interesting but I have not so much time right now to do some tests.

Viewing 7 replies - 1 through 7 (of 7 total)
  • The topic ‘Problem with relevanssi_create_excerpt and unicode ?’ is closed to new replies.