• Hello,
    We have pages whose urls have non-English characters – Odia/Oriya (India) in this case. When we export or list the urls using a plugin (export all urls or list urls) or a php script, the non-english characters are replaced with long strings of “%” and english characters. Same thing happens if we copy the url of a displayed page and paste it somewhere else.
    For example: [https://odiabibhaba.in/?????/] becomes [https://odiabibhaba.in/%e0%ac%86%e0%ac%ae%e0%ac%95%e0%ac%a5%e0%ac%be/].
    Is there some way we can get the page urls as they appear in their native form, i.e. including the non-english characters?
    Thanks,
    Nikhil

    The page I need help with: [log in to see the link]

Viewing 1 replies (of 1 total)
  • Moderator bcworkz

    (@bcworkz)

    Running the encoded URL through the PHP function urldecode() will restore the proper Odia glyphs. How to do this on your page with a list of URLs depends on how the list is generated. You should only decode the link text portion, assuming your page is UTF-8 based. Leave the actual href attribute encoded. This is necessary for good security.

    It doesn’t really matter much, but to be completely pedantic, the characters a-z are properly referred to as Latin characters, not English. There are many languages that use these besides English ;). Furthermore, the characters after % in encoded URLs are not only Latin characters, they are specifically hexadecimal numerals: 0-9,a-f only. Disregard if you wish, we know what you mean either way ??

Viewing 1 replies (of 1 total)
  • The topic ‘Export urls with non-English characters’ is closed to new replies.