• ivanblago1000

    (@ivanblago1000)


    Dear all,

    As already known, Cyrillic links are seen in WordPress as:

    https://cir.filmskasecanja.com/%d1%82%d0%b5%d1%81%d1%82%d0%b8%d1%80%d0%b0%d1%9a%d0%b5-%d0%b8%d1%81%d0%bf%d1%80%d0%b0%d0%b2%d0%bd%d0%b5-%d1%82%d1%80%d0%b0%d0%bd%d1%81%d0%bb%d0%b8%d1%82%d0%b5%d1%80%d0%b0%d1%86%d0%b8%d1%98%d0%b5/

    but it should look like this:

    https://cir.filmskasecanja.com/тестира?е-исправне-транслитераци?е/

    Where is the problem?

    1. WordPress shortens Cyrillic links and shortens them to fit within the 200-character limit. Of course, 200 characters are not seen here cir.filmskasecanja.com/тестира?е-исправне-транслитераци?е/ but, but transliteration translates this into a string of symbols (…d1%98%d0%b5…) and then happens to have 200 characters there.

    2. If I copy that link from Chrome and share that link on Facebook, these symbols are also seen and it’s ugly …

    3. If I copy that link from Safari and share that link on Facebook, the link is seen correctly as it should

    4. Where the link is displayed correctly, it is often truncated so that the words in the link are cut in half (see paragraph 1.)

    So, at the moment I’m not sure where this problem occurs .. In WordPress? In Google Chrome?

    It seems to me the problem could be solved if WordPress could see Cyrillic letters the same as Latin letters.

    Has anyone had any experience or tried to solve this problem?

    It seems to me that WordPress implementation of Cyrillic link support would solve this problem globally.

Viewing 12 replies - 1 through 12 (of 12 total)
  • salmanmig

    (@salmanmig)

    My website is CrimeTak but meta description not showing in google search engine. Please help me.

    Juha Mets?kallas

    (@juhametsakallas)

    Hm, I have written a couple of days ago a post onto my site with a permalink

    https://finnababilejo.fi/afi?oj/juha_metsakallas/2020/02/鬼滅の刃/

    Note, these is an Esperanto ? letter in the path and the post has a title in Japanese. All I had to make sure was checking full UTF-8 support for php and MySQL. You might want to check those settings first.

    Thread Starter ivanblago1000

    (@ivanblago1000)

    Hi @juhametsakallas

    Thank you for your reply, I will check these settings.

    Thread Starter ivanblago1000

    (@ivanblago1000)

    @juhametsakallas Settings are fine.

    I see this problem exists with your link, too.

    If I make paste of your link in Chrome I see this: https://finnababilejo.fi/afi%C5%9Doj/juha_metsakallas/2020/02/%E9%AC%BC%E6%BB%85%E3%81%AE%E5%88%83/

    But if I go to Safari and copy that link to Chrome from a Safari browser, then I see correctly: https://finnababilejo.fi/afi?oj/juha_metsakallas/2020/02/鬼滅の刃/

    It is possible that this is only between Chrome and Safari transliteration.

    Certainly, the problem remains that WordPress shortens links due to character limits.

    Juha Mets?kallas

    (@juhametsakallas)

    Interesting. I’m currently by a MS Windows 10 Enterprise N and my link

    • MS IE 11: displays correctly
    • MS Edge: displays correctly
    • Google Chrome: displays correctly
    • Opera: displays correctly
    • Firefox: displays correctly but took a noticeable moment to change from MIME encoding to real)

    All web browsers are up to date.

    For your site I see the same as above, i.e. тестира?е-исправне-транслитераци?е displays correctly. Can you test on another computer?

    There is a limit of 256 in some versions of MS IE (a MS “feature”), but you’re not reaching it (227 characters in that MIME-coded URL). Another limit is 1024 characters (comes from the how parameters in page requests are passed), but it’s even further away.

    Thread Starter ivanblago1000

    (@ivanblago1000)

    @juhametsakallas

    Let’s forget for a moment how the link appears in the browser.

    The problem is that WordPress shortens the links at about 200 characters.

    This is fine when it comes to Latin fonts.

    The problem is with Cyrillic fonts, where WordPress shortens the link much earlier because it does not count fonts one by one, but rather transliteration symbols.

    For example:

    – this link has 68 characters: https://cir.filmskasecanja.com/дугачки-линкови-за-тестира?е-трансли/

    – but WordPress sees many more characters here after it convert to: https://cir.filmskasecanja.com/%D0%B4%D1%83%D0%B3%D0%B0%D1%87%D0%BA%D0%B8-%D0%BB%D0%B8%D0%BD%D0%BA%D0%BE%D0%B2%D0%B8-%D0%B7%D0%B0-%D1%82%D0%B5%D1%81%D1%82%D0%B8%D1%80%D0%B0%D1%9A%D0%B5-%D1%82%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8/

    Therefore, not all Cyrillic font words can fit into the link. The last word was cut in half: https://cir.filmskasecanja.com/дугачки-линкови-за-тестира?е-трансли/

    …And two more words from the title are missing

    Juha Mets?kallas

    (@juhametsakallas)

    @ivanblago1000 ,

    I’m not sure, whether I understand you correctly. I made a test on my WP installation, on the one I mentioned above. If I copy and paste your link https://cir.filmskasecanja.com/дугачки-линкови-за-тестира?е-трансли/ using the Classic editor or the Block editor into a private test page, the result is in both cases the same: I see a text

    дугачки линкови за тестира?е транслитераци?е у вордпрес

    with the WP logo and text “My Blog”.

    The Russian text is a link to https://cir.filmskasecanja.com/дугачки-линкови-за-тестира?е-трансли/.

    The logo and English text is a link to https://cir.filmskasecanja.com/.

    There is no MIME-coding nor truncation anywhere.

    Where do you see a MIME-coded and truncated link?

    Thread Starter ivanblago1000

    (@ivanblago1000)

    @juhametsakallas

    Let’s try it this way, please:

    Try publishing a new article with this title: Тестира?е дугачких линкова и транслитераци?е ?ирилице у Вордпресу и како ?е се то приказати

    This title contains 91 characters, or just let me know if WordPress was able to display all the words in the link or the link only contains the beginning of the title

    Then send me that link. WordPress probably won’t be able to put all the words in the link …

    The link gets shortened to

    https://finnababilejo.fi/afi?oj/juha_metsakallas/2020/03/тестира?е-дугачких-линкова-и-трансли/

    but remains functional, i.e. displays on the address field as above and gets you to the page. I let the page be there until tomorrow afternoon for you to see it yourself.

    Thread Starter ivanblago1000

    (@ivanblago1000)

    @juhametsakallas

    Thank you for your engagement.

    This shortening can be a problem because of SEO.

    Also, it is ugly when such a link is shared on the internet, screenshot: https://ibb.co/tPDcpN6

    It would be nice if WordPress could treat Cyrillic links the same way it treats Latin links.

    Do you have any suggestions?

    (Ok, the text was Serbian, my bad.)

    Yes, I can see, why it can be problematic for SEO.

    When one dives deeply into how web pages work, there are two ways to send page requests, GET and POST requests. IIRC GET has a limit of 1024 characters imposed by the standard and while POST can handle more, it has a limit too (albeit so large that you seldom bumb into it). But those limits are set by the internet specifications, different web site platforms (Drupal, Joomla Sharepoint, WordPress and so on) often have much lower limits (I work with Sharepoint which loves to use GUIDs as identifiers, while at the same time has quite a low limit for how long query strings can be. That combination asks for trouble.)

    If I had to bet, I would say that shortening is a remnant from the past. Either the original designers failed to grasp differences in writing systems or the expensive memory didn’t allow the processing of longer non-English strings or both excuses.

    Luckily WordPress permits the URL and the title of a page be totally different things. I think you have to settle for it. Overall I prefer to see a link this way тестира?е исправне транслитераци?е ?ириличних линкова instead of the raw url https://cir.filmskasecanja.com/тестира?е-исправне-транслитераци?е/, ofc YMMV.

    Thread Starter ivanblago1000

    (@ivanblago1000)

    Daar @juhametsakallas thank you for all you have done.

    Yes, the thing is delicate and probably rooted in the past.

    I’ll try to find some solution for better WordPress support for Cyrillic letters and links.

Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘Correct transliteration for Cyrillic links’ is closed to new replies.