• In my opinion Michael took the only right approach: Concentrating only on the semantics works great and ensures clean page output.

    Now it’s easily possible to write even long blog posts in Word, using it’s spell and grammar checker and importing the text into WordPress with just a click. Awesome!

    There’s only one feature request: Mammoth already recognizes different (non-“normal”) paragraph formats (shown as warnings in the messages section after an import). It would be great if Mammoth could add the name of unknown paragraph formats as CSS class names to the resulting p-tags. That way it would even be possible to create enhanced (semantic) formats from within word like e.g. teaser paragraphs, warning boxes and so on.

Viewing 9 replies - 1 through 9 (of 9 total)
  • Plugin Author Michael Williamson

    (@michaelwilliamson)

    Many thanks for the kind words. If you’ve used styles to format your document, you can define a style map to describe how styles in Word should be converted to class names. For instance, if you want each Word paragraph that has the style name “Highlight” to be converted to a paragraph tag with the class name “highlight, then you can write:

    p[style-name='highlight'] => p.highlight:fresh

    The :fresh part tells Mammoth that you want each Word paragraph to be a separate paragraph in the output.

    A full description of how style maps work can be found here:

    https://github.com/mwilliamson/mammoth.js#writing-style-maps

    Once you have a style map, you can embed it in a Word document using this page:

    https://mike.zwobble.org/projects/mammoth/embed-style-map/

    If you take the resulting document from that page and upload it through the plugin, it should use the style map to create the appropriate HTML. Support in the plugin was only recently added, so do shout if anything seems amiss.

    Thread Starter sv3n.spandau

    (@sv3nspandau)

    Sounds awesome and exaclty like what I was looking for. Will try this out.

    Many thanks again for your great time-saving work!

    Thread Starter sv3n.spandau

    (@sv3nspandau)

    Great idea. It is working … mainly.

    Just have a few glitches and suggestions…

    1) I’ve hoped I could create a word document (maybe a template) with the stylemap embedded and could than create other documents based on this one.

    Unfortunately after I’ve embedded the stylemap using your web tool MS Word doesn’t want to open the document any more and states it is defect. So this would mean that I have to embed the stylemap manually each time before I upload it.

    I really would prefer an option to specify the stylemap within the WordPress settings.

    2) I have the following mapping: p[style-name='Quote'] => blockquote

    According to your definition of freshness I would expect two paragraphs using the Quote-paragraph style to be merged to one blockquote, but mammoth generates to blockquotes? Have I misunderstood the concept?

    3) I have defined a paragraph style named Source and map it to a simple div. My idea was to have an option to directly embed source code (e.g. WP shortcodes) that will be inserted as original into the created article. Unfortunately the straight quotation marks of my shortcode are imported as &quot entities instead of quotation marks. Any ideas for this?

    4) It would be great to have the possibility to embed the <!--more--> tag within my word document to break up the article’s excerpt and content. Any ideas whether this is possible?

    Plugin Author Michael Williamson

    (@michaelwilliamson)

    Adding an option within WordPress to control the style map is unlikely since it requires quite a bit of thought around where to store style maps, whether they’re shared between users, and so on, which would require time I don’t have. Hopefully storing the style map within the document is sufficient.

    As for Word reporting the file is defective: it looks like the tool I linked to was using an out-of-date version. If you try it again, Word should be happier opening the resulting file.

    Your usage of the mapping seems correct. It’s a bit tricky for me to say why two blockquotes are being created without seeing the original source document. Can you create a minimal document reproducing the problem?

    For (3), that’s because Mammoth will HTML-escape any text in the document. It might be possible to avoid escaping quotes outside of attributes — I’ll check to see how straightforward that is.

    As for (4), I don’t think there’s currently a good way of embedding the more tag at the moment.

    Thread Starter sv3n.spandau

    (@sv3nspandau)

    1) The new version of the embedder generates word documents which are opened by word without any complaints. Thumbs up!

    Templates do not work — documents created on base of the template do not contain the stylemap. But I am okay with creating an empty word document with the stylemap embedded and copying this to create new posts which seems to work fine.

    Haven’t thought of the multi-user implications. I think a global stylemap which could be overriden by an embedded stylemap would be enough, but I can live well with the current approach.

    2) I want to merge multiple following quote paragraphs in word to a single blockquote with multiple embedded ps.

    To achieve this I adjusted the mapping to p[style-name='Quote'] => blockquote > p and created a word document which contains just four paragraphs:

    1. normal
    2. quote
    3. quote
    4. normal

    Importing this document results in two seperate blockquote elements each containing a p element. You can download the minimalistic document here.

    3) Or what about a :raw postfix (similar to the :fresh postfix) that could tell Mammoth to take the original text without any escaping? This way it would even be possible to include some raw HTML.

    4) The suggestion from 3) might solve this too.

    5) Just a question: Is there currenlty the possibility to omit a special paragraph type completely while importing? Background: I like to include titles in my word documents at the beginning, but they map to the post’s title and do not appear in the content. Definitely not essential for surviving.

    ===

    Please don’t get the points above wrong: I love the plug-in as it is and it is really a huge time saver. All the requests above point into one direction: I would prefer to maintain the full article including shortcodes and the like within word. Without the possibility to include shortcodes/HTML import becomes a one-shot solution for just the first import and every change afterwards needs to be done in the WordPress editor which — unfortunately — is quite error prone due to it’s strange newline and paragraph handling.

    Thanks a lot for all your efforts…

    Plugin Author Michael Williamson

    (@michaelwilliamson)

    For (2), I think you’ll want to adjust the mapping slightly so that the p is always fresh i.e.

    p[style-name='Quote'] => blockquote > p:fresh

    I’ve also just updated the plugin. This allows elements to be collapsed more consistently, as well no longer escaping double quotes outside of attributes (which should solve (3)).

    For (4): something like :raw sounds plausible. I’ll have a bit more of a think about it when I get a bit more time.

    For (5): there currently isn’t a way to omit paragraphs. Adding something to the style map to allow this seems sensible though. Again, it’s something I’ll think about the next time I have a bit more spare time.

    Thread Starter sv3n.spandau

    (@sv3nspandau)

    For (2): That now works as expected. Thanks a lot!

    Unfortunately the last update of the plug-in introduced a new bug: While importing the docx, mammoth now inserts a line break before and after each tag. If I have for example italic text inside a paragraph mammoth will generate a line break before the opening <em> and after the closing </em>. Due to the wordpress editor’s “magic” (and annoying) line break handling, these line breaks will make it into the output.

    Besides this I still love the plug-in.

    Plugin Author Michael Williamson

    (@michaelwilliamson)

    I suspect the bug isn’t actually new, but only shows up when you’re the WordPress is in text mode rather than visual mode. In any case, the latest version should hopefully fix that — let me know if that’s not the case.

    Thread Starter sv3n.spandau

    (@sv3nspandau)

    Hi Michael, thanks again for the quick fix. I can confirm, that the issue does no longer appear. Great job!

Viewing 9 replies - 1 through 9 (of 9 total)
  • The topic ‘The only right approach!’ is closed to new replies.