• I’ve noticed unexplainable HTML generated by WordPress on my blog. A representative article can be found at “https://citizenopines.org/2010/08/19/public-policy/what-is-marriage-perry-vs-schwarzenegger/”. One discrepancy is that midway through the article the font is changed to New Times Roman when the theme has specified Arial. I can see nothing about the text which should cause any kind of change. Another discrepancy pertains to an unintended change in the size of the font.

    In this case the post was originally authored by OpenOffice Writer. At the point where the Font Face is changed there is an occurrence of some italicized text. Text following the italicized text is explicitly changed to “small”. As a result the way the HTML is generated there are explicit style attributes being written in every tag. This makes it extremely difficult to correct such problems by editing (fixing) the HTML.

    This leads me to wanting to know more about how the HTML generation is performed and what affect different authoring tools may have on the results. Can anyone refer me to applicable reference material.

Viewing 15 replies - 1 through 15 (of 20 total)
  • Thread Starter aajax

    (@aajax)

    Correction – I erroneously said that the font specified by the theme is Arial. What is correct is that the body uses sans-serif fonts, such as Arial, rather than serif fonts such as New Times Roman. The point is the font all of a sudden changes, conspicuously, for no apparent reason to something that isn’t consistent with the theme. Furthermore, the generated HTML prevents the use of the theme for determining which font should be used.

    That post has inline styles that are defining the font family for the paragraphs.

    Edit your post, and remove any unneeded, inline styles.

    Thread Starter aajax

    (@aajax)

    YES, I know. My question is, “why”? I didn’t do it. WordPress did it. The HTML WordPress generated is a classic example of what shouldn’t be done because changing it is very tedious and error prone. Furthermore, I have no reason to believe that WordPress won’t put it back the way it is because I have no idea of why WordPress did it.

    WordPress doesn’t ‘generate’ HTML. The editor in WordPress is based on tinymce which surrounds text with HTML tags in response to users clicking on the ‘styling’ buttons.

    Where did the text come from? It is possible that if it was pasted into the Visual Editor it took with it some embedded styles but that is a function of the output of the original source not the WordPress editor.

    Thread Starter aajax

    (@aajax)

    The article was authored in OpenOffice (OO) Writer, which is similar in function and capability to MS Word. It was copied and pasted into the editor used by WordPress. This article did use the styling capability of OO Writer. For example, text was italicized and indented using the features of OO Writer. This article also uses footnotes and endnotes which were recognized by whatever software generated the HTML. I neither entered HTML nor used the style buttons, to which you refer, when posting this article. I only did copy and paste.

    Don’t paste content from word processing software into WordPress as the pasted text will also contain the software’s own formatting. As you’ve now seen, this formatting will stop your pages from being displayed correctly. In short, you entered the HTML/CSS – you just didn’t know you were doing it.

    The article was authored in OpenOffice (OO) Writer, which is similar in function and capability to MS Word.

    You just answered your own question. WordPress didn’t add that formatting. You added that formatting, when you pasted directly from Writer.

    Note: I believe the Visual editor has a “Paste from Word” button. If you must paste directly from a rich text editor, try using this button.

    Thread Starter aajax

    (@aajax)

    The reason I mentioned the software used for authoring in my original post is that I did think this could be significant.

    While OO Writer and MS Word are both able to produce HTML/CSS I hadn’t requested that kind of formatting. What I’ve also thought is that this kind of software has its’ own, product specific, method of encoding the formatting controls, which is not HTML/CSS. Therefore, I had imagined that the software into which such code is pasted has to deal with the formatting controls. This is actually a very normal situation that accompanies all applications that support paste operations. I must confess that this is not something I know much about but it occurs to me that the interface must provide a way to recognize the text without having to understand related meta data.

    The idea that pasting from word processors into WordPress causes problems is a reasonable hypothesis. However, it doesn’t explain why WordPress displays the article partially using its’ theme but then switches fonts part way through it. I assure you that the OO Writer version uses the same font throughout. However, it is New Times Roman.

    If it is recommended that word processors are not used for authoring then what is recommended. The WordPress editor is nice from the point of view that it provides a ready to use facility for making comments but it is not really a very useful authoring tool. It has occurred to me that using an HTML editor might be wise, at least when special formatting is desired, but I would expect that this would make a more in depth knowledge of WordPress theme design than I presently possess necessary.

    If it is recommended that word processors are not used for authoring then what is recommended.

    The Visual editor. It can do anything that most post-authoring needs require. And if you need anything more advanced/complex, you really should be busing the HTML editor.

    The WordPress editor is nice from the point of view that it provides a ready to use facility for making comments but it is not really a very useful authoring tool.

    I must admit that I don’t understand this comment at all. What do you mean by “provides a ready to use facility for making comments”? And why don’t you find it to be “really a very useful authoring tool”?

    If it is recommended that word processors are not used for authoring then what is recommended.

    A plain-text editor such as Textedit in Mac or Notepad on PC. Be sure the program is set to plain-text mode, not rich text. If you don’t want to compose in your plain-text app directly, you can also select “save as plain text/txt” in your rich-text program, then open up your txt document in your plain-text editor, and paste from there.

    p.s. pretty much all content management systems and newsletter systems that use a WYSIWYG editor have this same issue – it’s not WordPress-specific.

    Thread Starter aajax

    (@aajax)

    By ready to use I meant that if someone can find your blog they have someway to comment or post on it without having to install or find other particular software. While not being feature rich it is easy to use and intuitive which is appropriate when considering that many users may be new to your site. One cannot expect everyone to have the knowledge and experience as oneself.

    I suppose everyone is entitled their own opinion about what constitutes usefulness so I may have erred if my remark sounds like this is universally accepted fact. My WordPress blog is not the only thing that I need to author so I like a tool that has some versatility. For example, how do I make a printable PDF file using the Visual Editor? I also like keeping copy, the master if you will, on my own system. Word processors support more features. For example, Visual Editor doesn’t seem to integrate with the theme meaning what you see is not what you are going to see on the live post. While more sophisticated features may be what got me into trouble here that doesn’t mean you never want those features. I’d say that Visual Editor has similar functionality to MS Wordpad. Wordpad is fine for some things but I’d also accuse it of not being a really useful authoring tool. Finally part of the problem may be that I just haven’t used the Visual Editor enough to appreciate what it can do.

    There is mention herein about tinymce, which is new to me. Is it possible that it can be used by itself in some kind of standalone manner? In that, use it to author on my own computer.

    Thread Starter aajax

    (@aajax)

    I conducted a little experiment to test my theory about what to expect when using paste.

    I took the article that is the subject of this thread and opened it in OO Writer. Then I copied the whole thing to the clipboard and pasted it into Notepad. What showed up was just the text absent any formatting. In that, no HTML/CSS or product specific meta data. Since I know the formatting is preserved if I paste the same thing back into OO Writer, I believe that the format controls (i.e., meta data) is present and can be used should the application want to. However, an application that doesn’t understand the meta data can still obtain the text and ignore the meta data.

    Of course we see simple examples of this all the time without thinking about it. For example, some times the font is preserved and some times it isn’t.

    For example, how do I make a printable PDF file using the Visual Editor?

    You wouldn’t. You would upload and attach a PDF to a Post.

    I also like keeping copy, the master if you will, on my own system. Word processors support more features.

    I’ve never tried it, but perhaps a desktop application, such as LiveWriter, would be useful for this?

    For example, Visual Editor doesn’t seem to integrate with the theme meaning what you see is not what you are going to see on the live post.

    Actually, that’s not true. Since version 3.0, WordPress has supported custom styles for the Visual Editor. If your Theme supports it, what you see in the Visual editor will be exactly what you see in your published post.

    Since I know the formatting is preserved if I paste the same thing back into OO Writer, I believe that the format controls (i.e., meta data) is present and can be used should the application want to. However, an application that doesn’t understand the meta data can still obtain the text and ignore the meta data.

    Did you try using the “Paste From Word” button (it’s the one with the “W” overlaid on a clipboard) in the Visual Editor?

    Your experiment may be overlooking the empirical evidence.

    I can see nothing about the text which should cause any kind of change.

    Simply viewing the source code for that article reveals the formatting issues immediately.

    The article was authored in OpenOffice (OO) Writer, which is similar in function and capability to MS Word. It was copied and pasted into the editor

    A pretty common and verifiable issue when pasting directly from word processing software.

    You might research this a little bit if you’re an open source alternative, do it yourself, willing to experiment, I’ll give that a try, do it yourself and just make it work kind of person.
    “Integrated Content Environment ” WordPress “how to” ICE

    Or more specifically, Blog from OpenOffice.org

    Otherwise, as previously mentioned, it might be best to avoid pasting from applications that function primarily as word processors or “print media” applications, and stick to plain text based applications or desktop software designed to publish directly to blogging platforms. Just more information for you.

    Although I always use a plain text editor or the HTML editor in WordPress, I personally have no feelings about it either way, really. Good luck with it!

Viewing 15 replies - 1 through 15 (of 20 total)
  • The topic ‘Faulty HTML is being generated’ is closed to new replies.