• Resolved markolympic

    (@markolympic)


    I’m having trouble understanding the process of what needs to be done to successfully import two pieces of metadata from the PDFs I’m uploading to the Media Library. Each PDF includes (an example of a PDF metadata entry is shown below):

    pdf:Title => The Role of Politics in the Appointment of Board Members at Sydney Organizing Committee of the Olympic Games (SOCOG)

    pdf:Author => Stylianos Daskalakis, Dimitrios Gargalianos & Evangelos Albanidis

    When I upload a PDF, the TITLE field, by default, contains the file name, which is useless to users when they execute a search. I want this TITLE field to be replaced by the pdf:Title.

    I would also like the new “AUTHOR” field (created using the Custom Fields plugin) to contain the data from the pdf:Author metadata.

    I’m assuming that I somehow have to map the IPTC fields so MLA knows what metadata to harvest from the PDFs I upload to my Media Library, but I’m at a loss to know how/where to introduce that mapping and what the code should look like.

    Thanks for helping.

Viewing 2 replies - 16 through 17 (of 17 total)
  • Thread Starter markolympic

    (@markolympic)

    Hi David. Actually I went ahead this morning and “fixed” the two files manually. They still show the same minimal metadata as before. Here’s what is odd: the project involves uploading about 1,000 PDFs, each of which had the metadata. The files were grouped into 27 folders. The two folders I was having trouble with yesterday had nearly half of the files in each folder experiencing the same problem with MLA not reading the metadata. That’s when I raised the issue with you. Then yesterday, I decided to just continue with the remaining 10 or so folders, even if I had to manually add the PDF titles and authors. In those folders, the metadata in nearly every file was read perfectly by MLA! Perhaps it was just coincidental that the 2 folders I was having problems with were the only two folders among the 27 with issues. Who knows. I’ll be interested to hear if you solve the issue as I would like to try out this same procedure with another website.

    Plugin Author David Lingren

    (@dglingren)

    Thanks for your update with the additional information.

    I have uploaded a new MLA Development Version dated 20220312 that improves XMP extraction in PDF documents that don’t strictly conform to the standard. You can find step-by-step instructions for using the Development Version in this earlier topic:

    PHP Warning on media upload with Polylang

    Once the Development Version is installed you can use the Media/Edit Media “Attachment File Metadata” text box to see the additional values in the pdf: section as well as the xmp: metadata.

    This fix will be part of my next MLA version, but in the interim it would be great if you could install the Development Version and let me know if it works for you. Thanks for alerting me to this MLA defect.

Viewing 2 replies - 16 through 17 (of 17 total)
  • The topic ‘Mapping PDF fields’ is closed to new replies.