Generally the plugin works for me, however I have found these issues:
When I attempt to import this PDF set to Smart XML, it imports only the images.
https://resources.billofrightsinstitute.org/wp-content/uploads/2014/12/BAA-004-HandoutsAgg.pdf
If I change the import setting to Text, it imports all the text up to page 14. For some reason it ignores the text on the pages after that.
When I update just the part that is page 15, found here:
https://resources.billofrightsinstitute.org/wp-content/uploads/2014/12/BAA-004-HandoutD.pdf
It works fine.
I’m concluding that the software is capable of importing the entire PDF mentioned earlier but for some reason it can only do it in parts.
Help is appreciated. Thanks!
]]>Hi,,
I want to use your plugin, but I can’t, can anyone please help me out???
My website is on Cloudways server(https://platform.cloudways.com/login), I tried to install PDFMINER and PDFIMAGES extensions but I can’t able to install, can anyone help me to install those packages on my hosting server?
I am getting this errors(see attached screenshot.)
https://drive.google.com/file/d/1LkezAvQITQmK9KLtdD121FeUKDvU_1yV/view?usp=sharing
Thanks
]]>Hello,
I have most common problem with installing PDFMiner and pdfimages on my shared hosting.
What is starting point, how to install PDFMiner and pdfimages on Linux shared hosting ?
I have app in Cpanel called Setup Python App and I install PDFMiner from there but still nothing happens.
Regards
]]>Hello friends. I am currently having problems installing the following. Hope you support. Thank you!
MacBook-Pro:~ vuvankhai$ brew install python-pdfminer
Error: No available formula with the name “python-pdfminer”
==> Searching for a previously deleted formula (in the last month)…
Warning: homebrew/core is shallow clone. To get complete history run:
git -C “$(brew –repo homebrew/core)” fetch –unshallow
Error: No previously deleted formula found.
==> Searching for similarly named formulae…
Error: No similarly named formulae found.
==> Searching taps…
==> Searching taps on GitHub…
Error: No formulae found in taps.
Hi there,
Is there a guide on how to install the pdfminer? I’ve looked at everything available and not sure how to run or install it…
Thanks
]]>Hi, I would like to use this plugin but my shared host server does not have PDFMiner or pdf images. I am informed by my shared hosting provider that they do not install libraries for users.
Given that most blogs will be on shared C-Panel type hosts, is there a way to somehow have the functionality of these libraries embedded within the plugin, or otherwise make it available on a local installation only rather than server-wide basis?
]]>Is there a test environment that can allow me to test this plugin? I tried installing the plugin but it says I need PDFMIner and Pdfimages on my server and I have no idea how to solve that. I’d appreciate if there was a way that I could see how the plugin works.
]]>Good Evening,
We have gone through the process of creating an altinstall of python 3.6.3, installed pip, wheel and setuptools. We have also installed poppler-utils for the pdfimages but are still getting the following from the plugin:
The following libraries NEED to be installed on your server :
Linux secure.guardian88.com 3.10.0-962.3.2.lve1.5.24.9.el7.x86_64 #1 SMP Wed Feb 13 08:24:50 EST 2019 x86_64
ZipArchive Installed
PDFMiner
Not installed! please see https://pypi.org/project/pdfminer/
Python
Installed: 2.7.5 (default, Oct 30 2018, 05:13:07)
pdfimages
Not installed! please see https://en.wikipedia.org/wiki/Pdfimages
Please advise.
]]>Hi, I would like to follow your project, are you on Github? Please add your Git to the plugin description. Very nice work.
Some ideas for the future. Feel free to ignore me.
——————————————–
Option to have own Post type for these imports.
PDFpost type for example.
This will let us have custom fields for meta information to Extract existing metadata from PDF, to custom fields maybe.
ex Keywords, Author etc
If on import a new post type is created, we can:
Attach the original PDF to the post in a view/template.
Extract original filename and record into meta data.
Get checksum of PDF.
Rename PDF file to checksum before storage to avoid file name conflicts.
Perhaps Customizable with Pre/Post naming convention ex PDF2POST_CHECKSUM.PDF
5c7863bba9209e80c2675c75a53ba840.pdf
Images should be named based on the checksum named document.
5c7863bba9209e80c2675c75a53ba840_image_01.jpg
If identical checksum pdf is uploaded, do not duplicate, write duplicated pdf’s filename (if different) to post metadata as “Alternate Filename(s)”
Download original pdf (as any of registered filenames)
Download modified pdf with any post data modifications.
For example, I plan on putting some translated excerpts into the post, might be nice to download a pdf with that metadata as an option.
Hi there, really cool plugin, has a lot of potential.
Some installation issues
Debian 9
Ziparchive – apt install php-zip – OK (Was already installed)
pdfimages – apt install poppler-utils – OK (This was the simplest package to get the util)
pdfminer – This is where the trouble starts.
Debian provides a python-pdfminer package but your plugin could NOT find it once installed. Trying to install pdfminer from pip would also fail while the debian package (python-pdfminer) was installed. I think the detection routine could use some work.
Regardless after installing from pip (not my preferred method) the plugin worked fine.
Even though the plugin works fine, I am receiving a message at the top of my screen in the PDF 2 Post area
The security check failed
Don’t know what that is.
]]>Hi, I’ve tried setting up the plugin both on a shared server through my web host and on a local ampps installation and can’t get any PDF to process. I’ve tried multiple files. Here is what gets printed out on the page after I hit upload:
File uploaded successfully! C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3.pdf
pdfimages -png C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3.pdf C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3_out/_images
Ok
pdf2txt.py -o C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3_out/POMONA-3.txt C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3.pdf
Ok
( ! ) Warning: file_get_contents(C:\Program Files (x86)\Ampps\www\wp/wp-content/uploads/2019/02/POMONA-3_out/POMONA-3.txt): failed to open stream: No such file or directory in C:\Program Files (x86)\Ampps\www\wp\wp-content\plugins\pdf2post\pdf2post.php on line 516
Call Stack
# Time Memory Function Location
1 0.0027 226496 {main}( ) …\edit.php:0
2 0.0045 271448 require_once( ‘C:\Program Files (x86)\Ampps\www\wp\wp-admin\admin.php’ ) …\edit.php:10
3 0.6093 20963232 do_action( ) …\admin.php:224
4 0.6093 20963672 WP_Hook->do_action( ) …\plugin.php:453
5 0.6093 20963720 WP_Hook->apply_filters( ) …\class-wp-hook.php:310
6 0.6093 20964192 call_user_func_array:{C:\Program Files (x86)\Ampps\www\wp\wp-includes\class-wp-hook.php:286} ( ) …\class-wp-hook.php:286
7 0.6093 20964440 PDF2Post->pdf2post_upload_page( ) …\class-wp-hook.php:286
8 0.6093 20964656 PDF2Post->handle_pdf( ) …\pdf2post.php:48
9 0.6507 20967928 PDF2Post->pdf2postFromAbsFile( ) …\pdf2post.php:367
10 1.0311 20971296 file_get_contents ( )
And here is the bit about libraries:
The following libraries NEED to be installed on your server :
ZipArchive Ok
PDFMiner
Python
Not installed!
pdfimages
I’ve got python running and I’ve downloaded xpdf tools and added the path to it but it still doesn’t seem to want to find pdfimages. At least that’s what I assume part of the issue is since the other things appear to be installed (even though it only says OK next to ZipArchive).
I’m not sure how to proceed but as yours is the only plugin that I’ve found that converts PDF to posts I really would like to find a way to make it work.
Thanks for your help!
]]>Hello!
I installed PDF 2 post, guessing I did something wrong as I get this error when I try to use the New Post From PDF function:
File upload successful! attachementy ID : 5CANNOT OPEN zip here: unknown error
Is there a log file I can look at to help troubleshoot?
Thanks!
Rob
]]>Every pdf I try to import gets uploaded, but then I get this error…
Warning: file_get_contents(/home/accinj32/public_html/transform401k.com/wp-content/uploads/2017/11/Weekly_Market_Commentary_08072017-1_images/Weekly_Market_Commentary_08072017-1.txt): failed to open stream: No such file or directory in /home/accinj32/public_html/transform401k.com/wp-content/plugins/pdf2post/pdf2post.php on line 409
I hope this plugin will be the awesome solution I hoped it would. Thanks.
David Lamoureux
]]>