Exporting Wiki content to PDF
I'm opening this page for a prelilminary discussion on how to generate PDFs from Wiki content.
An appropriate implementation of this feature would probably be a /pdf handler allowing a user to generate a PDF from a given page.
(More generally, an /export handler could give the user the choice about the preferred export format). There are several GPL-licensed Java and php solution for generating PDFs on the fly.
Some references
- PHP
- FPDF
a PHP class, which allows to generate PDF files without using the PDFlib library.
- PC4P
a PDF class for PHP.
- PDF-PHP
Create pdf documents using PHP, but without installing any modules. Comprises a base class which performs all the pdf creation, and an extension class (ezPdf) to allow simple document creation.
- R&OS pdf class
a PHP class which will allow the easy production of simple pdf documents.
- html2pdf
a PHP package that converts web pages to pdf, including css and images.
- Java
- FOP (Formatting Objects Processor)
(GPL) is the world's first print formatter driven by XSL formatting objects (XSL-FO) and the world's first output independent formatter. It is a Java application that reads a formatting object (FO) tree and renders the resulting pages to a specified output. Output formats currently supported include PDF, PCL, PS, SVG, XML (area tree representation), Print, AWT, MIF and TXT. The primary output target is PDF.
- XWiki PDF Export
An implementation of FOP for XWiki.
- Mike, I tried the R&OS pdf class and it is fairly easy to set up. The example in he readme file for database tables works straight out of the box. Just change the database settings and the SQL to query the database and your almost there. (A SQL query like select tag,body from main_wikka_pages where tag="HomePage" and latest="y"'; gives you output like this Of course, it shows that we need to handle the hyperlink formatting too. --JamesMcl .
FPDF
Hi James. Any progress on generating a PDF file output from wikka? -- Mike Bowen (GmBowen)
- Sorry Mike, no progress as yet as I've got other things to do at the moment. My lack of programming skills doesn't help although I have had success using the class without wikka. It shouldn't be too difficult though. Basically you have a file which includes the FPDF Class and queries the database. The problem is restricting the results to the latest page and presumably reversing the wikka markup to show bold etc. I could e-mail you an example file if you would like, need your e-mail address though.
- Have a look at this example Mike.
As you can see it picks the Page Name and Content fromm the database alright. The script needs a little refinement to format the page layout correctly and as I said earlier to restrict the results set from the sl query. The code is below.
<?php define('FPDF_FONTPATH','font/'); require('mysql_table.php'); class PDF extends PDF_MySQL_Table { function Header() { //Title $this->SetFont('Arial','',18); $this->Cell(0,6,'Wikka Output To PDF',0,1,'C'); $this->Ln(10); //Ensure table header is output parent::Header(); } } //Connect to database mysql_connect('localhost','user','password'); mysql_select_db('dbname'); $pdf=new PDF(); $pdf->Open(); $pdf->AddPage(); //First table: put all columns automatically $pdf->Table('select tag,body from main_wikka_pages'); $pdf->AddPage(); //Second table: specify 2 columns, $pdf->AddCol('tag',50,'tag','C'); $pdf->AddCol('body',50,'body','C'); $prop=array('HeaderColor'=>array(255,150,100), 'color1'=>array(210,245,255), 'color2'=>array(255,255,210), 'padding'=>2); //$pdf->Table('select tag,body from main_wikka_pages order by tag limit 0,10',$prop); $pdf->Output(wikka,I); ?>
Thanks James. I've got an undergrad programmer hired for a few weeks in May....and I found the pdf generating code for wikini as well (which is now only in a google cache).....so I'm probably going to have him try and get pdf output working (using a handler). I found a "footnoting/endnoting" action for wikini too, so I'll get him working on that also. When either are working I'll post them up (um, I hope this is okay and you don't feel like your efforts are being hi-jacked.....some people are really touchy about such things, but I really do need PDF output from wikka.). Cheers. --GmBowen
- Mike, have a look at FPDF at http://www.fpdf.org. Download the class and one of the database examples from the examples page. I'm sure your undergrad will have it working in no time. For what it's worth, I would like to place a link on each page or footer that allowed for the content to be saved as a pdf file. A further option to e-mail the page to another wikka user would also be good. I hope to hear about your success with this, soon. Cheers Jim.
Concept
- make a concept for ExportToPdf
- I sure hope somebody grabbed this code as it's not there any more. If you have it, please let me know. I have a php coder working for me the next 2 months who may be able to adapt it for wikka. --GmBowen
- Looks like there is no code yet - as far as I can deduce from the wiki2pdf page the project is still in the conceptual stage? http://www.thierrybazzanella.com/ only redirects to another (Flash-only) site. --JavaWoman
- I've looked at the page but forgot to grab the code (I darkly remember the beeing of code on the page) --NilsLindenberg
R & OS pdf class
- Mike, I just tried the R & OS pdf class. It works well straight out of the box. Change the database settings and the SQL query to something like select tag,body from wikka_pages where tag="HomePage" and latest="y" and you get this. Obviously, the hyperlink formatting will need to be addressed too. --JamesMcl
- Having a second look, some of the text seems to be missing from the page, I don't know why yet, may be something to do with the page layout --JamesMcl
- Maybe it's expecting an HTML document? It looks like lots of text is just cut off - but it's unformatted wiki source, so there really is no "page layout" - just a stream of plain text. I'd try running the wiki page through the formatter before feeding it to the PDF converter. BTW, what does "R & OS" stand for? --JavaWoman
- JavaWoman, I am not sure how you run a wiki page through the formatter. I don't know what "R & OS" stands for as the site that the class is on, http://www.ros.co.nz/ doesn't say on the web page. By the way DarTar, the PDF-PHP class is the same as the "R & OS" class. One is on the SourceForge site, the other on http://www.ros.co.nz/--JamesMcl
- Looking at the readme file on the http://www.ros.co.nz/ site it seems it does not expect any HTML at all, you need to do all the formatting yourself from raw data. Not so easy to do if what you need is a PDP equivalent of a rendered HTML page... Looks like FPDF is easier to implement. That said, you can run a wiki page (or any bit of code) through the formatter by just passing the code as a parameter to the formatter:
$output = $this->Format($input);--JavaWoman
Html2pdf
I think that a better way to investigate would be to look at the html2pdf package. This package seems to do pretty well with css and images. It is backended by gostscript to convert the page to postscript then to pdf. The downfall for this package is that it seems to be quite cpu intensive while generating the pdf. You can see a demo of it at http://pdf.shroom.net.I invision this as a action that can be added to a float box and the pdf is returned when a user click on a pdf icon or something along these lines.
I also think that some code should be added to cache a generated pdf, and serve the cached pdf if the page has not changed since the pdf was generated.
-AdamCrews
CategoryDevelopmentHandlers CategoryDevelopmentArchitecture