Sphene Community Tools

Community

Copyright © 2007-2018 by Herbert Poul

You are not logged in.
Login
Register

Change Language:



AddThis Social Bookmark Button

A Django site.

Powered by Sphene Community Tools
Board » General » PDF Generation

I thought this would be a cool feature to add onto the wiki I've implemented so I defined the SPH_SETTINGS variable in my settings.py file like so:

SPH_SETTINGS = {
                'wiki_pdf_generation':True,
                'wiki_pdf_generation_command':\
                    'fop -xml %(srcfile)s -xsl ' + \
                    os.path.join(SCT_ROOT_PATH, 'sphenecoll', 'sphene', 'contrib', \
                    'misc', 'xsl', 'xhtml2fo.xsl') + ' -pdf %(destfile)s ',
                'wiki_pdf_generation_cachedir':PATH+'/static/pdfs/',
                }


This sets the variables appropriately because I can now see the Download PDF button. These are simply copied from the documentation of PDF generation. I did change the command though, to simply be 'fop ...' removing the path and leading directory structure because I've added the fop directory to my PATH variable. However when I attempt to use it, I get the following, somewhat unhelpful, debug trace:

Environment:

Request Method: GET
Request URL: http://<ip_address/../project/18/wiki/pdf/Start/
Django Version: 1.0-final-SVN-8450
Python Version: 2.5.1
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.sites',
 'iaproto.ideaarcade',
 'iaproto.registration',
 'voting',
 'django_evolution',
 'sphene.community',
 'sphene.sphboard',
 'sphene.sphwiki']
Installed Middleware:
('sphene.community.middleware.ThreadLocals',
 'sphene.community.middleware.GroupMiddleware',
 'sphene.community.middleware.PermissionDeniedMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.middleware.doc.XViewMiddleware')


Traceback:
File "c:\python25\lib\site-packages\django\core\handlers\base.py" in get_response
  86.                 response = callback(request, *callback_args, **callback_kwargs)
File "C:\Python25\Lib\site-packages\sphene\communitytools\sphenecoll\sphene\sphwiki\views.py" in generatePDF
  112.         raise e

Exception Type: Exception at /../project/18/wiki/pdf/Start/
Exception Value: Error while generating PDF.


So I figured it was probably bugging out when attempting to run the fop command. I looked into the cache directory I defined and there was an xhtml file generated, Start_5.pdf.xhtml. I used this file, along with the xsl file at sphene/contrib/misc/xsl/xhtml2fo.xsl. It gave an error when attempting to generate the pdf:

(command line input)

>fop -xml \static\pdfs
\Start_5.pdf.xhtml -xsl C:\Python25\Lib\site-packages\sphene\communitytools\sphe
necoll\sphene\contrib\misc\xsl\xhtml2fo.xsl -pdf \static\pdfs\
test.pdf

(command line output)
Aug 25, 2009 4:06:40 PM org.apache.fop.cli.Main startFOP
SEVERE: Exception
javax.xml.transform.TransformerException: java.io.IOException: Server returned H
TTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dt
d
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:217)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:125)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)

---------

javax.xml.transform.TransformerException: java.io.IOException: Server returned H
TTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dt
d
        at org.apache.xalan.transformer.TransformerImpl.fatalError(TransformerIm
pl.java:782)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:756)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1284)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1262)
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:214)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:125)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)
Caused by: java.io.IOException: Server returned HTTP response code: 503 for URL:
 http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon
nection.java:1152)
        at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown So
urce)
        at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
        at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source
)
        at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Sourc
e)
        at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(
Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
known Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Sour
ce)
        at org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.jav
a:437)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:699)
        ... 6 more
---------
java.io.IOException: Server returned HTTP response code: 503 for URL: http://www
.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon
nection.java:1152)
        at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown So
urce)
        at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
        at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source
)
        at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Sourc
e)
        at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(
Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
known Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Sour
ce)
        at org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.jav
a:437)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:699)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1284)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1262)
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:214)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:125)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)



Is there a particular version of FOP we need to be using? I simply downloaded the bin file here: http://mirror.uoregon.edu/apache/xmlgraphics/fop/binaries/

I downloaded the version titled: fop-0.95-bin.zip

After a bit of research it appears w3 blocked requests for the files and FOP and/or the xhtml files are requiring them and requesting them for some reason, is there a way around this, another software to use instead perhaps?


--- Last Edited by Magus Bond at 2009-08-25 21:22:25 ---

--- Last Edited by Magus Bond at 2009-08-25 21:40:11 ---
yeah, i encountered a similar problem..
w3.org simply does not allow downloading of the .dtd any longer - probably depending on the user agent it simply rejects any requests coming from known automatic sources. maybe one could reconfigure FOP to use another user agent or something like it.
unfortunately i haven't had enough time yet to investigate it any further.
Hey, we have Signatures !!! Great, isn't it ? ;)
I've tried using these external dtd files which I found here: http://xmlgraphics.apache.org/fop/fo.html#fo-validate by adding the link in the xhtml file and replacing the w3 dtd file. However I get the following error:

Aug 25, 2009 5:14:58 PM org.apache.fop.cli.Main startFOP
SEVERE: Exception
javax.xml.transform.TransformerException: The markup declarations contained or p
ointed to by the document type declaration must be well-formed.
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:217)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:125)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)

---------

; SystemID: http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/foschema/fop.
xsd?view=co; Line#: 19; Column#: 2
javax.xml.transform.TransformerException: The markup declarations contained or p
ointed to by the document type declaration must be well-formed.
        at org.apache.xalan.transformer.TransformerImpl.fatalError(TransformerIm
pl.java:780)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:756)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1284)
        at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImp
l.java:1262)
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:214)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:125)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)


Any idea why it would be saying the html is not well-formed or is it simply because the dtd it's validating against isn't the right dtd file? I've also tried converting the xhtml file using other programs such as html2fo, however the generated pdf is not openable in Adobe, it says it's not a supported format or was damaged.

I think the easiest solution would be to manually go to the dtd file on w3 and perhaps trick the request to point back to your own machine and simply host the file on your own machine. I have no idea how I'd do this though, it's a cheap fix but I think it may work. P

Perhaps the easiest solution could be for you to serve the file here? I'm sure once the file is accessible the pdf generation would work.
I've fixed it by simply hosting the files on my own server and editing the sphwiki/models.py file to correctly reference this location when creating the header of the xhtml file.

However, the generation button is broken. It generates the xhtml file, but will not generate the pdf afterward. I have captured the command it is attempting to run and it appears to run fine when I run it in a python shell with os.system(command)

Instead of generating the pdf after the xhtml it seems to be raising the exception Exception( 'Error while generating PDF.' ). If you have an idea what could be causing this then I could get it working because the command line is generating the pdf but it is not working within the code when it's called.

Also, links within the generated PDF files are relative and do not work. i.e. (/static/images/file)
fyi: theoretically the latest (trunk) version of fop supports the option '-catalog' which should use a local version of the xhtml dtd ...
unfortunately .. i was not able to use it .. i always got strange IndexOutOfBound exceptions .. anyway .. i have now played around with http://www.xhtml2pdf.com/ which produces quite good results IMHO (see attachment)

the command i have used was:

SPH_SETTINGS['wiki_pdf_generation_command'] = '''cat %(srcfile)s | perl -n -e 's#<h1>#<h1 class="first"># and $h1 = 1 unless defined $h1 ; s#<a href="(.*?)">(.+?)</a>#<a href="\\1">\\2</a> <span class="linktext">[\\1]</span>#g ; print' | xhtml2pdf --css=/var/tmp/sct_pdf/pisa.css - %(destfile)s'''


i have also attached the .css which produces the PDF output. (it is a modified version of the default xhtml2pdf CSS)
i have also attached
Hey, we have Signatures !!! Great, isn't it ? ;)
Attachments

Please login to post a reply.



Powered by Sphene Community Tools