[Athen] Creating an accessible PDF from a web page

Deborah Armstrong armstrongdeborah at fhda.edu
Wed Jun 30 13:22:28 PDT 2021


Thanks to you both. Very helpful.



From: athen-list <athen-list-bounces at mailman12.u.washington.edu> On Behalf Of Joseph Polizzotto MA
Sent: Thursday, June 24, 2021 11:28 AM
To: Access Technology Higher Education Network <athen-list at u.washington.edu>
Subject: Re: [Athen] Creating an accessible PDF from a web page

Hi Debee:

If you are more comfortable using the command line, you can download the HTML from the URL (e.g., you can type google-chrome --headless --dump-dom 'URL' > ./PATH/TO/FILE.html) and then use free utilities to convert the (accessible Markdown) TXT version to PDF.

You could chain together these commands into a script to make it a simple process. For instance, some possible tools in the script could be HTML2Text<https://urldefense.com/v3/__https:/pypi.org/project/html2text/__;!!A-B3JKCz!TR6e_hZzFrmwGnMRhzCweei05op6kwEduPQfR4pJPxoF2UqAu7Hi8LQq8StPipAD7QKLtg$>, Pandoc<https://urldefense.com/v3/__https:/pandoc.org/__;!!A-B3JKCz!TR6e_hZzFrmwGnMRhzCweei05op6kwEduPQfR4pJPxoF2UqAu7Hi8LQq8StPipBTkPJP7w$>, or OfficeToPDF<https://urldefense.com/v3/__https:/github.com/cognidox/OfficeToPDF__;!!A-B3JKCz!TR6e_hZzFrmwGnMRhzCweei05op6kwEduPQfR4pJPxoF2UqAu7Hi8LQq8StPipAVuCmRBA$>.

Of course with OfficeToPDF, there may be things that you would still need to check in Acrobat, but the images + alt text, lists, and headings will all be there...

HTML2Text is a great tool in that it would allow you to optionally exclude things that you might not need or want from the web page. For instance, if the images are not so important to the user who just needs alt text, you can use the --images-to-alt option.

HTH,

Joseph

On Thu, Jun 24, 2021 at 9:02 AM Karen McCall <K4mccall at outlook.com<mailto:K4mccall at outlook.com>> wrote:
Whenever you choose “Print > Adobe PDF you will create an inaccessible PDF.

You will need Acrobat to create a tagged PDF from a webpage.

Choose File > Create > PDF from Webpage and copy the URL into the dialog.

The dialog has a Settings button where you can check the checkbox to create Bookmarks and another to add PDF tags.

You can also choose how many pages/layers of the website you want to convert to tagged PDF.

Cheers, Karen



From: athen-list <athen-list-bounces at mailman12.u.washington.edu<mailto:athen-list-bounces at mailman12.u.washington.edu>> On Behalf Of Deborah Armstrong
Sent: Thursday, June 24, 2021 11:45 AM
To: Access Technology Higher Education Network <athen-list at u.washington.edu<mailto:athen-list at u.washington.edu>>
Subject: [Athen] Creating an accessible PDF from a web page

When I print a PDF of a web page from Edge or Chrome in Windows 10, I always get an inaccessible PDF.

I haven’t needed to do this for a student yet, but I don’t understand why this happens. The text of the web page is already available to whatever default driver is printing my PDF. I’m using whatever the Windows default is, though I’ve tried other solutions.

Before I need to do this for real, does anyone know how to easily print an accessible PDF from an accessible web page?

--Debee

_______________________________________________
athen-list mailing list
athen-list at mailman12.u.washington.edu<mailto:athen-list at mailman12.u.washington.edu>
http://mailman12.u.washington.edu/mailman/listinfo/athen-list<https://urldefense.com/v3/__http:/mailman12.u.washington.edu/mailman/listinfo/athen-list__;!!A-B3JKCz!TR6e_hZzFrmwGnMRhzCweei05op6kwEduPQfR4pJPxoF2UqAu7Hi8LQq8StPipBjYukBEA$>


--
Alternate Media Supervisor
Disabled Students' Program
University of California, Berkeley
https://dsp.berkeley.edu/<https://urldefense.com/v3/__https:/nam02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fdsp.berkeley.edu*2F&data=02*7C01*7C*7C4e0abffcb5b34567a22308d5e13137b3*7C84df9e7fe9f640afb435aaaaaaaaaaaa*7C1*7C0*7C636662523854357148&sdata=yB5*2BUm2W6TBwpc*2BOF4DvN8wPoo1dozUwz8eCepYhTyY*3D&reserved=0__;JSUlJSUlJSUlJSUlJSU!!A-B3JKCz!TR6e_hZzFrmwGnMRhzCweei05op6kwEduPQfR4pJPxoF2UqAu7Hi8LQq8StPipAHSM2e4g$>
(510) 642-0329

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20210630/32ef71e3/attachment.html>


More information about the athen-list mailing list