[Athen] Inline Images and PDF Tags
Philip Kiff
phil at d4k.ca
Fri Apr 30 19:26:21 PDT 2021
I tried a couple things in the files you sent but I couldn't get the tag
order to generate correctly either: not from the Adobe Acrobat Pro DC
generator nor from the built-in Microsoft Word 365 generator.
Just as an FYI, when I used the axesPDF for Word plugin, the order *did*
come out correctly. And I would bet that it would also come out
correctly using the CommonLook Office plugin (thought I don't have that
one to test).
The Microsoft built-in PDF generator has been getting better and better
these past few years, but generally speaking, both the axesPDF and
CommonLook products do a much better job producing an accessible,
correctly formatted PDF directly from a well-formatted Word source.
Phil.
Philip Kiff
D4K Communications
On 2021-04-30 21:12, Steve Green wrote:
>
> Welcome to the weird world of accessible PDFs. I see the same issue as
> you, and I have no idea how to fix it in the Word document.
>
> It’s strange that Acrobat does a better job of tagging. In my
> experience, that’s rarely the case. But with accessible PDFs, there’s
> an exception to every rule.
>
> Steve
>
> *From:*athen-list <athen-list-bounces at mailman12.u.washington.edu> *On
> Behalf Of *Joseph Polizzotto MA
> *Sent:* 01 May 2021 00:53
> *To:* Access Technology Higher Education Network
> <athen-list at u.washington.edu>
> *Subject:* Re: [Athen] Inline Images and PDF Tags
>
> Hi Steve:
>
> Thanks for your response and for letting me know about the missing
> image. That image displayed the following screenshot of text in a DOCX
> file:
>
> Boyle's law (where (P is the pressure and V is the volume)
>
> Well, that's really interesting and good to know about general issues
> with the Adobe Acrobat Pro plug-in for MS Word. I will keep that in mind.
>
> FWIW, I have tried just using the Save As > PDF route in MS Word to
> circumvent the issue that I described but it does not resolve it.
>
> Using the Save As > PDF method, there is a difference in how the tags
> are created from the MS Word elements but the <Figure> tags for the
> inline images are still not in the correct place; with the Save As PDF
> method, in fact, they also come after the associated <P> tag but are
> not nested within the <P> tag, which is the case with the Adobe
> Acrobat Pro conversion method. In both cases, remediation would be
> time-consuming.
>
> What's interesting is that with both conversion methods, if I delete
> the root tag and then add tags to the document in Adobe Acrobat Pro,
> the <Figure> elements are in the right place related to the <P> tag,
> just that the alternative text is missing. It's as if the Adobe
> Acrobat Pro's "add tags" feature does a better job of tagging from
> within than when it first ingests the DOCX through the MS Word
> conversion suite.
>
> I am not sure why that is and would be even willing to delete all the
> tags, only to add them back, if it also meant that the alternative
> text would automagically reappear again. :-)
>
> Joseph
>
> On Fri, Apr 30, 2021 at 4:13 PM Steve Green
> <steve.green at testpartners.co.uk
> <mailto:steve.green at testpartners.co.uk>> wrote:
>
> It’s strange that you should post this message now, because
> literally one minute ago I sent a lengthy email to all our staff
> explaining why they must not use Adobe Acrobat Pro's plug-in for
> MS Word for creating PDFs.
>
> One of the reasons is that it does not seem to recognise the “Mark
> as decorative” checkbox in Word’s Alt Text pane, and it adds
> bizarre Alternate Text such as P1070TB2#y1.
>
> Another issue relates to the use of simple text boxes in Word.
> Although we discourage their use, there are times that you want or
> need to use them. You can put images in the textbox, which can
> potentially cause a problem because you can add Alt Text to both
> the image and the textbox. If you do that, the Acrobat
> Accessibility Checker reports a failure due to nested Alternate
> Text, which is perfectly reasonable – you can’t have Alt Text
> inside other Alt Text.
>
> The “solution” is to mark the text box as decorative and only add
> Alt Text to the image. If you “Save as PDF”, this does exactly
> what you would expect. The image is in the Tags panel with its
> Alternate Text. The text box is artifacted.
>
> You might expect the Adobe Acrobat Pro plug-in to do the same, or
> at least do something intelligent, but it doesn’t do either of
> those. It puts both the image and text box in the Tags panel. As
> discussed above, it adds random Alternate Text to the text box. It
> then deletes the Alt Text you added to the image, and to add
> insult to injury, the Acrobat Accessibility Checker fails because
> of the missing Alternate Text!
>
> I have also noticed the sort of issues you reported.
>
> For the time being, my recommendation is to use Word’s “Save as
> PDF” feature instead.
>
> BTW, all the images were missing from your email.
>
> Steve Green
>
> Managing Director
>
> Test Partners Ltd
>
> *From:*athen-list <athen-list-bounces at mailman12.u.washington.edu
> <mailto:athen-list-bounces at mailman12.u.washington.edu>> *On Behalf
> Of *Joseph Polizzotto MA
> *Sent:* 30 April 2021 23:53
> *To:* Access Technology Higher Education Network
> <athen-list at u.washington.edu <mailto:athen-list at u.washington.edu>>
> *Subject:* [Athen] Inline Images and PDF Tags
>
> Hi Everyone:
>
> I encountered a problem when converting an MS Word document with
> inline images using Adobe Acrobat Pro's plug-in for MS Word.
>
> The problem is that the inline images appear in the incorrect
> place within the PDF tags panel. Specifically, the <Figure> tags
> for the inline images are located /after/ the entire <P> tag with
> which they are associated.
>
> Instead of the MS Word paragraph being broken up into separate
> chunks of content within the <P> PDF tag, the paragraph is
> contained as one block of text inside the <P> tag and the inline
> images are represented as <Figure> tags after that block.
>
> For example, in the following snippet of an MS Word document,
> where P and V are inline images in the sentence:
>
> alt=Boyle's law (where P is the pressure and V is the volume)
>
> I find the following structure in the PDF tags panel:
>
> <P>
>
> Boyle's law (where is the pressure and is the volume)
>
> <Figure>
>
> P
>
> <Figure>
>
> V
>
> If I remove the tags and add them back using Adobe Acrobat Pro,
> the <Figure> tags will be in the correct place, in between the
> correct blocks of text, but the alternative text for the images
> will be lost.
>
> This is the desired tag structure:
>
> <P>
>
> Boyle's law (where
>
> <Figure>
>
> P
>
> is the pressure and
>
> <Figure>
>
> V
>
> is the volume
>
> What have you done to address this issue? Is there a way to avoid
> having to remediate the PDF tags for this issue and get the
> correct tag order for inline images when using Adobe Acrobat Pro's
> plug-in for MS Word?
>
> Note: I am using my MS Word's 365 (subscription) version with the
> continuous release version of Adobe Acrobat Pro (2021). I have
> attached the documents as well.
>
> Thanks for your help,
>
> Joseph
>
> --
>
> *Alternate Media Supervisor*
>
> Disabled Students' Program
>
> University of California, Berkeley
>
> https://dsp.berkeley.edu/
> <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdsp.berkeley.edu%2F&data=02%7C01%7C%7C4e0abffcb5b34567a22308d5e13137b3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636662523854357148&sdata=yB5%2BUm2W6TBwpc%2BOF4DvN8wPoo1dozUwz8eCepYhTyY%3D&reserved=0>
>
> (510) 642-0329
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20210430/1e41bbea/attachment.html>
More information about the athen-list
mailing list