[Athen] Inline Images and PDF Tags

Joseph Polizzotto MA jpolizzotto at berkeley.edu
Fri Apr 30 16:52:31 PDT 2021


Hi Steve:

Thanks for your response and for letting me know about the missing image.
That image displayed the following screenshot of text in a DOCX file:

Boyle's law (where (P is the pressure and V is the volume)

Well, that's really interesting and good to know about general issues with
the Adobe Acrobat Pro plug-in for MS Word. I will keep that in mind.

FWIW, I have tried just using the Save As > PDF route in MS Word to
circumvent the issue that I described but it does not resolve it.

Using the Save As > PDF method, there is a difference in how the tags are
created from the MS Word elements but the <Figure> tags for the inline
images are still not in the correct place; with the Save As PDF method, in
fact, they also come after the associated <P> tag but are not nested within
the <P> tag, which is the case with the Adobe Acrobat Pro conversion
method. In both cases, remediation would be time-consuming.

What's interesting is that with both conversion methods, if I delete the
root tag and then add tags to the document in Adobe Acrobat Pro, the
<Figure> elements are in the right place related to the <P> tag, just that
the alternative text is missing. It's as if the Adobe Acrobat Pro's "add
tags" feature does a better job of tagging from within than when it first
ingests the DOCX through the MS Word conversion suite.

I am not sure why that is and would be even willing to delete all the tags,
only to add them back, if it also meant that the alternative text would
automagically reappear again. :-)

Joseph


On Fri, Apr 30, 2021 at 4:13 PM Steve Green <steve.green at testpartners.co.uk>
wrote:


> It’s strange that you should post this message now, because literally one

> minute ago I sent a lengthy email to all our staff explaining why they must

> not use Adobe Acrobat Pro's plug-in for MS Word for creating PDFs.

>

>

>

> One of the reasons is that it does not seem to recognise the “Mark as

> decorative” checkbox in Word’s Alt Text pane, and it adds bizarre Alternate

> Text such as P1070TB2#y1.

>

>

>

> Another issue relates to the use of simple text boxes in Word. Although we

> discourage their use, there are times that you want or need to use them.

> You can put images in the textbox, which can potentially cause a problem

> because you can add Alt Text to both the image and the textbox. If you do

> that, the Acrobat Accessibility Checker reports a failure due to nested

> Alternate Text, which is perfectly reasonable – you can’t have Alt Text

> inside other Alt Text.

>

>

>

> The “solution” is to mark the text box as decorative and only add Alt Text

> to the image. If you “Save as PDF”, this does exactly what you would

> expect. The image is in the Tags panel with its Alternate Text. The text

> box is artifacted.

>

>

>

> You might expect the Adobe Acrobat Pro plug-in to do the same, or at least

> do something intelligent, but it doesn’t do either of those. It puts both

> the image and text box in the Tags panel. As discussed above, it adds

> random Alternate Text to the text box. It then deletes the Alt Text you

> added to the image, and to add insult to injury, the Acrobat Accessibility

> Checker fails because of the missing Alternate Text!

>

>

>

> I have also noticed the sort of issues you reported.

>

>

>

> For the time being, my recommendation is to use Word’s “Save as PDF”

> feature instead.

>

>

>

> BTW, all the images were missing from your email.

>

>

>

> Steve Green

>

> Managing Director

>

> Test Partners Ltd

>

>

>

>

>

> *From:* athen-list <athen-list-bounces at mailman12.u.washington.edu> *On

> Behalf Of *Joseph Polizzotto MA

> *Sent:* 30 April 2021 23:53

> *To:* Access Technology Higher Education Network <

> athen-list at u.washington.edu>

> *Subject:* [Athen] Inline Images and PDF Tags

>

>

>

> Hi Everyone:

>

>

>

> I encountered a problem when converting an MS Word document with inline

> images using Adobe Acrobat Pro's plug-in for MS Word.

>

>

>

> The problem is that the inline images appear in the incorrect place within

> the PDF tags panel. Specifically, the <Figure> tags for the inline images

> are located *after* the entire <P> tag with which they are associated.

>

>

>

> Instead of the MS Word paragraph being broken up into separate chunks of

> content within the <P> PDF tag, the paragraph is contained as one block of

> text inside the <P> tag and the inline images are represented as <Figure>

> tags after that block.

>

>

>

> For example, in the following snippet of an MS Word document, where P and

> V are inline images in the sentence:

>

>

>

> alt=Boyle's law (where P is the pressure and V is the volume)

>

>

>

> I find the following structure in the PDF tags panel:

>

>

>

> <P>

>

> Boyle's law (where is the pressure and is the volume)

>

> <Figure>

>

> P

>

> <Figure>

>

> V

>

>

>

> If I remove the tags and add them back using Adobe Acrobat Pro, the

> <Figure> tags will be in the correct place, in between the correct blocks

> of text, but the alternative text for the images will be lost.

>

>

>

> This is the desired tag structure:

>

>

>

> <P>

>

> Boyle's law (where

>

> <Figure>

>

> P

>

> is the pressure and

>

> <Figure>

>

> V

>

> is the volume

>

>

>

> What have you done to address this issue? Is there a way to avoid having

> to remediate the PDF tags for this issue and get the correct tag order for

> inline images when using Adobe Acrobat Pro's plug-in for MS Word?

>

>

>

> Note: I am using my MS Word's 365 (subscription) version with the

> continuous release version of Adobe Acrobat Pro (2021). I have attached the

> documents as well.

>

>

>

> Thanks for your help,

>

>

>

> Joseph

>

>

>

> --

>

> *Alternate Media Supervisor*

>

> Disabled Students' Program

>

> University of California, Berkeley

>

> https://dsp.berkeley.edu/

> <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdsp.berkeley.edu%2F&data=02%7C01%7C%7C4e0abffcb5b34567a22308d5e13137b3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636662523854357148&sdata=yB5%2BUm2W6TBwpc%2BOF4DvN8wPoo1dozUwz8eCepYhTyY%3D&reserved=0>

>

> (510) 642-0329

>

>

>

>

> _______________________________________________

> athen-list mailing list

> athen-list at mailman12.u.washington.edu

> http://mailman12.u.washington.edu/mailman/listinfo/athen-list

>



--
*Alternate Media Supervisor*
Disabled Students' Program
University of California, Berkeley
https://dsp.berkeley.edu/
<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdsp.berkeley.edu%2F&data=02%7C01%7C%7C4e0abffcb5b34567a22308d5e13137b3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636662523854357148&sdata=yB5%2BUm2W6TBwpc%2BOF4DvN8wPoo1dozUwz8eCepYhTyY%3D&reserved=0>
(510) 642-0329
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20210430/792832ad/attachment.html>


More information about the athen-list mailing list