[Athen] Inline Images and PDF Tags

Philip Kiff phil at d4k.ca
Fri Apr 30 19:26:21 PDT 2021


I tried a couple things in the files you sent but I couldn't get the tag
order to generate correctly either: not from the Adobe Acrobat Pro DC
generator nor from the built-in Microsoft Word 365 generator.

Just as an FYI, when I used the axesPDF for Word plugin, the order *did*
come out correctly. And I would bet that it would also come out
correctly using the CommonLook Office plugin (thought I don't have that
one to test).

The Microsoft built-in PDF generator has been getting better and better
these past few years, but generally speaking, both the axesPDF and
CommonLook products do a much better job producing an accessible,
correctly formatted PDF directly from a well-formatted Word source.

Phil.

Philip Kiff
D4K Communications

On 2021-04-30 21:12, Steve Green wrote:

>

> Welcome to the weird world of accessible PDFs. I see the same issue as

> you, and I have no idea how to fix it in the Word document.

>

> It’s strange that Acrobat does a better job of tagging. In my

> experience, that’s rarely the case. But with accessible PDFs, there’s

> an exception to every rule.

>

> Steve

>

> *From:*athen-list <athen-list-bounces at mailman12.u.washington.edu> *On

> Behalf Of *Joseph Polizzotto MA

> *Sent:* 01 May 2021 00:53

> *To:* Access Technology Higher Education Network

> <athen-list at u.washington.edu>

> *Subject:* Re: [Athen] Inline Images and PDF Tags

>

> Hi Steve:

>

> Thanks for your response and for letting me know about the missing

> image. That image displayed the following screenshot of text in a DOCX

> file:

>

> Boyle's law (where (P is the pressure and V is the volume)

>

> Well, that's really interesting and good to know about general issues

> with the Adobe Acrobat Pro plug-in for MS Word. I will keep that in mind.

>

> FWIW, I have tried just using the Save As > PDF route in MS Word to

> circumvent the issue that I described but it does not resolve it.

>

> Using the Save As > PDF method, there is a difference in how the tags

> are created from the MS Word elements but the <Figure> tags for the

> inline images are still not in the correct place; with the Save As PDF

> method, in fact, they also come after the associated <P> tag but are

> not nested within the <P> tag, which is the case with the Adobe

> Acrobat Pro conversion method. In both cases, remediation would be

> time-consuming.

>

> What's interesting is that with both conversion methods, if I delete

> the root tag and then add tags to the document in Adobe Acrobat Pro,

> the <Figure> elements are in the right place related to the <P> tag,

> just that the alternative text is missing. It's as if the Adobe

> Acrobat Pro's "add tags" feature does a better job of tagging from

> within than when it first ingests the DOCX through the MS Word

> conversion suite.

>

> I am not sure why that is and would be even willing to delete all the

> tags, only to add them back, if it also meant that the alternative

> text would automagically reappear again. :-)

>

> Joseph

>

> On Fri, Apr 30, 2021 at 4:13 PM Steve Green

> <steve.green at testpartners.co.uk

> <mailto:steve.green at testpartners.co.uk>> wrote:

>

> It’s strange that you should post this message now, because

> literally one minute ago I sent a lengthy email to all our staff

> explaining why they must not use Adobe Acrobat Pro's plug-in for

> MS Word for creating PDFs.

>

> One of the reasons is that it does not seem to recognise the “Mark

> as decorative” checkbox in Word’s Alt Text pane, and it adds

> bizarre Alternate Text such as P1070TB2#y1.

>

> Another issue relates to the use of simple text boxes in Word.

> Although we discourage their use, there are times that you want or

> need to use them. You can put images in the textbox, which can

> potentially cause a problem because you can add Alt Text to both

> the image and the textbox. If you do that, the Acrobat

> Accessibility Checker reports a failure due to nested Alternate

> Text, which is perfectly reasonable – you can’t have Alt Text

> inside other Alt Text.

>

> The “solution” is to mark the text box as decorative and only add

> Alt Text to the image. If you “Save as PDF”, this does exactly

> what you would expect. The image is in the Tags panel with its

> Alternate Text. The text box is artifacted.

>

> You might expect the Adobe Acrobat Pro plug-in to do the same, or

> at least do something intelligent, but it doesn’t do either of

> those. It puts both the image and text box in the Tags panel. As

> discussed above, it adds random Alternate Text to the text box. It

> then deletes the Alt Text you added to the image, and to add

> insult to injury, the Acrobat Accessibility Checker fails because

> of the missing Alternate Text!

>

> I have also noticed the sort of issues you reported.

>

> For the time being, my recommendation is to use Word’s “Save as

> PDF” feature instead.

>

> BTW, all the images were missing from your email.

>

> Steve Green

>

> Managing Director

>

> Test Partners Ltd

>

> *From:*athen-list <athen-list-bounces at mailman12.u.washington.edu

> <mailto:athen-list-bounces at mailman12.u.washington.edu>> *On Behalf

> Of *Joseph Polizzotto MA

> *Sent:* 30 April 2021 23:53

> *To:* Access Technology Higher Education Network

> <athen-list at u.washington.edu <mailto:athen-list at u.washington.edu>>

> *Subject:* [Athen] Inline Images and PDF Tags

>

> Hi Everyone:

>

> I encountered a problem when converting an MS Word document with

> inline images using Adobe Acrobat Pro's plug-in for MS Word.

>

> The problem is that the inline images appear in the incorrect

> place within the PDF tags panel. Specifically, the <Figure> tags

> for the inline images are located /after/ the entire <P> tag with

> which they are associated.

>

> Instead of the MS Word paragraph being broken up into separate

> chunks of content within the <P> PDF tag, the paragraph is

> contained as one block of text inside the <P> tag and the inline

> images are represented as <Figure> tags after that block.

>

> For example, in the following snippet of an MS Word document,

> where P and V are inline images in the sentence:

>

> alt=Boyle's law (where P is the pressure and V is the volume)

>

> I find the following structure in the PDF tags panel:

>

> <P>

>

> Boyle's law (where is the pressure and is the volume)

>

> <Figure>

>

> P

>

> <Figure>

>

> V

>

> If I remove the tags and add them back using Adobe Acrobat Pro,

> the <Figure> tags will be in the correct place, in between the

> correct blocks of text, but the alternative text for the images

> will be lost.

>

> This is the desired tag structure:

>

> <P>

>

> Boyle's law (where

>

> <Figure>

>

> P

>

> is the pressure and

>

> <Figure>

>

> V

>

> is the volume

>

> What have you done to address this issue? Is there a way to avoid

> having to remediate the PDF tags for this issue and get the

> correct tag order for inline images when using Adobe Acrobat Pro's

> plug-in for MS Word?

>

> Note: I am using my MS Word's 365 (subscription) version with the

> continuous release version of Adobe Acrobat Pro (2021). I have

> attached the documents as well.

>

> Thanks for your help,

>

> Joseph

>

> --

>

> *Alternate Media Supervisor*

>

> Disabled Students' Program

>

> University of California, Berkeley

>

> https://dsp.berkeley.edu/

> <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdsp.berkeley.edu%2F&data=02%7C01%7C%7C4e0abffcb5b34567a22308d5e13137b3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636662523854357148&sdata=yB5%2BUm2W6TBwpc%2BOF4DvN8wPoo1dozUwz8eCepYhTyY%3D&reserved=0>

>

> (510) 642-0329

>

>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20210430/1e41bbea/attachment.html>


More information about the athen-list mailing list