The Right Place at the Right Time: When to Use PDFs
By now, it has become common knowledge that PDFs are a problematic format for displaying web content. PDFs may be helpful in presenting standardized versions of forms, reports, and other documents intended for printed use, but their online utility ends there.
In addition to being unreadable by web crawlers or screen readers, they take too long to load and interrupt a user’s browsing flow.
Some web developers continue to rely on PDFs as a way to communicate information to their users, rather than creating an HTML version of the PDF content. While this approach is a quick and dirty fix, it may end up alienating users.
Organizations often post content via PDF that is more appropriate for HTML. For example, some organizations post a PDF on their website about the proper way to complete or submit paperwork. Instead, an HTML version of the instructions would fit easily into the visual framework of the site and meet the user’s information needs without requiring the hassle of a download. There is no imminent need to print the instructions when you can read them from your computer, so the PDF is rendered unnecessary.
While PDFs are helpful for standardizing document views or creating printed deliverables, they are often not appropriate for displaying web content.
Focus on the User
Government-run websites commonly use PDFs online to inform users about anything from registering to vote, to renewing a driver license, locating a state park, learning about a retirement plan, understanding zoning laws for a neighborhood, finding certified mechanics for a car inspection, or viewing new laws about safety belts.
The reason government agencies rely so heavily on PDFs is simple: the PDF already exists for internal business use, so why not put it online? It saves time in creating a new html page for the information, and it saves money because you don’t have to pay a developer to code the HTML content.
As a result, some government websites are virtual archives; the website becomes the keeper of informational brochures that probably no longer exist in hard copy. Although the internal team may be accessing information found in PDFs, it’s not always likely that users are as well.
Relying on PDFs doesn’t only hurt government agencies from a usability and accessibility stand point; it also widens the gap between citizens and government. The average user already feels alienated by government websites due to the use of institutionalized language, the commonly sterile visual design, and a general mistrust of government.
Although removing PDFs from a website sounds technical and non-emotional, making information easily accessible to users conveys a level of transparency that fosters trust and respect. Rather than using a PDF created for internal use to present information to the public, creating an HTML version demonstrates a commitment to the user’s experience and their search for information.
More than One Way to fix a PDF
While there are several effective ways of converting PDFs to HTML automatically, these automated services lack the finesse of a human touch. Adobe, Google, Dynamic PDF and countless other companies boast the ability to pull PDF content into HTML.
However, the process is often ad-hoc: images are separated from text, columns are not always aligned, and foreign characters are not recognized. Additionally, although the pages are readable in HTML, they are still visually rendered for printing rather than web publishing.
For example, the difference between the PDF version of the document below and its HTML counterpart is visually indiscernible. In this case, the visual transfer has been achieved perfectly. However, creating an HTML document that mirrors a PDF does not necessarily make the content more appropriate for online readers.
Transferring the information to HTML makes it accessible for a web crawler or screen reader and removes the hurdle of downloading the PDF for the user. However, both versions are written in a four column newspaper layout, which is not compatible with web reading. The use of headings is limited, there is little white space on the page, and when viewed in its full size, the bottom of the column disappears below the fold. To make information truly accessible, it needs to be presented in a digestible way.
Regardless of the software programs that exist to ease the transfer of PDF to HTML, nothing can replace the human touch. The problem comes in finding the human, financial, and technological resources necessary to make a smooth transition. Considering resources are especially strapped in government agencies, the transfer of PDFs becomes an even larger challenge.
The most effective way to make the leap from PDF to HTML is to prioritize the information that is most needed online, convert those PDFs into text or word documents, and revise the written content to align with best practices for web writing.
Most government agencies that post informational PDFs on their website don’t stop at only one or two; PDF usage is generally wide spread, spanning hundreds of links to various brochures, guidelines, instructions, and reports. That being said, the very thought of converting such a massive amount of information into HTML is daunting. However, the process doesn’t have to be done over night, and it doesn’t have to tie up unreasonable resources.
Prioritizing the content provides a structured and need-driven system to begin the process. By examining site metrics for the most commonly downloaded PDFs, it is obvious which information should be transferred to HTML. If the most used content consists of forms that require printing, manual completion, and postage, then the PDF is working as a useful printable tool.
However, if the most common downloads are guidelines for completing a form or a brochure explaining a certain project or service, it makes more sense to render that information in an HTML format.
Once the top PDFs have been identified, the time comes to do a quick file conversion. Simply save the PDF to a machine and select “Save As Text” from the File drop down The text version that appears can be used for creating your HTML page. The information can also be saved in a Word document for more extensive revision, including the addition of headers, shorter sentences and a limited use of columns.
While the HTML coding will need to be included, along with images and branding attributes, most of the heaving lifting is done.
Creating information that is easily readable online is where automated HTML generators fall short. In addition to layout and context, the language used in most printed deliverables is much different than best practices for web writing. Simply making the information of a PDF recognizable by a web crawler doesn’t mean it will be recognized by a web reader.
By double checking for effective use of headers, short sentences, and an overall reduction in content, it’s possible to create an HTML page that will speak to the reader. Rather than lying buried within a dense screen of text, the information will flow easily to the user.
Making public information as accessible and transparent as possible not only improves the overall experience for the user, it also improves trust in an organization. As easy as it is to overlook the usability barriers presented by PDFs, fixing this problem present an opportunity to build upon the organization’s relationship with its users, especially where government agencies are concerned.