Don't overthink file naming conventions

This is a quick post to discuss some basics about file naming conventions worked out as part of this project. At a recent program-wide meeting to discuss details of this project with all our office managers, among the memos distributed was the following:

If you take a look at the files in your advocate staff directories, you are likely to see individualized albeit typical patterns in the file names. There is a discernible Darwinism to the conventions individual users adopt: They use both directory structures and name files in a way that makes them “findable” for them later, if not for others. (OK, basically “unfindable” by others, in a lot of instances.) Common naming patterns include, in almost all instances, at least a generic name descriptive of the type of document (e.g., petition or complaint or writ), plus other descriptive elements that help the user to later locate it, such as a client or project name, the date of the document, and/or whether the file is a draft or a final or a version copy.

There are any number of ways one can go with file naming conventions, as well illustrated in the article at CompuJurist, Are there any recognized “best” practices for file naming conventions? Akin to what is discussed in that article, we’ve adopted the following template for use by advocates to name their files:

[draft/final]  [document type]  [party/case]  [subject]  [date] [file extension]

Yes, there are other longstanding concerns about how files are named, beyond what may concern your attorneys and other advocates. These concerns include naming conventions driven by the demands of particular operating systems, all too often non-intuitive but technical project requirements, or the recommendations of 800-pound gorillas like Google (explaining why Google favors dashes over underscores).

All that said, if you look at the memo linked above, you may notice that the examples promote the use of underscores, rather than spaces, between words in the filenames.

Our thinking is this:

  • As a practical matter, using spaces between words in file names creates file transfer problems when moving the file from one server to the other. Not a good thing. Especially when you are dealing with relocating files that count in the hundreds of thousands.
  • Using spaces creates readability problems when viewing the path of the file in a GSA search result, because the GSA normalizes the URL by inserting special characters wherever the file has a space in its name. Even if you didn’t know what to call it, you’ve seen this phenomenon. Here’s a real world example in a GSA search result from one of our test bed sites:

    /Ukiah%20Office/Former%20Staff/Kan's%20Transfer%20File/letter%20to%20jake.wpd

    Look familiar?
  • File names with underscores are easier to read than files with dashes. They just are, OK? To be fair, not everyone is going to agree with that proposition. But we did some admittedly unscientific user testing (hey, Glenn, you get what you pay for), where we asked staff to read the same file name listed three ways: With spaces, with underscores, and with dashes. Without exception, our crack team of testers said they found it easiest to read the file names if they had spaces (duh!); less easy to read if there were underscores; and least easy where dashes were used.
  • A usability corollary: If you use underscores, a linked file name in a search result is easier to read because as a link the file name appears underlined, so words appear as if they have spaces, which is the easiest of the three formats to read (see test results, above).

This is fairly prosaic stuff and bears some thought, but is not worth overthinking. Or much of an enforcement regime. The project goal is not to have nice, neat, compliant file names. That’s an objective. It is not the goal. We are not investing a lot of time worrying about those who paint outside the lines. The point of the project is to get the files targeted properly so that users can find the content they contain.

We are confident that, as users see file names displayed in the GSA search results, it will sink in why it matters how one names the files, and they will adapt.

One Response to “Don't overthink file naming conventions”

  1. Tony White Says:

    Very concise “non-rules”. For what it is worth, other programs that are used with CMS systems such as Plone have trouble with hyphens, cutting against the Google norm, once again. My own preference is to UseCapitlizationToBreakUpFilenames rather than any special character. Subjective, of course, although any file naming that has no spaces does better with being depicted, used as a live link (from within Outlook as one example), etc.