The NeverEnding Task: Organizing Files

In my speaking with scientists about data management, I often talk about how they organize their work files on their computers.  Asking someone about this is a deeply personal question- often people are highly defensive of their system, while simultaneously being frustrated with the structure.

Organizing files on your computer might sometimes feel like the NeverEnding task. You can spend two hours on Monday re-working the structure of your file system, only to find on Thursday that you are disappointed with the outcome and start over again.  Or perhaps you are quite happy with the structure, but a new project starts up (perhaps related to work in existing folders) and it seems illogical to keep the current organization scheme.    Here’s a few thoughts (and tips) that might help:

  1. Plan for the types of files you will be generating- spend 30 minutes brainstorming the anticipated files you will generate in the course of the project, then determining the most logical links between those files. Document the system with a flow chart or text file, post it in your workspace, and stick with it until you have a legitimate reason to change things.
  2. Use the same file structure consistently: on all of your computers, in your Dropbox, and in your Google Docs. Also use similar naming strategies for different types of files, like scripts, data sets, metadata files, and images.  Example:   Species_Site_Date_FileType.FileExtension  might be the base structure, and files might be
      • Eaffinis_nanaimo_20100901_FieldCounts.xls
      • Eaffinis_nanaimo_20100901_ANOVAcode.R
      • Eaffinish_nanaimo_20100901_adult232.tiff
  3. Consider including dates in your file names (I use YYYYMMDD, so I can always sort by most recent).
  4. There are tools available re-naming files in bulk and re-organizing them, like Bulk Rename Utility, Renamer, and File Buddy.

There is no one-size-fits-all solution for organizing your files. The system that works best for you is very dependent on the types of files you generate, the frequency with which you need access to those files, the interrelationships between your files, etc etc.

