A few years back, Microsoft Excel started automatically saving my spreadsheet files with the extensions .xlsx. I first noticed it when I got a new laptop for my postdoc at University of Alberta. Suddenly, I had to be cognizant of the fact that if I left Excel to its own devices, the spreadsheets I generated would not be readable on my home computer equipped with an older version of Excel.
First, let’s cover exactly what that extra “x” is for. The additional “x” in Excel file extensions stands for XML. XML is Extensible Markup Language, which is a markup language useful for data, databases, and data-related applications. The file type .xlsx is a combination of XML architecture and ZIP compression for size reduction. Here’s a succinct summary from mrexcel.com:
If you’ve ever looked at the “View Source” view of a webpage in Notepad, you are familiar with the structure of XML. While HTML allows for certain tags, like TABLE, BODY, TR, TD, XML allows for any tags. You can make up any sort of a tag to describe your data.
You can also check out Microsoft’s description of XML in Excel. What all of this means is that .xlsx files are more generalized and easier to use with web-based applications. It’s a good thing!
You might be asking yourself why I’m writing about .xlsx. Isn’t this an old issue that folks have figured out by now? The answer to that is yes and no. Many of the scientists I have spoken with over the last few months are entrenched in their current Excel version, and have major complaints about moving to newer versions. Excel 2003 (2004 for Mac) is still heavily used among some groups, which predates the .xlsx file type. Other scientists have moved on to later versions of Excel, but still have colleagues, advisors, or collaborators who use older versions and therefore cannot open the .xlsx file type. So while many scientists can tell you they have noticed the new extension on their Excel files, they don’t understand the underlying changes.
Of course, you can tell Excel to generate and save files in the old .xls format by going to the “Excel Options… Save” and changing your settings so files are saved as .xls: