Taming the noise of whitespace with the right dot files

June 25th 2015 Ian Buchanan in Editors, Git

While I prefer Sublime Text, my colleague, Nicola, prefers Vim. While we both have Macs, I sometimes work on my Windows desktop. Sharing work across these environments sometimes creates a whitespace conflict. Nicola's Vim put tabs to indent in shell script, instead of spaces. Or my Sublime Text on Windows put extra control characters at the end every line of Python. And, unless you are programming with whitespace, then you probably know how painful this can be. Fortunately, there are a couple tools you can use to avoid the most common whitespace problems.

Tabs and spaces

Tabs Spaces or Both

From emacswiki licensed under Creative Commons ShareAlike.

Unfortunately, tabs and spaces don't get along. In some cases, the debate has been settled by an authority. For example, Python says spaces. In other cases, the debate has been settled by the team. For example, Apache says spaces too. It's not my aim to open debate. The problem is that, once settled by people, tools don't always respect the convention.

That's why EditorConfig is wonderful. In the product's own words:

EditorConfig helps developers define and maintain consistent coding styles between different editors and IDEs. The EditorConfig project consists of a file format for defining coding styles and a collection of text editor plugins that enable editors to read the file format and adhere to defined styles. EditorConfig files are easily readable and they work nicely with version control systems.

Here is what might be in the .editorconfig file for a Python project:

[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true

[*.{py,rst,ini}]
indent_style = space
indent_size = 4

It uses the common INI-style. Section names in brackets are filepath globs. In the example above, we start matching all files. We want UTF-8 and Linux-style endings. We want a newline at the end of every file. We want whitespace removed from the end of lines. The next section matches typical Python files. In addition to the settings for all files, we want the tab key to create 4 spaces.

Once the .editorconfig is checked into version control, anyone who uses a compatible editor will automatically comply with whitespace conventions. Since EditorConfig is not new, there are many compatible editors. For Nicola, there is a Vim Plugin. For me, there is a Sublime Text Plugin. The notable exception is Eclipse but, where there's a will, there's a way.

Windows newlines

While mixing spaces and tabs can be annoying, mixing newline styles can break things. I'm looking at you Bash. Fortunately, Git solves this problem quite well with core.autocrlf. Windows users even have a sensible default that turns it on. Trouble still emerges when people don't use defaults.

If you work on a team where people can use Windows and non-Windows, then you don't need to rely on defaults. You can specify the line endings with .gitattributes. In the simplest case where Git can just manage all line endings, your .gitattributes can be:

* text=auto

Or, you can get specific per file extension with a .gitignore like this:

*.txt       text
*.vcproj    eol=crlf
*.sh        eol=lf
*.jpg       -text

Having been bitten by line endings before, I like to create .gitattributes at the start of a new repo. Most people won't know they need one until after a mix of newlines has gotten into the code. Once you have a proper .gitattributes file, you can apply the settings retroactively to normalize all the files. From the Git documentation about line endings:

$ rm .git/index     # Remove the index to force Git to
$ git reset         # re-scan the working directory
$ git status        # Show files that will be normalized
$ git add -u
$ git add .gitattributes
$ git commit -m "Introduce end-of-line normalization"

Avoid the noise

If you are working on an open source project, I hope the value of these files is obvious. You want to encourage participation without having to scold collaborators for not following your whitespace conventions. If you are working on proprietary source code, you might think there is no need. Please consider the future developers on this project. Will they know how you used whitespace? Better to give some hints in the form of these files.