Win32 path canonicalization refactoring#4852
Conversation
tiennou
left a comment
There was a problem hiding this comment.
Just a stylistic point (which could be ignored IMHO). The rename sure makes sense, so I'm 👍.
I'm left with a confusing urge to learn more about those strange Windows pathnames. Or flee, I'm not sure which 😜.
src/win32/path_w32.c
Outdated
| break; | ||
|
|
||
| /* Don't trim backslashes from drive letter paths, which | ||
| * are 3 characters long and of the form C:\, D:\, etc. */ |
There was a problem hiding this comment.
Minor: given the previous hunks, you might want to indent 😉.
There was a problem hiding this comment.
Thanks - Visual Studio is not great at formatting pasted hunks. Fixed.
The internal API `git_win32__canonicalize_path` is far, far too easily confused with the internal API `git_win32_path_canonicalize`. The former removes the namespace prefix from a path (eg, given `\\?\C:\Temp\foo`, it returns `C:\Temp\foo`, and given `\\?\UNC\server\share`, it returns `\\server\share`). As such, rename it to `git_win32_path_remove_namespace`. `git_win32_path_canonicalize` remains unchanged.
Update `git_win32_path_remove_namespace` to disambiguate the prefix being removed versus the prefix being added. Now we remove the "namespace", and (may) add a "prefix" in its place. Eg, we remove the `\\?\` namespace. We remove the `\\?\UNC\` namespace, and replace it with the `\\` prefix. This aids readability somewhat. Additionally, use pointer arithmetic instead of offsets, which seems to also help readability.
50677bf to
a34f5b0
Compare
They're a bit odd at first, but they're not actually that bad. The paths you typically see (eg, We moved over to use the namespaced paths explicitly, converting "standard" paths like Similarly when DOS introduced "long filenames" (ie, not 8.3) it supported a mapping from long filename to an 8.3 name for people stuck on old versions of DOS. So if you create Anyway. The easiest way for us to feel confident avoiding these attacks is to skip the DOS compatibility layer entirely. We can do this by using namespaced paths. If we open As best as I understand from authoritative sources, this is the only way to avoid this path manipulation. There's good news here, though. It also removes the 260 character We also examine Note that there is still a translation layer, to get directly to the devices. Although it's uncommon, you can actually mount devices throughout the path space in NT. Recall that in DOS, you had drives named with letters, eg So ultimately, all your paths get translated into a proc-like filesystem semantics. You could actually open |
PR #4825 fixes obviously incorrect behavior in our path canonicalization, taking namespace-formatted paths and formatting them in something that an end-user might want to see.
Reviewing that, I noticed that there was some rather confusing naming going on that could be clarified. For example, we had both a
git_win32__canonicalize_pathfunction and agit_win32_path_canonicalizefunction? Ouch.I took #4825 and added some refactoring on top of that:
git_win32__canonicalize_pathtogit_win32_path_remove_namespaceto reflect better what it actually does. (It removes the\\?\,\??\or\\?\UNC\prefixes.)git_win32_path_remove_namespaceto talk about the namespace prefixes being removed asnamespaces, and the UNC prefix being added back (\\, if there is one) as a prefix to disambiguate.I opened a PR for visibility, but I want to merge this quickly to alleviate the pain that this bug is causing.