More specially, a file path where we are trying to remove ..
and .
So, if you have “projects/vectorization/../gitstuff/./../../something/index.html” canonicalizing the string would reduce it to “something/index.html”
We have a couple of options to do this.
Option 1 – use std::filesystem
Since C++17, you can take a path, and canonicalize it using std::filesystem::canonical
.
For paths that do not exist, or you just want to mess around, you can use std::filesystem::weakly_canonical
to remove ..
and .
.
It’s really simple to work with, and here’s an example of how to use it:
void option_one(std::string& path)
{
auto fs_path = fs::path(path);
auto canonical = fs::weakly_canonical(fs_path);
std::cout << canonical << '\n';
}
Option 2 – use a stack
We can use a stack to hold the path components, that we find while iterating up through a string.
This will be a lot more involved than Option 1, but if C++17 is not available in your project, then:
/**********************************************************/
std::string option_two(std::string& path)
{
// Don't work on an empty path
if (path.empty())
return "";
// Use a stack to hold each path component
auto path_components = std::stack<std::string>();
// Use 2 variables to hold the beginning and the end of a path component string
// initialized to the beginning of the string, and the first slash found
auto beginning = 0;
auto end = path.find("/", beginning);
// Now, walk up the string, gathering each path component
while (end != std::string::npos)
{
// Check if the path component is a `..` or `.`
const auto item = path.substr(beginning, end - beginning);
// If it's a `..`, pop the stack, otherwise ignore `.` and only add a path component
if (item == ".." && !path_components.empty())
path_components.pop();
else if (item != ".")
path_components.push(item);
// Set our variables to the current slash position, and the next found slash
beginning = end + 1;
end = path.find("/", beginning);
}
// Add the last path component, if we have a trailing one
if ((path.length() - beginning) > 0)
{
const auto last = path.substr(beginning, path.length() - beginning);
if (last != ".." && last != ".")
path_components.push(last);
}
// Reverse the stack to make our mechanism work
std::stack<std::string> rpath_components;
while (!path_components.empty())
{
rpath_components.push(path_components.top());
path_components.pop();
}
// Append the path components to our string, delimited with `/`
std::string canonical;
while (!rpath_components.empty())
{
canonical += rpath_components.top();
rpath_components.pop();
canonical += "/";
}
// Remove the last trailing forward slash
canonical = canonical.substr(0, canonical.length() - 1);
std::cout << canonical << "\n";
return canonical;
}
So, you can see there are 2 ways, and lots more if you put your mind to it, of canonicalizing a path.
Happy coding!