Directory tree example¶
This example illustrates the tree system with a Windows-like directory
tree structure. The System
root node consists of one or more drives,
each of which has a drive letter and a root directory with named files
and subdirectories. To make things a little more interesting, a
symlink-like “mount” is added as a file type, which links to another
directory.
Using this tree, this example should teach you the basics of using
tree-gen trees. The tutorial runs you through the C++ code of
main.cpp
and Python code of main.py
, but be sure to check out
directory.tree
, primitives.hpp
, primitives.hpp
, and (if you
want to reproduce it in your own project) CMakeLists.txt
copied to
the bottom of this page as well. You will also find the complete
main.cpp
and main.py
there.
Let’s start by making the root node using tree::base::make()
. This
works somewhat like std::make_shared()
, but instead of returning a
shared_ptr
it returns a One
edge. This might come off as a bit
odd, considering trees in graph theory start with a node rather than an
edge. This is just a choice, though; a side effect of how the internal
``shared_ptr``s work and how Link/OptLink edges work (if you’d store the
tree as the root structure directly, the root node wouldn’t always be
stored in the same place on the heap, breaking link nodes).
auto system = tree::base::make<directory::System>();
At all times, you can use the dump()
method on a node to see what it
looks like for debugging purposes. It looks like this:
system->dump();
↓
System(
drives: !MISSING
)
While this system node exists as a tree in memory and tree-gen seems happy, our system tree is at this time not “well-formed”. A tree node (or an edge to one) is only considered well-formed if all of the following are true:
- All
One
edges in the tree connect to exactly one node.- All
Many
edges in the tree connect to at least one node.- All
Link
and non-emptyOptLink
nodes refer to a node reachable from the root node.- Each node in the tree is only used by a non-link node once.
Currently, the second requirement is not met.
ASSERT(!system.is_well_formed());
You can get slightly more information using check_well_formed()
;
instead or returning a boolean, it will throw a NotWellFormed
exception with an error message.
ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed());
↓
tree::base::NotWellFormed exception message: 'Many' edge of type N9directory5DriveE is empty
Note that the name of the type is “mangled”. The exact output will vary from compiler to compiler, or even from project to project. But hopefully it’ll be readable enough to make sense of in general.
To fix that, let’s add a default drive node to the tree. This should get
drive letter A
, because primitives::initialize() is specialized to
return that for ``Letter``s (see primitives.hpp).
system->drives.add(tree::base::make<directory::Drive>());
Drive
has a One
edge that is no empty, though, so the tree still
isn’t well-formed. Let’s add one of those as well.
system->drives[0]->root_dir = tree::base::make<directory::Directory>();
Now we have a well-formed tree. Let’s have a look:
system.check_well_formed();
system->dump();
↓
System(
drives: [
Drive(
letter: A
root_dir: <
Directory(
entries: []
name:
)
>
)
]
)
We can just change the drive letter by assignment, as you would expect.
system->drives[0]->letter = 'C';
Before we add files and directories, let’s make a shorthand variable for the root directory. Because root_dir is an edge to another node rather than the node itself, and thus acts like a pointer or reference to it, we can just copy it into a variable and modify the variable to update the tree. Note that you’d normally just write “auto” for the type for brevity; the type is shown here to make explicit what it turns into.
directory::One<directory::Directory> root = system->drives[0]->root_dir;
Let’s make a “Program Files” subdirectory and add it to the root.
auto programs = tree::base::make<directory::Directory>(
tree::base::Any<directory::Entry>{},
"Program Files");
root->entries.add(programs);
system.check_well_formed();
That’s quite verbose. But in most cases it can be written way shorter. Here’s the same with the less versatile but also less verbose emplace() method (which avoids the tree::make() call, but doesn’t allow an insertion index to be specified) and with a “using namespace” for the tree. emplace() can also be chained, allowing multiple files and directories to be added at once in this case.
{
using namespace directory;
root->entries.emplace<Directory>(Any<Entry>{}, "Windows")
.emplace<Directory>(Any<Entry>{}, "Users")
.emplace<File>("lots of hibernation data", "hiberfil.sys")
.emplace<File>("lots of page file data", "pagefile.sys")
.emplace<File>("lots of swap data", "swapfile.sys");
}
system.check_well_formed();
In order to look for a file in a directory, you’ll have to make your own function to iterate over them. After all, tree-gen doesn’t know that the name field is a key; it has no concept of a key-value store. This is simple enough to make, but to prevent this example from getting out of hand we’ll just use indices for now.
Let’s try to read the “lots of data” string from pagefile.sys.
ASSERT(root->entries[4]->name == "pagefile.sys");
// ASSERT(root->entries[4]->contents == "lots of page file data");
// '-> No member named 'contents' in 'directory::Entry'
Hm, that didn’t work so well, because C++ doesn’t know that entry 4 happens to be a file rather than a directory or a mount. We have to tell it to cast to a file first (which throws a std::bad_cast if it’s not actually a file). The easiest way to do that is like this:
ASSERT(root->entries[4]->as_file()->contents == "lots of page file data");
No verbose casts required; tree-gen will make member functions for all the possible subtypes.
While it’s possible to put the same node in a tree twice (without using a link), this is not allowed. This isn’t checked until a well-formedness check is performed, however (and in fact can’t be without having access to the root node).
root->entries.add(root->entries[0]);
ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed());
↓
tree::base::NotWellFormed exception message: Duplicate node of type N9directory5EntryEat address 0x5588663ab440 found in tree
Note that we can index nodes Python-style with negative integers for add() and remove(), so remove(-1) will remove the broken node we just added. Note that the -1 is not actually necessary, though, as it is the default.
root->entries.remove(-1);
system.check_well_formed();
We can, of course, add copies of nodes. That’s what copy (shallow) and clone (deep) are for. Usually you’ll want a deep copy, but in this case shallow is fine, because a File has no child nodes. Note that, even for a deep copy, links are not modified; it is intended to copy a subtree to another part of the same tree. To make a complete copy of a tree that maintains link semantics (i.e. does not link back to the original tree) the best way would be to go through serialize/deserialize.
root->entries.add(root->entries[0]->copy());
system.check_well_formed();
Note that the generated classes don’t care that there are now two directories named Program Files in the root. As far as they’re concerned, they’re two different directories with the same name. Let’s remove it again, though.
root->entries.remove();
Something we haven’t looked at yet are links. Links are edges in the tree that, well, turn it into something that isn’t strictly a tree anymore. While One/Maybe/Any/Many require that nodes are unique, Link/OptLink require that they are not unique, and are present elsewhere in the tree. Let’s make a new directory, and mount it in the Users directory.
auto user_dir = tree::base::make<directory::Directory>(
tree::base::Any<directory::Entry>{},
"");
root->entries.emplace<directory::Mount>(user_dir, "SomeUser");
Note that user_dir is not yet part of the tree. emplace works simply because it doesn’t check whether the directory is in the tree yet. But the tree is no longer well-formed now.
ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed());
↓
tree::base::NotWellFormed exception message: Link to node of type N9directory9DirectoryE at address 0x5588663abcc0 not found in tree
To make it work again, we can add it as a root directory to a second drive.
system->drives.emplace<directory::Drive>('D', user_dir);
system.check_well_formed();
A good way to confuse a filesystem is to make loops. tree-gen is okay with this, though.
system->drives[1]->root_dir->entries.emplace<directory::Mount>(root, "evil link to C:");
system.check_well_formed();
The only place where it matters is in the dump function, which only goes one level deep. After that, it’ll just print an ellipsis.
system->dump();
↓
System(
drives: [
Drive(
letter: C
root_dir: <
Directory(
entries: [
Directory(
entries: []
name: Program Files
)
Directory(
entries: []
name: Windows
)
Directory(
entries: []
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
Directory(
entries: [
Mount(
target --> <
...
>
name: evil link to C:
)
]
name:
)
>
name: SomeUser
)
]
name:
)
>
)
Drive(
letter: D
root_dir: <
Directory(
entries: [
Mount(
target --> <
Directory(
entries: [
Directory(
entries: []
name: Program Files
)
Directory(
entries: []
name: Windows
)
Directory(
entries: []
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
...
>
name: SomeUser
)
]
name:
)
>
name: evil link to C:
)
]
name:
)
>
)
]
)
Now that we have a nice tree, let’s try the serialization and deserialization functionality. Serializing is easy:
std::string cbor = tree::base::serialize(system);
Let’s write it to a file; we’ll load this in Python later.
{
std::ofstream cbor_output;
cbor_output.open("tree.cbor", std::ios::out | std::ios::trunc | std::ios::binary);
cbor_output << cbor;
}
Let’s also have a look at a hexdump of that.
int count = 0;
for (char c : cbor) {
if (count == 16) {
std::printf("\n");
count = 0;
} else if (count > 0 && count % 4 == 0) {
std::printf(" ");
}
std::printf("%02X ", (uint8_t)c);
count++;
}
std::printf("\n");
↓
BF 62 40 54 61 3F 62 40 69 00 62 40 74 66 53 79
73 74 65 6D 66 64 72 69 76 65 73 BF 62 40 54 61
2B 62 40 64 9F BF 62 40 54 61 31 62 40 69 01 62
40 74 65 44 72 69 76 65 66 6C 65 74 74 65 72 BF
63 76 61 6C 18 43 FF 68 72 6F 6F 74 5F 64 69 72
BF 62 40 54 61 31 62 40 69 02 62 40 74 69 44 69
72 65 63 74 6F 72 79 67 65 6E 74 72 69 65 73 BF
62 40 54 61 2A 62 40 64 9F BF 62 40 54 61 31 62
40 69 03 62 40 74 69 44 69 72 65 63 74 6F 72 79
67 65 6E 74 72 69 65 73 BF 62 40 54 61 2A 62 40
64 9F FF FF 64 6E 61 6D 65 BF 63 76 61 6C 6D 50
72 6F 67 72 61 6D 20 46 69 6C 65 73 FF FF BF 62
40 54 61 31 62 40 69 04 62 40 74 69 44 69 72 65
63 74 6F 72 79 67 65 6E 74 72 69 65 73 BF 62 40
54 61 2A 62 40 64 9F FF FF 64 6E 61 6D 65 BF 63
76 61 6C 67 57 69 6E 64 6F 77 73 FF FF BF 62 40
54 61 31 62 40 69 05 62 40 74 69 44 69 72 65 63
74 6F 72 79 67 65 6E 74 72 69 65 73 BF 62 40 54
61 2A 62 40 64 9F FF FF 64 6E 61 6D 65 BF 63 76
61 6C 65 55 73 65 72 73 FF FF BF 62 40 54 61 31
62 40 69 06 62 40 74 64 46 69 6C 65 68 63 6F 6E
74 65 6E 74 73 BF 63 76 61 6C 78 18 6C 6F 74 73
20 6F 66 20 68 69 62 65 72 6E 61 74 69 6F 6E 20
64 61 74 61 FF 64 6E 61 6D 65 BF 63 76 61 6C 6C
68 69 62 65 72 66 69 6C 2E 73 79 73 FF FF BF 62
40 54 61 31 62 40 69 07 62 40 74 64 46 69 6C 65
68 63 6F 6E 74 65 6E 74 73 BF 63 76 61 6C 76 6C
6F 74 73 20 6F 66 20 70 61 67 65 20 66 69 6C 65
20 64 61 74 61 FF 64 6E 61 6D 65 BF 63 76 61 6C
6C 70 61 67 65 66 69 6C 65 2E 73 79 73 FF FF BF
62 40 54 61 31 62 40 69 08 62 40 74 64 46 69 6C
65 68 63 6F 6E 74 65 6E 74 73 BF 63 76 61 6C 71
6C 6F 74 73 20 6F 66 20 73 77 61 70 20 64 61 74
61 FF 64 6E 61 6D 65 BF 63 76 61 6C 6C 73 77 61
70 66 69 6C 65 2E 73 79 73 FF FF BF 62 40 54 61
31 62 40 69 09 62 40 74 65 4D 6F 75 6E 74 66 74
61 72 67 65 74 BF 62 40 54 61 24 62 40 6C 0B FF
64 6E 61 6D 65 BF 63 76 61 6C 68 53 6F 6D 65 55
73 65 72 FF FF FF FF 64 6E 61 6D 65 BF 63 76 61
6C 60 FF FF FF BF 62 40 54 61 31 62 40 69 0A 62
40 74 65 44 72 69 76 65 66 6C 65 74 74 65 72 BF
63 76 61 6C 18 44 FF 68 72 6F 6F 74 5F 64 69 72
BF 62 40 54 61 31 62 40 69 0B 62 40 74 69 44 69
72 65 63 74 6F 72 79 67 65 6E 74 72 69 65 73 BF
62 40 54 61 2A 62 40 64 9F BF 62 40 54 61 31 62
40 69 0C 62 40 74 65 4D 6F 75 6E 74 66 74 61 72
67 65 74 BF 62 40 54 61 24 62 40 6C 02 FF 64 6E
61 6D 65 BF 63 76 61 6C 6F 65 76 69 6C 20 6C 69
6E 6B 20 74 6F 20 43 3A FF FF FF FF 64 6E 61 6D
65 BF 63 76 61 6C 60 FF FF FF FF FF FF
You can pull that through http://cbor.me and https://jsonformatter.org to inspect the output, if you like.
We can deserialize it into a complete copy of the tree.
auto system2 = tree::base::deserialize<directory::System>(cbor);
system2->dump();
↓
System(
drives: [
Drive(
letter: C
root_dir: <
Directory(
entries: [
Directory(
entries: []
name: Program Files
)
Directory(
entries: []
name: Windows
)
Directory(
entries: []
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
Directory(
entries: [
Mount(
target --> <
...
>
name: evil link to C:
)
]
name:
)
>
name: SomeUser
)
]
name:
)
>
)
Drive(
letter: D
root_dir: <
Directory(
entries: [
Mount(
target --> <
Directory(
entries: [
Directory(
entries: []
name: Program Files
)
Directory(
entries: []
name: Windows
)
Directory(
entries: []
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
...
>
name: SomeUser
)
]
name:
)
>
name: evil link to C:
)
]
name:
)
>
)
]
)
Note that equality for two link edges is satisfied only if they point to the exact same node. That’s not the case for the links in our two entirely separate trees, so the two trees register as unequal.
ASSERT(!system2.equals(system));
To be sure no data was lost, we’ll have to check the CBOR and debug dumps instead.
ASSERT(tree::base::serialize(system) == tree::base::serialize(system2));
std::ostringstream ss1{};
system->dump(ss1);
std::ostringstream ss2{};
system2->dump(ss2);
ASSERT(ss1.str() == ss2.str());
Python usage¶
Let us now turn our attention to tree-gen’s Python support. As you may remember from the introduction, tree-gen does not provide a direct interface between Python and C++ using a tool like SWIG; rather, it provides conversion between a CBOR tree representation and the in-memory representations of both languages. It then becomes trivial for users of the tree-gen structure to provide an API to users that converts from one to the other, since only a single string of bytes has to be transferred.
Recall that we wrote the CBOR representation of the tree we constructed in the C++ example to a file. Let’s try loading this file with Python.
from directory import *
with open(os.path.join(TEST_DIR, 'tree.cbor'), 'rb') as f:
tree = System.deserialize(f.read())
tree.check_well_formed()
print(tree)
↓
System(
drives: [
Drive(
letter: C
root_dir: <
Directory(
entries: [
Directory(
entries: -
name: Program Files
)
Directory(
entries: -
name: Windows
)
Directory(
entries: -
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
Directory(
entries: [
Mount(
target --> <
...
>
name: evil link to C:
)
]
name:
)
>
name: SomeUser
)
]
name:
)
>
)
Drive(
letter: D
root_dir: <
Directory(
entries: [
Mount(
target --> <
Directory(
entries: [
Directory(
entries: -
name: Program Files
)
Directory(
entries: -
name: Windows
)
Directory(
entries: -
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
...
>
name: SomeUser
)
]
name:
)
>
name: evil link to C:
)
]
name:
)
>
)
]
)
And there we go; the same structure in pure Python.
As you might expect from a pure-Python structure, field access is much less verbose than it is in C++.
print(tree.drives[0].letter)
print(tree.drives[0].root_dir.entries[0].name)
↓
C
Program Files
Perhaps contrary to Python intuition however, the structure provides type-safety. You can’t assign values to fields that don’t exist…
tree.color_of_a_banana = 'yellow'
↓
Traceback (most recent call last):
File "main.py", line 48, in <module>
tree.color_of_a_banana = 'yellow'
AttributeError: 'System' object has no attribute 'color_of_a_banana'
… or assign values of incorrect types to ones that do:
tree.drives[0] = 'not-a-drive'
↓
Traceback (most recent call last):
File "main.py", line 55, in <module>
tree.drives[0] = 'not-a-drive'
File "/home/docs/checkouts/readthedocs.org/user_builds/tree-gen/checkouts/latest/cbuild/examples/directory/directory.py", line 490, in __setitem__
.format(val, idx, self._T))
TypeError: object 'not-a-drive' is not an instance of 0
Note that the setters will however try to “typecast” objects they don’t know about (to be more specific, anything that isn’t an instance of Node). So if we try to assign a string directly to tree.drives, you get something you might not have expected:
tree.drives = 'not-a-set-of-drives'
↓
Traceback (most recent call last):
File "main.py", line 65, in <module>
tree.drives = 'not-a-set-of-drives'
File "/home/docs/checkouts/readthedocs.org/user_builds/tree-gen/checkouts/latest/cbuild/examples/directory/directory.py", line 1576, in drives
val = MultiDrive(val)
File "/home/docs/checkouts/readthedocs.org/user_builds/tree-gen/checkouts/latest/cbuild/examples/directory/directory.py", line 472, in __init__
.format(val, idx, self._T))
TypeError: object 'n' at index 0 is not an instance of <class 'directory.Drive'>
The setter for the drives property is trying to cast the string to a MultiDrive, which interprets its string input as any other iterable, and ultimately tries to cast the first letter to a Drive. There’s not much that can be done about this; it’s just how duck typing works.
Manipulation of trees in Python works just as you might expect it to in Python:
tree.drives.append(Drive('E'))
tree.drives[-1].root_dir = Directory([
File('test', 'test-contents'),
File('test2', 'test2-contents'),
])
del tree.drives[1]
assert not tree.is_well_formed()
tree.drives[0].root_dir.entries[-1].target = tree.drives[1].root_dir
tree.check_well_formed()
print(tree)
↓
System(
drives: [
Drive(
letter: C
root_dir: <
Directory(
entries: [
Directory(
entries: -
name: Program Files
)
Directory(
entries: -
name: Windows
)
Directory(
entries: -
name: Users
)
File(
contents: lots of hibernation data
name: hiberfil.sys
)
File(
contents: lots of page file data
name: pagefile.sys
)
File(
contents: lots of swap data
name: swapfile.sys
)
Mount(
target --> <
Directory(
entries: [
File(
contents: test
name: test-contents
)
File(
contents: test2
name: test2-contents
)
]
name:
)
>
name: SomeUser
)
]
name:
)
>
)
Drive(
letter: E
root_dir: <
Directory(
entries: [
File(
contents: test
name: test-contents
)
File(
contents: test2
name: test2-contents
)
]
name:
)
>
)
]
)
Once the Python code is done manipulating the structure, it can be serialized again.
cbor = tree.serialize()
assert cbor == System.deserialize(cbor).serialize()
count = 0
for c in cbor:
if count == 16:
print()
count = 0
elif count > 0 and count % 4 == 0:
print(' ', end='')
print('%02X ' % c, end='')
count += 1
print()
↓
A3 62 40 69 00 62 40 74 66 53 79 73 74 65 6D 66
64 72 69 76 65 73 A2 62 40 54 61 2B 62 40 64 82
A5 62 40 54 61 31 62 40 69 01 62 40 74 65 44 72
69 76 65 66 6C 65 74 74 65 72 A1 63 76 61 6C 18
43 68 72 6F 6F 74 5F 64 69 72 A5 62 40 54 61 31
62 40 69 02 62 40 74 69 44 69 72 65 63 74 6F 72
79 67 65 6E 74 72 69 65 73 A2 62 40 54 61 2A 62
40 64 87 A5 62 40 54 61 31 62 40 69 03 62 40 74
69 44 69 72 65 63 74 6F 72 79 67 65 6E 74 72 69
65 73 A2 62 40 54 61 2A 62 40 64 80 64 6E 61 6D
65 A1 63 76 61 6C 6D 50 72 6F 67 72 61 6D 20 46
69 6C 65 73 A5 62 40 54 61 31 62 40 69 04 62 40
74 69 44 69 72 65 63 74 6F 72 79 67 65 6E 74 72
69 65 73 A2 62 40 54 61 2A 62 40 64 80 64 6E 61
6D 65 A1 63 76 61 6C 67 57 69 6E 64 6F 77 73 A5
62 40 54 61 31 62 40 69 05 62 40 74 69 44 69 72
65 63 74 6F 72 79 67 65 6E 74 72 69 65 73 A2 62
40 54 61 2A 62 40 64 80 64 6E 61 6D 65 A1 63 76
61 6C 65 55 73 65 72 73 A5 62 40 54 61 31 62 40
69 06 62 40 74 64 46 69 6C 65 68 63 6F 6E 74 65
6E 74 73 A1 63 76 61 6C 78 18 6C 6F 74 73 20 6F
66 20 68 69 62 65 72 6E 61 74 69 6F 6E 20 64 61
74 61 64 6E 61 6D 65 A1 63 76 61 6C 6C 68 69 62
65 72 66 69 6C 2E 73 79 73 A5 62 40 54 61 31 62
40 69 07 62 40 74 64 46 69 6C 65 68 63 6F 6E 74
65 6E 74 73 A1 63 76 61 6C 76 6C 6F 74 73 20 6F
66 20 70 61 67 65 20 66 69 6C 65 20 64 61 74 61
64 6E 61 6D 65 A1 63 76 61 6C 6C 70 61 67 65 66
69 6C 65 2E 73 79 73 A5 62 40 54 61 31 62 40 69
08 62 40 74 64 46 69 6C 65 68 63 6F 6E 74 65 6E
74 73 A1 63 76 61 6C 71 6C 6F 74 73 20 6F 66 20
73 77 61 70 20 64 61 74 61 64 6E 61 6D 65 A1 63
76 61 6C 6C 73 77 61 70 66 69 6C 65 2E 73 79 73
A5 62 40 54 61 31 62 40 69 09 62 40 74 65 4D 6F
75 6E 74 64 6E 61 6D 65 A1 63 76 61 6C 68 53 6F
6D 65 55 73 65 72 66 74 61 72 67 65 74 A2 62 40
54 61 24 62 40 6C 0B 64 6E 61 6D 65 A1 63 76 61
6C 60 A5 62 40 54 61 31 62 40 69 0A 62 40 74 65
44 72 69 76 65 66 6C 65 74 74 65 72 A1 63 76 61
6C 18 45 68 72 6F 6F 74 5F 64 69 72 A5 62 40 54
61 31 62 40 69 0B 62 40 74 69 44 69 72 65 63 74
6F 72 79 67 65 6E 74 72 69 65 73 A2 62 40 54 61
2A 62 40 64 82 A5 62 40 54 61 31 62 40 69 0C 62
40 74 64 46 69 6C 65 68 63 6F 6E 74 65 6E 74 73
A1 63 76 61 6C 64 74 65 73 74 64 6E 61 6D 65 A1
63 76 61 6C 6D 74 65 73 74 2D 63 6F 6E 74 65 6E
74 73 A5 62 40 54 61 31 62 40 69 0D 62 40 74 64
46 69 6C 65 68 63 6F 6E 74 65 6E 74 73 A1 63 76
61 6C 65 74 65 73 74 32 64 6E 61 6D 65 A1 63 76
61 6C 6E 74 65 73 74 32 2D 63 6F 6E 74 65 6E 74
73 64 6E 61 6D 65 A1 63 76 61 6C 60
Complete file listings¶
directory.tree¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | // Attach \file docstrings to the generated files for Doxygen. # Implementation for classes representing a Windows directory tree. source # Header for classes representing a Windows directory tree. header // Include tree base classes. include "tree-base.hpp" tree_namespace tree::base // Include primitive types. include "primitives.hpp" import primitives // Initialization function to use to construct default values for the tree base // classes and primitives. initialize_function primitives::initialize serdes_functions primitives::serialize primitives::deserialize // Set the namespace for the generated classes and attach a docstring. # Namespace for classes representing a Windows directory tree. namespace directory # Root node, containing the drives and associated directory trees. system { # The drives available on the system. There must be at least one. drives: Many<drive>; } # Represents a drive. drive { # The drive letter used to identify it. letter: primitives::Letter; # Root directory. root_dir: One<directory>; } # Represents a directory entry. entry { # Name of the directory entry. name: primitives::String; # Represents a regular file. file { # The file contents. contents: primitives::String; } # Represents a (sub)directory. directory { # The directory contents. Note that directories can be empty. entries: Any<entry>; } # Represents a link to another directory. mount { # The directory linked to. target: Link<directory>; } } |
primitives.hpp¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | /** \file * Defines primitives used in the generated directory tree structure. */ #pragma once #include <string> /** * Namespace with primitives used in the generated directory tree structure. */ namespace primitives { /** * Letter primitive, used to represent drive letters. */ using Letter = char; /** * Strings, used to represent filenames and file contents. */ using String = std::string; /** * Initialization function. This must be specialized for any types used as * primitives in a tree that are actual C primitives (int, char, bool, etc), * as these are not initialized by the T() construct. */ template <class T> T initialize() { return T(); }; /** * Declare the default initializer for drive letters. It's declared inline * to avoid having to make a cpp file just for this. */ template <> inline Letter initialize<Letter>() { return 'A'; } /** * Serialization function. This must be specialized for any types used as * primitives in a tree. The default implementation doesn't do anything. */ template <typename T> void serialize(const T &obj, tree::cbor::MapWriter &map) { } /** * Serialization function for Letter. */ template <> inline void serialize<Letter>(const Letter &obj, tree::cbor::MapWriter &map) { map.append_int("val", obj); } /** * Serialization function for String. */ template <> inline void serialize<String>(const String &obj, tree::cbor::MapWriter &map) { map.append_string("val", obj); } /** * Deserialization function. This must be specialized for any types used as * primitives in a tree. The default implementation doesn't do anything. */ template <typename T> T deserialize(const tree::cbor::MapReader &map) { return initialize<T>(); } /** * Deserialization function for Letter. */ template <> inline Letter deserialize<Letter>(const tree::cbor::MapReader &map) { return map.at("val").as_int(); } /** * Deserialization function for String. */ template <> inline String deserialize<String>(const tree::cbor::MapReader &map) { return map.at("val").as_string(); } } // namespace primitives |
primitives.py¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | class Letter(str): """Letter primitive, used to represent drive letters.""" def __new__(cls, s='A'): return str.__new__(cls, s) class String(str): """Strings, used to represent filenames and file contents.""" pass def serialize(typ, val): """Serialization function.""" # Serialization formats of primitives. if typ is Letter: return {'val': ord(val)} if typ is String: return {'val': val} # Serialization formats of annotations. if isinstance(typ, str): return val # Some unknown type. raise TypeError('no known serialization of type %r, value %r' % (typ, val)) def deserialize(typ, val): """Deserialization function.""" # Serialization formats of primitives. if typ is Letter: return Letter(chr(val['val'])) if typ is String: return String(val['val']) # Serialization formats of annotations. if isinstance(typ, str): return val # Some unknown type. raise TypeError('no known deserialization for type %r, value %r' % (typ, val)) |
CMakeLists.txt¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | cmake_minimum_required(VERSION 3.12 FATAL_ERROR) project(directory-example CXX) # Add the tree-gen repository root directory. Normally, your project would be # located outside of tree-gen's root, which means you can omit the second # argument. add_subdirectory(../.. tree-gen) # Generates the files for the directory tree. generate_tree_py( "${CMAKE_CURRENT_SOURCE_DIR}/directory.tree" "${CMAKE_CURRENT_BINARY_DIR}/directory.hpp" "${CMAKE_CURRENT_BINARY_DIR}/directory.cpp" "${CMAKE_CURRENT_BINARY_DIR}/directory.py" ) add_executable( directory-example "${CMAKE_CURRENT_BINARY_DIR}/directory.cpp" "${CMAKE_CURRENT_SOURCE_DIR}/main.cpp" ) target_include_directories( directory-example # This directory for primitives.hpp: PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}" # Binary directory for directory.hpp: PRIVATE "${CMAKE_CURRENT_BINARY_DIR}" ) target_link_libraries(directory-example tree-lib) # The following lines only serve to register the example as a test, so you can # run it using `make test` or `ctest` as well. You don't need them as such in # your own project. enable_testing() add_test( NAME directory-example COMMAND directory-example WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} ) # Only add the Python test if CMake is new enough for us to not have to bother # with FindPythonInterp. if(NOT ${CMAKE_VERSION} VERSION_LESS "3.12") find_package(Python3 COMPONENTS Interpreter) if(${Python3_FOUND}) add_test( NAME directory-example-py COMMAND ${Python3_EXECUTABLE} main.py ${CMAKE_CURRENT_BINARY_DIR} WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} ) endif() endif() |
main.cpp¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 | #include <iostream> #include <cstdio> #include <stdexcept> #include "../utils.hpp" // Include the generated file. #include "directory.hpp" // Note: the // comment contents of main(), together with the MARKER lines and // the output of the program, are used to automatically turn this into a // restructured-text page for ReadTheDocs. int main() { // ********************** // Directory tree example // ********************** // // This example illustrates the tree system with a Windows-like directory // tree structure. The ``System`` root node consists of one or more drives, // each of which has a drive letter and a root directory with named files // and subdirectories. To make things a little more interesting, a // symlink-like "mount" is added as a file type, which links to another // directory. // // Using this tree, this example should teach you the basics of using // tree-gen trees. The tutorial runs you through the C++ code of // ``main.cpp`` and Python code of ``main.py``, but be sure to check out // ``directory.tree``, ``primitives.hpp``, ``primitives.hpp``, and (if you // want to reproduce it in your own project) ``CMakeLists.txt`` copied to // the bottom of this page as well. You will also find the complete // ``main.cpp`` and ``main.py`` there. // // Let's start by making the root node using ``tree::base::make()``. This // works somewhat like ``std::make_shared()``, but instead of returning a // ``shared_ptr`` it returns a ``One`` edge. This might come off as a bit // odd, considering trees in graph theory start with a node rather than an // edge. This is just a choice, though; a side effect of how the internal // ``shared_ptr``s work and how Link/OptLink edges work (if you'd store the // tree as the root structure directly, the root node wouldn't always be // stored in the same place on the heap, breaking link nodes). auto system = tree::base::make<directory::System>(); MARKER // At all times, you can use the ``dump()`` method on a node to see what it // looks like for debugging purposes. It looks like this: system->dump(); MARKER // While this system node exists as a tree in memory and tree-gen seems // happy, our system tree is at this time not "well-formed". A tree node (or // an edge to one) is only considered well-formed if all of the following // are true: // // - All ``One`` edges in the tree connect to exactly one node. // - All ``Many`` edges in the tree connect to at least one node. // - All ``Link`` and non-empty ``OptLink`` nodes refer to a node reachable // from the root node. // - Each node in the tree is only used by a non-link node once. // // Currently, the second requirement is not met. ASSERT(!system.is_well_formed()); MARKER // You can get slightly more information using ``check_well_formed()``; // instead or returning a boolean, it will throw a ``NotWellFormed`` // exception with an error message. ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed()); MARKER // Note that the name of the type is // `"mangled" <https://en.wikipedia.org/wiki/Name_mangling#C++>`_. The exact // output will vary from compiler to compiler, or even from project to // project. But hopefully it'll be readable enough to make sense of in // general. // To fix that, let's add a default drive node to the tree. This should get // drive letter ``A``, because primitives::initialize() is specialized to // return that for ``Letter``s (see primitives.hpp). system->drives.add(tree::base::make<directory::Drive>()); MARKER // ``Drive`` has a ``One`` edge that is no empty, though, so the tree still // isn't well-formed. Let's add one of those as well. system->drives[0]->root_dir = tree::base::make<directory::Directory>(); MARKER // Now we have a well-formed tree. Let's have a look: system.check_well_formed(); system->dump(); MARKER // We can just change the drive letter by assignment, as you would expect. system->drives[0]->letter = 'C'; MARKER // Before we add files and directories, let's make a shorthand variable for // the root directory. Because root_dir is an edge to another node rather // than the node itself, and thus acts like a pointer or reference to it, // we can just copy it into a variable and modify the variable to update the // tree. Note that you'd normally just write "auto" for the type for // brevity; the type is shown here to make explicit what it turns into. directory::One<directory::Directory> root = system->drives[0]->root_dir; MARKER // Let's make a "Program Files" subdirectory and add it to the root. auto programs = tree::base::make<directory::Directory>( tree::base::Any<directory::Entry>{}, "Program Files"); root->entries.add(programs); system.check_well_formed(); MARKER // That's quite verbose. But in most cases it can be written way shorter. // Here's the same with the less versatile but also less verbose emplace() // method (which avoids the tree::make() call, but doesn't allow an // insertion index to be specified) and with a "using namespace" for the // tree. emplace() can also be chained, allowing multiple files and // directories to be added at once in this case. { using namespace directory; root->entries.emplace<Directory>(Any<Entry>{}, "Windows") .emplace<Directory>(Any<Entry>{}, "Users") .emplace<File>("lots of hibernation data", "hiberfil.sys") .emplace<File>("lots of page file data", "pagefile.sys") .emplace<File>("lots of swap data", "swapfile.sys"); } system.check_well_formed(); MARKER // In order to look for a file in a directory, you'll have to make your own // function to iterate over them. After all, tree-gen doesn't know that the // name field is a key; it has no concept of a key-value store. This is // simple enough to make, but to prevent this example from getting out of // hand we'll just use indices for now. // Let's try to read the "lots of data" string from pagefile.sys. ASSERT(root->entries[4]->name == "pagefile.sys"); // ASSERT(root->entries[4]->contents == "lots of page file data"); // '-> No member named 'contents' in 'directory::Entry' MARKER // Hm, that didn't work so well, because C++ doesn't know that entry 4 // happens to be a file rather than a directory or a mount. We have to tell // it to cast to a file first (which throws a std::bad_cast if it's not // actually a file). The easiest way to do that is like this: ASSERT(root->entries[4]->as_file()->contents == "lots of page file data"); MARKER // No verbose casts required; tree-gen will make member functions for all // the possible subtypes. // While it's possible to put the same node in a tree twice (without using // a link), this is not allowed. This isn't checked until a well-formedness // check is performed, however (and in fact can't be without having access // to the root node). root->entries.add(root->entries[0]); ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed()); MARKER // Note that we can index nodes Python-style with negative integers for // add() and remove(), so remove(-1) will remove the broken node we just // added. Note that the -1 is not actually necessary, though, as it is the // default. root->entries.remove(-1); system.check_well_formed(); MARKER // We *can*, of course, add copies of nodes. That's what copy (shallow) and // clone (deep) are for. Usually you'll want a deep copy, but in this case // shallow is fine, because a File has no child nodes. Note that, even for // a deep copy, links are not modified; it is intended to copy a subtree to // another part of the same tree. To make a complete copy of a tree that // maintains link semantics (i.e. does not link back to the original tree) // the best way would be to go through serialize/deserialize. root->entries.add(root->entries[0]->copy()); system.check_well_formed(); MARKER // Note that the generated classes don't care that there are now two // directories named Program Files in the root. As far as they're concerned, // they're two different directories with the same name. Let's remove it // again, though. root->entries.remove(); MARKER // Something we haven't looked at yet are links. Links are edges in the tree // that, well, turn it into something that isn't strictly a tree anymore. // While One/Maybe/Any/Many require that nodes are unique, Link/OptLink // require that they are *not* unique, and are present elsewhere in the // tree. Let's make a new directory, and mount it in the Users directory. auto user_dir = tree::base::make<directory::Directory>( tree::base::Any<directory::Entry>{}, ""); root->entries.emplace<directory::Mount>(user_dir, "SomeUser"); MARKER // Note that user_dir is not yet part of the tree. emplace works simply // because it doesn't check whether the directory is in the tree yet. But // the tree is no longer well-formed now. ASSERT_RAISES(tree::base::NotWellFormed, system.check_well_formed()); MARKER // To make it work again, we can add it as a root directory to a second // drive. system->drives.emplace<directory::Drive>('D', user_dir); system.check_well_formed(); MARKER // A good way to confuse a filesystem is to make loops. tree-gen is okay // with this, though. system->drives[1]->root_dir->entries.emplace<directory::Mount>(root, "evil link to C:"); system.check_well_formed(); MARKER // The only place where it matters is in the dump function, which only goes // one level deep. After that, it'll just print an ellipsis. system->dump(); MARKER // Now that we have a nice tree, let's try the serialization and // deserialization functionality. Serializing is easy: std::string cbor = tree::base::serialize(system); MARKER // Let's write it to a file; we'll load this in Python later. { std::ofstream cbor_output; cbor_output.open("tree.cbor", std::ios::out | std::ios::trunc | std::ios::binary); cbor_output << cbor; } MARKER // Let's also have a look at a hexdump of that. int count = 0; for (char c : cbor) { if (count == 16) { std::printf("\n"); count = 0; } else if (count > 0 && count % 4 == 0) { std::printf(" "); } std::printf("%02X ", (uint8_t)c); count++; } std::printf("\n"); MARKER // You can pull that through http://cbor.me and https://jsonformatter.org // to inspect the output, if you like. // We can deserialize it into a complete copy of the tree. auto system2 = tree::base::deserialize<directory::System>(cbor); system2->dump(); MARKER // Note that equality for two link edges is satisfied only if they point to // the exact same node. That's not the case for the links in our two // entirely separate trees, so the two trees register as unequal. ASSERT(!system2.equals(system)); MARKER // To be sure no data was lost, we'll have to check the CBOR and debug dumps // instead. ASSERT(tree::base::serialize(system) == tree::base::serialize(system2)); std::ostringstream ss1{}; system->dump(ss1); std::ostringstream ss2{}; system2->dump(ss2); ASSERT(ss1.str() == ss2.str()); MARKER return 0; } |
main.py¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | import sys, os, traceback TEST_DIR = os.path.realpath(sys.argv[1]) sys.path.append(TEST_DIR) def marker(): print('###MARKER###') # Note: the # | comment contents of this file, together with the marker() lines # and the output of the program, are used to automatically turn this into a # restructured-text page for ReadTheDocs. The output is appended to the C++ # output. # | Python usage # | ============ # | Let us now turn our attention to tree-gen's Python support. As you may # | remember from the introduction, tree-gen does not provide a direct interface # | between Python and C++ using a tool like SWIG; rather, it provides # | conversion between a CBOR tree representation and the in-memory # | representations of both languages. It then becomes trivial for users of the # | tree-gen structure to provide an API to users that converts from one to the # | other, since only a single string of bytes has to be transferred. # | Recall that we wrote the CBOR representation of the tree we constructed in # | the C++ example to a file. Let's try loading this file with Python. from directory import * with open(os.path.join(TEST_DIR, 'tree.cbor'), 'rb') as f: tree = System.deserialize(f.read()) tree.check_well_formed() print(tree) marker() # | And there we go; the same structure in pure Python. # | As you might expect from a pure-Python structure, field access is much less # | verbose than it is in C++. print(tree.drives[0].letter) print(tree.drives[0].root_dir.entries[0].name) marker() # | Perhaps contrary to Python intuition however, the structure provides # | type-safety. You can't assign values to fields that don't exist... try: tree.color_of_a_banana = 'yellow' except AttributeError: traceback.print_exc(file=sys.stdout) marker() # | ... or assign values of incorrect types to ones that do: try: tree.drives[0] = 'not-a-drive' except TypeError: traceback.print_exc(file=sys.stdout) marker() # | Note that the setters will however try to "typecast" objects they don't know # | about (to be more specific, anything that isn't an instance of Node). So if # | we try to assign a string directly to tree.drives, you get something you # | might not have expected: try: tree.drives = 'not-a-set-of-drives' except TypeError: traceback.print_exc(file=sys.stdout) marker() # | The setter for the drives property is trying to cast the string to a # | MultiDrive, which interprets its string input as any other iterable, and # | ultimately tries to cast the first letter to a Drive. There's not much that # | can be done about this; it's just how duck typing works. # | Manipulation of trees in Python works just as you might expect it to in # | Python: tree.drives.append(Drive('E')) tree.drives[-1].root_dir = Directory([ File('test', 'test-contents'), File('test2', 'test2-contents'), ]) del tree.drives[1] assert not tree.is_well_formed() tree.drives[0].root_dir.entries[-1].target = tree.drives[1].root_dir tree.check_well_formed() print(tree) marker() # | Once the Python code is done manipulating the structure, it can be # | serialized again. cbor = tree.serialize() assert cbor == System.deserialize(cbor).serialize() count = 0 for c in cbor: if count == 16: print() count = 0 elif count > 0 and count % 4 == 0: print(' ', end='') print('%02X ' % c, end='') count += 1 print() marker() |