hammer2 - update documentation, begin working on callback I/O
[dragonfly.git] / sys / vfs / hammer2 / TODO
CommitLineData
8c280d5d 1
e07becf8
MD
2* Currently the check code (bref.methods / crc, sha, etc) is being checked
3 every single blasted time a chain is locked, even if the underlying buffer
4 was previously checked for that chain. This needs an optimization to
5 (significantly) improve performance.
6
da6f36f4
MD
7* flush synchronization boundary crossing check and current flush chain
8 interlock needed.
50456506
MD
9
10* snapshot creation must allocate and separately pass a new pmp for the pfs
11 degenerate 'cluster' representing the snapshot. This theoretically will
12 also allow a snapshot to be generated inside a cluster of more than one
13 node.
14
15* snapshot copy currently also copies uuids and can confuse cluster code
16
58e43599
MD
17* hidden dir or other dirs/files/modifications made to PFS before
18 additional cluster entries added.
19
278ab2b2
MD
20* transaction on cluster - multiple trans structures, subtrans
21
22* inode always contains target cluster/chain, not hardlink
23
278ab2b2
MD
24* chain refs in cluster, cluster refs
25
72ebfa75
MD
26* check inode shared lock ... can end up in endless loop if following
27 hardlink because ip->chain is not updated in the exclusive lock cycle
28 when following hardlink.
29
0924b3f8
MD
30cpdup /build/boomdata/jails/bleeding-edge/usr/share/man/man4 /mnt/x3
31
623d43d4
MD
32
33 * The block freeing code. At the very least a bulk scan is needed
34 to implement freeing blocks.
35
36 * Crash stability. Right now the allocation table on-media is not
37 properly synchronized with the flush. This needs to be adjusted
38 such that H2 can do an incremental scan on mount to fixup
39 allocations on mount as part of its crash recovery mechanism.
40
41 * We actually have to start checking and acting upon the CRCs being
42 generated.
43
44 * Remaining known hardlink issues need to be addressed.
45
46 * Core 'copies' mechanism needs to be implemented to support multiple
47 copies on the same media.
48
49 * Core clustering mechanism needs to be implemented to support
50 mirroring and basic multi-master operation from a single host
51 (multi-host requires additional network protocols and won't
52 be as easy).
53
fdf62707
MD
54* make sure we aren't using a shared lock during RB_SCAN's?
55
91abd410
MD
56* overwrite in write_file case w/compression - if device block size changes
57 the block has to be deleted and reallocated. See hammer2_assign_physical()
58 in vnops.
59
1a7cfe5a
MD
60* freemap / clustering. Set block size on 2MB boundary so the cluster code
61 can be used for reading.
62
63* need API layer for shared buffers (unfortunately).
64
731b2a84
MD
65* add magic number to inode header, add parent inode number too, to
66 help with brute-force recovery.
67
68* modifications past our flush point do not adjust vchain.
69 need to make vchain dynamic so we can (see flush_scan2).??
70
1a7cfe5a
MD
71* MINIOSIZE/RADIX set to 1KB for now to avoid buffer cache deadlocks
72 on multiple locked inodes. Fix so we can use LBUFSIZE! Or,
73 alternatively, allow a smaller I/O size based on the sector size
74 (not optimal though).
75
a864c5d9
MD
76* When making a snapshot, do not allow the snapshot to be mounted until
77 the in-memory chain has been freed in order to break the shared core.
78
79* Snapshotting a sub-directory does not snapshot any
80 parent-directory-spanning hardlinks.
81
731b2a84
MD
82* Snapshot / flush-synchronization point. remodified data that crosses
83 the synchronization boundary is not currently reallocated. see
84 hammer2_chain_modify(), explicit check (requires logical buffer cache
85 buffer handling).
86
51bf8e9b
MD
87* on fresh mount with multiple hardlinks present separate lookups will
88 result in separate vnodes pointing to separate inodes pointing to a
89 common chain (the hardlink target).
90
91 When the hardlink target consolidates upward only one vp/ip will be
92 adjusted. We need code to fixup the other chains (probably put in
93 inode_lock_*()) which will be pointing to an older deleted hardlink
94 target.
95
32b800e6
MD
96* Filesystem must ensure that modify_tid is not too large relative to
97 the iterator in the volume header, on load, or flush sequencing will
98 not work properly. We should be able to just override it, but we
99 should complain if it happens.
100
8c280d5d
MD
101* Kernel-side needs to clean up transaction queues and make appropriate
102 callbacks.
103
104* Userland side needs to do the same for any initiated transactions.
105
222d9e22
MD
106* Nesting problems in the flusher.
107
01eabad4
MD
108* Inefficient vfsync due to thousands of file buffers, one per-vnode.
109 (need to aggregate using a device buffer?)
110
8cce658d
MD
111* Use bp->b_dep to interlock the buffer with the chain structure so the
112 strategy code can calculate the crc and assert that the chain is marked
113 modified (not yet flushed).
114
115* Deleted inode not reachable via tree for volume flush but still reachable
116 via fsync/inactive/reclaim. Its tree can be destroyed at that point.
117
866d5273
MD
118* The direct write code needs to invalidate any underlying physical buffers.
119 Direct write needs to be implemented.
120
121* Make sure a resized block (hammer2_chain_resize()) calculates a new
222d9e22 122 hash code in the parent bref
866d5273 123
995e78dc
MD
124* The freemap allocator needs to getblk/clrbuf/bdwrite any partial
125 block allocations (less than 64KB) that allocate out of a new 64K
126 block, to avoid causing a read-before-write I/O.
127
128* Check flush race upward recursion setting SUBMODIFIED vs downward
129 recursion checking SUBMODIFIED then locking (must clear before the
130 recursion and might need additional synchronization)
131
db0c2eb3
MD
132* There is definitely a flush race in the hardlink implementation between
133 the forwarding entries and the actual (hidden) hardlink inode.
134
135 This will require us to associate a small hard-link-adjust structure
136 with the chain whenever we create or delete hardlinks, on top of
137 adjusting the hardlink inode itself. Any actual flush to the media
138 has to synchronize the correct nlinks value based on whether related
139 created or deleted hardlinks were also flushed.
140
995e78dc
MD
141* When a directory entry is created and also if an indirect block is
142 created and entries moved into it, the directory seek position can
143 potentially become incorrect during a scan.
144
145* When a directory entry is deleted a directory seek position depending
146 on that key can cause readdir to skip entries.
db0c2eb3 147
73e441b9
MD
148* TWO PHASE COMMIT - store two data offsets in the chain, and
149 hammer2_chain_delete() needs to leave the chain intact if MODIFIED2 is
150 set on its buffer until the flusher gets to it?
151
db0c2eb3
MD
152
153 OPTIMIZATIONS
154
155* If a file is unlinked buts its descriptors is left open and used, we
156 should allow data blocks on-media to be reused since there is no
157 topology left to point at them.