nrelease - fix/improve livecd
[dragonfly.git] / sys / vfs / hammer2 / TODO
CommitLineData
21a90458 1
e2163f5b
MD
2* Need backend synchronization / serialization when the frontend detaches
3 a XOP. modify_tid tests won't be enough, the backend may wind up executing
4 the XOP out of order after the detach.
159c3ca2 5
8cd26e36
MD
6* xop_start - only start synchronized elements
7
12ff971c
MD
8* See if we can remove hammer2_inode_repoint()
9
d34788ef
MD
10* FIXME - logical buffer associated with write-in-progress on backend
11 disappears once the cluster validates, even if more backend nodes
12 are in progress.
13
14* FIXME - backend ops need per-node transactions using spmp to protect
15 against flush.
16
17* FIXME - modifying backend ops are not currently validating the cluster.
18 That probably needs to be done by the frontend in hammer2_xop_start()
19
20* modify_tid handling probably broken w/ the XOP code for the moment.
21
c603b86b
MD
22* embedded transactions in XOPs - interlock early completion
23
24* remove current incarnation of EAGAIN
25
c847e838
MD
26* mtx locks should not track td_locks count?. They can be acquired by one
27 thread and released by another. Need API function for exclusive locks.
28
159c3ca2
MD
29* Convert xops and hammer2_update_spans() from cluster back into chain calls
30
e513e77e
MD
31* syncthr leaves inode locks for entire sync, which is wrong.
32
5ceaaa82
MD
33* recovery scan vs unmount. At the moment an unmount does its flushes,
34 and if successful the freemap will be fully up-to-date, but the mount
35 code doesn't know that and the last flush batch will probably match
36 the PFS root mirror_tid. If it was a large cpdup the (unnecessary)
37 recovery pass at mount time can be extensive. Add a CLEAN flag to the
38 volume header to optimize out the unnecessary recovery pass.
39
e513e77e
MD
40* More complex transaction sequencing and flush merging. Right now it is
41 all serialized against flushes.
42
5ceaaa82
MD
43* adding new pfs - freeze and force remaster
44
45* removing a pfs - freeze and force remaster
46
464659a3
MD
47* bulkfree - sync between passes and enforce serialization of operation
48
49* bulkfree - signal check, allow interrupt
50
51* bulkfree - sub-passes when kernel memory block isn't large enough
52
53* bulkfree - limit kernel memory allocation for bmap space
54
55* bulkfree - must include any detached vnodes in scan so open unlinked files
56 are not ripped out from under the system.
57
58* bulkfree - must include all volume headers in scan so they can be used
59 for recovery or automatic snapshot retrieval.
60
61* bulkfree - snapshot duplicate sub-tree cache and tests needed to reduce
62 unnecessary re-scans.
63
e07becf8
MD
64* Currently the check code (bref.methods / crc, sha, etc) is being checked
65 every single blasted time a chain is locked, even if the underlying buffer
66 was previously checked for that chain. This needs an optimization to
67 (significantly) improve performance.
68
da6f36f4
MD
69* flush synchronization boundary crossing check and current flush chain
70 interlock needed.
50456506
MD
71
72* snapshot creation must allocate and separately pass a new pmp for the pfs
73 degenerate 'cluster' representing the snapshot. This theoretically will
74 also allow a snapshot to be generated inside a cluster of more than one
75 node.
76
77* snapshot copy currently also copies uuids and can confuse cluster code
78
58e43599
MD
79* hidden dir or other dirs/files/modifications made to PFS before
80 additional cluster entries added.
81
278ab2b2
MD
82* transaction on cluster - multiple trans structures, subtrans
83
84* inode always contains target cluster/chain, not hardlink
85
278ab2b2
MD
86* chain refs in cluster, cluster refs
87
72ebfa75
MD
88* check inode shared lock ... can end up in endless loop if following
89 hardlink because ip->chain is not updated in the exclusive lock cycle
90 when following hardlink.
91
0924b3f8
MD
92cpdup /build/boomdata/jails/bleeding-edge/usr/share/man/man4 /mnt/x3
93
623d43d4
MD
94
95 * The block freeing code. At the very least a bulk scan is needed
96 to implement freeing blocks.
97
98 * Crash stability. Right now the allocation table on-media is not
99 properly synchronized with the flush. This needs to be adjusted
100 such that H2 can do an incremental scan on mount to fixup
101 allocations on mount as part of its crash recovery mechanism.
102
103 * We actually have to start checking and acting upon the CRCs being
104 generated.
105
106 * Remaining known hardlink issues need to be addressed.
107
108 * Core 'copies' mechanism needs to be implemented to support multiple
109 copies on the same media.
110
111 * Core clustering mechanism needs to be implemented to support
112 mirroring and basic multi-master operation from a single host
113 (multi-host requires additional network protocols and won't
114 be as easy).
115
fdf62707
MD
116* make sure we aren't using a shared lock during RB_SCAN's?
117
91abd410
MD
118* overwrite in write_file case w/compression - if device block size changes
119 the block has to be deleted and reallocated. See hammer2_assign_physical()
120 in vnops.
121
1a7cfe5a
MD
122* freemap / clustering. Set block size on 2MB boundary so the cluster code
123 can be used for reading.
124
125* need API layer for shared buffers (unfortunately).
126
731b2a84
MD
127* add magic number to inode header, add parent inode number too, to
128 help with brute-force recovery.
129
130* modifications past our flush point do not adjust vchain.
131 need to make vchain dynamic so we can (see flush_scan2).??
132
1a7cfe5a
MD
133* MINIOSIZE/RADIX set to 1KB for now to avoid buffer cache deadlocks
134 on multiple locked inodes. Fix so we can use LBUFSIZE! Or,
135 alternatively, allow a smaller I/O size based on the sector size
136 (not optimal though).
137
a864c5d9
MD
138* When making a snapshot, do not allow the snapshot to be mounted until
139 the in-memory chain has been freed in order to break the shared core.
140
141* Snapshotting a sub-directory does not snapshot any
142 parent-directory-spanning hardlinks.
143
731b2a84
MD
144* Snapshot / flush-synchronization point. remodified data that crosses
145 the synchronization boundary is not currently reallocated. see
146 hammer2_chain_modify(), explicit check (requires logical buffer cache
147 buffer handling).
148
51bf8e9b
MD
149* on fresh mount with multiple hardlinks present separate lookups will
150 result in separate vnodes pointing to separate inodes pointing to a
151 common chain (the hardlink target).
152
153 When the hardlink target consolidates upward only one vp/ip will be
154 adjusted. We need code to fixup the other chains (probably put in
155 inode_lock_*()) which will be pointing to an older deleted hardlink
156 target.
157
32b800e6
MD
158* Filesystem must ensure that modify_tid is not too large relative to
159 the iterator in the volume header, on load, or flush sequencing will
160 not work properly. We should be able to just override it, but we
161 should complain if it happens.
162
8c280d5d
MD
163* Kernel-side needs to clean up transaction queues and make appropriate
164 callbacks.
165
166* Userland side needs to do the same for any initiated transactions.
167
222d9e22
MD
168* Nesting problems in the flusher.
169
01eabad4
MD
170* Inefficient vfsync due to thousands of file buffers, one per-vnode.
171 (need to aggregate using a device buffer?)
172
8cce658d
MD
173* Use bp->b_dep to interlock the buffer with the chain structure so the
174 strategy code can calculate the crc and assert that the chain is marked
175 modified (not yet flushed).
176
177* Deleted inode not reachable via tree for volume flush but still reachable
178 via fsync/inactive/reclaim. Its tree can be destroyed at that point.
179
866d5273
MD
180* The direct write code needs to invalidate any underlying physical buffers.
181 Direct write needs to be implemented.
182
183* Make sure a resized block (hammer2_chain_resize()) calculates a new
222d9e22 184 hash code in the parent bref
866d5273 185
995e78dc
MD
186* The freemap allocator needs to getblk/clrbuf/bdwrite any partial
187 block allocations (less than 64KB) that allocate out of a new 64K
188 block, to avoid causing a read-before-write I/O.
189
190* Check flush race upward recursion setting SUBMODIFIED vs downward
191 recursion checking SUBMODIFIED then locking (must clear before the
192 recursion and might need additional synchronization)
193
db0c2eb3
MD
194* There is definitely a flush race in the hardlink implementation between
195 the forwarding entries and the actual (hidden) hardlink inode.
196
197 This will require us to associate a small hard-link-adjust structure
198 with the chain whenever we create or delete hardlinks, on top of
199 adjusting the hardlink inode itself. Any actual flush to the media
200 has to synchronize the correct nlinks value based on whether related
201 created or deleted hardlinks were also flushed.
202
995e78dc
MD
203* When a directory entry is created and also if an indirect block is
204 created and entries moved into it, the directory seek position can
205 potentially become incorrect during a scan.
206
207* When a directory entry is deleted a directory seek position depending
208 on that key can cause readdir to skip entries.
db0c2eb3 209
73e441b9
MD
210* TWO PHASE COMMIT - store two data offsets in the chain, and
211 hammer2_chain_delete() needs to leave the chain intact if MODIFIED2 is
212 set on its buffer until the flusher gets to it?
213
db0c2eb3
MD
214
215 OPTIMIZATIONS
216
217* If a file is unlinked buts its descriptors is left open and used, we
218 should allow data blocks on-media to be reused since there is no
219 topology left to point at them.