Commit | Line | Data |
---|---|---|
21a90458 | 1 | |
e2163f5b MD |
2 | * Need backend synchronization / serialization when the frontend detaches |
3 | a XOP. modify_tid tests won't be enough, the backend may wind up executing | |
4 | the XOP out of order after the detach. | |
159c3ca2 | 5 | |
8cd26e36 MD |
6 | * xop_start - only start synchronized elements |
7 | ||
12ff971c MD |
8 | * See if we can remove hammer2_inode_repoint() |
9 | ||
d34788ef MD |
10 | * FIXME - logical buffer associated with write-in-progress on backend |
11 | disappears once the cluster validates, even if more backend nodes | |
12 | are in progress. | |
13 | ||
14 | * FIXME - backend ops need per-node transactions using spmp to protect | |
15 | against flush. | |
16 | ||
17 | * FIXME - modifying backend ops are not currently validating the cluster. | |
18 | That probably needs to be done by the frontend in hammer2_xop_start() | |
19 | ||
20 | * modify_tid handling probably broken w/ the XOP code for the moment. | |
21 | ||
c603b86b MD |
22 | * embedded transactions in XOPs - interlock early completion |
23 | ||
24 | * remove current incarnation of EAGAIN | |
25 | ||
c847e838 MD |
26 | * mtx locks should not track td_locks count?. They can be acquired by one |
27 | thread and released by another. Need API function for exclusive locks. | |
28 | ||
159c3ca2 MD |
29 | * Convert xops and hammer2_update_spans() from cluster back into chain calls |
30 | ||
e513e77e MD |
31 | * syncthr leaves inode locks for entire sync, which is wrong. |
32 | ||
5ceaaa82 MD |
33 | * recovery scan vs unmount. At the moment an unmount does its flushes, |
34 | and if successful the freemap will be fully up-to-date, but the mount | |
35 | code doesn't know that and the last flush batch will probably match | |
36 | the PFS root mirror_tid. If it was a large cpdup the (unnecessary) | |
37 | recovery pass at mount time can be extensive. Add a CLEAN flag to the | |
38 | volume header to optimize out the unnecessary recovery pass. | |
39 | ||
e513e77e MD |
40 | * More complex transaction sequencing and flush merging. Right now it is |
41 | all serialized against flushes. | |
42 | ||
5ceaaa82 MD |
43 | * adding new pfs - freeze and force remaster |
44 | ||
45 | * removing a pfs - freeze and force remaster | |
46 | ||
464659a3 MD |
47 | * bulkfree - sync between passes and enforce serialization of operation |
48 | ||
49 | * bulkfree - signal check, allow interrupt | |
50 | ||
51 | * bulkfree - sub-passes when kernel memory block isn't large enough | |
52 | ||
53 | * bulkfree - limit kernel memory allocation for bmap space | |
54 | ||
55 | * bulkfree - must include any detached vnodes in scan so open unlinked files | |
56 | are not ripped out from under the system. | |
57 | ||
58 | * bulkfree - must include all volume headers in scan so they can be used | |
59 | for recovery or automatic snapshot retrieval. | |
60 | ||
61 | * bulkfree - snapshot duplicate sub-tree cache and tests needed to reduce | |
62 | unnecessary re-scans. | |
63 | ||
e07becf8 MD |
64 | * Currently the check code (bref.methods / crc, sha, etc) is being checked |
65 | every single blasted time a chain is locked, even if the underlying buffer | |
66 | was previously checked for that chain. This needs an optimization to | |
67 | (significantly) improve performance. | |
68 | ||
da6f36f4 MD |
69 | * flush synchronization boundary crossing check and current flush chain |
70 | interlock needed. | |
50456506 MD |
71 | |
72 | * snapshot creation must allocate and separately pass a new pmp for the pfs | |
73 | degenerate 'cluster' representing the snapshot. This theoretically will | |
74 | also allow a snapshot to be generated inside a cluster of more than one | |
75 | node. | |
76 | ||
77 | * snapshot copy currently also copies uuids and can confuse cluster code | |
78 | ||
58e43599 MD |
79 | * hidden dir or other dirs/files/modifications made to PFS before |
80 | additional cluster entries added. | |
81 | ||
278ab2b2 MD |
82 | * transaction on cluster - multiple trans structures, subtrans |
83 | ||
84 | * inode always contains target cluster/chain, not hardlink | |
85 | ||
278ab2b2 MD |
86 | * chain refs in cluster, cluster refs |
87 | ||
72ebfa75 MD |
88 | * check inode shared lock ... can end up in endless loop if following |
89 | hardlink because ip->chain is not updated in the exclusive lock cycle | |
90 | when following hardlink. | |
91 | ||
0924b3f8 MD |
92 | cpdup /build/boomdata/jails/bleeding-edge/usr/share/man/man4 /mnt/x3 |
93 | ||
623d43d4 MD |
94 | |
95 | * The block freeing code. At the very least a bulk scan is needed | |
96 | to implement freeing blocks. | |
97 | ||
98 | * Crash stability. Right now the allocation table on-media is not | |
99 | properly synchronized with the flush. This needs to be adjusted | |
100 | such that H2 can do an incremental scan on mount to fixup | |
101 | allocations on mount as part of its crash recovery mechanism. | |
102 | ||
103 | * We actually have to start checking and acting upon the CRCs being | |
104 | generated. | |
105 | ||
106 | * Remaining known hardlink issues need to be addressed. | |
107 | ||
108 | * Core 'copies' mechanism needs to be implemented to support multiple | |
109 | copies on the same media. | |
110 | ||
111 | * Core clustering mechanism needs to be implemented to support | |
112 | mirroring and basic multi-master operation from a single host | |
113 | (multi-host requires additional network protocols and won't | |
114 | be as easy). | |
115 | ||
fdf62707 MD |
116 | * make sure we aren't using a shared lock during RB_SCAN's? |
117 | ||
91abd410 MD |
118 | * overwrite in write_file case w/compression - if device block size changes |
119 | the block has to be deleted and reallocated. See hammer2_assign_physical() | |
120 | in vnops. | |
121 | ||
1a7cfe5a MD |
122 | * freemap / clustering. Set block size on 2MB boundary so the cluster code |
123 | can be used for reading. | |
124 | ||
125 | * need API layer for shared buffers (unfortunately). | |
126 | ||
731b2a84 MD |
127 | * add magic number to inode header, add parent inode number too, to |
128 | help with brute-force recovery. | |
129 | ||
130 | * modifications past our flush point do not adjust vchain. | |
131 | need to make vchain dynamic so we can (see flush_scan2).?? | |
132 | ||
1a7cfe5a MD |
133 | * MINIOSIZE/RADIX set to 1KB for now to avoid buffer cache deadlocks |
134 | on multiple locked inodes. Fix so we can use LBUFSIZE! Or, | |
135 | alternatively, allow a smaller I/O size based on the sector size | |
136 | (not optimal though). | |
137 | ||
a864c5d9 MD |
138 | * When making a snapshot, do not allow the snapshot to be mounted until |
139 | the in-memory chain has been freed in order to break the shared core. | |
140 | ||
141 | * Snapshotting a sub-directory does not snapshot any | |
142 | parent-directory-spanning hardlinks. | |
143 | ||
731b2a84 MD |
144 | * Snapshot / flush-synchronization point. remodified data that crosses |
145 | the synchronization boundary is not currently reallocated. see | |
146 | hammer2_chain_modify(), explicit check (requires logical buffer cache | |
147 | buffer handling). | |
148 | ||
51bf8e9b MD |
149 | * on fresh mount with multiple hardlinks present separate lookups will |
150 | result in separate vnodes pointing to separate inodes pointing to a | |
151 | common chain (the hardlink target). | |
152 | ||
153 | When the hardlink target consolidates upward only one vp/ip will be | |
154 | adjusted. We need code to fixup the other chains (probably put in | |
155 | inode_lock_*()) which will be pointing to an older deleted hardlink | |
156 | target. | |
157 | ||
32b800e6 MD |
158 | * Filesystem must ensure that modify_tid is not too large relative to |
159 | the iterator in the volume header, on load, or flush sequencing will | |
160 | not work properly. We should be able to just override it, but we | |
161 | should complain if it happens. | |
162 | ||
8c280d5d MD |
163 | * Kernel-side needs to clean up transaction queues and make appropriate |
164 | callbacks. | |
165 | ||
166 | * Userland side needs to do the same for any initiated transactions. | |
167 | ||
222d9e22 MD |
168 | * Nesting problems in the flusher. |
169 | ||
01eabad4 MD |
170 | * Inefficient vfsync due to thousands of file buffers, one per-vnode. |
171 | (need to aggregate using a device buffer?) | |
172 | ||
8cce658d MD |
173 | * Use bp->b_dep to interlock the buffer with the chain structure so the |
174 | strategy code can calculate the crc and assert that the chain is marked | |
175 | modified (not yet flushed). | |
176 | ||
177 | * Deleted inode not reachable via tree for volume flush but still reachable | |
178 | via fsync/inactive/reclaim. Its tree can be destroyed at that point. | |
179 | ||
866d5273 MD |
180 | * The direct write code needs to invalidate any underlying physical buffers. |
181 | Direct write needs to be implemented. | |
182 | ||
183 | * Make sure a resized block (hammer2_chain_resize()) calculates a new | |
222d9e22 | 184 | hash code in the parent bref |
866d5273 | 185 | |
995e78dc MD |
186 | * The freemap allocator needs to getblk/clrbuf/bdwrite any partial |
187 | block allocations (less than 64KB) that allocate out of a new 64K | |
188 | block, to avoid causing a read-before-write I/O. | |
189 | ||
190 | * Check flush race upward recursion setting SUBMODIFIED vs downward | |
191 | recursion checking SUBMODIFIED then locking (must clear before the | |
192 | recursion and might need additional synchronization) | |
193 | ||
db0c2eb3 MD |
194 | * There is definitely a flush race in the hardlink implementation between |
195 | the forwarding entries and the actual (hidden) hardlink inode. | |
196 | ||
197 | This will require us to associate a small hard-link-adjust structure | |
198 | with the chain whenever we create or delete hardlinks, on top of | |
199 | adjusting the hardlink inode itself. Any actual flush to the media | |
200 | has to synchronize the correct nlinks value based on whether related | |
201 | created or deleted hardlinks were also flushed. | |
202 | ||
995e78dc MD |
203 | * When a directory entry is created and also if an indirect block is |
204 | created and entries moved into it, the directory seek position can | |
205 | potentially become incorrect during a scan. | |
206 | ||
207 | * When a directory entry is deleted a directory seek position depending | |
208 | on that key can cause readdir to skip entries. | |
db0c2eb3 | 209 | |
73e441b9 MD |
210 | * TWO PHASE COMMIT - store two data offsets in the chain, and |
211 | hammer2_chain_delete() needs to leave the chain intact if MODIFIED2 is | |
212 | set on its buffer until the flusher gets to it? | |
213 | ||
db0c2eb3 MD |
214 | |
215 | OPTIMIZATIONS | |
216 | ||
217 | * If a file is unlinked buts its descriptors is left open and used, we | |
218 | should allow data blocks on-media to be reused since there is no | |
219 | topology left to point at them. |