Fix formatting
[ikiwiki.git] / docs / developer / gsocprojectspage / index.mdwn
90c0b013 1[[!meta title="Google Summer of Code Project List"]]
4114eeff 2
3[[!toc levels=0]]
f57c228c 5Have a look at our SoC pages from [[2008|/docs/developer/GoogleSoC2008/]], [[2009|/docs/developer/gsoc2009/]], [[2010|/docs/developer/gsoc2010/]] and [[2011|/docs/developer/gsoc2011/]] to get an overview about prior year's projects.
4114eeff 6
7For more details on Google's Summer of Code: [Google's SoC page](
f57c228c 9Alternate project links: [[Projects page|/docs/developer/ProjectsPage/]], [[Research Projects|/docs/developer/researchprojectspage/]]
4114eeff 11Note to prospective students: These project proposals are meant to be a first approximation; we're looking forward to your own suggestions (even for completely new directions) and will try to integrate your ideas to make the GSoC project more interesting to all parties. Even when a proposal is very specific about the goals that must be achieved and the path that should be taken, these are always negotiable. Keep in mind that we have tried to limit the proposals on this page to those that (based on our past experience) are appropriate for the GSoC program. This is by no means a comprehensive list, original ideas or proposals based on project ideas found on other pages are very welcome.
13Note to everyone else: These proposals are by no means Summer of Code specific, anyone is welcome and encouraged to adopt any of these projects at any time (just please let us know, or make a note on this page).
17* Prerequisites: knowledge that the student should have before starting the project. It may be possible to acquire the knowledge in the course of the project, but the estimated difficulty would increase substantially. On the bright side, you can expect to have a much deeper understanding of these fields (and gain some real-world experience) after you successfully complete the respective project.
18* Difficulty: Estimated difficulty of the project, taking into account the complexity of the task and the time constraints of the GSoC program.
19* Contact point: The person you should contact for any further information or clarifications. If the primary contact for a project does not respond in a reasonable amount of time (2-3 days), you should contact the appropriate DragonFly BSD mailing list, usually kernel@.
21#### Project ideas
11f3d975 23---
a6283272 25##### Implement amd64 Linux compatibility for x86_64 64-bit kernel
26* Add a syscall table which translates 64-bit Linux system calls to DragonFly ones
27* Add support for ELF binary detection.
29DragonFly/i386 supports the execution of 32 bit Linux binaries; it
30is only natural to implement the same kind of binary compatibility
31for 64-bit systems.
33Some of the other *BSD systems may already have implemented such a mechanism.
35Meta information:
37* Prerequisites: C, amd64 architecture knowledge
38* Difficulty: Moderate to difficult
39* Contact point:
4114eeff 42
43##### Implement ARC algorithm extension for the vnode free list
44* Vnode recycling is LRU and can't efficiently handle data sets which
45 exceed the maxvnode limit. When the maxvnode limit is reached the kernel
46 starts throwing away cached vnodes along with their VM objects (and thus
47 all related cached file data).
49* What we would like to do is implement an ARC algorithm for the free
50 vnodes to determine which ones to throw away and potentially combine
51 this with further caching of the related VM object even after the vnode
52 is thrown away by associating it with a mount point and inode number,
53 until memory pressure forces all of its pages out.
55* For this project the student can choose to just implement the VM object
56 retention portion and not try to implement an ARC algorithm (which can
57 be considerably more complex).
59Meta information:
61* Prerequisites: C, OS internals
62* Difficulty: Modest without ARC (Very difficult with ARC)
63* Contact point: dillon
4114eeff 67##### Make DragonFly NUMA-aware
69* Parse related ACPI tables
70* NUMA-aware memory allocation
71* References:
72[ACPI SLIT parser](
73[ACPI SRAT parser](
74[NetBSD NUMA diff](
75[NetBSD NUMA x86 diff]( (These patches now in NetBSD tree)
77Meta information:
79* Prerequisites: C, introductory computer architecture
80* Difficulty: Easy
81* Contact point:
86##### Port valgrind to DragonFlyBSD
88Valgrind is a very useful tool on a system like DragonFly that's under heavy development. Since valgrind is very target specific, a student doing the port will have to get acquainted with many low level details of the system libraries and the user<->kernel interface (system calls, signal delivery, threading...). This is a project that should appeal to aspiring systems programmers. Ideally, we would want the port to be usable with vkernel processes, thus enabling complex checking of the core kernel code.
90The goal of this project is to port valgrind to the DragonFlyBSD platform so that at least the memcheck tool runs sufficiently well to be useful. This is in itself a challenging task. If time remains, the student should try to get at least a trivial valgrind tool to work on a vkernel process.
92Meta information:
94* Prerequisites: C, x86 assembly, low-level OS internals
95* Difficulty: Hard
96* Contact point: Aggelos Economopoulos <>
1e5fd6e2 100##### Make vkernels checkpointable (2011 Project)
4114eeff 101
102* See checkpt(1).
103* Implement save and restore of segment registers so that threaded applications may be checkpointed. The segment registers support TLS. There are potential security concerns here.
104* Teach the checkpt system call how to checkpoint multiple vmspaces.
105* Add code to the vkernel which gets triggered upon reception of a SIGCKPT signal to dump/load e.g. the current state of network drivers.
106* This would allow us to save and restore or even migrate a complete DragonFly operating system running on the vkernel platform.
107This could be especially handy on laptops (if we'd get X11 operating in vkernels).
108* See also:
110Meta information:
112* Prerequisites: C, OS internals
113* Difficulty: Medium
114* Contact point: Michael Neumann <>
115* References: [1]( [2](
119##### HAMMER compression
121* Compress blocks as they get written to disk.
122* Only file data (rec_type == DATA) should be compressed, not meta-data.
123* the CRC should be that of the uncompressed data.
124* ideally you'd need to associate the uncompressed data with the buffer cache buffer somehow, so that decompression is only performed once.
125* compression could be turned on a per-file or per-pfs basis.
126* gzip compression would be just fine at first.
128Doing compression would require flagging the data record as being compressed and also require double-buffering since
129the buffer cache buffer associated with the uncompressed data might have holes in it and otherwise referenced by user
130programs and cannot serve as a buffer for in-place compression or decompression.
132The direct read / direct write mechanic would almost certainly have to be disabled for compressed buffers and the
133small-data zone would probably have to be used (the large-data zone is designed only for use with 16K or 64K buffers).
135Meta information:
137* Prerequisites: C, filesystem internals
138* Difficulty: Difficult
139* Contact point: Michael Neumann <>
4114eeff 143##### Userland System V Shared Memory / Semaphore / Message Queue implementation
144* Implement some or all of these subsystems in their entirety, or as completely as possible in userland using a daemon, mmap and the DragonFly umtx_sleep(2)/umtx_wakeup(2) or other userland facilities.
145* Any security or other major hurdles to this approach that would likely have to be implemented in-kernel should be noted in the students application.
146* Test and benchmark the new facilities with heavy SysV consumers such as PostgreSQL
147* Identify performance tradeoffs made in the userland implementation versus the existing kernel implementation. If time permits identify and apply solutions to these tradeoffs so that the userland implementation performs on par with or better than the kernel implementation.
149Meta information:
151* Prerequisites: C, x86 assembly
152* Difficulty: Moderate
153* Contact point: Samuel J. Greear <>
8e2bdf60 157##### DragonFly history access for Gnome/KDE
5654e978 158* Write a Dolphin (KDE) plugin or Gnome file manager plugin that creates a 'time slider' when working with HAMMER filesystems.
159* If time remains investigate additional features and/or methods of display and possibly a HAMMER configuration utility for managing history retention, etc.
8e2bdf60 160
161Meta information:
5654e978 163* Prerequisites: C, Gnome or KDE familiarity
8e2bdf60 164* Difficulty: Hard
165* Contact point:
166* References: [A similar idea for ZFS](
170##### Create a Samba VFS plugin to expose Hammer history
171* Give access to Hammer snapshots/fine-grained history to anyone able to access the Hammer volume over Samba
5654e978 172* This would involve writing a Samba3 VFS module to expose historical versions of files as "shadow copies". VFS module implementations supporting more traditional snapshot hierarchies do already exist.
8e2bdf60 173
174Meta information:
5654e978 176* Prerequisites: C
177* Difficulty: Moderate
8e2bdf60 178* Contact point:
182##### Port Hyper-V Linux Integration components to DragonFly
183* Microsoft released a dual BSD/GPL version of their para-virtualized drivers (SCSI and Networking) for Linux.
184* This work would require porting the Linux VMBus (Microsoft's equivlalent to XenBus) and the corresponding SCSI (StorVSC) and networking (NetVSC) drivers to DragonFly.
185* References: [Sources]( [Architecture Overview](
8e2bdf60 186
187Meta information:
189* Prerequisites: C, OS internals
190* Difficulty: Hard
191* Contact point:
8e2bdf60 192
8cb8502a 193---
195##### Implement more dm targets
196* Since we now have dm (device mapper) in DragonFly, it would be nice to make better use of it. Currently we have a relatively small number of useful targets (crypt, linear and striped).
197* Other targets should be implemented, in particular the mirror target would be of interest. Other ideas are welcome, too. Before applying for this please discuss the target of interest on the mailing list or with me directly.
c2cfaa23 198* There is a start of a journalled mirror target, if you want to attack soft mirroring; the problem is a lot more difficult than it seems at first, so talking on the mailing list or on IRC would be definitely worthwhile!
8cb8502a 199
200Meta information:
202* Prerequisites: C, OS internals
203* Difficulty: Medium
c2cfaa23 204* Contact point: , Alex Hornung <>, Venkatesh Srinivas <>
8cb8502a 205
3a9e5de8 206---
208##### Implement a new unionfs
209* unionfs is a particularly useful pseudo-fs which allows to have an upper and a lower filesystem on a single mountpoint. The upper mountpoint is mostly transparent, so that the lower mountpoint is accessible.
3a9e5de8 210* A typical use case is mounting a tmpfs filesystem as the upper and a read-only FS as the lower mp. This way files can be edited transparently even on a RO filesystem without actually modifying it.
65d9da85 211* The current unionfs is completely broken as it relies on the whiteout VFS technique which is not supported by HAMMER. A new unionfs implementation should not rely on archaic methods such as whiteout.
3a9e5de8 212
213Meta information:
215* Prerequisites: C, OS internals, ideally some knowledge of the FreeBSD/DragonFly VFS
216* Difficulty: Medium
217* Contact point:
8e2bdf60 218
0c47d1e1 219---
221##### Improve compatibility of libdevattr with Linux' libudev
222* Our libdevattr has an API which is mostly compatible with Linux' libudev, but it is doubtful that any Linux application making use of libudev would run out of the box on DragonFly with libdevattr.
b515bc2d 223* The aim of this project is to identify the shortcomings of libdevattr and fix them so that some common libudev applications work with our libdevattr.
224* This might involve some kernel hacking to improve our kern_udev and definitely includes some grunt work of "tagging" subsystems with the kern_udev API.
225* Most of the work will be in userland, though, working on udevd and libdevattr.
0c47d1e1 226
227Meta information:
229* Prerequisites: C, familiarity with Linux' libudev would be a plus
230* Difficulty: Medium
231* Contact point: , Alex Hornung <>
9792e675 235---
fc1a3138 237##### Implement further dsched disk scheduling policies (2011 Project: BFQ)
9792e675 238* dsched is a highly flexible disk scheduling framework which greatly minimizes the effort of writing disk scheduling policies.
239* Currently only dsched_fq, a fairly simple fair-queuing policy, and noop policies are implemented.
240* The aim of this project would be to implement at least another useful disk scheduling policy, preferably one that improves interactivity.
241* Other ideas are welcome.
1e4090b0 242* This is a great opportunity for CS students interested in scheduling problems to apply their theoretical knowledge.
9792e675 243
244Meta information:
246* Prerequisites: C, OS internals, familiarity with disk scheduling
247* Difficulty: Medium
248* Contact point: , Alex Hornung <>
82d3efd3 250---
692141e4 252##### Implement hardware nested page table support for vkernels
82d3efd3 253* Various modern hardware supports virtualization extensions, including nested pagetables.
254* The DragonFly BSD vmspaces API, used to support vkernels, is effectively a software implementation of nested pagetables.
255* The goal of this project would be to add support for detection of the hardware features on AMD and Intel cpu's and alter the vmspace implementation to use hardware support when available.
257Meta information:
9792e675 258
82d3efd3 259* Prerequisites: C, x86 assembly, OS internals
260* Difficulty: Hard
261* Contact point:
8ff76619 264
ec874e8b 265##### Access to ktr(4) buffers via shared memory
266Our event tracing system, ktr(4), records interesting events in per-cpu buffers that are printed out with ktrdump(8). Currently, ktrdump uses libkvm to access these buffers, which is suboptimal. One can allow a sufficiently-privileged userspace process to map those buffers read-only and access them directly. For bonus points, design an extensible, discoverable (think reflection) mechanism that provides fast access via shared memory to data structures that the kernel chooses to expose to userland.
268Meta information:
270* Prerequisites: C, OS internals
271* Difficulty: Medium
272* Contact point:, Aggelos Economopoulos <>
9376443d 275##### nmalloc (libc malloc) measurements and performance work
277nmalloc is our libc memory allocator it is a slab-like allocator; it recently had some work done to add per-thread caches, but there is much more work that could be done. A project on this might characterize fragmentation, try out a number of techniques to improve per-thread caching and reduce the number of total syscalls, and see if any are worth applying.
279Possible things to work on:
280(thread caches)
281* The per-thread caches are fixed-size; at larger object sizes (say 4K), this can result in a lot of memory tied up. Perhaps they should scale their max size inversely to the object size.
283* The per-thread caches are filled one-at-a-time from free(). Perhaps the per-thread caches should be burst-filled.
285* Perhaps the per-thread caches should age items out
287(slab zone allocation)
288* zone_alloc() currently burst-allocates slab zones with the zone magazine held across a spinlock.
290* zone_free() holds the zone magazine lock around bzero()ing a slab zone header
292* zone_free() madvise()s one slab at a time; it'd be nice to madvise() runs of contiguous slabs
294* zone_free() madvise()s very readily (for every slab freed). Perhaps it should only madvise slabs that are idle for some time
296* zone_free() burst-frees slabs. Its not clear whether this is a good idea.
299* currently allocations > either 4k or 8k are forced directly to mmap(); this means that idle memory from free slabs cannot be used to service those allocations and that we do no caching for allocations > than that size. this is almost certainly a mistake.
301* we could use a small (embeddable) data structure that allows:
3021. efficient coalescing of adjacent mmap space for madvise
3032. efficient queries for vmem_alloc() (w/ alignment!)
3043. compact and doesn't use any space in the zone header (dirty/cold!)
3054. allows traversal in address order to fight fragmentation
3065. keep two such data structures (one for dirty pages, one for cold pages)
309* These are just ideas; there are many more things possible and many of these things need a lot of measurement to evaluate them. It'd be interesting to see if any of these are appropriate for it.
314A description of the Sun Solaris work on which the DragonFly allocator is based; use this as an overview, but do not take it as gospel for how the DFly allocator works.
316* (Jason Evans tech talk about jemalloc, 1/2011)
318jemalloc is FreeBSD's and Firefox's (and NetBSD and GNASH and ...)'s malloc; in this tech talk, Jason Evans reviews how jemalloc works, how it has changed recently, and how it avoid fragmentation.
320* (Ayelet Wasik's thesis 'Features of a Multi-Threaded Memory Allocator')
322This thesis is an excellent overview of many techniques to reduce contention and the effects these techniques have on fragmentation.
324* Prerequisites: C, a taste of data structures
325* Difficulty: moderate
326* Contact point: Venkatesh Srinivas <>
01b524ba 329
330##### Create a filesystem indexing service
331Currently to locate an arbitrary file on a dragonfly system you would use the locate(1), which(1) or whereis(1) tools. These are a bit clunky, paint in broad strokes and the accuracy of the database is often suspect. The first part of this project would involve implementing the Linux inotify interface in the DragonFly kernel. The second part would be to write a daemon that can (optionally) operate as an indexing service, if the weekly 310.locate periodic job see's that the locate database is being maintained by the daemon, it can skip running locate.updatedb(8). A third part of this project might involve extending the current database to a binary format with information about file types, what bits are set, etc. This could enable the user to have the locate tool paint in narrower strokes by specifying only files of type "ASCII text" or only files that are suid root or have the execute bit set.
333Meta information:
335* Prerequisites: C, OS internals, binary file formats
336* Difficulty: Easy/Moderate
337* Contact point: Samuel J. Greear <>
4c7e21d4 340
341##### Make DragonFly multiboot capable
342Adjust the DragonFly kernel to be multiboot (the specification) capable. In addition, add necessary code to grub2 to understand our disklabel64 and anything else we need to be able to use grub2 to multiboot DragonFly without any chainloading involved.
344Meta information:
346* Prerequisites: C, OS internals
347* Difficulty: Easy/Moderate
348* Contact point: Alex Hornung <>
7700f24f 350
2fed452e 353##### Complete installer rewrite
996043c0 354Completely rewrite the installer to be much simpler to maintain. It will still have to be an ncurses-based installer written in C, or in Python (but with C bindings for every single library that will be created - see below). A text interface UI library (e.g. newt [see examples on] - which seems very easy and handy) should be used to make the handling of the graphical part as easy as possible.
2fed452e 355
356As part of rewriting the installer, several functions scattered around in other base utils should be factored out into libraries that both the installer and the util it comes from can use, e.g.:
7c6affa1 357
358* partitioning (both GPT and MBR) should be factored out into two libraries, that the fdisk and the gpt tools use, but the installer can make use of, too.
359* disklabel32/64 functionality
360* adduser (and other user/group management)
2fed452e 361
362The new installer should then make use of all these new libraries and other ones that are already available (libcryptsetup, libluks, liblvm, libtcplay) to offer more advanced features.
c5845662 364NOTE: The new installer should maintain most if not all of the functionality of the old installer in addition to adding features taking advantage of the aforementioned libraries.
2fed452e 366
368Meta information:
370* Prerequisites: C
371* Difficulty: Moderate
372* Contact point:, Alex Hornung <>
4c7e21d4 375---
103f9a35 376##### Extend dsched framework to support jails
377Extend/modify the dsched framework to take into account jails and etc. instead of always allocating a 'tdio'. This would allow different process groupings (such as all processes in a jail) to be scheduled together. A new jail-specific policy would have to be written to support this, or an existing policy modified.
379Meta information:
381* Prerequisites: C, OS internals
382* Difficulty: Moderate
383* Contact point:, Samuel J. Greear <>, Alex Hornung <>
1a9c0407 386##### Implement NFS version 4
387* NFSv4 is more than a simple version increase; it is an adaptation of NFS to Internet and WAN networks, with an expectation of high latency and firewalled data transfers and a non-naive security framework layer.
388* NFSv4 servers export a single Pseudo File System (which has nothing to do with HAMMER(5) PFSes besides the name) merging all local filesystems in a unique namespace.
389* We already have some kernel code which could be used as a starting point (WebNFS)
d8f691bb 390* FreeBSD possesses a NFSv4 implementation which could be ported or serve as a reference basis
391* Given NFSv4 protocol complexity, it may be best to implement this project in userspace
1a9c0407 392
393Meta information:
395* Prerequisites: C, OS internals, ideally some knowledge of the VFS and namecache layers
396* Difficulty: Medium
397* Contact point:
4114eeff 400 (please add)
2a10524a 401
8d0fe716 403## Old not-so-useful project ideas, don't look here
2a10524a 404
405##### Implement i386 32-bit ABI for x86_64 64-bit kernel
406* Add a 32-bit syscall table which translates 32-bit
407 system calls to 64-bit.
408* Add support for 32 bit compatibility mode operation
409 and ELF binary detection.
411The idea here is to support the execution of 32 bit DragonFly binaries in 64 bit DragonFly environments, something numerous other operating systems have done. Several things must be done to support this. First, the appropriate control bits must be set to execute in 32-bit compatibility mode while in usermode instead of 64-bit mode. Second, when a system call is made from 32-bit mode a translation layer is needed to translate the system call into the 64-bit requivalent within the kernel. Third, the signal handler and trampoline code needs to operate on the 32-bit signal frame. Fourth, the 32 and 64 bit ELF loaders both have to be in the kernel at the same time, which may require some messing around with procedure names and include files since originally the source was designed to be one or the other.
413There are several hundred system calls which translates to a great deal of 'grunt work' when it comes time to actually do all the translations.
415Meta information:
417* Prerequisites: C
418* Difficulty: Difficult (lots of moving parts, particularly the trapframes)
419* Contact point: dillon
423DragonFly/x86_64 has been available for a few years, and is now
424the most used DragonFly architecture.
425There has never been an obvious need to use i386 DragonFly binaries
2bb9dcbe 426with it, all available DragonFly/i386 software can be rebuilt from source code. (comment added on 2013-02-21)
2a10524a 427
430##### Adapt pkgsrc to create a package system with dependency independence.
431* Create a set of tools that modifies how the pkgsrc packages are installed, allowing for the ability to upgrade individual packages, without stopping applications that depend on said packages from working. One method of achieving this is detailed at but other methods may be possible. PC-BSD have written a tool called PBI Builder which modifies FreeBSD ports for their dependency independence PBI system, this could be used as a starting point for the DragonFly BSD tools.
433Meta information:
435* Prerequisites: C
436* Difficulty: ?
437* Contact point:
441A new dports/pkg packaging system based on FreeBSD ports and pkgng has
442been implemented and is far superior to pkgsrc for all practical purposes.
443Pkgsrc may not be the best base to start such a project and the time
2bb9dcbe 444needed to implement it will be far greater than the regular GSoc timeframe. More like one year than two months. (comment added on 2013-02-21)
2a10524a 445
8d0fe716 447
2bb9dcbe 448#####Ability to execute Mach-O (OS X) binaries
449This is a project for a student with something to prove, executing a binary touches a huge number of moving parts of a modern kernel. This project would entail adding or porting support for Mach-O binaries to the DragonFly BSD kernel. It would also involve adding an additional system call vector, like the Linux vector used for linux binary emulation. This is quite a large and complicated task and any proposal will be expected to be well-researched to reflect that. The ability to execute non-GUI binaries that make use of shared libraries should be the minimum to which such a project should aspire. OpenDarwin is available as a reference or to port relevant code from.
451Meta information:
453* Prerequisites: C, OS internals, binary file formats
454* Difficulty: Hard
455* Contact point: Samuel J. Greear <>
8d0fe716 458
2bb9dcbe 459This project will only allow us to execute a few command-line utilities, most of which are already present in all Unix like systems.
460Beeing able to run Mac-OSX graphical applications will be a multi-year undertaking on top of it. Wine has been trying to reimplement Microsoft Windows APIs for 20 years already. (comment added on 2013-02-21)
8d0fe716 461
2bb9dcbe 462---