/* $NetBSD: TODO.modules,v 1.24 2021/08/09 20:49:08 andvar Exp $ */ Some notes on the limitations of our current (as of 7.99.35) module subsystem. This list was triggered by an Email exchange between christos and pgoyette. 1. Builtin drivers can't depend on modularized drivers (the modularized drivers are attempted to load as builtins). The assumption is that dependencies are loaded before those modules which depend on them. At load time, a module's undefined global symbols are resolved; if any symbols can't be resolved, the load fails. Similarly, if a module is included in (built-into) the kernel, all of its symbols must be resolvable by the linker, otherwise the link fails. There are ways around this (such as, having the parent module's initialization command recursively call the module load code), but they're often gross hacks. Another alternative (which is used by ppp) is to provide a "registration" mechanism for the "child" modules, and then when the need for a specific child module is encountered, use module_autoload() to load the child module. Of course, this requires that the parent module know about all potentially loadable children. 2. Currently, config(1) has no way to "no define" drivers XXX: I don't think this is true anymore. I think we can undefine drivers now, see MODULAR in amd64, which does no ath* and no select sppp* 3. It is not always obvious by their names which drivers/options correspond to which modules. 4. Right now critical drivers that would need to be pre-loaded (ffs, exec_elf64) are still built-in so that we don't need to alter the boot blocks to boot. This was a conscious decision by core@ some years ago. It is not a requirement that ffs or exec_* be built-in. The only requirement is that the root file-system's module must be available when the module subsystem is initialized, in order to load other modules. This can be accomplished by having the boot loader "push" the module at boot time. (It used to do this in all cases; currently the "push" only occurs if the booted filesystem is not ffs.) 5. Not all parent bus drivers are capable of rescan, so some drivers just have to be built-in. 6. Many (most?) drivers are not yet modularized 7. There's currently no provisions for autoconfig to figure out which modules are needed, and thus to load the required modules. In the "normal" built-in world, autoconfigure can only ask existing drivers if they're willing to manage (ie, attach) a device. Removing the built-in drivers tends to limit the availability of possible managers. There's currently no mechanism for identifying and loading drivers based on what devices might be found. 8. Even for existing modules, there are "surprise" dependencies with code that has not yet been modularized. For example, even though the bpf code has been modularized, there is some shared code in bpf_filter.c which is needed by both ipfilter and ppp. ipf is already modularized, but ppp is not. Thus, even though bpf_filter is modular, it MUST be included as a built-in module if you also have ppp in your configuration. Another example is sysmon_taskq module. It is required by other parts of the sysmon subsystem, including the "sysmon_power" module. Unfortunately, even though the sysmon_power code is modularized, it is referenced by the acpi code which has not been modularized. Therefore, if your configuration has acpi, then you must include the "sysmon_power" module built-in the kernel. And therefore you also need to have "sysmon_taskq" and "sysmon" built-in since "sysmon_power" rerefences them. 9. As a corollary to #8 above, having dependencies on modules from code which has not been modularized makes it extremely difficult to test the module code adequately. Testing of module code should include both testing-as-a-built-in module and testing-as-a-loaded-module, and all dependencies need to be identified. 10. The current /stand/$ARCH/$VERSION/modules/ hierarchy won't scale as we get more and more modules. There are hundreds of potential device driver modules. 11. There currently isn't any good way to handle attachment-specific modules. The build infrastructure (ie, sys/modules/Makefile) doesn't readily lend itself to bus-specific modules irrespective of $ARCH, and maintaining distrib/sets/lists/modules/* is awkward at best. Furthermore, devices such as ld(4), which can attach to a large set of parent devices, need to be modified. The parent devices need to provide a common attribute (for example, ld_bus), and the ld driver should attach to that attribute rather than to each parent. But currently, config(1) doesn't handle this - it doesn't allow an attribute to be used as the device tree's pseudo-root. The current directory structure where driver foo is split between ic/foo.c and bus1/foo_bus1.c ... busn/foo_busn.c is annoying. It would be better to switch to the FreeBSD model which puts all the driver files in one directory. 12. Item #11 gets even murkier when a particular parent can provide more than one attribute. 13. It seems that we might want some additional sets-lists "attributes" to control contents of distributions. As an example, many of our architectures have PCI bus capabilities, but not all. It is rather painful to need to maintain individual architectures' modules/md_* sets lists, especially when we already have to conditionalize the build of the modules based on architecture. If we had a single "attribute" for PCI-bus-capable, the same attribute could be used to select which modules to build and which modules from modules/mi to include in the release. (This is not limited to PCI; recently we encounter similar issues with spkr aka spkr_synth module.) 14. As has been pointed out more than once, the current method of storing modules in a version-specific subdirectory of /stand is sub-optimal and leads to much difficulty and/or confusion. A better mechanism of associating a kernel and its modules needs to be developed. Some have suggested having a top-level directory (say, /netbsd) with a kernel and its modules at /netbsd/kernel and /netbsd/modules/... Whatever new mechanism we arrive at will probably require changes to installation procedures and bootstrap code, and will need to handle both the new and old mechanisms for compatibility. One additional option mentioned is to be able to specify, at boot loader time, an alternate value for the os-release portion of the default module path, i.e. /stand/$MACHINE/$ALT-RELEASE/modules/ The following statement regarding this issue was previously issued by the "core" group: Date: Fri, 27 Jul 2012 08:02:56 +0200 From: To: Subject: Core statement on directory naming for kernel modules The core group would also like to see the following changes in the near future: Implementation of the scheme described by Luke Mewburn in to allow a kernel and its modules to be kept together. Changes to config(1) to extend the existing notion of whether or not an option is built-in to the kernel, to three states: built-in, not built-in but loadable as a module, entirely excluded and not even loadable as a module. 15. The existing config(5) framework provides an excellent mechanism for managing the content of kernels. Unfortunately, this mechanism does not apply for modules, and instead we need to manually manage a list of files to include in the module, the set of compiler definitions with which to build those files, and also the set of other modules on which a module depends. We really need a common mechanism to define and build modules, whether they are included as "built-in" modules or as separately-loadable modules. (From John Nemeth) Some sort of mechanism for a (driver) module to declare the list of vendor/product/other tuples that it can handle would be nice. Perhaps this would go in the module's .plist file? (See #17 below.) Then drivers that scan for children might be able to search the modules directory for an "appropriate" module for each child, and auto-load. 16. PR kern/52821 exposes another limitation of config(1) WRT modules. Here, an explicit device attachment is required, because we cannot rely on all kernel configs to contain the attribute at which the modular driver wants to attach. Unfortunately, the explicit attachment causes conflicts with built-in drivers. (See the PR for more details.) 17. (From John Nemeth) It would be potentially useful if a "push" from the bootloader could also load-and-push a module's .plist (if it exists. 18. (From John Nemeth) Some sort of schema for a module to declare the options (or other things?) that the module understands. This could result in a module-options editor to manipulate the .plist 19. (From John Nemeth) Currently, the order of module initialization is based on module classes and declared dependencies. It might be useful to have additional classes (or sub-classes) with additional invocations of module_class_init(), and it might be useful to have a non-dependency mechanism to provide "IF module-A and module-B are BOTH present, module-A needs to be initialized before module-B". 20. (Long-ago memory rises to the surface) Note that currently there is nothing that requires a module's name to correspond in any way with the name of file from which the module is loaded. Thus, it is possible to attempt to access device /dev/x, discover that there is no such device so we autoload /stand/.../x/x.kmod and initialize the module loaded, even if the loaded module is for some other device entirely! 21. We currently do not support "weak" symbols in the in-kernel linker. It would take some serious thought to get such support right. For example, consider module A with a weak reference to symbol S which is defined in module B. If module B is loaded first, and then module A, the symbol gets resolved. But if module A is loaded first, the symbol won't be resolved. If we subsequently load module B, we would have to "go back" and re-run the linker for module A. Additional difficulties arise when the module which defines the weak symbol gets unloaded. Then, you would need to re-run the linker and _unresolve_ the weak symbol which is no longer defined. 22. A fairly large number of modules still require a maximum warning level of WARNS=3 due to signed-vs-unsigned integer comparisons. We really ought to clean these up. (I haven't looked at them in any detail, but I have to wonder how code that compiles cleanly in a normal kernel has these issues when compiled in a module, when both are done with WARNS=5). 23. The current process of "load all the emulation/exec modules in case one of them might handle the image currently being exec'd" isn't really cool. (See sys/kern/kern_exec.c?) It ends up auto-loading a whole bunch of modules, involving file-system access, just to have most of the modules getting unloaded a few seconds later. We don't have any way to identify which module is needed for which image (ie, we can't determine that an image needs compat_linux vs some other module). 24. Details are no longer remembered, but there are some issues with building xen-variant modules (on amd4, and likely i386). In some cases, wrong headers are included (because a XEN-related #define is missing), but even if you add the definition some headers get included in the wrong order. One particular fallout from this is the inability to have a compat version of x86_64 cpu-microcode module. PR port-xen/53130 This is likely to be fixed by Chuck Silvers on 2020-07-04 which removed the differences between the xen and non-xen module ABIs. As of 2021-05-28 the cpu-microcode functionality has once again been enabled for i386 and amd64 compat_60 modules.