Yang baru di OpenBSD 4.3

Networking
I read that dhcpd(8) was working by luck, using overflow buffers to store options… would you like to tell us more?
Kenneth Westerback: In October 2007, shortly before 4.2 was released, Nahuel Riva and Gera Richart discovered that a carefully crafted client request could cause dhcpd to crash. A fix was developed by millert@ for 4.3 and lead to the first errata for the about to be released 4.2. In essence a client could request a specific size for the response generated by dhcpd which violated assumptions within the code and resulted in stack corruption.
In January Peter Hessler discovered that another carefully crafted request could cause the option storing logic to write data in memory it shouldn’t be using and thus crash dhcpd. As with the October bug, the fault was straightforward to discover once someone encountered a failure.
In between these two discoveries the option processing logic in dhcpd got a thorough going over as I tried to make the logic clearer and understandable. As a result several bugs in handling option storage into the two overflow buffers were fixed. e.g. actually using the second of the overflow buffers! Safer initialization was introduced and more care taken to ensure all option buffers were correctly utilized.
The end result is a much more robust dhpcd for 4.3, which can return more options in each response by correctly utilizing all the option space available to it.
None of these changes should introduce interoperability issues. If anything it should reduce them by more correctly implementing the standards.
What changed in the way TCP responses to highly fragmented packets are constructed?
Markus Friedl: It’s just a bug fix. When creating a response to a TCP packet, the stack made some assumptions about the layout of the original mbuf chain. After some IPv6-related changes these assumptions where no longer true, so now we create a packet from the scratch when sending TCP responses (usually for TCP-Resets).
You developed snmpd(8) an implementation of the Simple Network Management Protocol, and snmpctl(8), its control tool. What is the status of the implementation?
Reyk Floeter: I started working on working on snmpd(8) because I needed an alternative to net-snmp which is more secure, less complicated, reduced to basic functionality, and designed for OpenBSD. Many people picked it up and started using it even when it still was in a very early development stage. It has been tested with net-snmp, Nagios and some other free and commercial SNMP implementations or Network Monitoring Systems (NMS).
I also like to thank the user community and some developers for the very good feedback with useable bug reports, code review, and testing. We also tested it against the PROTOS test suite which helped to find some remaining issues; snmpd(8) is running very stable now.
For the future I neither plan to implement every existing MIB nor any exotic SNMP extensions like Agent-X, but there will be further work to add more MIBs related to TCP/IP networking and OpenBSD monitoring. It currently supports most of the SNMPv1/v2c MIBs, IP-MIB, BRIDGE-MIB, HOST-RESOURCES-MIB, IF-MIB, and the OPENBSD-SENSORS-MIB. It is also possible to send SNMPv2 traps via snmpctl(8) or from relayd(8).
hoststated(8)/hoststatectl(8) were renamed to relayd(8)/relayctl(8)… What is the new scope of the tool and what features have you added in this release?
Reyk Floeter: hoststated(8) has been started as a daemon for health checks on load balanced hosts – it was the “Host State Daemon” as a helper to extend pf’s load balancing capabilities. The layer 7 relaying code I wrote extended the daemon in a significant way and the old name was a little bit misleading.
relayd(8) is a fully-featured TCP/IP relay, or Application Layer Gatway (ALG), where the health checking of hosts is just a part of the functionality. It currently supports TCP, HTTP, and DNS relaying, SSL “acceleration” or termination and the traditional layer 3 redirections. The grammar of the new relayd.conf(5) configuration file has been redesigned which will need some attention when migrating from hoststated. The grammar is more obvious, “services” became “redirections” because they’re using the rdr functionality in pf, and tables look more like in the pf.conf(5) grammar.
relayd(8) is now also able to send SNMP traps via snmpd(8) when the state of a monitored host changes. This is a very nice feature to monitor load balancers in existing NMS. I also like the interface to make this happen; external daemons can open the /var/run/snmpd.sock and send TLV-based IMSG to snmpd(8) – there is no need to link relayd(8) against a SNMP library or to handle any ASN.1/BER encoding outside of snmpd(8) itself.
I heard you simplified the configuration of carp(4) load balancing. Please tell us more…
Marco Pfatschbacher: It has always been a minor inconvenience to set up CARP load balancing. You’d have to create multiple interfaces with the same address, manage those hostname.carp* files, and make sure to get the advskew and the link flags just right. To get rid of the need for multiple interfaces, I had to factor out the virtual host portion of carp into a separate struct that is kept in a list per carp interface. This makes it possible that one carp interface can now contain up to 32 virtual host instances. Rather than creating multiple interfaces with the same address, we can now just create a single carp interface and assign it multiple carpnodes with their respective advskews. This is a time-saver and should ease troubleshooting across CARP members.
Furthermore I replaced the link flags with more descriptive ifconfig balancing options.Setting up an IP balanced cluster with two hosts now becomes as simple as:
host-A# ifconfig carp0 192.168.1.10 carpnodes 1:0,2:100 balancing ip
host-B# ifconfig carp0 192.168.1.10 carpnodes 1:100,2:0 balancing ip
The resulting state on host-A should be sth. like:
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:00:5e:00:01:01
carp: carpdev sis0 advbase 1 balancing ip
state MASTER vhid 1 advskew 0
state BACKUP vhid 2 advskew 100
groups: carp
inet 192.168.1.10 netmask 0xffffff00 broadcast 192.168.1.255
To have things more consistent there’s now also an “ARP balancing” equivalent for IPv6: “NDP balancing”.
Would you like to talk about the new wireless drivers you worked on?
Damien Bergamini: There will be four new drivers for 802.11 wireless devices in 4.3:
- bwi(4) for Broadcom AirForce devices was written for DragonFlyBSD by Sepherosa Ziehau and ported to OpenBSD by Jonathan Gray and Marcus Glocker.
- upgt(4) for Conexant PrismGT USB devices was written by Marcus Glocker thanks to the reverse engineering done by the people at prism54.org.
- iwn(4) for Intel Wireless WiFi Link 4965AGN devices was written by me from the reverse engineering of the Intel Linux iwlwifi driver.
- ral(4) RT2860 was written by me based on documentation provided by Ralink Technology (documentation that has been obtained thanks to Theo de Raadt.)
Special thanks go to Jonathan Gray and Theo de Raadt for their efforts to free the firmware for the ral(4) RT2860 devices. Thanks to their determination, we were able to ship the RT2860 firmware under the MIT license just in time for the 4.3 release! bwi(4), upgt(4) and iwn(4) require non-free firmware to operate.
Although iwn(4) and RT2860 are 802.11n devices, they only work in 802.11g mode for the moment. More work needs to be done in our generic 802.11 layer (net80211) to fully support these devices. I plan to work on 802.11n support after WPA support is integrated which is something I’m actively working on.
Storage
Is it true that you improved the speed of flash drives? How?
Stuart Henderson: We noticed that Sandisk CompactFlash cards were a lot faster than most others and wondered why. Naddy pointed out that the cards which performed slowly all needed single-sector I/O and that DMA transfer was being disabled for these. I looked at other OS and found a simple change made in NetBSD that looked promising, ported it across and did some testing – it didn’t cause any negative effects, and improved performance a great deal in some cases, so into the snapshots it went. After a little while with no reported problems it seemed pretty safe, so it’s now committed.
Looking at the dmesg submitted since then, I noticed that some of the new machines with solid-state hard drives (like the Eee PC) are also affected, so it turns out it was a really good time to make this change.
What limits does this release have when dealing with storage?
Otto Moerbeek: We finished very large parts of large disk support. Large disks are disks that have more than 2TB capacity, the sector count of such disks overflows an 32-bit integer variable. We now use in all layers of disk related code (disklabel, buffer cache, drivers, ffs and ffs2) 64-bit integers to address disk blocks. The disklabel format has also changed to support large disks and partitions (up to 128PB, though the current code limits it at 64PB).
The actual largest file you can create did not change, with the default block size the maximum file size FFS can store is 1PB. But the kernel limits file sizes to a maximum of 2^31 pages, so it ends up at being 8TB on most platforms. You’ll probably need to create a sparse file to actually create one like that
The original version of FFS supports up to 1TB filesystems, due to it’s larger inode FFS2 can support up to the largest partition we can handle. There are important issues though: a filesystem check of a large filesystem takes a lot of memory. That is, is may need more memory than a user process can allocate. To be able to actually use a large filesystem you’ll need to newfs(8) it with larger fragment and block sizes than the defaults normally used. Solving this is high on the wanted features list, of course. Also, using FFS2 for any filesystem used in the install or upgrade process is not supported.
As for other filesystems we support, I do not know the limits of either the format itself or our implementation.
Federico Biancuzzi manages the BSD section of the Italian magazine Linux&C. As a freelancer, he writes for ONLamp, LinuxDevCenter, and SecurityFocus.