Saturday, July 12, 2008

Linux Kernel Development Stats from Greg Kroah Hartman

Linux kernel hacker Greg Kroah Hartman's June 5, 2008 talk at Google titled "The Linux Kernel" was chock-full of details about kernel development. I noted down some of the the things he said. Please note that the talk was delivered on June 5th, 2008 and all stats mentioned by GKH are relative to that date.
  • Code changes per day in 2007-2008 so far
    • 4,300 lines added
    • 1,800 lines removed
    • 1,500 lines modified
    These changes don't include moving files around.It works out to about 3.69 changes per hour, 24x7 . It's not just the drivers that are changing at this rate, it's the the entire kernel.
  • The kernel itself is about 9.2 million lines, and has been increasing in size by about 10% every year since 2.6.0 (when GKH started tracking it). The drivers make up about about 55% of the code, while architecture specific code is second in terms of LOC. The core of the kernel is about 5%.
  • Supports more processors and devices than any other OS in history.
  • One of the consequences of the scorching pace of the kernel development is that, the kernel is a better place for patches , than for it to be maintained separately. GKH pointed out that it is difficult to keep pace with Linux kernel development and gave the example of Xen; The Xen folks apparently never played nice with the kernel developers and are now trying to get their patches into the kernel. From his tone it sounded as thought it was a tough climb uphill for the Xen guys, and the KVM is already in the kernel.
  • The Kernel development hierarchy (it has one !) has developers feeding patches to the maintainers, who in turn feed them to subsystem maintainers (pci, security, usb, etc).Sub-system maintainers maintain their own trees. Steve Rothwell (IBM, Austalia) pulls changes into the "Next" tree every night and does daily builds, while Andrew Morton pulls changes into his tree once a week or so. When Linus says the merge tree window is open, all the sub-system maintainers hit Linus. There is also the stable release tree, which is not maintained indefinitely.
  • He likened the patch submission process to a lossy network routing algorithm - it can handle people dropping out . There is no other way to develop at such high speed.If you own a file or subsystem, you have to accept the fact that other people are going to be changing it. Maintainers can always revert changes if they don't like it.
  • There is no good way to test the kernel except to run it. The hundreds of permutations of devices and interactions makes it impossible to test it comprehensively. The only way out is for the developers to test the rc releases.
  • There are no stable and unstable releases anymore. For the last four years,they have been replaced by releases every 2 and 3/4 months.
  • There have been 2399 independent contributors to the Linux kernel in the last year. 50% of the contributors submitted only one patch; half of the half contributed two patches. the top of the curve is getting flatter. Top 30% do only 30% of the work. The number of individual contributors is going up.
  • Top developers by quantity (for the last one and half years)
    1. Adrian Bunk 754
    2. Al Viro 698
    3. Thomas Gleixner 656
    4. David S.Miller 655
    5. Bart Zolnierkiewicz 637
    6. Paul Mundt 610
    7. Ralf Baechle 604
    8. Ingo Molnar 596
    9. Patrick McHardy 554
    10. Tejun Heo 530
  • Top contributors by sign-offs (shows who the gatekeepers are). Note that Linus doesn't sign off on patches from subsystem maintainers.
    1. Andrew morton 9086
    2. Linus Torvalds 8960
    3. David S.Miller 4926
    4. Jeff Garzik 2960
    5. Ingo Molnar 2489
    6. Greg Kroah-Hartman 2098
    7. Thomas Gleixner 1098
    8. Mauro Carvalho Chehab 1822
    9. Paul Mackerras 1675
    10. John Linville 1461
  • Who's funding Linux kernel development ?
    1. Amateurs 18.5%
    2. Red Hat 11.6%
    3. IBM 7.5%
    4. Novell 6.6%
    5. Unknown individuals 5.5%
    6. Intel 4.1%
    7. Oracle 2.2%
    8. Consultants 2.2%
    9. Academia 1.5%
    10. Renesas Technology 1.5%
    Google is at number 13 with 1.4% contribution. Without Andrew Morton's contributions Google's would be at the fortieth spot.
  • Canonical had about 6 changes in the past 5 years; they are in the 300th
    position. GKH was very emphatic that 'Canonical does not give back to the community'.
  • 75% of Linux kernel work is paid for.
  • Linus jokes that the kernel is not intelligent design, it is evolution. GKH also added that "We react to stimuli that's happening in the world" and "We don't over plan things".
  • As GKH put it, 'We broke all software development rules and we are continuing to break it'.