Tools for Reading Sources

by admin

Hacking is a good method of learning. And the pre-stage of hacking is reading the source code. You might have this experience: facing a large amount of source code (generally dozens of files), you don’t know how to start, or you even don’t know how to read them. A good tool is very helpful for reading code. It can facilitate your understanding of the code and help you find out the things you want as soon as possible. I will introduce you some wonderful tools of browsing sources.

For me, the must-have feathers of a source browsing tool should be:

  1. Syntax highlighting
  2. Friendly user interface
  3. Identifier and function search

If you are a Windows user, you might have heard about Source Insight, which is a non-free software. I haven’t used it, so I am not going to comment it.

For Linux user, the first tool I want to mention is Source Navigator. It is developed and supported by Red hat. It supports C, C++, Java, Tcl, FORTRAN and COBOL. It’s very powerful. But if you just want to browse sources, I don’t recommend you to use it. Since it GUI is developed in Tcl, an ancient programming language, the user interface is not very friendly. I bet you don’t like it.

Another wonderful software is Doxygen. Although it aims to generate documents various programming languages, including C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D, you can treat it as a source browsing tool. It can generate documents in HTML format. So you can read them via Firefox. If the source code is well commented following JavaDoc’s style, a beautiful documents describing classes, variables, functions could be generated. If the source code is not commented in JavaDoc’s style, the documents could also be generated, but it just have source codes, without these detailed description about classes, variables, and functions. Doxygen can also generate class-graphs, call-graphs and other similar graphs in UML style, which will help you understand the source.

The disadvantage of Doxygen is that its search result is somehow coarse-grained. It cannot distinguish variables from functions in search result. And it cannot differentiate definition, declaration and reference. So, if you are dealing with a huge project, you may find it uncomfortable to read source generated by Doxygen.

The ultimate tool, in my view, is LXR, which stand for Linux Corss Referencer. Specially, it may be the best choice for you to read Linux source code. Although, namely, LXR aim for Linux source code, it suitable for other projects written in C/C++, COBOL, Java, Perl. It is a web based tool and you can read the source code via Firefox. It uses a database to management cross references. So the search result is very accurate. And it can differentiate definition, declaration and reference.

LXR is famous for its difficulty on setup and configuration. And its user interface cannot be called “beautiful”. Luckily, the next generation, LXRng (demo) is available now. LXRng has a very beautiful and friendly user interface.  Moreover, setup and configuration for LXRng is much simpler than LXR. However, it seems that LXRng currently only support C language. If you need it for C++, you need a slight hack:

Locate to your LXRng’s installation directory, then open this file:

lib/LXRng/Lang/C.pm

Find the following code from this file:

sub pathexp {
    return qr/.[ch]$/;
}

And change it into:

sub pathexp {
    return qr/.[ch]$|.cpp$/;
}

Then LXRng should support the C++ files.

Doxygen and LXRng could satisfy most of your requirement. Maybe reading source from Firefox is strange and unfamiliar to you, but I think you will going to like it if you just have a try. If you have no idea about how to use them, just go to their websites, where you can find plenty of documents and toturials.