+++ /dev/null
-Compiling and Linking
-----
-
-Assume we have:
-
- - one or more source files, *.go, perhaps in different directories
- - a compiler, C. it takes one .go file and generates a .o file.
- - a linker, L, it takes one or more .o files and generates a go.out (!) file.
-
-There is a question around naming of the files. Let's avoid that
-problem for now and state that if the input is X.go, the output of
-the compiler is X.o, ignoring the package declaration in the file.
-This is not current behavior and probably not correct behavior, but
-it keeps the exposition simpler.
-
-Let's also assume that the linker knows about the run time and we
-don't have to specify bootstrap and runtime linkage explicitly.
-
-
-Basics
-----
-
-Given a single file, main.go, with no dependencies, we do:
-
- C main.go # compile
- L main.o # link
- go.out # run
-
-Now let's say that main.go contains
-
- import "fmt"
-
-and that fmt.go contains
-
- import "sys"
-
-Then to build, we must compile in dependency order:
-
- C sys.go
- C fmt.go
- C main.go
-
-and then link
-
- L main.o fmt.o sys.o
-
-To the linker itself, the order of arguments is unimportant.
-
-When we compile fmt.go, we need to know the details of the functions
-(etc.) exported by sys.go and used by fmt.go. When we run
-
- C fmt.go
-
-it discovers the import of sys, and must then read sys.o to discover
-the details. We must therefore compile the exporting source file before we
-can compile the importing source. Moreover, if there is a mismatch
-between export and import, we can discover it during compilation
-of the importing source.
-
-To be explicit, then, what we say is, in effect
-
- C sys.go
- C fmt.go sys.o
- C main.go fmt.o sys.o
- L main.o fmt.o sys.o
-
-
-The contents of .o files (I)
-----
-
-It's necessary to include in fmt.o the information for linking
-against the functions etc. in sys.o. It's also possible to identify
-sys.o explicitly inside fmt.o, so we need to say only
-
- L main.o fmt.o
-
-with sys.o discovered automatically. Iterating again, it's easy
-to reduce the link step to
-
- L main.o
-
-with L discovering automatically the .o files it needs to process
-to create the final go.out.
-
-
-Automation of dependencies (I)
-----
-
-It should be possible to automate discovery of the dependencies of
-main.go and therefore the order necessary to compile. Since the
-source files contain explicit import statements, it is possible,
-given a source file, to discover the dependency tree automatically.
-(This will require rules and/or conventions about where to find
-things; for now assume everything is in the same directory.)
-
-The program that does this might possibly be a variant of the
-compiler, since it must parse import statements at least, but for
-clarity let's call it D for dependency. It can be a little like
-make, but let's not call it make because that brings along properties
-we don't want. In particular, it reads the sources to discover the
-dependencies; it doesn't need a separate description such as a
-Makefile.
-
-In a directory with the source files above, including main.go, but
-with no .o files, we say:
-
- D main.go
-
-D reads main.go, finds the import for fmt, and in effect descends,
-automatically running
-
- D fmt.go
-
-which in turn invokes
-
- D sys.go
-
-The file sys.go has no dependencies, so it can be compiled; D
-therefore says in effect
-
- "compile sys.go"
-
-and returns; then we have what we need for fmt.go since the exports
-in sys.go are known (or at least the recipe to discover them is
-known). So the next level says
-
- "compile fmt.go"
-
-and pops up, whereupon the top D says
-
- "compile main.go"
-
-The output of D could therefore be described as a script to run to
-compile the source.
-
-We could imagine that instead, D actually runs the compiler.
-(Conversely, we could imagine that C uses D to make sure the
-dependencies are built, but that has the danger of causing unnecessary
-dependency checking and compilation; more on that later.)
-
-To build, therefore, all we need to say is:
-
- D -c main.go # -c means 'run the compiler'
- L main.o
-
-Obviously, D at this stage could just run L. Therefore, we can
-simplify further by having it do so, whereupon
-
- D -c main.go
-
-can automate the complete compilation and linking process.
-
-Automation of dependencies (II)
-----
-
-Let's say we now edit main.go without changing its imports. To
-recompile, we have two options. First, we could be explicit:
-
- C main.go
-
-Or we could use D to automate running the compiler, as described
-in the previous section:
-
- D -c main.go
-
-The D command will discover the import of fmt, but can see that fmt.o
-already exists. Assuming its existence implies its currency, it need
-go no further; it can invoke C to compile main.go and link as usual.
-Whether it should make this assumption might be controlled by a flag.
-For the purpose of discussion, let's say it makes the assumption if
-the -c flag is set.
-
-There are two implications to this scheme. First, running D when D
-is going to turn around and run C anyway implies we could just run
-C directly and save one command invocation. (We could decide
-independently whether C should automatically invoke the linker.)
-
-The other implication is more interesting. If we stop traversing
-the dependency hierarchy as soon as we discover a .o file, then we
-may not realize that fmt.o is out of date and link against a stale
-binary. To fix this problem, we need to stat() or checksum the .o
-and .go files to see if they need recompilation. Doing this every
-time is expensive and gets us back into the make-like approach.
-
-The great majority of compilations do not require this full check,
-however; this is especially true when in the compile-debug-edit
-cycle. We therefore propose splitting the model into two scenarios.
-
-Scenario 1: General
-
-In this scenario, we ask D to update the full dependency tree by
-stat()-ing or checksumming files to check currency. The generated
-go.out will always be up to date but incremental compilation will
-be slower. Typically, this will be necessary only after a major
-operation like syncing or checking out code, or if there are known
-changes being made to the dependencies.
-
-Scenario 2: Fast
-
-In this scenario, we explicitly tell D -c what has changed and have
-it compile only what is required. Typically, this will mean compiling
-only the single active file or maybe a few files. If an IDE is
-present or there is some watcher tool, it's easy to avoid the common
-mistake of forgetting to compile a changed file.
-
-If an edit has caused skew between export and import, this will be
-caught by the compiler, so it should be type-safe at least. If D is
-running the compilation, it might be possible to arrange that C tells
-it there is a dependency problem and have D then try to resolve it
-by reevaluation.
-
-
-The contents of .o files (II)
-----
-
-For scenario 2, we can make things even faster if the .o files
-identify not just the files that must be imported to satisfy the
-imports, but details about the imports themselves. Let's say main.go
-uses only one function from fmt.go, called F. If the compiled main.o
-says, in effect
-
- from package fmt get F
-
-then the linker will not need to read all of fmt.o to link main.o;
-instead it can extract only the necessary function.
-
-Even better, if fmt is a package made of many files, it may be
-possible to store in main.o specific information about the exact
-files needed:
-
- from file fmtF.o get F
-
-The linker can then not even bother opening the other .o files that
-form package fmt.
-
-The compiler should therefore be explicit and detailed within the .o
-files it generates about what elements of a package are needed by
-the program being compiled.
-
-Earlier, we said that when we run
-
- C fmt.go
-
-it discovers the import of sys, and must then read sys.o to discover
-the details. Note that if we record the information as specified here,
-when we then do
-
- C main.go
-
-and it reads fmt.o, it does not in turn need to read sys.o; the necessary
-information has already been pulled up into fmt.o by D.
-
-Thus, once the dependency information is properly constructed, to
-compile a program X.go we must read X.go plus N .o files, where N
-is the number of packages explicitly imported by X.go. The transitive
-closure need not be evaluated to compile a file, only the explicit
-imports. By this result, we hope to dramatically reduce the amount
-of I/O necessary to compile a Go source file.
-
-To put this another way, if a package P imports packages Xi, the
-existence of Xi.o files is all that is needed to compile P because the
-Xi.o files contain the export information. This is what breaks the
-transitive dependency closure.