% -*- LaTeX -*- %(*** preamble \documentclass[9pt]{article} \usepackage[utf8]{inputenc} \usepackage{palatino} \usepackage{mathrsfs} \usepackage{xspace} \usepackage[T1]{fontenc} \usepackage[english]{babel} \usepackage[a4paper,lmargin=1cm,rmargin=1cm,tmargin=1cm,bmargin=2cm]{geometry} \newcommand{\ocb}{\texttt{ocamlbuild}\xspace} \newcommand{\tags}{\texttt{\_tags}\xspace} %***) %(*** title \begin{document} \title{The \ocb users manual} \author{Berke \textsc{Durak}, Nicolas \textsc{Pouillard}} \date{February 2007} \maketitle %***) %(*** abstract \begin{abstract} \ocb is a tool automating the compilation of most OCaml projects with minimal user input. Its use is not restricted to projects having a simple structure -- the extra effort needed to make it work with the more complex projects is in reasonable proportion with their added complexity. In practice, one will use a set of small text files, and, if needed, an OCaml compilation module that can fine-tune the behaviour and define custom rules. \end{abstract} %***) %(*** Features of ocamlbuild \section{Features of \ocb} {\em This section is intended to read like a sales brochure or a datasheet.} \begin{itemize} \item Built-in compilation rules for OCaml projects handle all the nasty cases: native and byte-code, missing \texttt{.mli} files, preprocessor rules, libraries, package (-pack) debugging and profiling flags, C stubs. \item Plugin mechanism for writing compilation rules and actions in a real programming language, OCaml itself. \item Automatic inference of dependencies. \item Correct handling of dynamically discovered dependencies. \item Object files and other temporary files are created in a specific directory, leaving your main directory uncluttered. \item Sanity checks ensure that object files are where they are supposed to be: in the build directory. \item Regular projects are built using a single command with no extra files. \item Parallel compilation to speed up things on multi-core systems. \item Sophisticated display mode to keep your screen free of boring and repetitive compilation message while giving you important progress information in a glimpse, and correctly multiplexing the error messages. \item Tags and flags provide a concise and convenient mechanism for automatic selection of compilation, preprocessing and other options. \item Extended shell-like glob patterns, that can be combined using boolean operators, allow you to concisely define the tags that apply to a given file. \item Mechanisms for defining the mutual visibility of subdirectories. \item Cache mechanism avoiding unnecessary compilations where reasonably computable. \end{itemize} %***) %(*** Limitations \section{Limitations} {\em Not perfect nor complete yet, but already pretty damn useful.} We were not expecting to write the ultimate compilation tool in a few man-months, however we believe we have a tool that solves many compilation problems, especially our own, in a satisfactory way. Hence there are a lot of missing features, incomplete options and hideous bugs lurking in \ocb, and we hope that the OCaml community will find our first try at \ocb useful and hopefully help it grow into a tool that satisfies most needs of most users by providing feedback, bug reports and patches. The plugin API maybe somewhat lacking in maturity, as it has only been tested by a few people. We believe a good API can only evolve under pressure from many peers and the courage to rewrite things cleanly when time is ripe by the developers. Most of the important functions a user will need are encapsulated in the plugin API, which is the \texttt{Ocamlbuild\_plugin} module pack. We intend to keep that API backwards compatible. It may happen that intricate projects need features not available in that module -- you may then use functions or values directly from the core \ocb modules. We ask you to report such usage to the authors so that we may make the necessary changes to the API; you may also want to isolate calls to the non-API parts of the \ocb library from the rest of your plugin to be able to keep the later when incompatible changes arise. The way that \ocb handles the command-line options, the \tags file, the target names, names of the tags, and so on, are not expected to change in incompatible ways. We intend to keep a project that compiles without a plugin compilable without modifications in the future. %***) %(*** Using ocamlbuild \section{Using \ocb} {\em Learn how to use \ocb with short, specific, straight-to-the-point examples.} The amount of time and effort spent on the compilation process of a project should be proportionate to that spent on the project itself. It should be easy to set up a small project, maybe a little harder for a medium-sized project, and it may take some more time, but not too much, for a big project. Ideally setting up a big project would be as easy as setting up a small project. However, as projects grow, modularization techniques start to be used, and the probability of using meta programming or multiple programming languages increases, thus making the compilation process more delicate. \ocb is intended to be very easy to use for projects, large or small, with a simple compilation process: typing \texttt{ocamlbuild foo.native} should be enough to compile the native version of a program whose top module is \texttt{foo.ml} and whose dependencies are in the same directory. As your project gets more complex, you will gradually start to use command-line options to specify libraries to link with, then configuration files, ultimately culminating in a custom OCaml plugin for complex projects with arbitrary dependencies and actions. %(*** Hygiene *) \subsection{Hygiene \& where is my code ?} Your code is in the \texttt{\_build} directory, but \ocb automatically creates a symbolic link to the executables it produces in the current directory. \ocb copies the source files and compiles them in a separate directory which is \texttt{\_build} by default. For \ocb, any file that is not in the build directory is a source file. It is not unreasonable to think that some users may have bought binary object files they keep in their project directory. Usually binary files cluttering the project directory are due to previous builds using other systems. \ocb has so-called ``hygiene'' rules that state that object files (\texttt{.cmo}, \texttt{.cmi}, or \texttt{.o} files, for instance) must not appear outside of the build directory. These rules are enforced at startup; any violations will be reported and \ocb will exit. You must then remove these files by hand or run, with caution, the script \texttt{sanitize.sh}, which is generated in your source directory. This script will contain commands to remove them for you. To disable these checks, you can use the \texttt{-no-hygiene} flag. If you have files that must elude the hygiene squad, just tag them with \texttt{precious} or \texttt{not\_hygienic}. %***) %(*** Hello, world ! \subsection{Hello, world !} Assuming we are in a directory named \texttt{example1} containing one file \texttt{hello.ml} whose contents are \begin{verbatim} let _ = Printf.printf "Hello, %s ! My name is %s\n" (if Array.length Sys.argv > 1 then Sys.argv.(1) else "stranger") Sys.argv.(0) ;; \end{verbatim} we can compile and link it into a native executable by invoking \texttt{ocamlbuild hello.native}. Here, \texttt{hello} is the basename of the top-level module and \texttt{native} is an extension used by \ocb to denote native code executables. \begin{verbatim} % ls hello.ml % ocamlbuild hello.native Finished, 4 targets (0 cached) in 00:00:00. % ls -l total 12 drwxrwx--- 2 linus gallium 4096 2007-01-17 16:24 _build/ -rw-rw---- 1 linus gallium 43 2007-01-17 16:23 hello.ml lrwxrwxrwx 1 linus gallium 19 2007-01-17 16:24 hello.native -> _build/hello.native* \end{verbatim} What's this funny \texttt{\_build} directory ? Well that's where \ocb does its dirty work of compiling. You usually won't have to look very often into this directory. Source files are be copied into \texttt{\_build} and this is where the compilers will be run. Various cache files are also stored there. Its contents may look like this: \begin{verbatim} % ls -l _build total 208 -rw-rw---- 1 linus gallium 337 2007-01-17 16:24 _digests -rw-rw---- 1 linus gallium 191 2007-01-17 16:24 hello.cmi -rw-rw---- 1 linus gallium 262 2007-01-17 16:24 hello.cmo -rw-rw---- 1 linus gallium 225 2007-01-17 16:24 hello.cmx -rw-rw---- 1 linus gallium 43 2007-01-17 16:23 hello.ml -rw-rw---- 1 linus gallium 17 2007-01-17 16:24 hello.ml.depends -rwxrwx--- 1 linus gallium 173528 2007-01-17 16:24 hello.native* -rw-rw---- 1 linus gallium 936 2007-01-17 16:24 hello.o -rw-rw---- 1 linus gallium 22 2007-01-17 16:24 ocamlc.where \end{verbatim} %***) %(*** Executing my code \subsection{Executing my code} You can execute your code the old-fashioned way (\texttt{./hello.native}). You may also type \begin{verbatim} ocamlbuild hello.native -- Caesar \end{verbatim} and it will compile and then run \texttt{hello.native} with the arguments following \texttt{-{}-}, which should display: \begin{verbatim} % ocamlbuild hello.native -- Caesar Finished, 4 targets (0 cached) in 00:00:00. Hello, Caesar ! My name is _build/hello.native \end{verbatim} %***) %(*** The log file, verbosity and debugging \subsection{The log file, verbosity and debugging} By default, if you run \ocb on a terminal, it will use some ANSI escape sequences to display a nice, one-line progress indicator. To see what commands \ocb has actually run, you can check the contents of the \texttt{\_build/\_log} file. To change the name of the log file or to disable logging, use the \texttt{-log } or \texttt{-no-log} options. Note that the log file is truncated at each execution of \ocb. The log file contains all the external commands that \ocb ran or intended to run along with the target name and the computed tags. With the \texttt{-verbose } option, \ocb will also write more or less useful debugging information; a verbosity level of $1$ (which can also be specified using the \texttt{-verbose} switch) prints generally useful information; higher levels produce much more output. %***) %(*** Cleaning \subsection{Cleaning} \ocb may leave a \texttt{\_build} directory and symbolic links to executables in that directory (unless when using -no-links). All of these can be removed safely by hand, or by invoking \ocb with the \texttt{-clean} flag. %***) %(*** Where and how to run \ocb \subsection{Where and how to run \ocb ?} An important point is that \ocb must be invoked from the root of the project, even if this project has multiple, nested subdirectories. This is because \ocb likes to store the object files in a single \texttt{\_build} directory. You can change the name of that directory with the \texttt{-build-dir} option. \ocb can be either invoked manually from the UNIX or Windows shell, or automatically from a build script or a Makefile. Unless run with the \texttt{-no-hygiene} option, there is the possibility that \ocb will prompt the user for a response. By default, on UNIX systems, if \ocb senses that the standard output is a terminal, it will use a nice progress indicator using ANSI codes, instrumenting the output of the processes it spawns to have a consistent display. Under non-UNIX systems, or if the standard output is not a terminal, it will run in classic mode where it will echo the executed commands on its standard output. This selection can be overridden with the \texttt{-classic-display} option. %***) %(*** Dependencies \subsection{Dependencies} {\em Dependencies are automatically discovered.} Most of the value of \ocb lies in the fact that it often needs no extra information to compile a project besides the name of the top-level module. \ocb calls \texttt{ocamldep} to automatically find the dependencies of any modules it wants to compile. These dependencies are dynamically incorporated in the dependency graph, something \texttt{make} cannot do. For instance, let's add a module \texttt{Greet} that implements various ways of greeting people. \begin{verbatim} % cat greet.ml type how = Nicely | Badly;; let greet how who = match how with Nicely -> Printf.printf "Hello, %s !\n" who | Badly -> Printf.printf "Oh, here is that %s again.\n" who ;; % cat hello.ml open Greet let _ = let name = if Array.length Sys.argv > 1 then Sys.argv.(1) else "stranger" in greet (if name = "Caesar" then Nicely else Badly) name; Printf.printf "My name is %s\n" Sys.argv.(0) ;; \end{verbatim} Then the module \texttt{Hello} depends on the module \texttt{Greet} and \ocb can figure this out for himself -- we still only have to invoke \texttt{\ocb hello.native}. Needless to say, this works for any number of modules. %***) %(*** Native and byte code \subsection{Native and byte-code} If we want to compile byte-code instead of native, we just a target name of \texttt{hello.byte} instead of \texttt{hello.native}, i.e., we type \texttt{\ocb hello.byte}. %***) %(*** Compile flags \subsection{Compile flags} To pass a flag to the compiler, such as the \texttt{-rectypes} option, use the \texttt{-cflag} option as in: \begin{verbatim} ocamlbuild -cflag -rectypes hello.native \end{verbatim} You can put multiple \texttt{-cflag} options, they will be passed to the compiler in the same order. You can also given them in a comma-separated list with the \texttt{-cflags} option (notice the plural): \begin{verbatim} ocamlbuild -cflags -I,+lablgtk,-rectypes hello.native \end{verbatim} These flags apply when compiling, that is, when producing \texttt{.cmi}, \texttt{.cmo},\texttt{.cmx} and \texttt{.o} files from \texttt{.ml} or \texttt{.mli} files. %***) %(*** Link flags \subsection{Link flags} Link flags apply when the various object files are collected and linked into one executable. These will typically be include directories for libraries. They are given using the \texttt{-lflag} and \texttt{-lflags} options, which work in the same way as the \texttt{-cflag} and \texttt{-cflags} options. %***) %(*** Linking with external libraries \subsection{Linking with external libraries} In our third example, we use one Unix system call and functions from the \texttt{num} library: \begin{verbatim} % cat epoch.ml let _ = let s = Num.num_of_string (Printf.sprintf "%.0f" (Unix.gettimeofday ())) in let ps = Num.mult_num (Num.num_of_string "1000000000000") s in Printf.printf "%s picoseconds have passed since January 1st, 1970.\n" (Num.string_of_num ps) ;; \end{verbatim} This requires linking with the \texttt{unix} and \texttt{num} modules, which is accomplished by using the \texttt{-lib unix} and \texttt{-lib num} flags, or, alternatively, \texttt{-libs unix,num}: \begin{verbatim} % ocamlbuild -libs nums,unix epoch.native -- Finished, 4 targets (4 cached) in 00:00:00. 1169051647000000000000 picoseconds have passed since January 1st, 1970. \end{verbatim} You may need to add options such as \texttt{-cflags -I,/usr/local/lib/ocaml/} and \texttt{-lflags -I,/usr/local/lib/ocaml/} if the libraries you wish to link with are not in OCaml's default search path. %***) %(*** The _tags files \subsection{The \tags files} Finer control over the compiler flags applied to each source file, such as preprocessing, debugging, profiling and linking options, can be gained using \ocb's tagging mechanism. Every source file has a set of tags which tells \ocb what kind of file it is and what to do with it. A tag is simply a string, usually lowercase, for example \texttt{ocaml} or \texttt{native}. The set of tags attached to a file is computed by applying the tagging rules to the filename. Tagging rules are defined in \tags files in any parent directory of a file, up to the main project directory. Each line in the \tags file is made of a glob pattern (see subsection \ref{subsec:glob}) and a list of tags. More than one rule can apply to a file and rules are applied in the order in which they appear in a file. By preceding a tag with a minus sign, one may remove tags from one or more files. \subsubsection{Example: the built-in \tags file} \begin{verbatim} <**/*.ml> or <**/*.mli> or <**/*.mlpack> or <**/*.ml.depends>: ocaml <**/*.byte>: ocaml, byte, program <**/*.odoc>: ocaml, doc <**/*.native>: ocaml, native, program <**/*.cma>: ocaml, byte, library <**/*.cmxa>: ocaml, native, library <**/*.cmo>: ocaml, byte <**/*.cmi>: ocaml, byte, native <**/*.cmx>: ocaml, native \end{verbatim} A special tag made from the path name of the file relative to the toplevel of the project is automatically defined for each file. For a file \texttt{foo/bar.ml} this tag will be \texttt{file:foo/bar.ml}. If you do not have subdirectories, you can put \texttt{*.ml} instead of \texttt{**/*.ml}. %***) %(*** Glob patterns and expressions \subsection{Glob patterns and expressions} \label{subsec:glob} Glob patterns have a syntax similar to those used by UNIX shells to select path names (like \texttt{foo\_*.ba?}). They are used in \ocb to define the files and directories to which tags apply. Glob expressions are glob patterns enclosed in brackets \texttt{<} and \texttt{>} combined using the standard boolean operators \texttt{and}, \texttt{or}, \texttt{not}. This allows one to describe sets of path names in more concise and more readable ways. Please note that file and directory names are supposed to be made of the following characters: $\texttt{a}$, $\dots$, $\texttt{z}$, $\texttt{A}$, $\dots$, $\texttt{Z}$, $\texttt{0}$, $\dots$, $\texttt{9}$, $\texttt{\_}$, $\texttt{-}$ and $\texttt{.}$. This is called the pathname alphabet $P$. \begin{table}[h] \begin{center} \small \begin{tabular}{|p{3cm}|l|p{3cm}|p{3cm}|p{5cm}|} \hline {\em Formal syntax} & {\em Example} & {\em Matches} & {\em Does not match} & {\em Meaning (formal meaning)} \\ \hline \hline %% {$u$ \vspace*{0.5em} A string of pathname characters} & \texttt{foo.ml} & \texttt{foo.ml} & \texttt{fo.ml}, \texttt{bar/foo.ml} & The exact string $u$ ($\{ u \}$, where $u \in P^*$) \\ \hline %% {\texttt{*} \vspace*{0.5em} The wild-card star}& \texttt{*}& $\varepsilon$, \texttt{foo}, \texttt{bar} & \texttt{foo/bar}, \texttt{/bar} & Any string not containing a slash ($P^*$) \\ \hline %% {\texttt{?} \vspace*{0.5em} The joker}& \texttt{?}& \texttt{a}, \texttt{b}, \texttt{z} & \texttt{/}, \texttt{bar} & Any one-letter string, excluding the slash \\ \hline %% {\texttt{**/} \vspace*{0.5em} The prefix inter-directory star}& \texttt{**/foo.ml}& \texttt{foo.ml}, \texttt{bar/foo.ml}, \texttt{bar/baz/foo.ml} & \texttt{foo/bar}, \texttt{/bar} & The empty string, or any string ending with a slash ($\varepsilon \cup P^*\mathtt{/}$) \\ \hline %% {\texttt{/**} \vspace*{0.5em} The suffix inter-directory star}& \texttt{foo/**}& \texttt{foo}, \texttt{foo/bar} & \texttt{bar/foo} & Any string starting with a slash, or the empty string ($\varepsilon \cup \mathtt{/}P^*$) \\ \hline %% {\texttt{/**/} \vspace*{0.5em} The infix inter-directory star}& \texttt{bar/**/foo.ml}& \texttt{bar/foo.ml}, \texttt{bar/baz/foo.ml} & \texttt{foo.ml} & Any string starting and ending with a slash ($\varepsilon \cup \mathtt{/}P^*\mathtt{/}$) \\ \hline %% {$\mathtt{[} r_1 r_2 \cdots r_k \mathtt{]}$ where $r_i$ is either $c$ or $c_1-c_2$ $(1 \leq i \leq k)$ \vspace*{0.5em} The positive character class}& \texttt{[a-fA-F0-9\_.]}& \texttt{3}, \texttt{F}, \texttt{.} & \texttt{z}, \texttt{bar} & Any one-letter string made of characters from one of the ranges $r_i$ ($1 \leq i \leq n$). ($\mathscr L(r_1) \cup \cdots \cup \mathscr L(r_n)$) \\ \hline %% {\texttt{[\char`\^}$r_1 r_2 \cdots r_k \mathtt{]}$ where $r_i$ is either $c$ or $c_1-c_2$ $(1 \leq i \leq k)$ \vspace*{0.5em} The negative character class}& \texttt{[\char`\^a-fA-F0-9\_.]}& \texttt{z}, \texttt{bar} & \texttt{3}, \texttt{F}, \texttt{.} & Any one-letter string NOT made of characters from one of the ranges $r_i$ ($1 \leq i \leq n$). ($\Sigma^* \setminus \left(\mathscr L(r_1) \cup \cdots \cup \mathscr L(r_n)\right)$) \\ \hline %% {$p_1 p_2$ \vspace*{0.5em} A concatenation of patterns}& \texttt{foo*}& \texttt{foo}, \texttt{foob}, \texttt{foobar} & \texttt{fo}, \texttt{bar} & Any string with a prefix matching $p_1$ and the corresponding suffix matching $p_2$, ($\{ uv \mid u \in \mathscr L(p_1), v \in \mathscr L(p_2) \}$) \\ \hline %% {$\mathtt{\{} p_1 \mathtt{,} p_2 \mathtt{,} \cdots \mathtt{,} p_k \mathtt{\}}$ \vspace*{0.5em} A union of patterns}& \texttt{toto.\{ml,mli\}}& \texttt{toto.ml}, \texttt{toto.mli} & \texttt{toto.} & Any string matching one of the patterns $p_i$ for $1 \leq i \leq k$. ($\mathscr L(p_1) \cup \cdots \cup \mathscr L(p_k)$) \\ \hline %% \end{tabular} \end{center} \caption{ Syntax and semantics of glob patterns. } \end{table} \begin{table} \begin{center} \small \begin{tabular}{|p{2cm}|l|p{7cm}|} \hline {\em Formal syntax} & {\em Example} & {\em Meaning (formal meaning)} \\ \hline \hline {$\mathtt{<}p\mathtt{>}$} & \texttt{} & Pathnames matching the pattern $p$ \\ \hline {$e_1 \; \mathtt{or} \; e_2$} & \texttt{<*.ml> or } & Pathnames matching at least one of the expressions $e_1$ and $e_2$ \\ \hline {$e_1 \; \mathtt{and} \; e_2$} & \texttt{<*.ml> and } & Pathnames matching both expressions $e_1$ and $e_2$ \\ \hline {$\mathtt{not} \; e$} & \texttt{not <*.mli>} & Pathnames not matching the expression $e$ \\ \hline {$\mathtt{true}$} & \texttt{true} & All pathnames \\ \hline {$\mathtt{false}$} & \texttt{false} & No pathnames \\ \hline \end{tabular} \end{center} \caption{ Syntax and semantics of glob expressions. } \end{table} %***) %(*** Subdirectories \subsection{Subdirectories} If the files of your project are held in one or more subdirectories, \ocb must be made aware of that fact using the \texttt{-I} or \texttt{-Is} options or by adding an \texttt{include} tag. For instance, assume your project is made of three subdirectories, \texttt{foo}, \texttt{bar} and \texttt{baz} containing various \texttt{.ml} files, the main file being \texttt{foo/main.ml}. Then you can either type: \begin{verbatim} % ocamlbuild -Is foo,bar,baz foo/main.native \end{verbatim} or add the following line in the \tags file \begin{verbatim} or or : include \end{verbatim} and call \begin{verbatim} % ocamlbuild foo/main.native \end{verbatim} There are then two cases. If no other modules named \texttt{Bar} or \texttt{Baz} exist elsewhere in the project, then you are done. Just use \texttt{Foo}, \texttt{Foo.Bar} and \texttt{Foo.Baz} in your code. Otherwise, you will need to use the plugin mechanism and define the mutual visibility of the subdirectories using the \texttt{Pathname.define\_context} function. \subsubsection{Note on subdirectory traversal} \ocb used to traverse by default any subdirectory not explicitly excluded. This is no longer the case. Note that you can still have a fine grained control using your \tags file and the \texttt{traverse} tag. There is no longer the \texttt{true: traverse} tag declaration by default. To make \ocb recursive use one of these: \begin{enumerate} \item Give the \texttt{-r} flag to ocamlbuild. \item Have a \tags or myocamlbuild.ml file in your top directory. \end{enumerate} %***) %(*** Grouping targets \subsection{Grouping targets with \texttt{.itarget}} You can create a file named \texttt{foo.itarget} containing a list of targets, one per line, such as \begin{verbatim} main.native main.byte stuff.docdir/index.html \end{verbatim} Requesting the target \texttt{foo.otarget} will then build every target listed in the file \texttt{foo.itarget}. Blank lines and lines starting with a sharp (\texttt{\#}) are ignored. %***) %(*** Packing subdirectories into modules \subsection{Packing subdirectories into modules} OCaml's \texttt{-pack} option allows you to structure the contents of a module in a subdirectory. For instance, assume you have a directory \texttt{foo} containing two modules \texttt{bar.ml} and \texttt{baz.ml}. You want from these to build a module \texttt{Foo} containing \texttt{Bar} and \texttt{Baz} as submodules. In the case where no modules named \texttt{Bar} or \texttt{Baz} exist outside of \texttt{Foo}, to do this you must write a file \texttt{foo.mlpack}, preferably sitting in the same directory as the directory \texttt{Foo} and containing the list of modules (one per line) it must contain: \begin{verbatim} Bar Baz \end{verbatim} Then when you will request for building \texttt{foo.cmo} the package will be made from \texttt{bar.cmo} and \texttt{baz.cmo}. %***) %(*** Making an OCaml library \subsection{Making an OCaml library} In a similar way than for packaged modules you can make a library by putting it's contents in a file (with the mllib extension). For instance, assume you have a two modules \texttt{bar.ml} and \texttt{baz.ml}. You want from these to build a library \texttt{foo.cmx?a} containing \texttt{Bar} and \texttt{Baz} modules. To do this you must write a file \texttt{foo.mllib} containing the list of modules (one per line) it must contain: \begin{verbatim} Bar Baz \end{verbatim} Then when you will request for building \texttt{foo.cma} the library will be made from \texttt{bar.cmo} and \texttt{baz.cmo}. %***) %(*** Making an OCaml toplevel \subsection{Making an OCaml toplevel} Making a toplevel is almost the same thing than making a packaged module or a library. Just write a file with the \texttt{mltop} extension (like \texttt{foo.mltop}) and request for building the toplevel using the \texttt{top} extension (\texttt{foo.top} in this example). %***) %(*** Preprocessor options \subsection{Preprocessor options and tags} You can specify preprocessor options with \texttt{-pp} followed by the preprocessor string, for instance \texttt{ocamlbuild -pp "camlp4o.opt -unsafe"} would run your sources thru CamlP4 with the \texttt{-unsafe} option. Another way is to use the tags file. \begin{center} \begin{tabular}{|l|l|l|} \hline \textbf{Tag} & \textbf{Preprocessor command} & \textbf{Remark} \\ \hline \hline \texttt{pp(cmd...)} & \texttt{cmd...} & Arbitrary preprocessor command\footnote{The command must not contain newlines or parentheses.} \\ \hline \texttt{camlp4o} & \texttt{camlp4o} & Original OCaml syntax \\ \hline \texttt{camlp4r} & \texttt{camlp4r} & Revised OCaml syntax \\ \hline \texttt{camlp4of} & \texttt{camlp4of} & Original OCaml syntax with extensions \\ \hline \texttt{camlp4rf} & \texttt{camlp4rf} & Revised OCaml syntax with extensions \\ \hline \end{tabular} \end{center} %%%%% \subsubsection{An example, dealing with some configuration variables} %%%%% %%%%% It's quite common to have in your sources some files that you want to access %%%%% when your program is running. One often uses some variables that are setup by %%%%% the end user. Now suppose that there is only two files that use these variables %%%%% (mylib.ml and parseopt.ml). %%%%% %%%%% In the \tags file: %%%%% \begin{verbatim} %%%%% "mylib.ml" or "parseopt.ml": pp(sed -e "s,LIBDIR,/usr/local/lib/FOO,g") %%%%% \end{verbatim} %%%%% %%%%% In fact that solution is not really acceptable, since the variable is hardcoded %%%%% in the \tags file. Trying to workaround this issue by using some shell variable %%%%% does not work either since the -pp argument will be escaped in simple quotes. %%%%% Note also that using some script shell that will do that sed and use \verb'$LIBDIR' %%%%% as a shell variable is not a good idea since \ocb don't know this dependency on that %%%%% shell script. %%%%% %%%%% There is in fact at least two good solutions. The first is to tell that dependency %%%%% using the \texttt{dep} function in your plugin. The second is simpler it just consist %%%%% on generating some OCaml file at configure time. By naming this configuration file %%%%% \texttt{myocamlbuild_config.ml} \ocb will make it also available to your plugin. %%%%% %%%%% In your \texttt{myocamlbuild_config.mli} interface: %%%%% \begin{verbotim} %%%%% val prefix : string %%%%% val libdir : string %%%%% \end{verbotim} %%%%% %%%%% And in your \texttt{configure} script %%%%% \begin{verbatim} %%%%% #!/bin/sh %%%%% %%%%% # Setting defaults values %%%%% PREFIX=/usr/local %%%%% LIBDIR=$PREFIX/lib/FOO %%%%% CONF=myocamlbuild_config.ml %%%%% %%%%% # ... some shell to parse option and check configuration ... %%%%% %%%%% # Dumping the configuration as an OCaml file. %%%%% rm -f $CONF %%%%% echo "let prefix = \"$PREFIX\";;" >> $CONF %%%%% echo "let libdir = \"$LIBDIR\";;" >> $CONF %%%%% chmod -w $CONF %%%%% \end{verbatim} %***) %(*** Debugging and profiling \subsection{Debugging byte code and profiling native code} The preferred way of compiling code suitable for debugging with \texttt{ocamldebug} or profiling native code with \texttt{ocamlprof} is to use the appropriate target extensions, \texttt{.d.byte} for debugging or \texttt{.p.native}. Another way is to add use the \texttt{debug} or \texttt{profile} tags. Note that these tags must be applied at the compilation and linking stages. Hence you must either use \texttt{-tag debug} or \texttt{-tag profile} on the command line, or add a \begin{verbatim} true: debug \end{verbatim} line to your \tags file. Please note that the byte-code profiler works in a wholly different way and is not supported by \ocb. %***) %(*** Generating documentation using \texttt{ocamldoc} \subsection{Generating documentation using \texttt{ocamldoc}} Write the names of the modules whose interfaces will be documented in a file whose extension is \texttt{.odocl}, for example \texttt{foo.odocl}, then invoke \ocb on the target \texttt{foo.docdir/index.html}. This will collect all the documentation from the interfaces (which will be build, if necessary) using \texttt{ocamldoc} and generate a set of HTML files under the directory \texttt{foo.docdir/}, which is actually a link to \texttt{\_build/foo.docdir/}. As for packing subdirectories into modules, the module names must be written one per line, without extensions and correctly capitalized. Note that generating documentation in formats other than HTML or from implementations is not supported. %***) %(*** The display line \subsection{The display line} Provided \ocb runs in a terminal under a POSIX environment, it will display a sophisticated progress-indicator line that graciously interacts with the output of subcommands. This line looks like this: \begin{verbatim} 00:00:02 210 (180 ) main.cmx ONbp--il / \end{verbatim} Here, 00:00:02 is the elapsed time in hour:minute:second format since \ocb has been invoked; 210 is the number of external commands, typically calls to the compiler or the like, that may or may not have been invoked; 180 is the number of external commands that have not been invoked since their result is already in the build directory; \texttt{main.cmx} is the name of the last target built; \texttt{ONbp--il} is a short string that describes the tags that have been encountered and the slash at the end is a frame from a rotating ticker. Hence, the display line has the following structure: \begin{verbatim} HH:MM:SS JOBS (CACHED) PATHNAME TAGS TICKER \end{verbatim} The tag string is made of 8 indicators which each monitor a tag. These tags are \texttt{ocaml}, \texttt{native}, \texttt{byte}, \texttt{program}, \texttt{pp}, \texttt{debug}, \texttt{interf} and \texttt{link}. Initially, each indicator displays a dash \texttt{-}. If the current target has the monitored tag, then the indicator displays the corresponding character (see table \ref{tab:tag-chars}) in uppercase. Otherwise, it displays that character in lowercase. This allows you to see the set of tags that have been applied to files in your project during the current invocation of \ocb. Hence the tag string \texttt{ONbp--il} means that the current target \texttt{main.cmx} has the tags \texttt{ocaml} and \texttt{native}, and that the tags \texttt{ocaml}, \texttt{native}, \texttt{byte}, \texttt{program}, \texttt{interf} and \texttt{link} have already been seen. \begin{table} \begin{center} \begin{tabular}{|l|c|} \hline \textbf{Tag} & \textbf{Display character} \\ \hline \hline ocaml & O \\ \hline native & N \\ \hline byte & B \\ \hline program & P \\ \hline pp & R \\ \hline debug & D \\ \hline interf & I \\ \hline link & L \\ \hline \end{tabular} \end{center} \caption{\label{tab:tag-chars} Relation between the characters displayed in the tag string and the tags.} \end{table} %***) %(*** ocamllex, ocamlyacc and menhir \subsection{\texttt{ocamllex}, \texttt{ocamlyacc} and \texttt{menhir}} \ocb knows how to run the standard lexer and parser generator tools \texttt{ocamllex} and \texttt{ocamlyacc} when your files have the standard \texttt{.mll} and \texttt{.mly} extensions. If you want to use \texttt{menhir} instead of \texttt{ocamlyacc}, you can either launch \ocb with the \texttt{-use-menhir} option or add a \begin{verbatim} true: use_menhir \end{verbatim} line to your \tags file. Note that there is currently no way of using \texttt{menhir} and \texttt{ocamlyacc} in the same execution of \ocb. %***) %(*** Changing the compilers \subsection{Changing the compilers or tools} As \ocb is part of your OCaml distribution, it knows if it can call the native compilers and tools (\texttt{ocamlc.opt}, \texttt{ocamlopt.opt}...) or not. However you may want \ocb to use another \texttt{ocaml} compiler for different reasons (such as cross-compiling or using a wrapper such as \texttt{ocamlfind}). Here is the list of relevant options: \begin{itemize} \item \texttt{-ocamlc } \item \texttt{-ocamlopt } \item \texttt{-ocamldep } \item \texttt{-ocamlyacc } \item \texttt{-menhir } \item \texttt{-ocamllex } \item \texttt{-ocamlmktop } \item \texttt{-ocamlrun } \end{itemize} %***) \subsection{Writing a \texttt{myocamlbuild.ml} plugin} %(*** Interaction with version control systems \subsection{Interaction with version control systems} Here are tips for configuring your version control system to ignore the files and directories generated by \ocb. The directory \texttt{\_build} and any symbolic links pointing into \texttt{\_build} should be ignored. To do this, you must add the following ignore patterns to your version control system's ignore set: \begin{verbatim} _build *.native *.byte *.d.native *.p.byte \end{verbatim} For CVS, add the above lines to the \texttt{.cvsignore} file. For Subversion (SVN), type \texttt{svn propedit svn:ignore .} and add the above lines. %***) %(*** A shell script for driving it all? \subsection{A shell script for driving it all?} {\em To shell or to make ?} Traditionally, makefiles have two major functions. The first one is the dependency-ordering, rule-matching logic used for compiling. The second one is as a dispatcher for various actions defined using phony targets with shell script actions. These actions include cleaning, cleaning really well, archiving, uploading and so on. Their characteristic is that they rely little or not on the building process -- they either need the building to have been completed, or they don't need anything. As \texttt{/bin/sh} scripts have been here for three to four decades and are not going anywhere, why not replace that functionality of makefiles with a shell script ? We have thought of three bad reasons: \begin{itemize} \item Typing \texttt{make} to compile is now an automatism, \item We need to share variable definitions between rules and actions, \item Escaping already way too special-character-sensitive shell code with invisible tabs and backslashes is a dangerously fun game. \end{itemize} We also have bad reasons for not using an OCaml script to drive everything: \begin{itemize} \item \texttt{Sys.command} calls the \texttt{/bin/sh} anyway, \item Shell scripts can execute partial commands or commands with badly formed arguments. \item Shell scripts are more concise for expressing... shell scripts. \end{itemize} Anyway you are of course free to use a makefile or an OCaml script to call ocamlbuild. Here is an example shell driver script: \begin{verbatim} #!/bin/sh set -e TARGET=epoch FLAGS="-libs unix,nums" OCAMLBUILD=ocamlbuild ocb() { $OCAMLBUILD $FLAGS $* } rule() { case $1 in clean) ocb -clean;; native) ocb $TARGET.native;; byte) ocb $TARGET.byte;; all) ocb $TARGET.native $TARGET.byte;; depend) echo "Not needed.";; *) echo "Unknown action $1";; esac; } if [ $# -eq 0 ]; then rule all else while [ $# -gt 0 ]; do rule $1; shift done fi \end{verbatim} %***) %\subsection{Common errors} %***) \appendix %(*** Motivations \section{Motivations} {\em This inflammatory appendix describes the frustration that led us to write \ocb.} Many people have painfully found that the utilities of the \texttt{make} family, namely GNU Make, BSD Make, and their derivatives, fail to scale to large projects, especially when using multi-stage compilation rules, such as custom pre-processors, unless dependencies are hand-defined. But as your project gets larger, more modular, and uses more diverse pre-processing tools, it becomes increasingly difficult to correctly define dependencies by hand. Hence people tend to use language-specific tools that attempt to extract dependencies. However another problem then appears: \texttt{make} was designed with the idea of a static dependency graph. Dependency extracting tools, however, are typically run by a rule in \texttt{make} itself; this means that make has to reload the dependency information. This is the origin of the \texttt{make clean; make depend; make} mantra. This approach tends to work quite well as long as all the files sit in a single directory and there is only one stage of pre-processing. If there are two or more stages, then dependency extracting tools must be run two or more times - and this means multiple invocations of \texttt{make}. Also, if one distributes the modules of a large project into multiple subdirectories, it becomes difficult to distribute the makefiles themselves, because the language of \texttt{make} was not conceived to be modular; the only two mechanisms permitted, inclusion of makefile fragments, and invocation of other make instances, must be skillfully coordinated with phony target names (\texttt{depend1, depend2...}) to insure inclusion of generated dependencies with multi-stage programming; changes in the structure of the project must be reflected by hand and the order of variable definitions must be well-thought ahead to avoid long afternoons spent combinatorially fiddling makefiles until it works but no one understands why. These problems become especially apparent with OCaml: to ensure type safety and to allow a small amount of cross-unit optimization when compiling native code, interface and object files include cryptographical digests of interfaces they are to be linked with. This means that linking is safer, but that makefile sloppiness leads to messages such as: \begin{verbatim} Files foo.cmo and bar.cmo make inconsistent assumptions over interface Bar \end{verbatim} The typical reaction is then to issue the mantra \texttt{make clean; make depend; make} and everything compiles just fine... from the beginning. Hence on medium projects, the programmer often has to wait for minutes instead of the few seconds that would be taken if \texttt{make} could correctly guess the small number of files that really had to be recompiled. It is not surprising that hacking a build tool such as \texttt{make} to include a programming language while retaining the original syntax and semantics gives an improvised and cumbersome macro language of dubious expressive power. For example, using GNU make, suppose you have a list of \texttt{.ml}s that you want to convert into a list including both \texttt{.cmo}s and \texttt{.cmi}s, that is you want to transform \texttt{a.ml b.ml c.ml} into \texttt{a.cmi a.cmo b.cmi b.cmo c.cmi c.cmo} while preserving the dependency order which must be hand specified for linking \footnote{By the way, what's the point of having a declarative language if \texttt{make} can't sort the dependencies in topological order for giving them to \texttt{gcc} or whatever ?}. Unfortunately \texttt{\$patsubst \%.ml, \%.cmi \%.cmo, a.ml b.ml c.ml} won't work since the \%-sign in the right-hand of a \texttt{patsubst} gets substituted only once. You then have to delve into something that is hardly lambda calculus: an intricate network of \texttt{foreach}, \texttt{eval}, \texttt{call} and \texttt{define}s may get you the job done, unless you chicken out and opt for an external \texttt{awk}, \texttt{sed} or \texttt{perl} call. People who at this point have not lost their temper or sanity usually resort to metaprogramming by writing Makefile generators using a mixture of shell and m4. One such an attempt gave something that is the nightmare of wannabe package maintainers: it's called \texttt{autotools}. Note that it is also difficult to write \texttt{Makefiles} to build object files in a separate directory. It is not impossible since the language of \texttt{make} is Turing-complete, a proof of which is left as an exercise. Note that building things in a separate directory is not necessarily a young enthusiast's way of giving a different look and feel to his projects -- it may be a good way of telling the computer that \texttt{foo.mli} is generated by \texttt{ocamlyacc} using \texttt{foo.mly} and can thus be removed. %***) %(*** Default rules \section{Summary of default rules} The contents of this table give a summary of the most important default rules. To get the most accurate and up-to-date information, launch \ocb with the \texttt{-documentation} option. \begin{center} \small \begin{tabular}{|l|l|p{5cm}|} \hline \textbf{Tags} & \textbf{Dependencies} & \textbf{Targets} \\ \hline \hline & \%.itarget & \%.otarget \\ \hline ocaml & \%.mli \%.mli.depends & \%.cmi \\ \hline byte, debug, ocaml & \%.mlpack \%.cmi & \%.d.cmo \\ \hline byte, ocaml & \%.mlpack & \%.cmo \%.cmi \\ \hline byte, ocaml & \%.mli \%.ml \%.ml.depends \%.cmi & \%.d.cmo \\ \hline byte, ocaml & \%.mli \%.ml \%.ml.depends \%.cmi & \%.cmo \\ \hline native, ocaml, profile & \%.mlpack \%.cmi & \%.p.cmx \%.p.o \\ \hline native, ocaml & \%.mlpack \%.cmi & \%.cmx \%.o \\ \hline native, ocaml, profile & \%.ml \%.ml.depends \%.cmi & \%.p.cmx \%.p.o \\ \hline native, ocaml & \%.ml \%.ml.depends \%.cmi & \%.cmx \%.o \\ \hline debug, ocaml & \%.ml \%.ml.depends \%.cmi & \%.d.cmo \\ \hline ocaml & \%.ml \%.ml.depends & \%.cmo \%.cmi \\ \hline byte, debug, ocaml, program & \%.d.cmo & \%.d.byte \\ \hline byte, ocaml, program & \%.cmo & \%.byte \\ \hline native, ocaml, profile, program & \%.p.cmx \%.p.o & \%.p.native \\ \hline native, ocaml, program & \%.cmx \%.o & \%.native \\ \hline byte, debug, library, ocaml & \%.mllib & \%.d.cma \\ \hline byte, library, ocaml & \%.mllib & \%.cma \\ \hline byte, debug, library, ocaml & \%.d.cmo & \%.d.cma \\ \hline byte, library, ocaml & \%.cmo & \%.cma \\ \hline & lib\%(libname).clib & lib\%(libname).a dll\%(libname).so \\ \hline & \%(path)/lib\%(libname).clib & \%(path)/lib\%(libname).a \%(path)/dll\%(libname).so \\ \hline library, native, ocaml, profile & \%.mllib & \%.p.cmxa \%.p.a \\ \hline library, native, ocaml & \%.mllib & \%.cmxa \%.a \\ \hline library, native, ocaml, profile & \%.p.cmx \%.p.o & \%.p.cmxa \%.p.a \\ \hline library, native, ocaml & \%.cmx \%.o & \%.cmxa \%.a \\ \hline & \%.ml & \%.ml.depends \\ \hline & \%.mli & \%.mli.depends \\ \hline ocaml & \%.mll & \%.ml \\ \hline doc, ocaml & \%.mli \%.mli.depends & \%.odoc \\ \hline & \%.odocl & \%.docdir/index.html \\ \hline ocaml & \%.mly & \%.ml \%.mli \\ \hline & \%.c & \%.o \\ \hline & \%.ml \%.ml.depends & \%.inferred.mli \\ \hline \end{tabular} \end{center} %***) \end{document}