Skip to content
February 5, 2006 / gus3

Call-time Binding

Many computer programs are built using one of two function-binding techniques:

  • Link-time (or static) binding allocates all code and data into the runnable program during the build process. If helper code is later patched, all programs using it must be re-built in order to get the patch.
  • Run-time (or dynamic) binding waits until program launch to fit all the pieces together. The launch is slower, but a patch involves building only one piece, rather than the entire program. On the down side, if a patch is defective, then all program using that code library will be affected.

I would like to propose a simple third way: call-time binding, in which a code library is not loaded until it is needed. The immediate benefit is keeping the RSS reasonably low for a process. The trade-off is that some support code is needed, which can actually increase the RSS for a complicated program.

I have included a rudimentary proof-of-concept code sample (CAPTCHA and 30-second delay involved for free download), to demonstrate an actual implementation.

The philosophy is simple: Don’t load code until it’s needed. Using dlopen() and dlsym() judiciously means that unnecessary code is never loaded.

First, start with some real functions. Here is add2() in real2.c:

#include <stdio.h>

int add2(int a) {
  printf("Now in real add2(a)\n");
  printf("Passed a = %d\n", a);
  printf("Now exiting real fn(a),
          returning %d\n", a+2);
  return a+2;
}

Next, add a function pointer in fn.h:

extern int (*add2)(int);
#define add2(a) (*add2)(a)

The add2_stub() function in stub.c is linked at build-time, but it will be called only once:

#include <dlfcn.h>
#include <stdio.h>
#include "fn.h"

/* __attribute__((constructor))
   means it runs when this library
   is loaded, at launch time */
void prep(void)
  __attribute__((constructor));

int (*add2)(int a);

int add2_stub(int a) {
  void *h;

  h = dlopen("./real2.so", RTLD_LAZY);
  if (!h) {
    fprintf(stderr, "%s\n", dlerror());
    exit(1);
  }
  /* do the lookup: */
  /* clear existing errors */
  dlerror();
  add2 = (int (*)(int))dlsym(h, "add2");
  /* return the real result */
  return (*add2)(a);
}

/* the following must appear
   after all stub functions */
void prep(void) {
add2 = add2_stub;
}

The stub library will be loaded at launch time. Marking the prep() function means it will be called when the library is loaded, causing the (*add2) pointer to point to add2_stub. Remember, in fn.h, the code “add2(x)” translates to “(*add2)(x)”, which at launch time will call add2_stub(x).

However, upon calling add2_stub(x), the library with the “real” add2() is loaded, and (*add2) is set to point to the actual, desired code. Later calls to (*add2) go directly to the loaded library’s add2() immediately.

The main executable includes the simple function call:

add2(5);

which, the first time, calls the stub function, which loads the library, re-points the pointer, and then invokes the real add2(5). The second time and later executing the above function call results in a simple call to the real add2(5).

Finally, the Makefile to tie it all together:

main: main.o stub.o real.so real2.so
	gcc -g -o main -ldl \
main.o stub.o

clean:
	rm -f main.o stub.o real.so \
real2.so main

main.o: main.c
	gcc -g -o main.o -c main.c

stub.o: stub.c
	gcc -g -o stub.o -c stub.c

real.so: real.c
	gcc -g -o real.so -shared real.c

real2.so: real.c
	gcc -g -o real2.so -shared real2.c

The leading whitespace is a tab character, not spaces. Two notes:

  1. “real2.so” is built with the “-shared” parameter to “gcc”, which mandates a shared library suitable for run-time binding.
  2. The only library linked statically to “main.o” is “stub.o”, containing the stub code.

The code sample contains a slightly more complicated demonstration, involving two separate functions in two different libraries. User input causes the two libraries to be loaded in different order.

Simple experiments on Linux show that the libraries are mapped using shared pages, even for different processes loading them in different order. Just as with run-time binding, a library mapped to multiple processes won’t result in significantly higher memory pressure on the system.

The two advantages should be obvious. If no running program needs to do complex number math, why load the code to do it? Conversely, if a program does need code for complex math, there is no need to load an entire monster library just to get that. And, when the last program needing it exits, the code will be unmapped, reducing the memory pressure.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: