1. 程式人生 > >Linker and Libraries Guide Chapter 3 Runtime Linker

Linker and Libraries Guide Chapter 3 Runtime Linker

As part of the initialization and execution of a dynamic executable, an interpreter is called to complete the binding of the application to its dependencies. In the Solaris OS, this interpreter is referred to as the runtime linker.

During the link-editing of a dynamic executable, a special .interp

 section, together with an associated program header, are created. This section contains a path name specifying the program's interpreter. The default name supplied by the link-editor is the name of the runtime linker: /usr/lib/ld.so.1 for a 32–bit executable and /usr/lib/64/ld.so.1 for a 64–bit executable.

Note –

ld.so.1 is a special case of a shared object. Here, a version number of 1 is used. However, later Solaris OS releases might provide higher version numbers.

During the process of executing a dynamic object, the kernel loads the file and reads the program header information. See Program Header

. From this information, the kernel locates the name of the required interpreter. The kernel loads, and transfers control to this interpreter, passing sufficient information to enable the interpreter to continue executing the application.

In addition to initializing an application, the runtime linker provides services that enable the application to extend its address space. This process involves loading additional objects and binding to symbols provided by these objects.

The runtime linker performs the following actions.

  • Analyzes the executable's dynamic information section (.dynamic) and determines what dependencies are required.

  • Locates and loads these dependencies, analyzing their dynamic information sections to determine if any additional dependencies are required.

  • Performs any necessary relocations to bind these objects in preparation for process execution.

  • Calls any initialization functions provided by the dependencies.

  • Passes control to the application.

  • Can be called upon during the application's execution, to perform any delayed function binding.

  • Can be called upon by the application to acquire additional objects with dlopen(3C), and bind to symbols within these objects withdlsym(3C).

Shared Object Dependencies

When the runtime linker creates the memory segments for a program, the dependencies tell what shared objects are needed to supply the program's services. By repeatedly connecting referenced shared objects and their dependencies, the runtime linker generates a complete process image.

Note –

Even when a shared object is referenced multiple times in the dependency list, the runtime linker connects the object only once to the process.

Locating Shared Object Dependencies

When linking a dynamic executable, one or more shared objects are explicitly referenced. These objects are recorded as dependencies within the dynamic executable.

The runtime linker uses this dependency information to locate, and load, the associated objects. These dependencies are processed in the same order as the dependencies were referenced during the link-edit of the executable.

Once all the dynamic executable's dependencies are loaded, each dependency is inspected, in the order the dependency is loaded, to locate any additional dependencies. This process continues until all dependencies are located and loaded. This technique results in a breadth-first ordering of all dependencies.

Directories Searched by the Runtime Linker

The runtime linker looks in two default locations for dependencies. When processing 32–bit objects, the default locations are /lib and/usr/lib. When processing 64–bit objects, the default locations are /lib/64 and /usr/lib/64. Any dependency specified as a simple file name is prefixed with these default directory names. The resulting path name is used to locate the actual file.

The dependencies of a dynamic executable or shared object can be displayed using ldd(1). For example, the file /usr/bin/cat has the following dependencies:


$ ldd /usr/bin/cat
        libc.so.1 =>     /lib/libc.so.1
        libm.so.2 =>     /lib/libm.so.2

The file /usr/bin/cat has a dependency, or needs, the files libc.so.1 and libm.so.2.

The dependencies recorded in an object can be inspected using elfdump(1). Use this command to display the file's .dynamic section, and look for entries that have a NEEDED tag. In the following example, the dependency libm.so.2, displayed in the previous ldd(1) example, is not recorded in the file /usr/bin/catldd(1) shows the total dependencies of the specified file, and libm.so.2 is actually a dependency of /lib/libc.so.1.


$ elfdump -d /usr/bin/cat
 
Dynamic Section:  .dynamic:
     index  tag                value
       [0]  NEEDED            0x211               libc.so.1
       ...

In the previous elfdump(1) example, the dependencies are expressed as simple file names. In other words, there is no `/' in the name. The use of a simple file name requires the runtime linker to generate the path name from a set of default search rules. File names that contain an embedded `/', are used as provided.

The simple file name recording is the standard, most flexible mechanism of recording dependencies. The -h option of the link-editor records a simple name within the dependency. See Naming Conventions and Recording a Shared Object Name.

Frequently, dependencies are distributed in directories other than /lib and /usr/lib, or /lib/64 and /usr/lib/64. If a dynamic executable or shared object needs to locate dependencies in another directory, the runtime linker must explicitly be told to search this directory.

You can specify additional search path, on a per-object basis, by recording a runpath during the link-edit of an object. See Directories Searched by the Runtime Linker for details on recording this information.

A runpath recording can be displayed using elfdump(1). Reference the .dynamic entry that has the RUNPATH tag. In the following example, prog has a dependency on libfoo.so.1. The runtime linker must search directories /home/me/lib and/home/you/lib before it looks in the default location.


$ elfdump -d prog | egrep "NEEDED|RUNPATH"
       [1]  NEEDED            0x4ce               libfoo.so.1
       [3]  NEEDED            0x4f6               libc.so.1
      [21]  RUNPATH           0x210e              /home/me/lib:/home/you/lib

Another way to add to the runtime linker's search path is to set one of the LD_LIBRARY_PATH family of environment variables. This environment variable, which is analyzed once at process startup, can be set to a colon-separated list of directories. These directories are searched by the runtime linker before any runpath specification or default directory.

These environment variables are well suited to debugging purposes, such as forcing an application to bind to a local dependency. In the following example, the file prog from the previous example is bound to libfoo.so.1, found in the present working directory.


$ LD_LIBRARY_PATH=. prog

Although useful as a temporary mechanism of influencing the runtime linker's search path, the use of LD_LIBRARY_PATH is strongly discouraged in production software. Any dynamic executables that can reference this environment variable will have their search paths augmented. This augmentation can result in an overall degradation in performance. Also, as pointed out in Using an Environment Variableand Directories Searched by the Runtime LinkerLD_LIBRARY_PATH affects the link-editor.

Environmental search paths can result in a 64–bit executable searching a path that contains a 32–bit library that matches the name being looked for. Or, the other way around. The runtime linker rejects the mismatched 32–bit library and continues its search looking for a valid 64–bit match. If no match is found, an error message is generated. This rejection can be observed in detail by setting the LD_DEBUGenvironment variable to include the files token. See Debugging Library.


$ LD_LIBRARY_PATH=/lib/64 LD_DEBUG=files /usr/bin/ls
...
00283: file=libc.so.1;  needed by /usr/bin/ls
00283: 
00283: file=/lib/64/libc.so.1  rejected: ELF class mismatch: 32–bit/64–bit
00283: 
00283: file=/lib/libc.so.1  [ ELF ]; generating link map
00283:     dynamic:  0xef631180  base:  0xef580000  size:      0xb8000
00283:     entry:    0xef5a1240  phdr:  0xef580034  phnum:           3
00283:      lmid:           0x0
00283: 
00283: file=/lib/libc.so.1;  analyzing  [ RTLD_GLOBAL  RTLD_LAZY ]
...

If a dependency cannot be located, ldd(1) indicates that the object cannot be found. Any attempt to execute the application results in an appropriate error message from the runtime linker.


$ ldd prog
        libfoo.so.1 =>   (file not found)
        libc.so.1 =>     /lib/libc.so.1
        libm.so.2 =>     /lib/libm.so.2
$ prog
ld.so.1: prog: fatal: libfoo.so.1: open failed: No such file or directory

Configuring the Default Search Paths

The default search paths used by the runtime linker are /lib and /usr/lib for 32–bit application. For 64–bit applications, the default search paths are /lib/64 and /usr/lib/64. These search paths can be administered using a runtime configuration file created by thecrle(1) utility. This file is often a useful aid for establishing search paths for applications that have not been built with the correct runpaths.

A configuration file can be constructed in the default location /var/ld/ld.config, for 32–bit applications, or/var/ld/64/ld.config, for 64–bit applications. This file affects all applications of the respective type on a system. Configuration files can also be created in other locations, and the runtime linker's LD_CONFIG environment variable used to select these files. This latter method is useful for testing a configuration file before installing the file in the default location.

Dynamic String Tokens

The runtime linker allows for the expansion of various dynamic string tokens. These tokens are applicable for filter, runpath and dependency definitions.

Relocation Processing

After the runtime linker has loaded all the dependencies required by an application, the linker processes each object and performs all necessary relocations.

During the link-editing of an object, any relocation information supplied with the input relocatable objects is applied to the output file. However, when creating a dynamic executable or shared object, many of the relocations cannot be completed at link-edit time. These relocations require logical addresses that are known only when the objects are loaded into memory. In these cases, the link-editor generates new relocation records as part of the output file image. The runtime linker must then process these new relocation records.

For a more detailed description of the many relocation types, see Relocation Types (Processor-Specific). Two basic types of relocation exist.

  • Non-symbolic relocations

  • Symbolic relocations

The relocation records for an object can be displayed by using elfdump(1). In the following example, the file libbar.so.1 contains two relocation records that indicate that the global offset table, or .got section, must be updated.


$ elfdump -r libbar.so.1

Relocation Section:  .rel.got:
    type                               offset             section       symbol
  R_SPARC_RELATIVE                    0x10438             .rel.got  
  R_SPARC_GLOB_DAT                    0x1043c             .rel.got      foo

The first relocation is a simple relative relocation that can be seen from its relocation type and the that no symbol is referenced. This relocation needs to use the base address at which the object is loaded into memory to update the associated .got offset.

The second relocation requires the address of the symbol foo. To complete this relocation, the runtime linker must locate this symbol from either the dynamic executable or from one of its dependencies.

Relocation Symbol Lookup

The runtime linker is responsible for searching for symbols that are required by objects at runtime. Typically, users become familiar with the default search model that is applied to a dynamic executable and its dependencies, and to the objects obtained through dlopen(3C). However, more complex flavors of symbol lookup can result because of the symbol attributes of an object, or through specific binding requirements.

Two attributes of an object affect symbol lookup. The first attribute is the requesting object's symbol search scope. The second attribute is the symbol visibility offered by each object within the process.

These attributes can be applied as defaults at the time the object is loaded. These attributes can also be supplied as specific modes todlopen(3C). In some cases, these attributes can be recorded within the object at the time the object is built.

An object can define a world search scope, and/or a group search scope.

world

The object can search for symbols in any other global object within the process.

group

The object can search for symbols in any object of the same group. The dependency tree created from an object obtained withdlopen(3C), or from an object built using the link-editor's -B group option, forms a unique group.

An object can define that any of the object's exported symbols are globally visible or locally visible.

global

The object's exported symbols can be referenced from any object that has world search scope.

local

The object's exported symbols can be referenced only from other objects that make up the same group.

The simplest form of symbol lookup is outlined in the next section Default Symbol Lookup. Typically, symbol attributes are exploited by various forms of dlopen(3C). These scenarios are discussed in Symbol Lookup.

An alternative model for symbol lookup is provided when a dynamic object employes direct bindings. This model directs the runtime linker to search for a symbol directly in the object that provided the symbol at link-edit time. See Direct Bindings.

Default Symbol Lookup

A dynamic executable and all the dependencies loaded with the executable are assigned world search scope, and global symbol visibility. A default symbol lookup for a dynamic executable or for any of the dependencies loaded with the executable, results in a search of each object. The runtime linker starts with the dynamic executable, and progresses through each dependency in the same order in which the objects were loaded.

ldd(1) lists the dependencies of a dynamic executable in the order in which the dependencies are loaded. For example, suppose the dynamic executable prog specifies libfoo.so.1 and libbar.so.1 as its dependencies.


$ ldd prog
        libfoo.so.1 =>   /home/me/lib/libfoo.so.1
        libbar.so.1 =>   /home/me/lib/libbar.so.1

Should the symbol bar be required to perform a relocation, the runtime linker first looks for bar in the dynamic executable prog. If the symbol is not found, the runtime linker then searches in the shared object /home/me/lib/libfoo.so.1, and finally in the shared object/home/me/lib/libbar.so.1.

Note –

Symbol lookup can be an expensive operation, especially when the size of symbol names increases and the number of dependencies increases. This aspect of performance is discussed in more detail in Performance Considerations. See Direct Bindings for an alternative lookup model.

The default relocation processing model also provides for a transition into a lazy loading environment. If a symbol can not be found in the presently loaded objects, any pending lazy loaded objects are processed in an attempt to locate the symbol. This loading compensates for objects that have not fully defined their dependencies. However, this compensation can undermine the advantages of a lazy loading.

Runtime Interposition

By default, the runtime linker searches for a symbol first in the dynamic executable and then in each dependency. With this model, the first occurrence of the required symbol satisfies the search. Therefore, if more than one instance of the same symbol exists, the first instance interposes on all others.

An overview of how symbol resolution is affected by interposition is provided in Simple Resolutions. A mechanism for changing symbol visibility, and hence reducing the chance of accidental interposition is provided in Reducing Symbol Scope.

Interposition can be enforced, on a per-object basis, if an object is explicitly identified as an interposer. Any object loaded using the environment variable LD_PRELOAD or created with the link-editor's -z interpose option, is identified as an interposer. When the runtime linker searches for a symbol, any object identified as an interposer is searched after the application, but before any other dependencies.

The use of all of the interfaces offered by an interposer can only be guaranteed if the interposer is loaded before any process relocation has occurred. Interposers provided using the environment variable LD_PRELOAD, or established as non-lazy loaded dependencies of the application, are loaded before relocation processing starts. Interposers that are brought into a process after relocation has started are demoted to normal dependencies. Interposers can be demoted if the interposer is lazy loaded, or loaded as a consequence of usingdlopen(3C). The former category can be detected using ldd(1).


$ ldd -Lr prog
        libc.so.1 =>     /lib/libc.so.1
        foo.so.2 =>      ./foo.so.2
        libmapmalloc.so.1 =>     /usr/lib/libmapmalloc.so.1
        loading after relocation has started: interposition request \
                (DF_1_INTERPOSE) ignored: /usr/lib/libmapmalloc.so.1
Note –

If the link-editor encounters an explicitly defined interposer while processing dependencies for lazy loading, the interposer is recorded as a non-lazy loadable dependency.

Direct Bindings

An object that uses direct bindings maintains the relationship between a symbol reference and the dependency that provided the definition. The runtime linker uses this information to search directly for the symbol in the associated object, rather than carry out the default symbol search model. Direct binding information can only be established to dependencies specified with the link-edit. Therefore, use of the -z defs option is recommended.

The direct binding of a symbol reference to a symbol definition can be established with one of the following mechanisms.

  • With the -B direct option. This option establishes direct bindings between the object being built and all of the objects dependencies. This option also establishes direct bindings between any symbol reference and symbol definition within the object being built.

    The use of -B direct also enables lazy loading. This enabling is equivalent to adding the option -z lazyload to the front of the link-edit command line. See Lazy Loading of Dynamic Dependencies.

  • With the -z direct option. This option establishes direct bindings from the object being built to any dependencies that follow the option on the command line. This option can be used together with the -z nodirect option, to toggle the use of direct bindings between dependencies. This option does not establish direct bindings between any symbol reference and symbol definition within the object being built.

  • With the DIRECT mapfile keyword. This keyword provides for directly binding individual symbols. See Defining Additional Symbols with a mapfile.

Direct binding can significantly reduce the symbol lookup overhead incurred by a dynamic process that has many symbolic relocations and many dependencies. This model also enables multiple symbols of the same name to be located from different objects that have been bound to directly.

Note –

Direct bindings can be disabled at runtime by setting the environment variable LD_NODIRECT to a non-null value.

The default symbol search model allows all references to a symbol to bind to one definition. Direct binding circumvents implicit interposition symbols, as direct bindings bypass the default search model. However, any object explicitly identified as an interposer is searched before the object that supplies the symbol definition. Explicit interposers include objects loaded using the environment variable LD_PRELOAD, or objects created with the link-editor's -z interpose option. See Runtime Interposition.

Some interfaces exist to provide alternative implementations of a default technology. These interfaces expect their implementation to be the only instance of that technology within a process. An example is the malloc(3C) family. There are various malloc() family implementations, and each family expects to be the only implementation used within a process. The direct binding to an interface within such a family should be avoided, otherwise more than one instance of the technology can be referenced by the same process. For example, one dependency within a process can directly bind against libc.so.1, while another dependency directly binds againstlibmapmalloc.so.1. The potential for inconsistent use of two different implementations of malloc() and free() is error prone.

Objects that provide interfaces that expect to be single-instance within a process, should prevent any direct binding to their interfaces. An interface can be labelled to prevent any caller from directly binding to the interface with one of the following mechanisms.

  • With the -B nodirect option. This option prohibits the direct binding to all interfaces offered by the object.

  • With the NODIRECT mapfile keyword. This keyword provides for prohibiting the direct binding to individual symbols. See Defining Additional Symbols with a mapfile.

Non-direct labelling prevents any symbol reference from directly binding to an implementation. The symbol search to satisfy the reference uses the default symbol search model. Non-direct labelling has been employed to build the various malloc() family implementations that are provided with the Solaris OS.

Note –

The NODIRECT mapfile keyword can be combined with the command line options -B direct or -z direct. Symbols that are not explicitly defined NODIRECT follow the command line directive. Similarly, the DIRECT mapfile keyword can be combined with the command line option -B nodirect. Symbols that are not explicitly defined DIRECT follow the command line directive.

When Relocations Are Performed

Relocations can be separated into two types dependent upon when the relocation is performed. This distinction arises due to the type ofreference being made to the relocated offset.

  • An immediate reference

  • A lazy reference

An immediate reference refers to a relocation that must be determined immediately when an object is loaded. These references are typically to data items used by the object code, pointers to functions, and even calls to functions made from position-dependent shared objects. These relocations cannot provide the runtime linker with knowledge of when the relocated item is referenced. Therefore, all immediate relocations must be carried out when an object is loaded, and before the application gains, or regains, control.

lazy reference refers to a relocation that can be determined as an object executes. These references are typically calls to global functions made from position-independent shared objects, or calls to external functions made from a dynamic executable. During the compilation and link-editing of any dynamic module that provide these references, the associated function calls become calls to a procedure linkage table entry. These entries make up the .plt section. Each procedure linkage table entry becomes a lazy reference with an associated relocation.

As part of the first call to a procedure linkage table entry, control is passed to the runtime linker. The runtime linker looks up the required symbol and rewrites the entry information in the associated object. Future calls to this procedure linkage table entry go directly to the function. This mechanism enables relocations of this type to be deferred until the first instance of a function is called. This process is sometimes referred to as lazy binding.

The runtime linker's default mode is to perform lazy binding whenever procedure linkage table relocations are provided. This default can be overridden by setting the environment variable LD_BIND_NOW to any non-null value. This environment variable setting causes the runtime linker to perform both immediate reference and lazy reference relocations when an object is loaded. These relocations are performed before the application gains, or regains, control. For example, all relocations within the file prog together within its dependencies are processed under the following environment variable. These relocations are processed before control is transferred to the application.


$ LD_BIND_NOW=1 prog

Objects can also be accessed with dlopen(3C) with the mode defined as RTLD_NOW. Objects can also be built using the link-editor's -z now option to indicate that the object requires complete relocation processing at the time the object is loaded. This relocation requirement is also propagated to any dependencies of the marked object at runtime.

Note –

The preceding examples of immediate references and lazy references are typical. However, the creation of procedure linkage table entries is ultimately controlled by the relocation information provided by the relocatable object files used as input to a link-edit. Relocation records such as R_SPARC_WPLT30 and R_386_PLT32 instruct the link-editor to create a procedure linkage table entry. These relocations are common for position-independent code.

However, a dynamic executable is typically created from position dependent code, which might not indicate that a procedure linkage table entry is required. Because a dynamic executable has a fixed location, the link-editor can create a procedure linkage table entry when a reference is bound to an external function definition. This procedure linkage table entry creation occurs regardless of the original relocation records.

Relocation Errors

The most common relocation error occ