22

What is a Delphi DCU file?

I believe it stands for "Delphi Compiled Unit". Am I correct in assuming it contains object code, and therefore corresponds to an ".o" file compiled from a C/C++ source code file?

magnus
  • 4,031
  • 7
  • 26
  • 48
  • 1
    It is in a similar class to '.o, or '.obj'files, yes. It contains object code that can be linked. – Martin James Jul 10 '14 at 00:52
  • 1
    It's essentially used as a sort of cache during compilation. Units which have not changed don't need to be re-compiled, so the DCU is re-used in that case to speed up the compilation of the project. – Jerry Dodge Jul 10 '14 at 00:59
  • 4
    @Jerry: It's compiled (binary) code, almost precisely like a .c object file. It's not a "cache" of any sort; it can be copied to another computer and used in a new project. It's an object file. Kids nowdays, that don't remember the days of object files, libraries, and linkers. Sheesh. :-) – Ken White Jul 10 '14 at 01:32
  • 1
    @Ken Yes, I was just pointing out one of the advantages of the nature of DCU files. If you only modify one of 10 units, if the other 9 already have DCU files, only the 1 will be compiled, thus speeding up the process. Unless of course you choose to do a full build. – Jerry Dodge Jul 10 '14 at 01:47
  • 1
    @Jerry: Yeah, that's the same exact behavior of any compiled object file. :-) Object files are a necessary intermediate step between the compiler and linker, in order to put stuff in a format that the linker can then process and (with the runtime and other libraries) resolve symbols to "link" together to form an executable file. – Ken White Jul 10 '14 at 01:57
  • @Jerry I understand what you're saying. Kind of like how `make` only recompiles object files from source files if the last modified timestamp on the source file is after that on the object file. So it kind of is "cached", but only insofar as the Delphi compiler caches it for you during the build process. – magnus Jul 10 '14 at 02:22
  • 1
    Calling these "cache" files is really quite a stretch. The build process has optimizations built-in that are designed to reduce needless processes. So if none of the dependencies for a given target file have not been updated, and the target file is still there, the build process (eg., make) simply skips rebuilding it. Kinda like saying that if you don't empty your trash can every night then it's acting as a "cache" for things you might want to recycle. – David Schwartz Jul 10 '14 at 04:51

3 Answers3

34

I believe .dcu generally means "Delphi Compiled Unit" as opposed to a .pas file which is simply "Pascal source code".

A .dcu file is the file that the DCC compiler produces after compiling the .pas files (.dfm files are converted to binary resources, then directly processed by the linker).

It's analogous to .o and .obj files that other compilers produce, but contains more information on the symbols (therefore you can reverse engineer the interface section of a unit from it omitting comments and compiler directives).

A .dcu file technically not a "cache" file, although your builds will run faster if you don't delete them and when doesn't need to recompile them. A .dcu file is tied to the compiler version that generated it. In that sense it is less portable than .o or .obj files (though they have their share of compatibility problems too)

Here's some history in case it adds anything.

Compilers have traditionally translated source code languages into some intermediate form. Interpreters don't do that -- they just interpret the language directly and run the application right away. BASIC is the classic example of an interpreted language. The "command line" in DOS and Windows has a language that can be written in files called "batch files" with a .bat extension. But typing things on the command line executed them directly. In *nix environments, there are a bunch of different command-line interpreters (CLIs), such as sh, csh, bash, ksh, and so on. You can create batch files from all of them -- this are usually referred to as "scripting languages". But there are a lot of other languages now that are both interpreted and compiled.

Anyway Java and .Net, for example, compile into something called an intermediate "byte-code" representation.

Pascal was originally written as a single-pass compiler, and Turbo Pascal (originating from PolyPascal) - with different editions for CP/M, CP/M-86 and DOS - directly generated a binary executable (COM) file that ran under those operating systems.

Pascal was originally designed as a small, efficient language intended to encourage good programming practices using structured programming and data structuring; Turbo Pascal 1 was originally designed as a an IDE with built-in very fast compiler, and an affordable competitor in the the DOS and CP/M market against the long edit/compile/link cycles at that time. Turbo Pascal and Pascal had similar limitations as any programming environment back then: memory and disk space were measured in kilobytes, processor speeds in Megahertz.

Linking to an executable binary prevented you from linking to separately compiled units and libraries.

Before Turbo Pascal, there was UCSD p-System operating system (supporting many languages, including Pascal. The UCSD Pascal compiler back then already extended the Pascal language with units) which compiled into a pseudo-machine byte-code (called p-code) format that allowed linking multiple units together. It was slow though,

Meanwhile, c evolved in VAX and Unix environments, and it compiled into .o files, which meant "object code" as opposed to "source code". Note: this is totally unrelated to anything we call "objects" today.

Turbo Pascal up to and including version 3 directly generated .com binary output files (although you could use amend those overlays files), and as of version 4 supported separating code into units that first compiled into .tpu files before linked into the final executable binary. The Turbo C compiler generated .obj (object code) files rather than byte-codes, and Delphi 2 introduced .obj file generation on order to co-operate with C++ Builder.

Object files use relative addressing within each unit, and require what's called "fix-ups" (or relocation) later on to make them run. Fix-ups point to symbolic labels that are expected to exist in other object files or libraries.

There are two kinds of "fix-ups": one is done statically by a tool called a "linker". The linker takes a bunch of object files and seams them together into something analogous to a patchwork quilt. It then "fixes-up" all of the relative references by plugging-in pointers to all of the externally-defined labels.

The second fix-ups are done dynamically when the program is loaded to run. They're done by something called the "loader", but you never see that. When you type a command on the command line, the loader is called to load an EXE file into memory, fix-up the remaining links based on where the file is loaded, and then control is transferred to the entry point of the application.

So .dcu files originated as .tpu files when Borland introduced units in Turbo Pascal, then changed extension with the introduction of Delphi. They are very different from .obj files, though you can link to .obj files from Turbo Pascal and Delphi.

Delphi also hid the linker entirely, so you just do a compile and a run. All of the linker settings are still there, however, in one of Delphi's options panes.

Jeroen Wiert Pluimers
  • 23,965
  • 9
  • 74
  • 154
David Schwartz
  • 1,756
  • 13
  • 18
  • 6
    +1. However, while this is interesting "history", the actual answer to the question is "Yes, they're Delphi Compiled Units, which are almost exactly the same as C/C++ object files". That should have come first, and then the history lesson (so that the people who can clearly find the answer, even if they're not interested in the history). – Ken White Jul 10 '14 at 01:35
  • Got interrupted between writing my comment and actually clicking the +1. :-) Sorry about that; it's done now. Thanks for the edit - it makes things much clearer. – Ken White Jul 10 '14 at 01:58
  • +1 @DavidSchwartz, as Ken said, I was just looking for a "Yes, .DCU ~= .O/.OBJ" or "No, .DCU != .O/.OBJ", but a well written history lesson nonetheless. =) – magnus Jul 10 '14 at 02:25
  • 5
    "DCU" files were introduced as "TPU" files in Turbo Pascal 4.0. Turbo Pascal never produced or "linked" object files. Object file creation was introduced in Delphi 2 when the new 32bit compiler was delivered. – Allen Bauer Jul 10 '14 at 03:09
  • Thanks for clarifying that @AllenBauer, as I never used TurboPascal and don't recall what Delphi V1.0 actually produced. Prior to using Delphi, I had been using Borland C++ for many years, and it DID compile to .obj files. – David Schwartz Jul 10 '14 at 04:46
  • 3
    Yah, @user1420752, it is a vague and proprietary **object file** format, like OMF and COFF you are familiar with. Some reverse engineering has been done on it http://hmelnov.icc.ru/DCU/FAQ.htm – Free Consulting Jul 10 '14 at 05:00
  • 2
    It was UCSD P-System, not USCS Pascal and the original Turbo Pascal (up to version 3) generated COM files until the introduction of units in Turbo Pascal 4. – Andy_D Jul 10 '14 at 07:52
  • Oops, typo. Yes, UCSD ... but I've always heard it referred to as "UCSD Pascal". And yes, those early compilers did generate COM files rather than EXEs. I think it wasn't until linkers started being used that EXEs came on the scene, because COM files didn't require load-time fixups -- they just got loaded into memory at whatever starting address was specified and fired up. That's why TSRs were all COM files. – David Schwartz Jul 10 '14 at 09:38
  • 3
    The line "_Pascal was originally written as a single-pass compiler, and it generated an executable (COM or EXE) file directly that ran under DOS_." is somewhat flawed. Pascal predates DOS by about 10 years. – Disillusioned Jul 10 '14 at 15:08
  • True, although I was referring to PC environments. I first learned to program on a BASIC time-sharing system, and FORTRAN with punch-cards. Then in college I used big UNIVAC mainframes with punch-cards and PDP-11/05s and LSI-11s running FORTRAN. I remember TECO. I don't remember what I did to compile or link anything, but stuff ran. We were taught Algol rather than Pascal. I never really used Pascal much until Delphi came out, although I puttered around with UCSD Pascal a bit -- it was just way too slow. – David Schwartz Jul 10 '14 at 17:46
8

In addition to David Schwartz's answer, there is one case when a dcu actually is quite different from typical obj files generated in other languages: Generic type definitions. If a generic type is defined in a Delphi Unit, the compiler compiles this code into a syntax tree representation rather than to machine code. This syntax tree representation then is stored in the dcu file. When the generic type then is used and instantiated in another unit, the compiler will use this representation and "merge" it with the syntax tree of the unit using the generic type. You could think of this being somewhat analogues to method inlining. This, btw is also the reason why a unit that makes heavy use of generics takes much longer to compile, although the generic types are "linked in" from a dcu file.

iamjoosy
  • 3,299
  • 20
  • 30
  • 1
    The COFF file format allows compiler and tool writers to add custom fields to it that are ignored by the rest of the toolchain. So you could easily inject things like template (C++) or generic (.NET/Delphi) class definitions into the standard object files with no ill effects on other tools. I wonder if C++ Builder does the same thing that Delphi does in this respect? – David Schwartz Jul 10 '14 at 09:30
  • 2
    @David, there is a huge difference between C++ templates and Delphi generics. The former one is actually a real source code substitution and as such requires the template source code to be available when it is used in another file, while Delphi generics are precompiled and the source code is not needed when they are used. Assuming that C++ Builder is compatible to C++ standards I would assume that C++ Builder handles templates very differently than Delphi generics. – iamjoosy Jul 10 '14 at 09:39
1

A Delphi Compiled Unit contains object code, and pre-compiled headers, and is therefore somewhat comparable to both an obj file and a .pch / .gch file.

The 'interface' section of a Delphi source file corresponds to the header, and the 'implementation' section creates the object code.

Pre-compiled header files may significantly reduce compilation and link time. The DCU header section provides link information to other referenced units, that does not have to be re-discovered.

In the Delphi / Turbo Pascal environment, pre-compiled headers support strict type checking, which would have required source-code referencing if an Object file format like .coff or .obj had been used. (In C++, name mangling provides a similar but less complete function).

david
  • 2,435
  • 1
  • 21
  • 33