7

Question

Are there any resources for learning how to use assembly in Delphi?

Background Information

I've found and read some general assembly and instruction set references (x86, MMX, SSE etc). But I'm finding it difficult to apply that information in Delphi. General things like how to get the value of a class property etc.

I would like to have the option to use assembly when optimising code.

I understand:

  • It will be difficult to beat the compiler.
  • High-level optimisation techniques are much more likely to increase performance by several orders of magnitude over low-level assembly optimisations. (Such as choosing different algorthims, caching etc)
  • Profiling is vital. I'm using Sampling Profiler for real-world performance analysis and cpu cycle counts for low-level details.

I am interested in learning how to use assembly in Delphi because:

  • It can't hurt to have another tool in the toolbox.
  • It will help with understanding the compiler generated assembly output.
  • Understanding what the compiler is doing may help with writing better performing pascal code.
  • I'm curious.
PhiS
  • 4,540
  • 25
  • 35
Shannon Matthews
  • 9,649
  • 7
  • 44
  • 75
  • 3
    Just writing it in assembly, is not gonna make it faster. Writing it in assembly in a way smarter/faster then the compiler thought of, might. Functionality like getting a class property most probably isn't taking up your time, the manipulations on the value of the property are probably what you want to attempt in assembly. – PtPazuzu Aug 17 '11 at 02:05
  • 2
    I don't think there's any Delphi-plus-assembler books. I would grab values from class properties, using regular pascal code, store the result in a local variable, and then use the local variable reference in my inline assembler, for instance. Related question for general books on X86 asm;http://stackoverflow.com/questions/4845/good-x86-assembly-book – Warren P Aug 17 '11 at 02:10
  • @PtPazuzu: Yep, manipulating values is the goal. But it would help to understand the basics first. Ultimately I'm looking towards SSE as a source of potential speed-ups. Delphi doesn't take advantage of SSE instructions at all does it? – Shannon Matthews Aug 17 '11 at 03:13
  • 1
    For optimization it's worth reading the [AMD optimization guides](http://developer.amd.com/documentation/guides/pages/default.aspx) and [the one from Intel's](http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html). Without knowing such low-level HW stuff (like pipelines, alignment, unrolling, parallelism, prefetching, caches and such), you won't make your code (much) faster. Be aware than in x64, you'll have to rewrite your asm (at less the not SSE2 part), or make 2 versions of it. – Arnaud Bouchez Aug 17 '11 at 05:27
  • @Shannon Delphi does not produce SSE instructions. You can make a try to FPC which does (for floating point). Or wait for XE2 which will use SSE2 for floating-point math in x64 mode. For audio processing, you've already some Delphi libraries around, like http://code.google.com/p/newac or http://www.dsp-worx.de – Arnaud Bouchez Aug 17 '11 at 05:33
  • @Arnaud: SSE2 in XE2!! Awesome, I didn't know that. I'm looking forward to XE2 more and more. :) Thanks for the AMD and Intel optimisation links. – Shannon Matthews Aug 17 '11 at 05:43
  • @Shannon - up to SSE4.2 (already in versions before XE2); no AVX yet. – PhiS Jan 24 '12 at 08:06

5 Answers5

12

Here is a resource that could be helpful...

www.guidogybels.eu/docs/Using%20Assembler%20in%20Delphi.pdf

(I wanted to add a comment to @Glenn with this info, but am forced to use the Answer mechanism as I am New to this forum and not enough Reps...)

Bob A
  • 346
  • 2
  • 5
  • 1
    +1 for mentioning this excellent article. I hadn't expected it still to be online, since Guido doesn't seem to be using Delphi anymore. – Rudy Velthuis Aug 17 '11 at 20:01
6

Most optimization involves creating better algorithms: usually that's where you can get the 'order of magnitude' speed improvements can be obtained.

The x64 assembly world is a big change over the x86 assembly world. Which means that with the introduction of x64 in Delphi in XE2 (very soon now ), you will have to write all your assembly code twice.

Getting yourself a better algorithm in Delphi relieves you of writing that assembly code at all.

The major area where assembly can help (but often smartly crafted Delphi code helps a lot too) is low level bit/byte twiddling, for instance when doing encryption. On the other hand FastMM (the fast memory manager for Delphi) has almost all code written in Delphi.

As Macro already wrote: starting with the disassembled code is often a good start. But assembly optimizations can go very far.
An example you can use as a starting point is for instance the SynCrypto unit which has an option for using either Delphi or assembly code.

Jeroen Wiert Pluimers
  • 23,965
  • 9
  • 74
  • 154
  • +1 for "Most optimization involves creating better algorithms" - this is the main rule. – Arnaud Bouchez Aug 17 '11 at 05:23
  • I understand optimisation is usually best done though high-level means, such as algorithm choices, but I don't think that negates the worth of knowing how to use assembly in Delphi. – Shannon Matthews Aug 17 '11 at 05:47
  • Perhaps the best benefit on creating assembler code doesn't have to do with the algorithm itself (for example, a bad quicksort beats a fully optimized bubble sort everytime). What it does have to do with is how that algorithm is expressed. For example, you can use CPU instructions (even x86) that Delphi doesn't make use of if they are called for (ADC is a great example). Or you code in such a way that you keep any variables in the registers. And even if you don't code in assembler, having knowledge of it can be useful in making faster Delphi code. – Glenn1234 Aug 17 '11 at 09:02
5

The way I read your post, you aren't looking so much for assembler resources as resources to explain how Delphi declarations are structured within memory so you can access them via assembler. This is indeed a difficult thing to find, but not impossible.

Here is a good resource I've found to begin to understand how Delphi structures its declarations. Since assembler only involves itself with discrete data addresses to CPU defined data types, you'll be fine with any Delphi structure as long as you understand it and access it properly.

The only other concern is how to interact with the Delphi procedure and function headers to get the data you want (assuming you want to do your assembler using the Delphi inline facility), but that just involves understanding of the standard function calling conventions. This and this will be useful to that end in understanding those.

Now using actual assembler (linked OBJ files) as opposed to the inline assembler is another topic, which will vary depending on the assembler chosen. You can find information on that as well, but if you have an interest you can always ask that question, too.

HTH.

Community
  • 1
  • 1
Glenn1234
  • 2,542
  • 1
  • 16
  • 21
  • Thanks Glenn! This is exactly the kind of information I'm look for! – Shannon Matthews Aug 17 '11 at 05:53
  • Are there advantages to using linked OBJ files over inline assembly? (Should this be another question?) – Shannon Matthews Aug 17 '11 at 05:55
  • @Shannon As I understand it, the main advantage is that you might get support for instructions in the "real assembler" that you might not get in the Delphi inline assembler. Oddly enough, I never had problems with inlining calls to "real" assembler functions, either. It just depends on whether you get something you need out of the assembler product that you can't get out of Delphi itself. – Glenn1234 Aug 17 '11 at 06:12
  • FWIW, an article on using NASM as external assembler for Delphi can be found here: http://www.rvelthuis.de/articles/articles-nasm.html – Rudy Velthuis Aug 17 '11 at 08:25
4

To use BASM efficiently, you need to have a knowledge both of (1) how Delphi does things at a low level and (2) of assembly. Most of the times, you will not find both of these things described in one place.

However, Dennis Christensen's BASM for beginner and this Delphi3000 article go in that direction. For more specific questions, besides Stackoverflow, also Embarcadero's BASM forum is quite useful.

PhiS
  • 4,540
  • 25
  • 35
2

The simplest solution is always coding it in pascal, and look at the generated assembler.

Speedwise, assembler is usually only at a plus in tight loops, and in general code there is hardly improvement, if any. I've only one piece of assembler in my code, and the benefit comes from recoding a floating point vector operation in fixed point SSE. The saturation provided by SIMD instruction sets is an additional bonus.

Worse even, much ill advised assembler code floating around the web is actually slower than the pascal equivalents on modern processors because the tradeoffs of processors changed over time.


Update:

Then simply load the class property in a local var in the prologue of your procedure before you enter the assembler loop, or move the assembler to a different procedure. Choose your battles.

Studying RTL/VCL source might also yield ideas how to access certain constructs.

Btw, not all low level optimization is done using assembler. On Pascal level with some pointer knowledge a lot can be done too, and stuff like cache optimization can sometimes be done on Pascal level too (see e.g. Cache optimization of rotating bitmaps )

Community
  • 1
  • 1
Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • I'm not interested in assembler for general code. My applications process audio in real-time. I often use tight loops when working with the audio data itself. Depending on the actual application, I'm aiming for less then 2-5% cpu usage on a modern CPU (in the realtime sections only, I don't care about cpu usuage during non-realtime sections so much). All "optimisations" are benched-marked as well. I don't blindly apply anything. – Shannon Matthews Aug 17 '11 at 03:03
  • Thanks Marco. That cache optimization read is interesting. +1 – Shannon Matthews Aug 17 '11 at 11:03
  • Personally I think the most relevant hint for you is in the first paragraph. Change floating point manipulation to e.g. 16-bit fixed point SIMD with SSE. I can imagine that that is relevant for your intended purpose. And of course checkout stuff like Audacity and other open source audio manipulation for examples. It is really hard to come up with such things on your own, at least till you get some experience. – Marco van de Voort Aug 17 '11 at 12:11