I need to parse an IDML file and save the images separately from that file in formats for the Web. Can I do that IDMLlib? And if it's possible then can you show me some examples? P.S. The documentation of that library is awful, and the examples are horrible.
2 Answers
Yes, you can do it with IDMLlib, or by writing your own IDML parser (which is what I've done).
Images in IDML can be either embedded or linked. To extract an embedded image, you need to find the content node, as Jongware has described.
Here is an example of the IDML for an image that is not embedded:
<Image ItemTransform="1 0 0 1 -32.04 -35.04" Self="uf4" Name="$ID/" Visible="true" AppliedObjectStyle="ObjectStyle/$ID/[None]" GradientFillHiliteAngle="0" GradientFillHiliteLength="0" LocalDisplaySetting="Default" GradientFillAngle="0" GradientFillLength="0" GradientFillStart="0 0" VerticalLayoutConstraints="FlexibleDimension FixedDimension FlexibleDimension" HorizontalLayoutConstraints="FlexibleDimension FixedDimension FlexibleDimension" OverriddenPageItemProps="" LastUpdatedInterfaceChangeCount="" TargetInterfaceChangeCount="" ParentInterfaceChangeCount="" ImageTypeName="$ID/JPEG" ImageRenderingIntent="UseColorSettings" EffectivePpi="300 300" ActualPpi="300 300" Space="$ID/#Links_RGB">
<Properties>
<Profile type="string">$ID/None</Profile>
<GraphicBounds Right="64.08" Left="0" Bottom="70.08" Top="0"/>
</Properties>
<TextWrapPreference TextWrapMode="None" TextWrapSide="BothSides" ApplyToMasterPageOnly="false" Inverse="false">
<Properties>
<TextWrapOffset Right="0" Left="0" Bottom="0" Top="0"/>
</Properties>
<ContourOption ContourPathName="$ID/" IncludeInsideEdges="false" ContourType="SameAsClipping"/>
</TextWrapPreference>
<Link Self="uf7" LinkResourceSize="0~6561" LinkImportTime="2012-09-03T15:23:30" LinkImportModificationTime="2012-05-22T15:25:15" LinkImportStamp="file 129821703152428740 25953" ExportPolicy="NoAutoExport" ImportPolicy="NoAutoImport" CanPackage="true" CanUnembed="true" CanEmbed="true" ShowInUI="true" LinkObjectModified="false" LinkResourceModified="false" LinkClientID="257" LinkClassID="35906" StoredState="Normal" LinkResourceFormat="$ID/JPEG" LinkResourceURI="file:D:/Pictures/hkp.jpg" AssetID="$ID/" AssetURL="$ID/"/>
<ClippingPathSettings IncludeInsideEdges="false" Index="-1" AppliedPathName="$ID/" InsetFrame="0" Tolerance="2" Threshold="25" UseHighResolutionImage="true" RestrictToFrame="false" InvertPath="false" ClippingType="None"/>
<ImageIOPreference AlphaChannelName="$ID/" AllowAutoEmbedding="true" ApplyPhotoshopClippingPath="true"/>
</Image>
To find the image, you need to find the Link
node that is a child of the Image
node, and extract the value of the LinkResourceURI
attribute, which is the path to the image. This is a local path, so you need to do all this on the same machine the IDML was authored on.
For an IDML document to be portable between machines, you need to embed the images using the Links panel in InDesign.

- 11,138
- 7
- 48
- 91
-
Can you please give an example for an embedded image? I am able to extract the CDATA from the Contents node, but I am not sure how can I convert and store this data as a jpeg/png image. Basically what I have is <![CDATA[/9j/4R ......... AEn/ AOj9T89fNaSSn//Z]]> in a string which is extracted from the Contents node. How can I write this data to a file, so that it correctly represents the image which can be directly viewed. – TheRock Jan 08 '14 at 09:24
-
1What technology are you using? In .NET it is as simple as http://stackoverflow.com/a/5400225/1014822. I expect most other development paradigms have similar tooling. – Jude Fisher Jan 08 '14 at 12:42
-
I need to this in Java using the IDML tools library which comes along with the indesign SDK. – TheRock Jan 08 '14 at 13:12
-
Please let me know whether my approach is correct. As I told you earlier I have extracted the entire content of the Contents node in a Java String, after that I am replacing any "<![CDATA[" or "]]>" with "". This gives me only the base64 encoded data. After this I can directly use a base64 decoder to decode this string and write the output to a file, and save it with a jpg/png extension. – TheRock Jan 08 '14 at 13:56
-
Yes, that should be right. Once you've extracted that string, it isn't really an IDML issue though: you should post a separate question tagged for Java and base64. I don't know anything at all about Java... – Jude Fisher Jan 08 '14 at 15:19
IDML files, very famously, do not 'contain' the base-64 encoded image data for linked images, only for embedded ones. For linked images only their physical locations on the original machine are stored.
Embedded images are found inside "Spread_uXX.xml" files, in a tag <Image>
. This tag contains the image dimensions and some other meta-information, and a sub-tag <Contents>
that lists the CDATA in Base-64. Be warned: there may be more than a single block of CDATA for each image.
The type of embedded images may or may not be the same as the original; the Image tag should declare the type in an attribute ImageTypeName
. If the file format is not one you can use 'for the web', you need to convert it yourself.
I don't use IDMLlib so I cannot comment on its examples style.

- 11,138
- 7
- 48
- 91

- 22,200
- 8
- 54
- 100
-
"IDML files, very famously, do not include linked images, only embedded ones." This is not right. IDML files can have either linked or embedded images. When an image is linked, the path is stored in the `LinkResourceURI` of the `` node within the `
` node, and the `StoredState` attribute of the `` node is set to 'Normal' (as opposed to 'Embedded'). – Jude Fisher Jul 25 '13 at 14:12 -
-
OP can locate the images, if the operation is on the same machine. Nonetheless, the statement you made is incorrect, and SO is a valuable reference so it's important these things are noted and, where possible, corrected. – Jude Fisher Jul 25 '13 at 17:23
-
@JcFx, IDML files do not include the actual images for linked images, just links to them. So if the operation is on a differnt machine there is no way to access the linked images (unless the link URI is a network address). – Thayne Jan 06 '15 at 21:13
-
@Thayne Yes, I know. That's why my comment says 'if the operation is on the same machine'. Still more accurate than stating IDML files 'do not include linked images' IMO. – Jude Fisher Jan 07 '15 at 21:30
-
@JcFx: okay, I refined that statement to describe the status of linked files as well (which will not help the OP, but may clarify this for others). – Jongware Jan 08 '15 at 00:06
-
1@Jongware Thanks - I wasn't trying to re-open an old debate - but was replying to Thayne's recent comment. I've proposed an edit as well - otherwise there's a chance of confusion between the IDML 'Image' object and the actual image data. – Jude Fisher Jan 09 '15 at 16:59