4

In a mixed Mac/Windows/Linux dev team, using Maven and Git. Some of the files and folders contain : (colon) which is valid everywhere except on Windows. Specifically, this is the jcr:content folder used by Apache Sling / Adobe AEM.

When cloning the project using Git, it fails because it can't create these files/folders.

Is it possible to check all the files for characters not permitted on these platforms? I want to fail the Maven build so that the developer knows to rename the folder so that it works on all platforms.

I've searched for Maven plugins, but found nothing that might do this job. If it's possible as a Git hook, that would be a suitable alternative, but I've seen nothing viable here either.

antonyh
  • 2,131
  • 2
  • 21
  • 42
  • What do you use to check content out of AEM and serialize it? I've never run into this problem and I'm a Windows user. When I use [Vault](https://docs.adobe.com/docs/en/aem/6-2/develop/dev-tools/ht-vlttool.html), it simply renames all relevant files to use underscores around the prefix (`_cq_dialog.xml`, `_cq_editConfig.xml`, etc.). Also, Vault tends to serialize `jcr:content` nodes as XML elements, not folders. – toniedzwiedz Feb 20 '17 at 10:42
  • As for Maven plugins, check out the [`content-package-maven-plugin`](https://docs.adobe.com/docs/en/aem/6-2/develop/dev-tools/vlt-mavenplugin.html) or [`maven-crx-plugin`](https://github.com/Cognifide/Maven-CRX-Plugin) they both use `vlt` internally. – toniedzwiedz Feb 20 '17 at 10:45
  • 1
    The AEM content has been extracted from a content package and put in to `src/test/resources/SLING-INF` for use with Prosper tests. It looks like the package manager behaves differently to `vlt` – antonyh Feb 20 '17 at 11:35

3 Answers3

3

In order to fail the build when a directory contains an unwanted character, you could use the Maven Enforcer Plugin, and write a custom rule that would perform this check, since there are no dedicated rules for this.

That said, you can also use the evaluateBeanshell rule for this purpose: this rule evaluates Beanshell code and fails the build if the script returns false. In this case, the rule uses FileUtils.getDirectoryNames, which a method that returns a list of directories recursively matching include/exclude Ant style patterns starting from a base directory. In the following, all directories under the src directory containing a colon : in their name are matched; that list must be empty for the build to continue.

<plugin>
  <artifactId>maven-enforcer-plugin</artifactId>
  <version>1.4.1</version>
  <executions>
    <execution>
      <id>enforce-beanshell</id>
      <goals>
        <goal>enforce</goal>
      </goals>
      <configuration>
        <rules>
          <evaluateBeanshell>
            <condition>org.codehaus.plexus.util.FileUtils.getDirectoryNames(new File("src"), "**/*:*", null, false).isEmpty()</condition>
          </evaluateBeanshell>
        </rules>
      </configuration>
    </execution>
  </executions>
</plugin>

Plexus Utils is already a dependency of the plugin, so you don't need to add it again, but it might be preferable to still do so in case future versions don't have it. All of the path are relative to the project's base directory, so there is no need to specify it in the file to start the search from.

Note also that this only check files under the src directory; in case you want to check other directories as well, you can add more conditions. And furthermore, it runs at the validate phase, so if you want to check folders that are generated during the build, you'll want to use another phase.

Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • Thanks! This worked really well, I had to change `src` to `.` because it's declared in a parent pom (no source code), and I added a `` to make the error easy to understand. – antonyh Feb 20 '17 at 11:37
  • @antonyh I want it to check src, but I don't want it to fail with "basedir src does not exist". It ignored my try-catch block. How to do? – caduceus Dec 14 '20 at 09:42
  • @caduceus it would be better to ask a question specifically for this need, this question has already been answered and if centered on checking the characters in filenames rather than the existence of a folder. Create a new question then ping me and I'll try and help – antonyh Dec 15 '20 at 10:08
  • @antonyh pls have a look at https://stackoverflow.com/questions/65307107/how-to-catch-exception-in-a-beanshell – caduceus Dec 15 '20 at 13:46
0

It's possible to use Git Hooks to block filenames:

https://github.com/t-b/git-pre-commit-hook-windows-filenames/blob/master/pre-commit

#!/bin/bash
#
# Copyright thomas dot braun aeht virtuell minus zuhause dot de,  2013
#
# A hook script to check that the to-be-commited files are valid
# filenames on a windows platform.
# Sources:
# - http://stackoverflow.com/a/62888
# - http://msdn.microsoft.com/en-us/library/aa365247.aspx
#
# To enable this hook, rename this file to "pre-commit", move it to ".git/hook" and make it executable.

if git rev-parse --verify HEAD >/dev/null 2>&1
then
    against=HEAD
else
    # Initial commit: diff against an empty tree object
    against=
fi

enforcecompatiblefilenames=$(git config hooks.enforcecompatiblefilenames)

# Redirect output to stderr.
exec 1>&2

if test "$enforcecompatiblefilenames" != "true"
then
  exit 0
fi

git diff --cached --name-only --diff-filter=A -z $against | while IFS= read -r -d '' filename; do
  # Non-printable characters from ASCII range 0-31 
  nonprintablechars=$(echo -n "$filename" | LC_ALL=C tr -d '[ -~]' | wc -c)

  # Illegal characters: < > : " / \ | ? *
  # We don't test for / (forward slash) here as that is even on *nix not allowed in *filename*
  illegalchars=$(echo -n "$filename" | LC_ALL=C grep -E '(<|>|:|"|\\|\||\?|\*)' | wc -c)

  # Reserved names plus possible extension
  # CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9
  reservednames=$(echo -n "$filename" | LC_ALL=C grep -i -E '(CON|PRN|AUX|NUL|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9).[a-z]{3}' | wc -c)

  # No trailing period or space
  trailingperiodorspace=$(echo -n "$filename" | LC_ALL=C grep -E '(\.| )$' | wc -c)

  # File name is all periods
  filenameallperiods=$(echo -n "$filename" | LC_ALL=C grep -E '^\.+$' | wc -c)

  # Check complete path length to be smaller than 260 characters
  # This test can not be really accurate as we don't know if PWD on the windows filesystem itself is not very long 
  absolutepathtoolong=0
  if test $(echo "$filename" | wc -c) -ge 260
  then
    absolutepathtoolong=1
  fi

  # debug output
  if test -n "$GIT_TRACE"
  then
    echo "File: ${filename}"
    echo nonprintablechars=$nonprintablechars
    echo illegalchars=$illegalchars
    echo reservednames=$reservednames
    echo trailingperiodorspace=$trailingperiodorspace
    echo filenameallperiods=$filenameallperiods
    echo absolutepathtoolong=$absolutepathtoolong
  fi

  if test $nonprintablechars -ne 0 \
     || test $illegalchars -ne 0 \
     || test $reservednames -ne 0 \
     || test $trailingperiodorspace -ne 0 \
     || test $filenameallperiods -ne 0 \
     || test $absolutepathtoolong -ne 0
  then
    echo "Error: Attempt to add a file name which is incompatible to windows file systems."
    echo
    echo "If you know what you are doing you can disable this"
    echo "check using:"
    echo
    echo "git config hooks.enforcecompatiblefilenames false"
    echo
    exit 1
  fi
done

The downside of this is the need to install it locally for each developer, as unfortunately not all Git repo services support server-side hooks (looking at you, GitHub).

antonyh
  • 2,131
  • 2
  • 21
  • 42
0

Thanks for the beanshell solution, it really helped me out. I'd like to show my answer as well, because I'm also using a regular expression and some debugging messages to help understand what's going on. This works on my machine (TM):

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-enforcer-plugin</artifactId>
    <version>1.4.1</version>
    <executions>
        <execution>
            <id>migration-filename-convention</id>
            <goals>
                <goal>enforce</goal>
            </goals>
            <phase>validate</phase>
            <configuration>
                <rules>
                    <evaluateBeanshell>
                        <condition>
                                List filenames = org.codehaus.plexus.util.FileUtils.getFileNames(
                                    new File("src"),
                                    "**/*.sql",
                                    null,
                                    false);

                                for (Iterator it = filenames.iterator(); it.hasNext();) {
                                    String file = it.next();
                                    print("Found SQL file: " + file);
                                    passesValidation = java.util.regex.Pattern.matches("^.+[\\/\\\\]V[0-9]{4}([0-1][0-9])([0-3][0-9])[0-9]{6}__BDV.sql$", file);
                                    if (passesValidation) {
                                        print("Filename passes validation");
                                        it.remove();
                                    } else {
                                        print("Did not pass validation");
                                    };
                                };

                                filenames.isEmpty()</condition>
                        </evaluateBeanshell>
                    </rules>
                <fail>true</fail>
            </configuration>
        </execution>
    </executions>
</plugin>

The advantage here is that the beanshell code is more readable and it prints out the files it finds along the way:

[INFO]
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (migration-filename-convention) @ inventory-microservice ---
Found SQL file: main\resources\database\V20170803113900__BDV.sql
Filename passes validation
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS

This can be used to validate that SQL scripts follow a timestamp-based filename convention.

It is also attached to the validate Maven lifecycle, so mvn validate will invoke this too.

Nikolaos Georgiou
  • 2,792
  • 1
  • 26
  • 32