0

I am attempting to index my movie collection and in doing so have run across an issue where at least one title is skipped in the import phase due to special characters. The code skips over "Æon Flux" due to it starting with Æ. Would anyone know how to correct this, please?

Clear-Host

# Variables:
$movie_dir = "K:\Movies"

# Because reasons...
$PSDefaultParameterValues['*:Encoding'] = 'utf8'

# Connect to the library MySQL.Data.dll
Add-Type -Path 'C:\Program Files (x86)\MySQL\Connector NET 8.0\Assemblies\v4.8\MySql.Data.dll'
 
# Create a MySQL Database connection variable that qualifies:
$Connection = [MySql.Data.MySqlClient.MySqlConnection]@{ConnectionString='server=127.0.0.1;uid=username;pwd=password;database=media'}
$Connection.Open()

# Drop the table to clear all entries.
$sql_drop_table = New-Object MySql.Data.MySqlClient.MySqlCommand
$sql_drop_table.Connection = $Connection
$sql_drop_table.CommandText = 'DROP TABLE Movies'
$sql_drop_table.ExecuteNonQuery() | Out-Null

# (Re)create the table.
$sql_create_table = New-Object MySql.Data.MySqlClient.MySqlCommand
$sql_create_table.Connection = $Connection
$sql_create_table.CommandText = 'create table Movies(movie_id INT NOT NULL AUTO_INCREMENT, movie_title VARCHAR(255) NOT NULL, movie_file_date INT, movie_IMDB_id INT, PRIMARY KEY (movie_id))'
$sql_create_table.ExecuteNonQuery() | Out-Null

$movies = Get-ChildItem $movie_dir -File -include *.mp4 -Recurse -Depth 1 |
    Select-Object -ExpandProperty FullName |
    Sort-Object |
    Get-Unique |
    where{$_ -ne ""}

foreach ($movie in $movies)
{
    # .net function to get just the filename (movie title).
    $title = [System.IO.Path]::GetFileNameWithoutExtension($movie)
    # Get the creation date of the movie and reformat it to yearmonthday.
    $add_date = (Get-ChildItem $movie).CreationTime.toString("yyyyMMdd")

    $query = "INSERT INTO Movies(movie_id, movie_title, movie_file_date) VALUES(NULL, @title, $add_date)"
    $command = $connection.CreateCommand()
    $command.CommandText = $query
    # Sanatize single quotes in filenames for input.
    $command.Parameters.AddWithValue("@title", $title) | Out-Null
    $command.ExecuteNonQuery() | Out-Null
}

# Close the MySQL connection.
$Connection.Close()

Write-Host
Write-Host("Added") $movies.Count ("movies.")
Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68
graih
  • 30
  • 3
  • You're saying "Æon Flux" isn't sorting where you expect it to or it's not getting processed at all? Is that path present in `$movies`? If you change the `Æ` to an ASCII character does it work? You haven't described the structure of `K:\Movies` or listed any input files so, presumably, you must have a non-hidden (due to lack of `-Force`) `Æon Flux.mp4` in `K:\Movies` or one directory below. Also, `ExecuteNonQuery()` returns the number of rows affected by the command; instead of unconditionally discarding it with `| Out-Null` you should check that it returns `1`. – Lance U. Matthews Apr 15 '22 at 09:38
  • The movies directory contains 27 sub-directories (0-9, and then one each for A to Z.) When I run the code without any database calls and redirect the output to a text file the problem child is output as "’on Flux (2005)" where the AE character is changed to a single quote. I'll update the output of the query to check for a return status. Thanks! – graih Apr 15 '22 at 10:18

2 Answers2

1

I don't think it is the Get-ChildItem that skips the file with that special character. More likely, you need to tell your MySql to use UTF-8.
For that, have a look at How to make MySQL handle UTF-8 properly

As for your code, I would change this:

$movies = Get-ChildItem $movie_dir -File -include *.mp4 -Recurse -Depth 1 |
    Select-Object -ExpandProperty FullName |
    Sort-Object |
    Get-Unique |
    where{$_ -ne ""}

into

$movies = Get-ChildItem -Path $movie_dir -File -Filter '*.mp4' -Recurse -Depth 1 | Sort-Object -Property FullName

and work with the FileInfo objects from there on:

foreach ($movie in $movies) {
    $title = $movie.BaseName
    # Get the creation date of the movie and reformat it to yearmonthday.
    $add_date = '{0}:yyyyMMdd}' -f $movie.CreationTime
    . . .
}
Theo
  • 57,719
  • 8
  • 24
  • 41
  • 1
    It seems it was a bit of both. The database table was a mixture of UTF-8 and Latin for some reason, so I corrected that and now all is well. I've incorporated your suggestions into the newest version, thanks again. – graih Apr 15 '22 at 10:52
0

Though Æ is not an ASCII character it is not otherwise "special", so I edited the question title and tags to reflect that.

ExecuteNonQuery() returns the number of rows affected by the command; in the case of $command, it's the number of rows inserted. You are discarding this value, however...

$command.ExecuteNonQuery() | Out-Null

...which masks the problem in the event the INSERT fails. Instead, test the result and respond appropriately...

if ($command.ExecuteNonQuery() -eq 1)
{
    Write-Host -Message "Successfully inserted movie ""$title""."
}
else
{
    Write-Warning -Message "Failed to insert movie ""$title""."
}

This will make it clear if the issue lies in interacting with the filesystem or the database.

Some other notes:

  • MySqlCommand implements the IDisposable interface and so each instance should be disposed when you're done using it...

    $query = "INSERT INTO Movies(movie_id, movie_title, movie_file_date) VALUES(NULL, @title, $add_date)"
    $command = $connection.CreateCommand()
    try
    {
        $command.CommandText = $query
        # Sanatize single quotes in filenames for input.
        $command.Parameters.AddWithValue("@title", $title) | Out-Null
        if ($command.ExecuteNonQuery() -eq 1)
        {
            Write-Host -Message "Successfully inserted movie ""$title""."
        }
        else
        {
            Write-Warning -Message "Failed to insert movie ""$title""."
        }
    }
    finally
    {
        $command.Dispose()
    }
    

    ...and the same for $sql_drop_table and $sql_create_table. The code in the finally block will run even if an error is thrown from within the try block.

  • See Difference with Parameters.Add and Parameters.AddWithValue and its links for why AddWithValue() can be problematic.

  • Instead of...

    Write-Host("Added") $movies.Count ("movies.")
    

    ...a more typical way to build this message would be with string interpolation...

    Write-Host "Added $($movies.Count) movies."
    

    ...or the format operator...

    Write-Host ('Added {0} movies.' -f $movies.Count)
    

    You can also incorporate numeric format strings, so if $movies.Count is 1234 and $PSCulture is 'en-US' then...

    Write-Host "Added $($movies.Count.ToString('N0')) movies."
    

    ...and...

    Write-Host ('Added {0:N0} movies.' -f $movies.Count)
    

    ...will both write...

    Added 1,234 movies.
    
Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68
  • Apologies for the delay in responding, I was away. Thanks again for your help and depth of insight into my "little problem." I've made the changes you've suggested and it is working much smoother and with better indication when something does go awry. I've added a comment at the top attributing both yours and @Theo's help along with a link to this page. Not that anyone except myself will see it, but still. – graih Apr 23 '22 at 08:20