In a PowerShell Provider, when do you refresh vs cache data?

Question

I am writing a PowerShell provider in C#. The provider exposes an applications domain objects through a drive-like interface. For example:

my:\Users\joe@blow.com
my:\Customers\Marty

This data ultimately comes from a database.

I have been unable to find any great guidance for when you should go to the database for data, and when you should cache it. I find that PowerShell calls methods like ItemExists and GetChildNames many times; often repeatedly for the same command. It is impractical to go to the database 5 or 6 times just because they hit Tab for auto-complete, for example.

But at the same time, as a user at the command prompt, if I type Get-ChildItem (dir) and see the list, then do something outside PowerShell so that I know the data is refreshed, taking another directory listing should expect to see any changes to the database.

I feel that if I knew the right term to describe my problem (in PowerShell parlance) I would be able to Google the answer or find an existing duplicate question, but I'm stuck.

"There are two difficult problems in software engineering: naming, cache invalidation and off by one errors." — Richard, Aug 09 '11 at 12:05

x0n · Accepted Answer · 2011-08-09T12:31:13.787

This has very little to do with powershell and everything to do with your data, and how important it is to refresh it. A simple caching scheme would be to use a time based system whereby after N minutes, a request to your back end's data layer would pull a fresh copy and reset the timer. It seems you already have an idea what your particular rules should be. I don't think two successive "dir" commands should always result in two pulls from the backing store, but you do think so for your system. So make it so.

UPDATE

Perhaps a simple guiding principle might be that you should only refresh your data once per provider command issued. The list of built-in commands that operate on provider items consists of:

Clear-Item
Copy-Item
Get-Item
Invoke-Item
Move-Item
New-Item
Remove-Item
Rename-Item
Set-Item

Additionally, the list of built-in commands that operate on provider item properties consists of:

Clear-ItemProperty
Copy-ItemProperty
Get-ItemProperty
Move-ItemProperty
New-ItemProperty
Remove-ItemProperty
Rename-ItemProperty
Set-ItemProperty

And finally, for reading/writing content, we use:

Add-Content
Clear-Content
Get-Content
Set-Content

Each of these commands has a corresponding method in NavigationCmdletProvider (for hierarchical datastores) and this is where you might want to refresh your data. When implementing the New/Move/Rename/Remove/Set/Clear and other data changing methods, you should use some kind of optimistic concurrency methodology as provider instances in PowerShell are not singletons; there may be one or more instances in play at any time.

I wrote a provider that takes its implementation from script that you may find easier to prototype things in. See http://psprovider.codeplex.com/

Hope this helps.

Presently I am using a time based approach. But it feels hackish. Surely, there is a proper indicator in PowerShell to tell the difference between when it calls a method 5 times in the execution of one command and when a new command or new pipeline is running. — Craig Celeste, Aug 09 '11 at 02:32
To offer another example. If someone wrote a .ps1 script that included some interaction with the provider and the execution of existing tools that affect the database, there may be successive provider operations milliseconds after each other, with data altering operations in between. It seems unprofessional to suggest that those people put a sleep or delay in their script to accommodate me. And it also seems wrong to go to the database on every provider method, when those methods are called multiple times in the execution of a single command. This, I guess, is the issue I'm grappling with. — Craig Celeste, Aug 09 '11 at 02:57
Thanks x0n! I'll review your codeplex link this afternoon. Perhaps I am getting confused with which NavigationCmdletProvider methods relate to 'commands' and which are helpers, like IsValidPath, ItemExists, GetChildItems, GetChildNames, etc. — Craig Celeste, Aug 09 '11 at 12:58
Yep, and you're not alone in that confusion. Providers are tricky and API documentation is not helpful until you understand the sequence of calls. — x0n, Aug 09 '11 at 13:40

In a PowerShell Provider, when do you refresh vs cache data?

1 Answers1