2

I'm not a specialist in IPFS and linux, but I'll try to describe a question as I can.
There is a txt file with lines, representing a list of filenames and its IPFS CIDs (I suppose).
The structure of the file is the following:

"description of a file" "IPFS CID"  
description1 CID1.....shnpnhzuquz3ho7vvabqjyide  
description2 CID2.....keg4rklbzenppnyrjk554h5ru  

And there is a description, how to do it with a linux command, what supposedly pin of each line within this file to IPFS.

cat file.txt | xargs -L1 bash -c 'ipfs pin add $1'

How to do this command via Windows Powershell?

I've tried to enter the following (by the analogy of this question - Convert xargs Bash command to PowerShell?)

cat file.txt | %{ipfs pin add $_}

but it yields errors:

Error: invalid path "description1 CID1.....shnpnhzuquz3ho7vvabqjyide": selected encoding not supported  
Error: invalid path "description2 CID2.....keg4rklbzenppnyrjk554h5ru": selected encoding not supported  
.....
mklement0
  • 382,024
  • 64
  • 607
  • 775
valen320
  • 21
  • 3

4 Answers4

2

Update:

  • It turns out that the real input file has no header line, whereas his answer assumes that it does.
    I'm leaving the answer as-is, because the techniques of dealing with the header line may be of interest to future readers.

  • Your input file has a header line, which needs to be skipped (your bash command would try to execute ipfs pin add with the word of from the header).

  • From each line, the second space-separated token must be extracted and passed to ipfs pin add, which is what xargs -L1 in combination with bash -c '... $1' implicitly does.

cat file.txt | 
  select -Skip 1 | # skip header line
  % {              # process each line
    ipfs pin add (-split $_)[1]  # Extract 2nd whitespace-separated token
  }

Note:

  • On Windows, cat is a built-in alias for PowerShel's Get-Content cmdlet, on Unix-like platforms it refers to the standard utility, /bin/cat.

  • select is a built-in alias for Select-Object, and, with lines of text as input,
    Select-Object -Skip 1 is equivalent to tail -n +2

  • % is a built-in alias for ForEach-Object, which processes each input object via a script block ({ ... }), inside of which the automatic $_ variable refers to the input object at hand.

  • The unary form of the -split operator is used to split each line into whitespace-separated tokens, and [1] extracts the 2nd token from the resulting array.

    • An alternative is to use the .Split() .NET string method, as shown in a comment you made on your own question: in the case at hand, $_.Split(" ")[1] is equivalent to (-split $_)[1]

    • However, in general the -split operator is more versatile and PowerShell-idiomatic - see this answer for background information.

Note: I'm not familiar with ipfs, but note that the docs for ipfs pin add only talk about passing paths as arguments, not CIDs - it is ipfs pin remote add that supports CIDs.


An alternative approach is to treat your input file as a (space-)delimited file and parse it into objects with Import-Csv, which allows you to extract column values by column name:

Import-Csv -Delimiter ' ' file.txt | 
  ForEach-Object {
    ipfs pin add $_.'IPFS CID'
  }

Since ipfs pin add supports multiple arguments, you can simplify and speed up the command as follows, using only one ipfs pin add call:

ipfs pin add (Import-Csv -Delimiter ' ' file.txt).'IPFS CID'

This takes advantage of PowerShell's member-access enumeration feature and PowerShell's ability to translate an array of values into individual arguments when calling an external utility.

Note that, at least hypothetically, there's a platform-specific limit on the length of a command line that you could run into (which is something that only xargs would handle automatically, by partitioning the arguments into as few calls as necessary).

mklement0
  • 382,024
  • 64
  • 607
  • 775
1

instead of trying to do this on powershell, use powershell to go to WSL, setup IPFS (Kubo for Go using the linux guide) and use this command to pin the files. remove the header on your first line manually.

$ cat /mnt/pathtoyourfile/file.txt | shuf | xargs -P8 -L1 bash -c 'timeout 2m ipfs pin add $1

one thing to note: your repo may not be in the default location, and windows env variables dont carry into WSL, so add your path to .bashrc by using

$ vim ~/.bashrc

add the export command

$ export IPFS_PATH=/mnt/d/yourIPFSrepo/.ipfs

esc, and type :wq and enter to save and exit. use

$ source ~/.bashrc

to load it to current powershell for changes to take effect immediately. use ipfs init to check node initialization location. launch another powershell bash window to verify path is always changed to your specified path on launch.

  • Please [format your post properly](https://stackoverflow.com/help/formatting). (Fence your code with `\`\`\`` lines.) – mklement0 Apr 19 '23 at 11:04
0

It started hanging after pinning 10 because of entry #11. I deleted 5 entries before entry #11, which now makes it entry #6. The code pinned 5 entries before hanging and was hanging on 6. I'm not sure what causes it to hang on specific entries though, but this code worked perfectly in powershell:

cat file.txt | %{ipfs pin add $_.split(" ")[1]}
buddemat
  • 4,552
  • 14
  • 29
  • 49
  • You're speaking as if you're the author of the question, but your user account is different. Did you create a second account? Note that the information about hanging isn't even part of the question, and the sample input data wouldn't explain it. Unless you have specific information to share about the nature of a line that causes a hang, this information isn't useful to future readers. By contrast, what _is_ problematic about the sample input data in the question is the presence of a _header line_, which your command doesn't account for. – mklement0 Apr 18 '23 at 07:33
  • not author, but i know the file they were using to pin the items. since i am new here, they didnt let me comment on the comment section there. i was talking about the authors comment: "the following seem to be working: cat file.txt | %{ipfs pin add $_.split(" ")[1]} seem that pins only first 10 entries :( still need help". the header line is not accounted for, could delete that manually. also, no i dont know why it hangs, just know that the hang is because of the entry itself. hopefully someone more knowledgeable can shed light on it – anonymouspanda Apr 19 '23 at 08:04
  • I see, but without seeing the data, no one will be able to shed light on this. Turns out there was no header line, but if there were, there's no need to remove it manually, as there's a trivial programmatic solution in PowerShell; a header line could even be leveraged for a more concise, OO solution. – mklement0 Apr 19 '23 at 11:03
0

I need to clarify, there is no header in the file. So I edited the initial command. For now, there are two commands that both work good:
1:

Get-Content file.txt | 
    ForEach-Object {
      ipfs pin add (-split $_)[1]  # Extract 2nd whitespace-separated token
      }

and 2:

cat file.txt | %{ipfs pin add $_.split(" ")[1]}

I think there is the problem in pinning particular lines. And accidentally it was 11th line. anonymouspanda is another user that faced the same problem as me I suppose.

valen320
  • 21
  • 3
  • I've since updated my answer to use `cat` instead of `Get-Content` and `%` instead of `ForEach-Object`, so as not to create a distraction, but I explain how they're _aliases_. Therefore, the only effective difference -between the two commands - albeit resulting in the same behavior - is `(-split $_)[1] ` vs. `$_.split(" ")[1]}`, and I've updated my answer to explain why you may prefer the former. – mklement0 Apr 18 '23 at 09:37
  • However, I've left the header-line handling in the answer, as it may be of interest to future readers. The header-line confusion illustrates the importance of providing _representative_ sample input and [formatting posts properly](https://stackoverflow.com/help/formatting), to clearly show which parts are descriptive, and which parts are part of the data, as well as to show code with proper formatting. – mklement0 Apr 18 '23 at 09:39