11

I'd like to use fread in a (R)script that would get input data via the linux pipe mechanism. Is there an fread analog for the following?

read.csv(file = 'stdin', ...)

I'll also settle for reading stdin some other way and then using fread to parse it, as I mainly want this for fread's superior separator and header logic.

eddi
  • 49,088
  • 6
  • 104
  • 155
  • My understanding is that `fread` memory-maps the file, and I wouldn't think you can memory-map stdin... so my guess is "no, there isn't". – Joshua Ulrich Jul 01 '13 at 20:22
  • @Thomas smth like this `cat file | myscript.r`; and then read the `stdin` pipe from the script – eddi Jul 01 '13 at 20:44
  • What's wrong with this question? Why the down-votes? Probably someone should explain...?? – Arun Jul 01 '13 at 20:48
  • 1
    @eddi, in that case, why not `Rscript myscript.r file` and use `commandArgs` inside the script? – Arun Jul 01 '13 at 20:50
  • @Arun because most of the time I actually do `head file | myscript.r` or `zcat file | myscript.r` – eddi Jul 01 '13 at 20:51
  • @eddi, looking at `fread` it seems not possible (at least not obvious to me). – Arun Jul 01 '13 at 20:56
  • @eddi, if you want to salvage this question, probably you should add some details about what you mean. I still am not getting the reason for the down-votes pouring in though. – Arun Jul 01 '13 at 21:00
  • 1
    @Arun it's because I disagreed with meta people http://meta.stackexchange.com/questions/186877/what-exactly-is-wrong-with-this-title – eddi Jul 01 '13 at 21:01
  • 5
    @eddi You've been around long enough to know what a reproducible example looks like. – GSee Jul 01 '13 at 21:08
  • 1
    @GSee, I agree with the idea, but I fail to see how this question could be made reproducible. He doesn't know how/if `fread` can be used in that manner.. What example are you asking for? An example of how it's with `read.csv`? – Arun Jul 01 '13 at 21:11
  • 2
    @Arun, for starters, there's no data! – GSee Jul 01 '13 at 21:12
  • @GSee I've already figured out the answer, but I really don't see what reproducible code you'd want for this? A script that just has `fread('stdin')` in it or `read.csv(file='stdin')`?? – eddi Jul 01 '13 at 21:21
  • @eddi, DWin managed to create a reproducible example in his answer. Perhaps you could study that. – GSee Jul 01 '13 at 21:23
  • @GSee sorry I don't get it, I don't understand what code would create a reproducible example of the above, except for creating an R file and running Rscript – eddi Jul 01 '13 at 21:27
  • @eddi, after you enter `read.csv('stdin')`, then what do you do? – GSee Jul 01 '13 at 21:27
  • @Thomas, you should read [**this**](http://stackoverflow.com/a/16499642/559784). – Arun Jul 01 '13 at 21:29
  • @GSee - it doesn't matter, but in this case, I actually literally do nothing - I just want to use `fread` to output text better in a console – eddi Jul 01 '13 at 21:30
  • 2
    You don't need `fread`, `read.csv`, or anything else to do _nothing_. – GSee Jul 01 '13 at 21:31
  • @GSee I don't think you get it - I use R to display text better than `cat` would – eddi Jul 01 '13 at 21:32
  • 2
    @eddi you've shown no such better display of text in your Question. – GSee Jul 01 '13 at 21:32
  • @Thomas, my understanding was that you *dint know* what `stdin` argument for `read.csv` was supposed to do. So I linked to the first question I found (with a simple google search). My understanding is that it had nothing to do with `tagging` at the time you asked the question. – Arun Jul 01 '13 at 21:35
  • @GSee because that's not the question, the question is self-contained - reading `stdin` using `fread` – eddi Jul 01 '13 at 21:35
  • how I use that readout afterwards is irrelevant to the question – eddi Jul 01 '13 at 21:36
  • @Thomas, and I don't think it's linux specific. On windows, I think it's `read.csv(stdin())`.. Not sure though. Hadn't used in a while. – Arun Jul 01 '13 at 21:37
  • it deserves also a `csv` tag. – Metrics Jul 01 '13 at 21:40
  • 2
    @Metrics no it doesn't, this is a generic read issue and has very little to do with `csv` – eddi Jul 01 '13 at 21:43
  • 2
    @Thomas, it doesn't make sense to add `zcat`and `csv` to this question as it's not relevant to this question at all!!! It was just an usage example! – Arun Jul 01 '13 at 21:44
  • @Thomas that was added just as an example and is not relevant to the question, the question is about reading a file, not zcat'ing – eddi Jul 01 '13 at 21:45
  • @Thomas, read the question title. It'll be obvious what the question is actually about. You need not add every little example as a tag. – Arun Jul 01 '13 at 21:45
  • @eddi, you certainly manage to stir things up whether it be asking question or answering, here on SO or there on meta.. :) – Arun Jul 01 '13 at 21:47
  • 1
    @eddi: Thanks. It appears to be the problem with reading the csv file (as the title says) – Metrics Jul 01 '13 at 21:47
  • @Metrics, no it's a problem of reading a file in a way similar to how `read.csv` does (which is just an example from a larger family of file-reading functions); it doesn't have to be a csv file. – eddi Jul 01 '13 at 21:48
  • @Arun yeah; I'm actually slowly warming up to the idea of deleting this question and my answer - if SO community doesn't need what I consider to be a good question and answer, that's fine by me. – eddi Jul 01 '13 at 21:50
  • Thomas, I agree that's a better title, but then I had a very different one to begin with and that silly one wasn't chosen by me @RobertHarvey – eddi Jul 01 '13 at 21:52
  • 1
    @Thomas:Now the title makes sense! – Metrics Jul 01 '13 at 21:52
  • 1
    I feel like people here are viewing this question under a microscope. Maybe just me... – Arun Jul 01 '13 at 21:55
  • 5
    @eddi: The way you keep people from putting silly titles on your question is to provide a decent title to begin with. – Robert Harvey Jul 01 '13 at 21:57
  • 2
    @Arun: The OP posted a question on Meta about this question. He invited the scrutiny. – Robert Harvey Jul 01 '13 at 21:57
  • @RobertHarvey - you wanna ask the people on `r` whether the title I chose was a decent one or not? (I actually think Thomas's one is better, but that's not the question) – eddi Jul 01 '13 at 21:58
  • @RobertHarvey, I agree with your title change. But not with the tag editing and the million edits regarding getting the perfect question that's going on at the moment. Have you seen how many edits this question has had? – Arun Jul 01 '13 at 21:59
  • Actually, I'm back for one more edit to try to make this reproducible for the benefit of future readers. Also, I'll +1 it now that I get what it's about. – Thomas Jul 01 '13 at 21:59
  • 2
    @eddi: I think we made that perfectly clear on Meta. – Robert Harvey Jul 01 '13 at 21:59
  • @RobertHarvey, yes I am aware of the Meta question. I would attribute Thomas' "not understanding" the question to his apprehension. Although I agree with GSee's point on aspects to have improved the question. It certainly was not unclear (to me). – Arun Jul 01 '13 at 22:02
  • [this is](http://meta.stackexchange.com/questions/186897/meta-and-so-behavior-lowering-so-value) (surprisingly, as it shoudln't be) relevant to this question – eddi Jul 01 '13 at 23:55
  • @eddi How in the world is that relevant to this question? – GSee Jul 01 '13 at 23:59
  • @GSee in the same way as meta-related down-votes and edits are; it gives an idea of why this looks like a bad question – eddi Jul 02 '13 at 00:01
  • 3
    @Arun I'm not sure what "apprehension" means in that sentence, but I didn't understand it because it didn't make sense. It only made sense when eddi answered his own question, which let me reverse engineer the question. – Thomas Jul 02 '13 at 00:12
  • 3
    @eddi, you might find this useful: http://stackoverflow.com/questions/15784373/process-substitution/15785789#15785789 – flodel Jul 02 '13 at 01:15
  • 1
    @Thomas, the usage I was going for is "the act of understanding" or "notion or conception". Basically *you* did not understand it. The fact that Joshua commented (and DWin answered) within minutes after the post is a testimony to that, in my view. It could have been better but unclear wasn't one of the cases. – Arun Jul 02 '13 at 04:50
  • @flodel I'm a bit unclear on how to use that for `fread`, seems like what you suggested there is just an alias for `file='stdin'` in `read.*` no? (for the 'stdin' case, you have other cases covered there relevant to that question) – eddi Jul 02 '13 at 14:46

2 Answers2

25

Turns out it's as simple as:

fread('file:///dev/stdin')

This works, because fread actually creates a temporary file when the first 7 characters are "file://" or "http://" and uses download.file to copy the data there and then fread that.


Update: As of version 1.8.11 one can use shell commands in fread, making another solution possible:

fread('cat /dev/stdin')
eddi
  • 49,088
  • 6
  • 104
  • 155
  • 1
    I'm curious about the downvotes on this answer, an explanation would be appreciated. – eddi Jul 01 '13 at 21:28
  • 2
    + 1 because I learnt something new. Sometimes I simply don't get Stackoverflow, still don't get why those downvotes, seriously. – dickoa Jul 01 '13 at 22:45
  • 1
    Downvoting *this* is plain petty. Downvoting the Q is one thing, but the Answer. *sigh*. +1 – Gavin Simpson Jul 02 '13 at 14:49
  • download.file copies the data?!?! Would want to avoid that certainly, especially if the data is "big". `fread(' cat')` also works (note: the space is necessary for fread to consider it a system command) – malcook Mar 18 '18 at 07:42
  • This is super useful. How we could read several files? Like Rscript script.R File1 File2 File3 using this? – Eric González Feb 13 '21 at 01:33
  • That's a different problem - simply parse the command line arguments and read the files. – eddi Feb 15 '21 at 13:47
3

All of the read.* functions use 'scan' under their hoods. scan is fairly low level but does have the capacity for parsing lines of data into different classes.

> mat <- matrix(scan(), 4,4) # will paste in block of data 
1: 0.5 0.1428571 0.25
4: 0.5 0.1428571 0.25
7: 0.5 0.1428571 0.25
10: 0.5 0.1428571 0.25
13: 0.5 0.1428571 0.25
16: 0.5 
17:        # Terminate with two <cr>'s
Read 16 items
> mat
          [,1]      [,2]      [,3]      [,4]
[1,] 0.5000000 0.1428571 0.2500000 0.5000000
[2,] 0.1428571 0.2500000 0.5000000 0.1428571
[3,] 0.2500000 0.5000000 0.1428571 0.2500000
[4,] 0.5000000 0.1428571 0.2500000 0.5000000


> lst <- scan(what=list(double(0), "a"))
1: 4 t
2: 6 h
3:  8 l
4: 8 8
5: 
Read 4 records
> lst
[[1]]
[1] 4 6 8 8

[[2]]
[1] "t" "h" "l" "8"

You should also look at the ?connections page.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanks, `scan` is the first function I looked at, but I can't figure out how to combine it with `fread` – eddi Jul 01 '13 at 21:03