How to 'Follow' a File in Linux? Grep? Pipe? WGET Etc

Question

The requirement is essentially to monitor changes on /HelloWorld.txt, so that any new lines that are added can be 'picked up' and piped to something like a WGET, for that single new line.

I'm still in the early researching stages for how to chain these commands together (and ultimately turn into a Linux Daemon/Service/Systemctl 'service')

From what I know already and what I've been reading to fill in the gaps, it kind of feels like this should be a relatively simply one line command in a bash/shell script which does something like 'tail /HelloWorld.txt | wget example.com/HelloWorld'

I've yet to test any of this yet, so thought I'd post the question while I'm working away testing things in case I'm going in the wrong direction.

UPDATE

So looks like I am getting close to achieving a basic working example, but I'm seeing a lot of odd behaviour. Firstly, when running the command at the command line, I'm seeing the wrong data coming through.

Secondly, when I turn this into a Linux Service, I not only see the wrong data coming through, but also multiple times and lots of spawned processes loitering that needed cleaning up.

The command I'm running at the command line is;

tail -f -n 1 /myLog.txt | grep 'info_im_looking_for' --line-buffered | while read line; do xargs -d $'\n' -n 1 wget -q --delete-after --post-data "line=${line}" "https://www.example.com/HelloWorld"; done;

Seems I'm 95% of the way there, just struggling to get over the line. Feels as though if I can get this to work properly at the command line, I can figure out why the Linux Service isn't quite working properly.

What is odd is that when I just run the first bit of the command, this works fine and outputs to the console the info I would expect to see, it's the bits afterwards that don't seem to be quite working correctly;

tail -f -n 1 /myLog.txt | grep 'info_im_looking_for'

Note, my current thought process is that the parameter passing (which includes "s) is likely needing to be encoded, just in the process of taking a look at options for this.

Feels like this is probably going to be easier to achieve via a .sh or .py script rather than one magic single line piped command to get this working.

`tail -f ...filename... | xargs -d $'\n' -n 1 wget`, rather. But to be on-topic, a question needs to be about a _specific problem you actually face_ and concrete enough for someone who's not you to be able to prove whether or not an answer counts as a "solution" rather than a request for speculation/suggestions/advice. — Charles Duffy, Dec 28 '22 at 23:42
BTW, for doing this kind of operation at-scale there are tools that monitor files for new content and put that content onto a message bus much more efficiently than this kind of shell pipeline; that kind of operation will work at scales where your file is having lines added faster than a single thread running `wget` in a loop will keep up, since you can add more clients pulling messages off the queue. That kind of factor where operational concerns can lead to completely different designs based on desired scale is part of why we ask folks to come in with the big picture already nailed down. — Charles Duffy, Dec 28 '22 at 23:44
(...I wouldn't necessarily use `wget` either, if you care about performance; Python, node.js, &c will have a lot less overhead insofar as they can have a single long-running process that spins off async routines, coprocesses or threads as a new event comes in, vs needing to fork off a whole subprocess and then `exec` a new executable within it; yes, Python is slow, but `fork()`ing in a loop is slower; and once you're using a dedicated tool you can also choose to write it in a faster language like Go). — Charles Duffy, Dec 28 '22 at 23:47
...coming back to the question at hand, the above comments maybe exemplify the problem: There are lots of big-picture approaches available, most of which will work just fine until you're operating at a large enough scale that they won't anymore, and inviting people to speculate about which one they'd use is very open to differences of opinion and approach. If you don't need to scale to high throughput, there are _lots_ of "right" directions, and picking one is mostly a matter of opinion/preference/experience/familarity. — Charles Duffy, Dec 28 '22 at 23:50
@CharlesDuffy You clearly have experience with these things, so let's connect (I can't seem to find you on LinkedIn, but I'm easy to find... MichaelCropper86). For the purpose of this question though, it's a HelloWorld example, I'm not looking for performance and scaling to 10x million transactions per minute - I can figure out those challenges as they come around. — Michael Cropper, Dec 29 '22 at 00:13
@CharlesDuffy updated original question with more info and progress to date — Michael Cropper, Jan 01 '23 at 22:22
`10x million transactions per minute` than you shouldn't be using shell and wget in the first place, instead use C++, or _at least_ python or java. Your pc will die from 10x milion wget processes. `which does something like 'tail /HelloWorld.txt | wget example.com/HelloWorld` Please explain what _exactly_ do you want to do instead of "something like". Do you want to POST HTML a line of data to a website for every page, or open some websocet to stream the content of the file to a remote location? For that matter, why html and not telnet or anything else? — KamilCuk, Jan 01 '23 at 23:33
@KamilCuk Please re-read the content, you've grabbed the wrong end of the stick. I'm looking for a HelloWorld example. I want to monitor a HelloWorld.txt file on a Linux server, and for every new line that is added to that file (which meets a specific pattern) then I want to wget that line from the log file to a URL so can do all the fun magic there. — Michael Cropper, Jan 01 '23 at 23:40
`wget that line from the log file to a URL` I do not understand that line. "wget" is a program, not a verb. You can't "wget a line to a url". You can execute a HTTP connection with a specific payload. Could you clarify? I do not understand what to "wget a line to url" means. Does it mean to execute POST HTTP connection with payload containing `line=`? Does it mean to stream the line to a remote websocket HTTP listener? Why do you use HTTP at all? — KamilCuk, Jan 01 '23 at 23:45
Yes @KamilCuk that's right. To take the technical terminology and grammar out of the equation, I'm essentially looking for "Every time a new line is added to HelloWorld.log file in a Linux system, if the line contains 'abc' then send a wget/curl request to www.helloworld.com so I can easily handle this line item accordingly. — Michael Cropper, Jan 01 '23 at 23:51
I'm confused @KamilCuk - Yes, wget and curl are programs, which implement the HTTP GET or POST (and others) HTTP Request/Response Methods, which includes a URL (i.e. www.example.com) and a URI (i.e. /HelloWorld) and Request/Query Parameters (i.e. ?something=else). - What I'm aiming to achieve is getting said line item from the HelloWorld.log file sent over to the query parameter (POST Request, not GET Request). Does that make sense? — Michael Cropper, Jan 01 '23 at 23:56

score 0 · Answer 1 · answered Jan 01 '23 at 23:38

0

If you want to execute a POST HTML request for every line of input, I believe your xargs has been used incorrectly.

tail -f -n 1 /myLog.txt |
grep 'info_im_looking_for' --line-buffered |
while IFS= read -r line; do
      curl -XPOST --data-urlencode "line=${line}" "https://www.example.com/HelloWorld"
done

answered Jan 01 '23 at 23:38

KamilCuk

120,984
8
59
111

I've had a very quick play with this logic, and it does appear to be working better than the original I was trying. Can you explain why the xargs stuff was used incorrectly as I was going off the recommendation from @Charles Duffy on that one so I'm keen to understand why that wasn't working, as from what I've read the xargs is a handy Linux programme to handle Variables that are passed between programmes. – Michael Cropper Jan 02 '23 at 00:14
Youc an also do `.... | grep .... | xargs -n1 curl -XPOST --data-urlencode line={} https....`. It was used incorrectly, because there was `$line` inside xargs parameters, so it executed first line parameter for each line from the second line of input. The first line was read by `read`, from the second line they were read by xargs. https://stackoverflow.com/questions/2711001/how-to-apply-shell-command-to-each-line-of-a-command-output . `xargs` is like `while read`, executes for each line of input. – KamilCuk Jan 02 '23 at 00:18

How to 'Follow' a File in Linux? Grep? Pipe? WGET Etc

1 Answers1