0

I have this content in a JSON text file:

{"cdate":"2020-09-01T23:11:46-02:00","email":"example_email@gmail.com","phone":"+66 9988 1234","firstName":"John","lastName":"Smith","orgid":"0","orgname":"","segmentio_id":"","bounced_hard":"0","bounced_soft":"0","bounced_date":"0000-00-00","ip":"1234567","ua":"","hash":"jfepfjepjfewfe87","socialdata_lastcheck":"0000-00-00 00:00:00","email_local":"","email_domain":"","sentcnt":"27","rating_tstamp":"2019-09-22","gravatar":"1","deleted":"0","anonymized":"0","adate":"2020-08-21T04:11:09-05:00","udate":"2020-02-01T21:01:21-06:00",

This text is all on one line.

I want to extract three values: 'email', 'firstName' and 'lastName'. I used cut -d ":" -f 6,8,9.

This provides: "example_email@gmail.com","phone":"John","lastName":"Smith","orgid". I can then clean this up.

The problem is that I have hundreds of similar entries in the file and they are not all spaced in the same way. So I can't say that the next uses of cut should be +50 (or whatever).

I've looked at grep but I cannot figure out how to achieve my goal. Ideally I want to extract:

example_email@gmail.com John Smith

I don't care if it's on one line or three separate lines.

Thanks!

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

1 Answers1

0

Probably the best way would be to use some json parsers to take advantage of the format. But for the fun of it, this may work:

grep -o '"\(email\|firstName\|lastName\)":"[^"]*"' input_file

Worth checking out for proper tooling.

perreal
  • 94,503
  • 21
  • 155
  • 181