0

I asked a similar(ish) question here but feel as though this may be different enough (and not exclusive to R) that it may warrant its own question.

In short, I need a .txt file with the following format:

Column 1 Column 2

There are two specifications:

  1. Column 1 and column 2 must be aligned. This means that the starting character of each value in Columns 1 and 2 must be aligned vertically.

  2. Only one character should separate columns 1 and 2. This is regardless of the length of the value in column 1 (the length of which varies).

Using a text comparing website, I found that the difference between a 'successful' file (left) and an 'unsuccessful' file (right, which separates the two columns with a variable number of 'space' characters) is some sort of single character that separates the two columns. The perceived width of said character varies based on the length of the value in column 1.

Text compare output

Is there any way in which I can iterate through a .txt file that separates the two columns with a variable number of 'space' characters (basically the file on the right) and replace it with the 'space maker' on the left?

xfrostyy
  • 37
  • 5
  • 2
    You're probably talking about a tab. But anyway, what you want is to **display** your data in a specific way, and your file contains data as it is **stored** - these are two different things. – Thierry Lathuille Jul 23 '21 at 17:34
  • So are you asking for a solution in Python or in R? – fsimonjetz Jul 23 '21 at 17:35
  • @fsimonjetz either would work. The original dataframe is in R but I am more comfortable in Python. – xfrostyy Jul 23 '21 at 17:38
  • @ThierryLathuille Yes - that is a constructive way of looking at the problem. Is there any way for me to fix the display of the data? It may be helpful to look at the solution (by njp) at the following link to see how I am currently outputting the data: https://stackoverflow.com/questions/68492720/outputting-an-r-dataframe-to-a-txt-file-align-positive-and-negative-values – xfrostyy Jul 23 '21 at 17:40

1 Answers1

2

The site you need this for, according to your other questions, clearly specifies, "Multi-column files must be tab-delimited". So all you need to do is replace the spaces in between the values with a single \t. It doesn't matter if they are visually aligned, as commented by Thierry Lathuille – that's just something your text editor does. (You can change the width of the tab in the settings in most editors so it will "look aligned".)

This is a (deliberately verbose) solution in Python, the key step being to use split(), which gets rid of all whitespace between words, and re-assemble each line with '\t'.join().

new_data = list()

with open("data.txt") as infile:
    for row in infile:
        row = row.strip()
        
        if row: # skips empty rows, if any occur
            row = '\t'.join(row.split())
            new_data.append(row)

with open("new_data.txt", mode="w") as outfile:
    print('\n'.join(new_data), file=outfile)
fsimonjetz
  • 5,644
  • 3
  • 5
  • 21