0

I am experiencing a strange behavior of git clone -- some of the files in a git repository, hosted on bitbucket, are modified right after git clone. This problem looks similar to the one reported in the following questions,

but does not look quite the same. In my case, only a couple of files out of about two thousands are modified. I detected the modification by git status right after git clone. These modified files were originally text files with ascii encoding, but after cloning they were regarded as a binary file, as the file command showed:

norio@machine-original $ file -bi t_pot_2e_fft002.f90
text/plain; charset=us-ascii

norio@machine-new $ file -bi t_pot_2e_fft002.f90
application/octet-stream; charset=binary

where t_pot_2e_fft002.f90 is a file that was found to be modified.

In these modified files, a few but not all of commas ,, white spaces , and underscores _ were replaced with some non-ASCII characters, but I can still read most part of the file by the less command or emacs.

I repeated cloning 6 times into a different local directories, and I saw the modification 2 times. In the other 4 times, there was no modification. In the 2 cases in which files were modified, the files modified were different from one case to another.

I worked on repositories on machine-original and machine-original2 (to which I do not have access any more) pushing to and fetching from a remote repository on bitbucket. Now I am cloning this repository on bitbucket to machine-new. The version of git is 1.9.1 on machine-original and 2.14.1 on machine-new. (Edit: machine-original runs ubuntu 14.04 and machine-new runs ubuntu 17.10.)

I had core.filemode=true on machine-original until I realize this problem. Then, I changed it to false, but I do not know how to propagate the effect to remote repository -- I did git push, but only got Everything up-to-date.

I do not have the .gitattributes file mentioned in an answer to one of the questions above.

Can anyone explain why these non-reproducible modifications are made upon cloning? Is it safe to keep using a cloned repository if no modification was detected by git status right after git clone?

norio
  • 3,652
  • 3
  • 25
  • 33

1 Answers1

0

Since the output hexdump -bc t_pot_2e_fft002.f90 from the separate machines are different, that means the encodings are different. Maybe when the file pushed from a different machine with a kind of encoding, and when you clone the repo to another machine, it will encoding with a different way.

More details about encodings, you can also refer Dealing with inconsistent or corrupt character encodings.

When you set core.filemode as false, you actually didn’t change the file but "ignore the original encoding". And for your situation, you’d better set core.filemode as false:

core.fileMode

Tells Git if the executable bit of files in the working tree is to be honored.

Some filesystems lose the executable bit when a file that is marked as executable is checked out, or checks out a non-executable file with executable bit on. git-clone1 or git-init1 probe the filesystem to see if it handles the executable bit correctly and this variable is automatically set as necessary.

A repository, however, may be on a filesystem that handles the filemode correctly, and this variable is set to true when created, but later may be made accessible from another environment that loses the filemode (e.g. exporting ext4 via CIFS mount, visiting a Cygwin created repository with Git for Windows or Eclipse). In such a case it may be necessary to set this variable to false. See git-update-index1.

The default is true (when core.filemode is not specified in the config file).

More details, you can refer core.fileMode.

Community
  • 1
  • 1
Marina Liu
  • 36,876
  • 5
  • 61
  • 74