I am studying dependency parsing using CoNLL-U format. I can find how to handle CoNLL-U parser or tokenlist, but I cannot find how to convert a text sentence into a CoNLL-U format.
I tried converting code from https://github.com/datquocnguyen/jPTDP
def conllConverter(path):
writer = open(path) + ".conllu", "w", encoding = "utf-8")
lines = open(path, "r", encoding = "utf-8").readlines()
for line in lines:
tok = line.strip().split()
if not tok or line.strip() == "":
writer.write("\n")
else:
count += 1
writer.write(str(count) + "\t" + word + "\t" + "\t".join(['_'] * 8) + "\n")
writer.write("\n")
writer.close()
if __name__ == "__main__":
conllCoverter("test")
pass
"test" file, which is the input of conllCoverter(path) function, is "_io Text10Wrapper" format file which contains text sentences I want to convert to CoNLL-U file such as: 1. Completely frustrating experience . 2. Paid extra money to get the air conditioner with connectivity .
However, after I tried conllConverter(path) function which is defined above, the output shows me only the 10-raw columns (seems like CoNLL-U format) and raw text without any extra information.
In conclusion, I want to ask how could I convert text sentence into a CoNLL-U format.