7

I'm building an application that is a server that implements the tftp protocol. I'm having hard times in understand what's the difference between ASCII format and binary format (netascii and octect) in tftp, and how should I read files differently as the protocol states.

I know that an ASCII char can be rapresented with a single byte. So I don't understand what's the difference between reading in ascii mode (1 byte each character) and binary mode (1 raw byte).

I can read the file with flag ios::binary for binary mode (octet in tftp) and without it for ascii (netascii in tftp), but I really don't understand what's the difference in reading files in these two ways (I always come up with an array of bytes).

If someone can help me understand, I'll really appreciate it

The tftp protocol specification: http://www.rfc-editor.org/rfc/rfc1350.txt

The part that I don't understand is this one:

Three modes of transfer are currently supported: netascii (This is ascii as defined in "USA Standard Code for Information Interchange"
[1] with the modifications specified in "Telnet Protocol
Specification" [3].) Note that it is 8 bit ascii. The term
"netascii" will be used throughout this document to mean this
particular version of ascii.); octet (This replaces the "binary" mode of previous versions of this document.) raw 8 bit bytes; mail,
netascii characters sent to a user rather than a file. (The mail
mode is obsolete and should not be implemented or used.) Additional
modes can be defined by pairs of cooperating hosts.

Francesco Belladonna
  • 11,361
  • 12
  • 77
  • 147

1 Answers1

10

There are two passages which can help clarify what the purpose of netascii is in RFC-1350/TFTP:

netascii (This is ascii as defined in "USA Standard Code for Information Interchange" [1] with the modifications specified in "Telnet Protocol Specification" [3].)

The "Telnet Protocol Specification" is RFC-764, and it describes the interpretation of various ASCII codes for use on the "Network Virtual Terminal". So, netascii would follow those interpretations (which include that lines must be terminated with a CR/LF sequence).

and:

A host which receives netascii mode data must translate the data to its own format.

So a host that used EBCDIC as it's native encoding, for example, might be expected to translate netascii to that encoding, but would leave "octet" data alone.

If you're implementing the TFTP server on a Unix (or other) system that uses LF for line endings, you'd be expected to add the CR for netascii transfers (as well as convert actual CR characters in the file to CR/NUL sequences.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • I think your answer is good, I just don't understand why CR should be converted to CR/NUL instead of CR/LF – Francesco Belladonna Aug 20 '11 at 10:11
  • 2
    @Fire-Dragon: a CR isn't a line ending on Unix systems (it might be, for example, on pre-OS/X Macs). And the telnet spec says that a CR that's not part of a newline needs to be CR/NUL. – Michael Burr Aug 20 '11 at 17:07
  • Oh thanks a lot, these explain why I wasn't understanding it. – Francesco Belladonna Oct 12 '13 at 21:50
  • A text file on your host system might be encoded as Unicode/UTF-8. When the TFTP server receives an ASCII text, it has to store/convert to Unicode; when the server sends a text file, it has to convert it to ASCII (eg by ignoring Unicode characters outside of the ASCII range or expanding e.g. Umlauts into a sequence of ASCII chars; ä -> ae, ç -> c..). – Grandswiss Jun 01 '20 at 06:32