Proposal of File String Literals Document No.: Dnnnn Project: Programming Language C++ Author: Andrew Tomazos <
[email protected] > Date: 20160421
Summary We propose a new kind of string literal called a file string literal. A file string literal is like a raw string literal, except the content of the string is stored in a separate dedicated source file, rather than inline in the host source file. The path of the file is specified, like an included header, as part of the file string literal. The content of the source file becomes the content of the string literal, verbatim.
Example The content of datafile.txt: The quick brown fox jumps over the lazy dog. The content of main.cc: #include
int main() { std::cout << F”datafile.txt”; } The program outputs: The quick brown fox jumps over the lazy dog.
Wording Modify [lex.phases]/4: Preprocessing directives are executed, macro invocations are expanded, file string literals are replaced , and _Pragma unary operator expressions are executed.
Add new section 16.10:
16.10 File string literals [cpp.filestrlit] A token of of the form: encodingprefix F " qcharsequence " udsuffix opt opt is a file string literal . [Note: A file string literal holds the same content as the source file identified by the qcharsequence. end note] A file string literal EF”Q”U , where E is the encodingprefix (or an empty source character sequence if not present), Q is the qcharsequence, U is the udsuffix (or an empty source character sequence if not present) is equivalent, by definition, to a preprocessing directive: # include “ Q ” except that the contents of the included source file shall be augmented during replacement in the following manner: Let C be the wouldbe replaced content of the source file named by Q . Let D be a valid dcharsequence that does not appear in C . The file string literal shall be replaced by the raw string literal ER”D(C)D”U .
Motivation String literals are heavily used to compilein sequences of text of some language (eg natural language, declarative language, programming language) into C++ programs that are then either stored in the readonly section of the program image or even, newly, further interpreted and processed during compilation using constexpr programming. Raw string literals have become a popular kind of string literal to use when the sequence of text is multiline, because of their verbatim nature they don’t need to be obfuscated by using preprocessor string literal concatenation and redundantly encoding newlines with “\n”. Sometimes it is useful to dedicate a whole source file to effectively hold the content of a single string literal. Organizing a codebase using a filesystem of small source files is accepted good practice as it is generally easier to navigate than scrolling through long source files. Further, having a dedicated source file for a string literal means that tools (for example languageaware editors) can be used on that source file that know about the language of the string literal, but don’t know about C++.
There is currently no standard mechanism to dedicate a source file to a single string literal in this manner. There are a variety of nonstandard mechanisms that are used. These techniques involve taking the original source file and then translating it into a generated C++ source file. For simple examples there are tools like XPM, incbin, "xxd i". These nonstandard mechanisms suffer a lot of problems. In particular they are difficult to use and configure. The build system needs to have novel configuration and then the name and format of the generated header file needs to be determined. This layer of indirection is unnecessary. A standard mechanism can perform the operation directly during translation in a simple portable way. We propose a standard mechanism to achieve this task. As can be seen in the proposed wording the mechanism we propose simply combines two existing standard mechanisms, source file inclusion and raw string literals , to produce file string literals . The reuse of the two existing mechanisms means the feature is very easy to understand and use, fits neatly into the ecosystem, and is highly efficient because the translation of the dedicated source file into the string literal takes place as part of normal translation. We also get “for free” the encodingprefix and userdefined literal mechanisms, and the encoding rules about source files.
References https://groups.google.com/a/isocpp.org/forum/#!msg/stdproposals/b6ncBojU8wI/blx1GlLqUAUJ https://groups.google.com/a/isocpp.org/forum/#!topic/stdproposals/CDbPC2YgiHE