digitalmars.D.learn - Parse File at compile time, but not embedded
- Pie? (26/26) Jun 06 2016 Is it possible to parse a file at compile time without embedding
- Alex Parrill (8/35) Jun 06 2016 Most compilers, I believe, will not embed a string if it is not
- Pie? (11/55) Jun 06 2016 This doesn't seem to be the case. In a release build, even though
- Mithun Hunsur (15/73) Jun 06 2016 This is definitely possible, but it can depend on your compiler.
- Pie? (9/41) Jun 06 2016 Ok, I will assume it will be able to be removed for release. It
- Alex Parrill (5/17) Jun 07 2016 Accessing a SQL server at compile time seems like a huge abuse of
- cy (4/8) Jun 09 2016 Presumably you wouldn't be building it at all, since this seems
- Joerg Joergonson (4/18) Jun 09 2016 Lol, who says you have access to my software? You know, the
- ketmar (4/7) Jun 10 2016 oh, yeah. it suddenly reminds me about some obscure thing. other
- Joerg Joergonson (20/28) Jun 11 2016 Mines not a build system...
- Alex Parrill (7/15) Jun 10 2016 By "I" I meant "someone new coming into the project", such as a
- Joerg Joergonson (42/59) Jun 10 2016 It seems that dmd does not remove the data if it is used in any
- ketmar (20/23) Jun 11 2016 sure, it has.
- Joerg Joergonson (7/30) Jun 11 2016 This doesn't seem to be the case though in more complex examples
- ketmar (6/9) Jun 13 2016 your code is *completely* different. that's why there are no
- Adrian Matoga (4/14) Jun 10 2016 Just mount a filesystem that uses an SQL database as storage
Is it possible to parse a file at compile time without embedding it into the binary? I have a sort of "configuration" file that defines how to create some objects. I'd like to be able to read how to create them but not have that config file stick around in the binary. e.g., (simple contrived example follows) Config.txt x, my1 y, my1 z, my2 class my1 { } class my2 { } void parseConfig(A) { .... } void main() { parseConfig('Config.txt') // Effectively creates a mixin that mixes in auto x = new my1; auto y = new my1; auto z = new my2; } If parseConfig uses import('Config.txt') then config.txt will end up in the binary which I do not want. It would be easier to be able to use import and strip it out later if possible. Config.txt may contain secure information, which is why is doesn't belong in the binary.
Jun 06 2016
On Monday, 6 June 2016 at 17:31:52 UTC, Pie? wrote:Is it possible to parse a file at compile time without embedding it into the binary? I have a sort of "configuration" file that defines how to create some objects. I'd like to be able to read how to create them but not have that config file stick around in the binary. e.g., (simple contrived example follows) Config.txt x, my1 y, my1 z, my2 class my1 { } class my2 { } void parseConfig(A) { .... } void main() { parseConfig('Config.txt') // Effectively creates a mixin that mixes in auto x = new my1; auto y = new my1; auto z = new my2; } If parseConfig uses import('Config.txt') then config.txt will end up in the binary which I do not want. It would be easier to be able to use import and strip it out later if possible. Config.txt may contain secure information, which is why is doesn't belong in the binary.Most compilers, I believe, will not embed a string if it is not used anywhere at runtime. DMD might not though, I'm not sure. But reading sensitive data at compile-time strikes me as dangerous, depending on your use case. If you are reading sensitive information at compile time, you are presumably going to include that information in your binary (otherwise why would you read it?), and your binary is not secure.
Jun 06 2016
On Monday, 6 June 2016 at 21:31:32 UTC, Alex Parrill wrote:On Monday, 6 June 2016 at 17:31:52 UTC, Pie? wrote:This doesn't seem to be the case. In a release build, even though I never "use" the string, it is embedded. I guess this is due to not using enum but enum seems to be much harder to work with if not impossible.Is it possible to parse a file at compile time without embedding it into the binary? I have a sort of "configuration" file that defines how to create some objects. I'd like to be able to read how to create them but not have that config file stick around in the binary. e.g., (simple contrived example follows) Config.txt x, my1 y, my1 z, my2 class my1 { } class my2 { } void parseConfig(A) { .... } void main() { parseConfig('Config.txt') // Effectively creates a mixin that mixes in auto x = new my1; auto y = new my1; auto z = new my2; } If parseConfig uses import('Config.txt') then config.txt will end up in the binary which I do not want. It would be easier to be able to use import and strip it out later if possible. Config.txt may contain secure information, which is why is doesn't belong in the binary.Most compilers, I believe, will not embed a string if it is not used anywhere at runtime. DMD might not though, I'm not sure.But reading sensitive data at compile-time strikes me as dangerous, depending on your use case. If you are reading sensitive information at compile time, you are presumably going to include that information in your binary (otherwise why would you read it?), and your binary is not secure.Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).
Jun 06 2016
On Monday, 6 June 2016 at 21:57:20 UTC, Pie? wrote:On Monday, 6 June 2016 at 21:31:32 UTC, Alex Parrill wrote:This is definitely possible, but it can depend on your compiler. If you use an enum, it'll be treated as a compile-time constant - so if you never store it anywhere (i.e. enum File = import('file.txt'); string file = File; is a no-no at global scope), you should be fine. If you do find yourself in the precarious situation of storing the data, then it's up to your compiler to detect that there are no runtime references to the data and elide it. LDC and GDC most likely do this, but I doubt DMD would. For safety, you should try and reformulate your code in terms of enums and local variables; this *should* work with DMD, but it's possible it's not smart enough to catch onto the fact that the function is never used at run-time (and therefore does not need to be included in the executable).On Monday, 6 June 2016 at 17:31:52 UTC, Pie? wrote:This doesn't seem to be the case. In a release build, even though I never "use" the string, it is embedded. I guess this is due to not using enum but enum seems to be much harder to work with if not impossible.Is it possible to parse a file at compile time without embedding it into the binary? I have a sort of "configuration" file that defines how to create some objects. I'd like to be able to read how to create them but not have that config file stick around in the binary. e.g., (simple contrived example follows) Config.txt x, my1 y, my1 z, my2 class my1 { } class my2 { } void parseConfig(A) { .... } void main() { parseConfig('Config.txt') // Effectively creates a mixin that mixes in auto x = new my1; auto y = new my1; auto z = new my2; } If parseConfig uses import('Config.txt') then config.txt will end up in the binary which I do not want. It would be easier to be able to use import and strip it out later if possible. Config.txt may contain secure information, which is why is doesn't belong in the binary.Most compilers, I believe, will not embed a string if it is not used anywhere at runtime. DMD might not though, I'm not sure.But reading sensitive data at compile-time strikes me as dangerous, depending on your use case. If you are reading sensitive information at compile time, you are presumably going to include that information in your binary (otherwise why would you read it?), and your binary is not secure.Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).
Jun 06 2016
On Tuesday, 7 June 2016 at 02:04:41 UTC, Mithun Hunsur wrote:On Monday, 6 June 2016 at 21:57:20 UTC, Pie? wrote:Ok, I will assume it will be able to be removed for release. It is an easy check(just search if binary contains file info). I'm sure an easy fix could be to write 0's over the data in the binary if necessary. If I use an enum dmd does *not* remove it in release build. I will work on parsing the file using CTFE and hopefully dmd will not try to keep it around, or it can be solved using gdc/ldc or some other method.On Monday, 6 June 2016 at 21:31:32 UTC, Alex Parrill wrote:This is definitely possible, but it can depend on your compiler. If you use an enum, it'll be treated as a compile-time constant - so if you never store it anywhere (i.e. enum File = import('file.txt'); string file = File; is a no-no at global scope), you should be fine. If you do find yourself in the precarious situation of storing the data, then it's up to your compiler to detect that there are no runtime references to the data and elide it. LDC and GDC most likely do this, but I doubt DMD would. For safety, you should try and reformulate your code in terms of enums and local variables; this *should* work with DMD, but it's possible it's not smart enough to catch onto the fact that the function is never used at run-time (and therefore does not need to be included in the executable).[...]This doesn't seem to be the case. In a release build, even though I never "use" the string, it is embedded. I guess this is due to not using enum but enum seems to be much harder to work with if not impossible.[...]Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).
Jun 06 2016
If I use an enum dmd DOES remove it in release build. But I'm not sure for the general case yet.
Jun 06 2016
On Tuesday, 7 June 2016 at 04:17:05 UTC, Pie? wrote:Ok, I will assume it will be able to be removed for release. It is an easy check(just search if binary contains file info). I'm sure an easy fix could be to write 0's over the data in the binary if necessary.Binaries aren't magical beings, if your string is there you can just check for it as you would any other file: grep "mysecret" mybinary sed "s/mysecret/garbage/g" mybinary If your string is very small you may hit a problem though. I know gcc for example sometimes maps little strings directly using mov instructions and the numeric value of the string chars. So if your string is very short it may be segmented in words, just adapt your search from there.
Jun 14 2016
On Monday, 6 June 2016 at 21:57:20 UTC, Pie? wrote:On Monday, 6 June 2016 at 21:31:32 UTC, Alex Parrill wrote:Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?But reading sensitive data at compile-time strikes me as dangerous, depending on your use case. If you are reading sensitive information at compile time, you are presumably going to include that information in your binary (otherwise why would you read it?), and your binary is not secure.Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).
Jun 07 2016
On Tuesday, 7 June 2016 at 22:09:58 UTC, Alex Parrill wrote:Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?Presumably you wouldn't be building it at all, since this seems like a technique to provide obfuscated binaries where people aren't privvy to exactly what was used to compile it.
Jun 09 2016
On Tuesday, 7 June 2016 at 22:09:58 UTC, Alex Parrill wrote:On Monday, 6 June 2016 at 21:57:20 UTC, Pie? wrote:Lol, who says you have access to my software? You know, the problem with assumptions is that they generally make no sense when you actually think about them.On Monday, 6 June 2016 at 21:31:32 UTC, Alex Parrill wrote:Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?[...]Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).
Jun 09 2016
On Thursday, 9 June 2016 at 22:02:44 UTC, Joerg Joergonson wrote:Lol, who says you have access to my software? You know, the problem with assumptions is that they generally make no sense when you actually think about them.oh, yeah. it suddenly reminds me about some obscure thing. other people told me that they were able to solve the same problems with something they called "build system"...
Jun 10 2016
On Friday, 10 June 2016 at 07:03:21 UTC, ketmar wrote:On Thursday, 9 June 2016 at 22:02:44 UTC, Joerg Joergonson wrote:Mines not a build system... In any case LDC does drop the data so it is ok. The problem with people is that they are idiots! They make assumptions about other peoples stuff without having any clue what actually is going on rather than addressing the real issue. In fact, the thing I'm doing has nothing to do with SQL, security, etc. It was only an example. I just don't want crap in my EXE that shouldn't be there, simple as that. Also, since I'm the sole designer and the software is simple, I have every right to do it how I want. What's strange, though, is my little ole app takes 300MB's and constantly uses 13% of the cpu... even though all it does is display a few images. This is with LDC release. Doesn't seem very efficient. I imagine a similar app in C would take about 1% and 20MB. Hopefully profiling in D isn't as much a nightmare as setting it up. BTW, I'm using simpledisplay... I saw that you made a commit or something on github. Are you noticing any similarities to cpu and memory usage?Lol, who says you have access to my software? You know, the problem with assumptions is that they generally make no sense when you actually think about them.oh, yeah. it suddenly reminds me about some obscure thing. other people told me that they were able to solve the same problems with something they called "build system"...
Jun 11 2016
On Thursday, 9 June 2016 at 22:02:44 UTC, Joerg Joergonson wrote:On Tuesday, 7 June 2016 at 22:09:58 UTC, Alex Parrill wrote:By "I" I meant "someone new coming into the project", such as a new hire or someone that will be maintaining your program while you work on other things. In any case, this is impossible. D has no such concept as "compile-time-only" values, so any usage of a value risks embedding it into the binary.Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?Lol, who says you have access to my software? You know, the problem with assumptions is that they generally make no sense when you actually think about them.
Jun 10 2016
On Friday, 10 June 2016 at 12:48:43 UTC, Alex Parrill wrote:On Thursday, 9 June 2016 at 22:02:44 UTC, Joerg Joergonson wrote:It seems that dmd does not remove the data if it is used in any way. When I started using the code, the data then appeared in the binary. The access to the code is through the following auto SetupData(string filename) { enum d = ParseData!(filename); //pragma(msg, d); mixin(d); return data; } The enum d does not have the data in it as showing by the pragma. ParseData simply determines how to build data depending on external state and uses import(filename) to get data. Since the code compiles, obviously d is a CT constant. But after actually using "data" and doing some work with it, the imported file showed up in the binary. Of course, if I just copy the pragma output and paste it in place of the first 3 lines, the external file it isn't added to the binary(since there are obviously then no references to it). So, at least for DMD, it doesn't do a good job at removing "dangling" references. I haven't tried GDC or LDC. You could probably use somethign like string ParseData(string filename)() { auto lines[] = import(splitLines(import(filename))); if (lines[0] == "XXXyyyZZZ33322211") return "int data = 3"; return "int data = 4"; } So the idea is if the external file contains XXXyyyZZZ33322211 we create an int with value 3 and if not then with 4. The point is, though, that `XXXyyyZZZ33322211` should never be in the binary since ParseData is never called at run-time. At compile time, the compiler executes ParseData as CTFE and is able to generate the mixin string as if directly typed "int data = 3;" or "int data = 4;" instead. The only difference between my code and the above is the generated string that is returned. I'm going to assume it's a dmd thing for now until I'm able check it out with another compiler.On Tuesday, 7 June 2016 at 22:09:58 UTC, Alex Parrill wrote:By "I" I meant "someone new coming into the project", such as a new hire or someone that will be maintaining your program while you work on other things. In any case, this is impossible. D has no such concept as "compile-time-only" values, so any usage of a value risks embedding it into the binary.Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?Lol, who says you have access to my software? You know, the problem with assumptions is that they generally make no sense when you actually think about them.
Jun 10 2016
On Friday, 10 June 2016 at 18:47:59 UTC, Joerg Joergonson wrote:sure, it has. template ParseData (string text) { private static enum Key = "XXXyyyZZZ33322211\n"; private static enum TRet = "int data = 3;"; private static enum FRet = "int data = 4;"; static if (text.length >= Key.length) { static if (text[0..Key.length] == Key) enum ParseData = TRet; else enum ParseData = FRet; } else { enum ParseData = FRet; } } void main () { mixin(ParseData!(import("a"))); } look, ma, no traces of our secret key in binary! and no traces of `int data` declaration too!In any case, this is impossible. D has no such concept as "compile-time-only" values, so any usage of a value risks embedding it into the binary.
Jun 11 2016
On Saturday, 11 June 2016 at 13:03:47 UTC, ketmar wrote:On Friday, 10 June 2016 at 18:47:59 UTC, Joerg Joergonson wrote:This doesn't seem to be the case though in more complex examples ;/ enums seem to be compile time only in certain conditions. My code is almost identical do what you have written except ParseData generates a more complex string and I do reference parts of the "Key" in the generation of the code. It's possible DMD keeps the full code around because of this.sure, it has. template ParseData (string text) { private static enum Key = "XXXyyyZZZ33322211\n"; private static enum TRet = "int data = 3;"; private static enum FRet = "int data = 4;"; static if (text.length >= Key.length) { static if (text[0..Key.length] == Key) enum ParseData = TRet; else enum ParseData = FRet; } else { enum ParseData = FRet; } } void main () { mixin(ParseData!(import("a"))); } look, ma, no traces of our secret key in binary! and no traces of `int data` declaration too!In any case, this is impossible. D has no such concept as "compile-time-only" values, so any usage of a value risks embedding it into the binary.
Jun 11 2016
On Sunday, 12 June 2016 at 01:39:11 UTC, Joerg Joergonson wrote:This doesn't seem to be the case though in more complex examples ;/it is.My code is almost identical do what you have writtenyour code is *completely* different. that's why there are no traces of CTFE values in my sample. it's not that hard to find out that my code has no functions at all, so no code for 'em can be generated.
Jun 13 2016
On Tuesday, 7 June 2016 at 22:09:58 UTC, Alex Parrill wrote:Just mount a filesystem that uses an SQL database as storage (query can be encoded in file path) and you have it. Whether it's a good idea is another story.Not necessarily, You chased that rabbit quite far! The data your reading could contain sensitive information only used at compile time and not meant to embed. For example, the file could contain login and password to an SQL database that you then connect, at compile time and retrieve that information the disregard the password(it is not needed at run time).Accessing a SQL server at compile time seems like a huge abuse of CTFE (and I'm pretty sure it's impossible at the moment). Why do I need to install and set up a MySQL database in order to build your software?
Jun 10 2016