Issue 2: *A vs *W functions for ReadString/WriteString
Back when I first supplied a patch for more types, functions were using the *A api calls. Now they use the *W calls. ReadString and WriteString do convert the PWideChar using UTF8Encode before returning them as string. According to this (https://bugs.freepascal.org/view.php?id=34876), this is wrong, since the FCL should not use UTF8. How would I be able to get Unicode values from the registry at all otherwise?
Issue 3: *W function for ReadStringList/WriteStringList
Same issue as above, ReadStringList / WriteStringList do use the *W functions, but do not use any conversion from PWideChar to String, resulting in broken TStringList content with #0 inserted between each character (since it’s latin unicode characters). My patch was to use UTF8 here as well, but since UTF8 is not wanted, should I supply a patch that converts it to plain AnsiString, loosing content, or is there any other way to retrieve Unicode registry content through TRegistry?
@Remy Lebeau: I would favor a WideString version as well - but that would break the Delphi 7 compatibility.
for standard latin characters, every second byte is #0
which for REG_MULTISZ is a line separator, so I get one character per list item
My main question is how to do updates to it that won't go wasted but comply with what follows conventions.
Any idea how I can speed this up?
Issue 2: *A vs *W functions for ReadString/WriteStringInstead of:
Back when I first supplied a patch for more types, functions were using the *A api calls. Now they use the *W calls. ReadString and WriteString do convert the PWideChar using UTF8Encode before returning them as string. According to this (https://bugs.freepascal.org/view.php?id=34876), this is wrong, since the FCL should not use UTF8. How would I be able to get Unicode values from the registry at all otherwise?
Issue 3: *W function for ReadStringList/WriteStringListInstead of:
Same issue as above, ReadStringList / WriteStringList do use the *W functions, but do not use any conversion from PWideChar to String, resulting in broken TStringList content with #0 inserted between each character (since it’s latin unicode characters). My patch was to use UTF8 here as well, but since UTF8 is not wanted, should I supply a patch that converts it to plain AnsiString, loosing content, or is there any other way to retrieve Unicode registry content through TRegistry?
useMay be better
Data := UnicodeString(StringReplace(List.Text, LineEnding, #0, [rfReplaceAll]) + #0#0);
Not really. Delphi 7 has WideString, and conversions between AnsiString and WideString.
That is true only if you store characters U+0000..U+00FF in a Unicode UTF-16 string. That is not true at all for ANSI/UTF-8 strings.
You misunderstand how REG_MULTI_SZ actually works. ...
List items are separated by a NULL CHARACTER, with an extra NULL CHARACTER at the end of the list. The size of each character is 1 or 2 bytes, determined by whether you use an ANSI or Unicode API to write/read the REG_MULTI_SZ data.
If you need to store string lists with embedded NULL CHARACTERS, you can always use REG_BINARY instead.
Any idea how I can speed this up?
Whine about it on the forum, apparently. r41267
My apologies, in my memory it was a pre-String=WideString version and the RTL was still AnsiString back then. Should've checked first though.
You misunderstand how REG_MULTI_SZ actually works. ...
List items are separated by a NULL CHARACTER, with an extra NULL CHARACTER at the end of the list. The size of each character is 1 or 2 bytes, determined by whether you use an ANSI or Unicode API to write/read the REG_MULTI_SZ data.
Fully understood here :) Since TRegistry was using *W, I was just ignoring the *A case.
If I post in bugtracker, the admins, as usual, will ignore, because a lot of changes :'(
Fixed: TRegistry.ReadString use string(U) instead of Utf8Encode(U) to work with or without LCLWhy? This already works without LCL. The string returned has codepage UTF8 and assigning it to another string will trigger the conversion from UTF8 to current codepage.
Why? This already works without LCL. The string returned has codepage UTF8 and assigning it to another string will trigger the conversion from UTF8 to current codepage.This show wrong names with trunc, but right with my patch.
The "some not ansi key" string must be defined somewhere in your sourcecode.It's in registry, not is code! Key may be ansi, but value names not - error appear.
The "some not ansi key" string must be defined somewhere in your sourcecode.It's in registry, not is code! Key may be ansi, but value names not - error appear.
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S=""
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S="a-umlaut,e-umlaut,i-umlaut"
Apparently not, since you thought the presence of NULL BYTES in a UTF-16 string would cause items to be separated incorrectly. IT DOES NOT. NULL BYTES are fine in a UTF-16 string. NULL CHARACTERS are not fine. There is a difference!
Whine about it on the forum, apparently. r41267
Great, so I'll just stop trying to find out how to submit useful patches, and start whining more?
What happened is simple - I had a REG_MULTI_SZ with two lines, "Hello" and "World", and using ReadStringList I got items "H", "e", "l", "l", "o" .... "d".
I did NOT think the presence of NULL BYTES in a UTF-16 string was causing the problem
but that the code interpreted the UTF-16 bytes as Ansi bytes.
So, that definitely is a bug and should be reported.
I've written a demonstration for the ReadStringList/WriteStringList issue because I seem to have expressed myself unclear repeatedly :)
https://gitlab.com/ccrdude/freepascal-issue-34876-readmultistring-bug
Also IMHO we should drop the non-windows implementation of Registry and make it a Windows only package.Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.
I cannot believe that any person of sound mind would use TRegistry or TRegIniFile on such a platform.
Better alternatives exists: TIniFile (a sort of native solution for *nix) or TXmlConfig.
I'm quite sure the fpc devels will not agree on that last point though.
Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.
Why would a *nix user expect that a TRegistry implementation exists at all?
On the other hand, as a mid- to long-term solution for software I port, I prefer the mentioned alternatives, since even on Windows, they would add the benefit of easier conversion to a portable app, for example.
Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.
Well, we could poll to see if anybody uses that at all?
We could deprecate it and remove it in 4.0?
We receive bug reports for it, so there is no need for a poll as there clearly are users using it. And as the others mentioned: it eases porting.