[Lazarus] Debugging fixed strings in UTF8 encoding
Martin
lazarus at mfriebe.de
Mon Apr 1 11:13:22 CEST 2013
On 01/04/2013 07:34, Ernest V Miller wrote:
>
> Sent log & source to <lazarus at mfriebe.de>.
>
Ok, I can see what happens.
Your GDB returns slightly different from mine (but correct). And it
triggers the Code for utf8 correction.
That is why you actually see the proper utf8 string, while I never do: I
always get the #123#124 representation.
*** ???
The IDE does not know, if the text is not utf8, so if it translates ansi
#234 as if it was utf8, then it ends up with invalid utf8, and that
fails to be displayed.
It is possible to always suppress that translation. But then you never
see utf8, you always see #123#124.
Other then making it a user set-able decision in the properties, there
is little that can be done.
It could do a heuristic, checking if the result has such invalid chars,
and if there is one then do all as #123. But an ascii sting may be a
valid utf8 string sometimes, yet the utf8 would map to entirely
different chars. In this case the heuristic would show a utf8 without
warning that the content is wrong (well it already does/would do)
*** #123#124
That is actually a missing call to the clean-up, that makes the utf8
read-able. But if I add it, then it will garble the ansi in string[5] to ???
------------------
It may be a while until I get to it.
I have not tested any of then, but if you want to look at it, for some
kind of temporary fix, you can look at either of the following (all in
debugger\gdbmidebugger.pp ):
Doing 1 or 2 may interfere with displaying the classname of an
exception, maybe other class name stuff (detection of real class for
"Sender: TObject").
Doing the test for invalid utf8, will avoid this, except where the class
name is longer than 127 chars. Because a classname sometime is returned
as first byte= length, then the name. And a length of 128 or more, will
be invalid utf8. Bun needs to be returned as the caller will fix it.
1) line 11000
function TGDBMIDebuggerCommand.GetText(const AExpression: String;
const AValues: array of const): String;
....
if not ExecuteCommand('x/s ' + AExpression, AValues, R, [],
DebuggerProperties.TimeoutForEval)
...
Result := ProcessGDBResultText(StripLN(R.Values));
end;
last line, remove "ProcessGDBResultText", so the line will be
Result := StripLN(R.Values);
That will always show #123, and no longer do utf8
You can call it first, then test for faulty utf8 (there is a function,
but not sure of the name), and if so assign the none-fixed value
2) line 10800
function TGDBMIDebuggerCommand.ProcessGDBResultText(S: String): String;
Do the same fix in here. Note this may also be called when getting float
values
3) line 12609
procedure FixUpResult(AnExpression: string; ResultInfo: TGDBType = nil);
...
case ResultInfo.Kind of
...
0, 1, 2: begin // 'char', 'character', 'ansistring'
...
then
FTextValue := copy(FTextValue, i+2, length(FTextValue) -
i - 1)
else
here you can add utf8 translation (and again add the utf8 test, if you like)
then
begin
FTextValue := copy(FTextValue, i+2, length(FTextValue) -
i - 1)
FTextValue := MakePrintable(ProcessGDBResultText( '\t' + FTextValue));
end
else
More information about the Lazarus
mailing list