Recent

Author Topic: Chemical Formulas  (Read 3881 times)

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Chemical Formulas
« on: March 17, 2018, 11:45:31 pm »
    I am writing a program to quiz General Chemistry students on some properties of compounds, ions, and elements.  One of those properties is the formula of the substance.  As the student types the formula into a RichMemo ("rmInput" in the attached project), it is formatted automatically.  If that answer is incorrect, the correct answer is shown in another RichMemo, "rmCorrectAnswers".  Occasionally there is more than one acceptable answer to a question, so those answers are shown as a list in rmCorrectAnswers.  I've gotten most of this worked out, but there are some problems.  I've attached a small project illustrating what I'm trying to do.  (The project is not the quiz - there are no questions or evaluation.  The top RichMemo is the self-formatting answer input control; the bottom one shows the "correct" answers.  The various kinds of answers in the quiz can be seen by commenting/uncommenting the appropriate sections in the MakeAnswerList procedure.  "Hybridization" is included for completeness.  I think I've gotten all the bugs worked out for that.)
     Some formulas contain a "hydrate dot", a heavy dot centered vertically on the line, directly after which numbers are written normally (ie, not subscripted).  The dot is entered by typing a period, which should then be replaced by the dot.  For example, typing FeCl3.6H2O should result in only the 3 and 2 being subscripted and the period should appear as the dot.  I've tried several ways to introduce the dot, shown in the source code as comments.  Usually the replacement does not occur; when it does, the other formatting gets messed up.  Also, the formatting in the two RichMemos is sometimes different, even though they both use FormatFormula.  So:
       1. How can I incorporate the hydrate dot into the formulas? 
       2. Can I save the resulting string (the basic string, not the rtf) in a simple text file, or would I have to convert the dot back to a period first?  (And how would I do that?)
     Other problems involve only rmCorrectAnswers:
       3. The last line is always followed by an empty line.  How can I remove it?  (And why isn't it a problem in rmInput?)  It doesn't present a problem in displaying the answer(s), really.  But it feels awkward and unnecessary.
       4. How can I adjust the height of the RichMemo according to the number of answers it holds?  Because of space limitations on the form in the actual program, there should be a maximum of 4 answers showing.  (See the end of the ShowAnswers procedure.)  I've guessed at numbers for LineHeight and the small extra space I've added between the first two lines.  I can refine those guesses, but there must be a way to actually measure them.
       5.  If a line is wider than rmCorrectAnswers's width the horizontal scrollbar appears, obscurring the last line.  Is there a way to detect when this happens and to determine the scrollbar's height so I can adjust the height of rmCorrectAnswers?  (See the "Plain text" section of MakeAnswerList.  Actually, this should happen very seldom, if ever.  But I am curious.)
       6. (Trivial) When the vertical scroll bar appears, the first click on the downward scroll arrow works as expected, but the second click has only a tiny effect.  All subsequent clicks work fine.  This certainly doesn't have to be fixed, but, again, I'm curious as to the reason.
     Finally, I apologize for asking so many questions in a single post.  If I should have separated the questions, please let me know.
     Thanks for your help.
Lazarus 1.8.0; fpc 3.0.4; Windows 10

wp

  • Hero Member
  • *****
  • Posts: 11858
Re: Chemical Formulas
« Reply #1 on: March 18, 2018, 12:37:13 am »
1. How can I incorporate the hydrate dot into the formulas? 
You have it in your code already, doesn't the UTF8 Bullet symbol work? Uncomment the line
Code: Pascal  [Select][+][-]
  1. const HydrateDot = #$E2#$80#$A2;
BTW, I think you were involved in the discussion last year which lead to the "chemtext package". I extended it to support the hydrate dot. The attached screenshot was created with the ChemLabel of this package.

2. Can I save the resulting string (the basic string, not the rtf) in a simple text file, or would I have to convert the dot back to a period first?  (And how would I do that?)
If you want to save the student's input then save it as it is, no special conversion because your program will know how to create a "nice" output.

The other questions are too rich-edit specific for my knowledge.
« Last Edit: March 18, 2018, 12:57:17 am by wp »

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Chemical Formulas
« Reply #2 on: March 18, 2018, 04:27:09 am »
       3. The last line is always followed by an empty line.  How can I remove it?  (And why isn't it a problem in rmInput?)  It doesn't present a problem in displaying the answer(s), really.  But it feels awkward and unnecessary.
When I replaced
Code: Pascal  [Select][+][-]
  1. RMemo.Lines.AddStrings(CurrentAnswers);
with
Code: Pascal  [Select][+][-]
  1. RMemo.Text:=Trim(CurrentAnswers.Text);
the extra line disappeared.

       4. How can I adjust the height of the RichMemo according to the number of answers it holds?  Because of space limitations on the form in the actual program, there should be a maximum of 4 answers showing.  (See the end of the ShowAnswers procedure.)  I've guessed at numbers for LineHeight and the small extra space I've added between the first two lines.  I can refine those guesses, but there must be a way to actually measure them.
Use your guess and fine tune it with the answer below. (Not a great solution)

       5.  If a line is wider than rmCorrectAnswers's width the horizontal scrollbar appears, obscurring the last line.  Is there a way to detect when this happens and to determine the scrollbar's height so I can adjust the height of rmCorrectAnswers?  (See the "Plain text" section of MakeAnswerList.  Actually, this should happen very seldom, if ever.  But I am curious.)
Give the application a chance to show the scroll bars by calling Application.ProcessMessages.
To find out if there is a scroll bar use GetWindowLong.
Increase the height is a loop until the vertical scroll bar disappears:
Code: Pascal  [Select][+][-]
  1. procedure TfrmQuizAnswers.btnShowAnswersClick(Sender : TObject);
  2. var
  3.   WndStyle: LONG;
  4.   bVScroll, bHScroll: Boolean;
  5.   h: Integer;
  6. begin
  7.   IsFormula := false;
  8.   IsHybrid := false;
  9.   MakeAnswerList;
  10.   ShowAnswers(rmCorrectAnswers);
  11.   repeat
  12.     h := rmCorrectAnswers.Height+1;
  13.     rmCorrectAnswers.Height := h;
  14.     Application.ProcessMessages;
  15.     WndStyle := GetWindowLong(rmCorrectAnswers.Handle, GWL_STYLE);
  16.     bVScroll := WndStyle and WS_VSCROLL = WS_VSCROLL;
  17.     bHScroll := WndStyle and WS_HSCROLL = WS_HSCROLL;
  18.   until not bVScroll;
  19. end;     { btnShowAnswersClick }

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Re: Chemical Formulas
« Reply #3 on: March 18, 2018, 10:19:34 pm »
     wp: The UTF8 Bullet does indeed show the dot, but in rmInput it and all following characters up to the final "O" are subscripted, at least when I use  "RMemo.Text := StringReplace(RMemo.Text, '.', HydrateDot, [rfReplaceAll]);"  for the replacement (see end of FormatFormula).  Strangely, in rmCorrectAnswers the dot shows up correctly, but all subscripting is gone.  Why would the same code produce different results in different controls of the same type?
     The formatting and unformatting routines you wrote last June work well, and I thank you again for them.  I was just trying to learn a little about RichMemo.  So I'd still like to know how to incorporate the dot.

     engkin: Your suggestion for eliminating the extra line works nicely.  So RichMemo.Text is the same as the concatenation of RichMemo.Lines without the final #13?  I tried your code for setting rmCorrectAnswer's height, but it sets the height to accomodate all the lines.  I haven't taken the time yet to change it to do what I need.

     Thanks to both of you.
Lazarus 1.8.0; fpc 3.0.4; Windows 10

wp

  • Hero Member
  • *****
  • Posts: 11858
Re: Chemical Formulas
« Reply #4 on: March 18, 2018, 11:42:12 pm »
I could imagine that this is an issue with indexing because the UTF8 bullet is three bytes wide. Not having much experience with RichMemo I would do some UTF8 experiments first. Enter the string 'äöü' into the memo; each of these characters is two bytes wide. Try to subscript the 'ö' (the second character). Which index do you have to specify in the SetTextAttributes and with length? The 'ö' is the second character, but the at the 3rd byte position, and it is 1 character, but 2 bytes long. From this experiment you learn whether you must specify codepoint or byte indexes and lengths in the SetTextAttributes method.

If you need codepoint indexes I'd suggest to replace the ASCII dot by UTF8 bullet at the end of your RichText generating routine because both are 1 code point and don't change indexing.

If you need byte indexes you must do the replacement at the beginning of the RichText routine because the two additional bytes of the bullet will shift the following characters by 2 bytes.

And of course, when your scanning loop finds the ASCII dot it must reset vertical character position.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Chemical Formulas
« Reply #5 on: March 19, 2018, 05:21:27 am »
     engkin: Your suggestion for eliminating the extra line works nicely.  So RichMemo.Text is the same as the concatenation of RichMemo.Lines without the final #13?
No it is not. There is a function Trim in that assignment to remove the extra white space. You could achieve the same thing if set SkipLastLineBreak to true after creating CurrentAnswers:
Code: Pascal  [Select][+][-]
  1.   CurrentAnswers := TStringList.Create;
  2.   CurrentAnswers.SkipLastLineBreak:=True;
Then you can keep your original code:
Code: Pascal  [Select][+][-]
  1.   RMemo.Lines.AddStrings(CurrentAnswers);

I tried your code for setting rmCorrectAnswer's height, but it sets the height to accomodate all the lines.  I haven't taken the time yet to change it to do what I need.
Add four answers, measure and fine tune its height without scroll bar, then add the rest. Make sure your guess shows a scroll bar. Probably you need to set its visibility to false during this process, and show it at the end to eliminate flicker. This process might be needed only once at the beginning to find the suitable height, I guess.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Chemical Formulas
« Reply #6 on: March 19, 2018, 11:34:52 pm »
       1. How can I incorporate the hydrate dot into the formulas? 
Two of the three constants you have in the code are wrong not suitable for Lazarus (UTF8):
Code: Pascal  [Select][+][-]
  1. //const HydrateDot = #0149;   //   •
  2. const HydrateDot = #$E2#$80#$A2;
  3. //const HydrateDot = #8226;
ANSI: #0149
UTF8: #$E2#$80#$A2
UTF32: #8226

UTF8 representation is 3 bytes and it needs to be treated as a string. For instance this:
Code: Pascal  [Select][+][-]
  1.       if (RawFormula[i] = '.') or (RawFormula[i] = HydrateDot) then
should be replaced with something like:
Code: Pascal  [Select][+][-]
  1.       if (RawFormula[i] = '.') or SameStr(HydrateDot, Copy(RawFormula, i, Length(HydrateDot))) then

That was the easy part.


While moving through RawFormula code points in procedure FormatFormula you used a for loop:
Code: Pascal  [Select][+][-]
  1.   for i := 1 to L do
where variable i points to a specific byte inside RawFormula. A for loop does not help here because when you hit a hydrate dot you need to skip 3 bytes (HydrateDot is a string of three bytes), that's one.

Two: RichMemo needs a different index for code points. Its index moves by one for each code point. For instance the formula A1•2B3 has:
Code: Pascal  [Select][+][-]
  1.    A 12 B 3
  2. i: 1,2,3,6,7,8  //<--- RawFormula
  3. c: 1,2,3,4,5,6  //<--- RichMemo

Just replace that loop with:
Code: Pascal  [Select][+][-]
  1.   i := 1;
  2.   c := 1;
  3.   while i<=L do
  4.     begin
  5.       if (RawFormula[i] = '.') or SameStr(HydrateDot,Copy(RawFormula, i, Length(HydrateDot))) then
  6.         begin
  7.           AfterDot := true;
  8.           FontParams.VScriptPos := vpNormal;
  9.         end
  10.       else if RawFormula[i] in (SubscriptChars - DigitChars) then
  11.         begin
  12.           AfterDot := false;
  13.           FontParams.VScriptPos := vpNormal
  14.         end
  15.       else if RawFormula[i] in DigitChars then
  16.         if AfterDot then
  17.           FontParams.VScriptPos := vpNormal
  18.         else FontParams.VScriptPos := vpSubScript;
  19.  
  20.       RMemo.SetTextAttributes(AnswerStart + c - 2, 1, FontParams);
  21.       inc(i, UTF8CharacterLengthFast(@RawFormula[i]));
  22.       inc(c);
  23.     end;//for i

You need to add unit LazUTF8 to your uses block for UTF8CharacterLengthFast.

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Re: Chemical Formulas
« Reply #7 on: March 21, 2018, 03:57:19 pm »
     Sorry it's taken so long to get back here.  Attached is a slightly updated version of my project, using engkin's code.  Getting closer, but not quite there.  If I do the period/dot replacement early in FormatFormula (line 75), then the subscripting in both RichMemos is OK, but the period is there instead of the dot.  If the replacement is done at the end (line 123; this would seem to obviate your code), rmInput is formatted correctly, but in rmCorrectAnswers the dot appears but there is no subscripting.  The dot is in RMemo.Lines[1] at the end of FormatFormula, but then disappears.  I don't understand.  I would have thought "early replacement" was the way to go.  Why do the two RichMemos behave differently?
Lazarus 1.8.0; fpc 3.0.4; Windows 10

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Chemical Formulas
« Reply #8 on: March 21, 2018, 08:56:04 pm »
If I do the period/dot replacement early in FormatFormula (line 75), then the subscripting in both RichMemos is OK, but the period is there instead of the dot. 
RawFormula and RichMemo should be identical. The replacement is not quite right, as it is only applied on RawFormula, not the RichMemo.

If the replacement is done at the end (line 123; this would seem to obviate your code), rmInput is formatted correctly, but in rmCorrectAnswers the dot appears but there is no subscripting.  The dot is in RMemo.Lines[1] at the end of FormatFormula, but then disappears.  I don't understand.  I would have thought "early replacement" was the way to go.  Why do the two RichMemos behave differently?
Because in one RM AnswerStart is 1 and makes the index for both RawFormula and RichMemo right. While the other RM has a different AnswerStart that breaks the indexing.

Try to set AnswerStart to 1 in both cases. Or fix the indexing.

 

TinyPortal © 2005-2018