Having fun parsing the HTML code on this forum

Having problems with the forum software, or suggestions for improvements

Having fun parsing the HTML code on this forum

Postby dobrichev » Fri Mar 22, 2024 9:55 am

Hi,

I've spent some time over the last 2 years trying to figure out if it's possible to archive posts acting as a regular user. Aiming for eventual recovery in a similar public forum, of course.
So far, the conclusion is that it is possible, but with some nuances.

phpBB generally has this data flow:

[Original Post Text] -> Transformation 1 -> [Database Representation] -> Transformation 2 -> [Browsable HTML]

And on Quote/Edit:

[Database Representation] -> Transformation 3 -> [HTML Form Fields]

Unfortunately, it turns out that all three transformations are many-to-many.
The reason is code evolution (upgrades, installing modules) and forum settings changes (posting while settings are in some state, editing/displaying while settings are in another state).


For those parsing the HTML code, I want to share an identified bug in the forum configuration.
The Hidden and Hidden=title BB tags are tranfromed to
Code: Select all
-- HTML text                  
<div style="padding: 3px; background-color: #FFFFFF; border: 1px solid #d8d8d8; font-size: 1em;">
   <div style="border-bottom: 1px solid #CCCCCC; margin-bottom: 3px; font-size: 0.9em; font-weight: bold; display: block;">
      <span onclick=" ... long script ... "/>       <========= a bug, see that the span tag is self-closed but is additionally closed later. Firefox ignores self-closure, php ignores later closure.
         <b>Hidden Text: </b>
         <a href="#" onclick="return false;">Show</a>
      </span>
   </div>
   <div class="quotecontent">
      <div style="display: none;">
         <dl ...
         </dl>
      </div>
   </div>
</div>

--- Firefox parsed data
<div style="padding: 3px; background-color: #FFFFFF; border: 1px solid #d8d8d8; font-size: 1em;">
   <div style="border-bottom: 1px solid #CCCCCC; margin-bottom: 3px; font-size: 0.9em; font-weight: bold; display: block;">
   <span ...>
      <b>Hidden Text: </b>
      <a href="#" onclick="return false;">Show</a>
   </span>
   </div>
      <div class="quotecontent">
      <div style="display: none;">
         <dl ...
         </dl>
      </div>
   </div>
</div>

--- php parsed data
<div style="padding: 3px; background-color: #FFFFFF; border: 1px solid #d8d8d8; font-size: 1em;">
   <div style="border-bottom: 1px solid #CCCCCC; margin-bottom: 3px; font-size: 0.9em; font-weight: bold; display: block;">
      <span .../>
      <b>Hidden Text: </b>
      <a href="#" onclick="return false;">Show</a>
   </div>
   <div class="quotecontent">
      <div style="display: none;">
         <dl ...
         </dl>
      </div>
   </div>
</div>


MD
dobrichev
2016 Supporter
 
Posts: 1850
Joined: 24 May 2010

Return to Forum questions and feedback