by dobrichev » Wed Jul 02, 2025 3:05 am
Hi coloin,
The required storage is small, maybe below 1 gigabyte.
About 3 years ago I attempted to restore the forum from the latest available backup from 2019.
The backup contained database, attached files, and some other unrelated stuff.
I found the source code for the same version of PhpBB. Now I doubt if it was the right version, but at least is was close.
I installed MariaDB and Nginx web server.
From the settings record in database I found two "mods" have been additionaly installed. One is SEO (Search Engines Optimization), which modifies navigation links from numerical identifiers to some text closer to the topic. The other is something related to payments, which I ignored.
Finally I found that my source code doesn't show latest topic in the forums list. I ignored this too after unsuccessful attempts to find the code for this patch.
SEO patching was painful - I found some source code which couldn't be installed automatically (likely not for the same version of PhpBB), and I did the patching manually, finding the right place for hundreds lines of extra code in dosens of files. I also had to manually write redirection configuration for the web server.
An additional change was the PHP code fixes to work around some unsupported pattern matching functions.
At this stage I got a semi-working clone of the forum on my home machine with data up to the 2019.
Several months later I attempted to read the forum and to convert the topics and posts to database records in a form expected by the forum software. I wrote transformation code and compared my results for old posts with the existing database records. After seeing the amount of differences, I gave up.
Recently, after the forum began its regular crashes, I did another attempt, this time less pedantic about data quality. The whole scan took about 1 day in a single thread. Many unrecoverable errors were found in my processing and after fixing them I did a second scan. The data was stored in some intermediate format in separate tables.
Finally, I manually performed a series of operations to reformat and merge data with the original database.
Why this forum crashes?
Initially I thought the site had been attacked by hackers and/or the administrators had significantly reduced the hardware.
Hours after I posted here the only public link to my forum clone, it was bitten by bots.
First days by Google (about one visit per hour from single IP address) and Microsoft (about one visit per 5 minutes from few similar IP addresses).
After a week or so, Microsoft lost intersest, but Amazon bots from Singapore started crawling more aggressively. At the moment I have about 120 visits per hour from various IP addresses and the number is growing.
If the same or similar bots cause crashes of the forum, a fix from administrators is needed. For example, Amazon bots clearly identify themselves as user agent "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214".