Forum clone

Having problems with the forum software, or suggestions for improvements

Forum clone

Postby dobrichev » Tue Jun 10, 2025 5:26 pm

After the constant crashes of this forum in recent weeks, I took some time to archive its content.

A read-only clone of the forum is temporarily available here.

The known non-cloned items include
  • Users with no posts.
  • User details (avatars, signatures, etc).
  • Poll topics.
  • View/download counters.
  • Date of the latest post editing.
  • Data for already broken posts/topics like the topic at the bottom of the General forum

For now, I have no plans for automated content synchronization, nor for placing the clone on a permanent site.
dobrichev
2016 Supporter
 
Posts: 1878
Joined: 24 May 2010

Re: Forum clone

Postby champagne » Tue Jun 10, 2025 5:48 pm

Hi mladen,
no skill on my side to do that, so thanks on behalf of the community
Best
champagne
2017 Supporter
 
Posts: 7659
Joined: 02 August 2007
Location: France Brittany

Re: Forum clone

Postby blue » Tue Jun 10, 2025 7:31 pm

Very comforting.
Many thanks.
blue
 
Posts: 1072
Joined: 11 March 2013

Re: Forum clone

Postby m_b_metcalf » Wed Jun 11, 2025 3:31 pm

Super!
User avatar
m_b_metcalf
2017 Supporter
 
Posts: 13655
Joined: 15 May 2006
Location: Berlin

Re: Forum clone

Postby coloin » Mon Jun 30, 2025 10:15 pm

Yes very good you have done this... especially as the hosting appears very inconsistent these days.....
Perhaps you can outline how you did it !!
how much storage space would it need to host it ... permanently ?
coloin
 
Posts: 2603
Joined: 05 May 2005
Location: Devon

Re: Forum clone

Postby dobrichev » Wed Jul 02, 2025 3:05 am

Hi coloin,

The required storage is small, maybe below 1 gigabyte.

About 3 years ago I attempted to restore the forum from the latest available backup from 2019.
The backup contained database, attached files, and some other unrelated stuff.
I found the source code for the same version of PhpBB. Now I doubt if it was the right version, but at least is was close.
I installed MariaDB and Nginx web server.
From the settings record in database I found two "mods" have been additionaly installed. One is SEO (Search Engines Optimization), which modifies navigation links from numerical identifiers to some text closer to the topic. The other is something related to payments, which I ignored.
Finally I found that my source code doesn't show latest topic in the forums list. I ignored this too after unsuccessful attempts to find the code for this patch.
SEO patching was painful - I found some source code which couldn't be installed automatically (likely not for the same version of PhpBB), and I did the patching manually, finding the right place for hundreds lines of extra code in dosens of files. I also had to manually write redirection configuration for the web server.
An additional change was the PHP code fixes to work around some unsupported pattern matching functions.

At this stage I got a semi-working clone of the forum on my home machine with data up to the 2019.

Several months later I attempted to read the forum and to convert the topics and posts to database records in a form expected by the forum software. I wrote transformation code and compared my results for old posts with the existing database records. After seeing the amount of differences, I gave up.

Recently, after the forum began its regular crashes, I did another attempt, this time less pedantic about data quality. The whole scan took about 1 day in a single thread. Many unrecoverable errors were found in my processing and after fixing them I did a second scan. The data was stored in some intermediate format in separate tables.
Finally, I manually performed a series of operations to reformat and merge data with the original database.

Why this forum crashes?

Initially I thought the site had been attacked by hackers and/or the administrators had significantly reduced the hardware.

Hours after I posted here the only public link to my forum clone, it was bitten by bots.
First days by Google (about one visit per hour from single IP address) and Microsoft (about one visit per 5 minutes from few similar IP addresses).
After a week or so, Microsoft lost intersest, but Amazon bots from Singapore started crawling more aggressively. At the moment I have about 120 visits per hour from various IP addresses and the number is growing.

If the same or similar bots cause crashes of the forum, a fix from administrators is needed. For example, Amazon bots clearly identify themselves as user agent "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214".
dobrichev
2016 Supporter
 
Posts: 1878
Joined: 24 May 2010

Postby 1to9only » Sat Jul 05, 2025 9:33 am

dobrichev wrote:Hours after I posted here the only public link to my forum clone, it was bitten by bots.

The 1st 50-60 users in a fresh phpbb install are mostly bots - deleting these may stop them!

Also a robots.txt in (I think) the htdocs folder may also block these bots.

To block all web crawlers from all content, robots.txt content:

Code: Select all
User-agent: *
Disallow: /
User avatar
1to9only
 
Posts: 4204
Joined: 04 April 2018

Postby dobrichev » Sat Jul 05, 2025 12:38 pm

Hi 1to9only,

robots.txt must be accessible and be in the root directory (and in any root of subdomains which isn't the case here).

In fact my clone has the same robots.txt file as this forum.

Now I realise that its effect is ruined by SEO mod which rewrites most of the navigation links so that they don't match the disallowed patterns.

I am changing the content of robots.txt according to your proposal. Let see the effect. Nevertheless the problem of this forum still holds.
dobrichev
2016 Supporter
 
Posts: 1878
Joined: 24 May 2010

Re: Forum clone

Postby dobrichev » Fri Jul 11, 2025 8:06 am

Filtering by robots.txt works.
About a day after the change, most of the robots disappeared.
dobrichev
2016 Supporter
 
Posts: 1878
Joined: 24 May 2010


Return to Forum questions and feedback