Troubleshooting

From mw.mh370.wiki
Jump to navigation Jump to search


A Guide to Using MediaWiki in a Hosted Environment

An instructional website by the developer of mh370wiki.net - a MediaWiki site about Malaysia Airlines Flight MH370.


Apparently nothing works perfectly. Fortunately, most things go right most of the time.

Every Systems Administrator or MediaWiki Administrator will encounter things that aren't quite right. The question: how to find a solution? can be rephrased: where is the problem?

The Internet enables a user to get a page from a web server. The end-to-end process involves a hierarchy of protocols, illustrated by the TCP/IP or OSI models, and a sequence of events. By analysing the steps involved it may be possible to isolate a networking issue or a specific process, a specific configuration setting, a specific point of failure like a mis-spelling or an empty file or a command or style which is not correctly implemented, and then fix it.

Some common issues on a hosted system can include mis-configured DNS, domain record not pointed to the correct installation folder, a missing PHP library, a MediaWiki extension causing problems, faults within LocalSettings.php as simple as a missing ; or a missing } in MediaWiki:Common.css Simple things that can cause the whole website to fail, or load with default settings only.

The end-to-end processes are summarised below:-


Web Browser and HTTP Requests

To find information on the Internet, a user needs a web browser, internet connection through an ISP or mobile provider, and uses a search engine (Google, Bing, DuckDuckGo etc.).

Assuming the topic of interest is Malaysian Airlines flight MH370, the search criteria could be just MH370 or a term like MH370 SATCOM. The first page of Google search results includes mh370wiki.net with a link to an article Communications:Satellite Ground Station Logs - Key Observations/Analysis

The URL is https://www.mh370wiki.net/Communications:Satellite_Ground_Station_Logs_-_Key_Observations/Analysis. mh370wiki.net is a MediaWiki-based website.

  • https is the protocol, a Secure version of the HyperText Transfer Protocol. Prior to using this secure protocol the Systems Administrator must login to the account at the Hosting Provider, and use cPanel tools to navigate to the Security section and select SSL/TLS. "The SSL/TLS Manager will allow you to generate SSL certificates, certificate signing requests, and private keys."
  • www is the World-Wide Web. Inclusion in the URL is optional. In this instance, the Systems Administrator has specified 'the protocol and server name to use in fully-qualified URLs' with this configuration in LocalSettings.php $wgServer = "https://www.mh370wiki.net"; for the mh370wiki.net site.
  • mh370wiki is the registered (second-level) domain name, a combination of MH370 and MediaWiki.

  • .net is a generic top-level domain. Generic means it is not country-specific.

  • Communications:Satellite_Ground_Station_Logs_-_Key_Observations/Analysis is the Path
    • Communications: is a Namespace, identifiable by the colon :
    • Satellite_Ground_Station_Logs_-_Key_Observations is the title of a parent page. The underscore characters _ are not visible in the page title.
    • /Analysis is a sub-page.

When the user selects the item in the search results, the browser sends a DNS Lookup Request to get the IP Address for the mh370wiki.net domain.

If you use a command-line tool to ping mh370wiki.net, the (current)[1] result is 151.106.103.26. If you place that address in the Browser it will open the home page of a hosting provider based in the USA. The implication is that the IP Address 151.106.103.26 is shared among several users. It is not unique to the website mh370wiki.net. A whois search for mh370wiki.net will identify the domain registrar and the location of DNS nameservers. These queries can be done for any website.

The browser simply inserts the IP Address 151.106.103.26 as the destination for an HTTP request to GET the resource named as the PATH.


HTTP Request and the Hosting Provider

The HTTP request is routed though the Internet until it reaches the destination IP Address 151.106.103.26 where the domain name mh370wiki.net is extracted and compared with a list of Account Holders. When the match is made, the HTTP request is routed internally to a Web Server and a directory within an account holder's /home/public_html directory. Let's assume a simple name has been used and the path is /home/public_html/MH370_Website.

There are two critical items in the /MH370_Website directory:-

  1. A file named .htaccess This acts as a 'gatekeeper' with instructions to the Web Server to send all requests to /w/index.php only.
  2. The subdirectory /w which contains the MediaWiki installation folders and files

There are also other items in the /MH370_Website directory:-

  1. A robots.txt file which tells search engine robots where to find a Site Map, and instructions which allow or disallow robot's access to parts of the website. For example, some Namespaces are accessible and search is allowed; others are disallowed.
  2. A file sitemap.xml which has been created using a custom tool (in this case, A1 Site Map Generator) and which is used by the search engines.
  3. Files which identify the site as registered with Google, Bing and Yandex (verification codes).

The next steps in the request to open this example webpage are processes within MediaWiki


HTTP Request and MediaWiki

The entry point for MediaWiki is the file index.php. PHP is an open-source programming (scripting) language. The name is an acronym for PHP: Hypertext Preprocessor. PHP scripts are executed by a web server.

index.php only contains a few lines of code and processes are delegated to other scripts WebStart.php, Setup.php and others. Overall, the processes are detailed in Manual:MediaWiki architecture and summarised here:-

  • perform security checks
  • load default configuration settings - See /w/includes/MainConfigSchema.php
  • get site settings from LocalSettings.php (and override defaults)
  • evaluate the type of request - is it view to simply view a page, or actions such as edit, delete etc.
  • the page is generated as templates, parser functions and variables are expanded to give the wikitext
  • the wikitext is converted to HTML and includes CSS and links to image files
  • the browser renders the HTML code, applies the styles (CSS), and includes linked files such as images
  • the final appearance depends on the skin and other user preferences, may be affected by browser settings, and depends on the screen resolution.


Some Extra Detail

If short URLs are not configured then the website URL will look like domain name/w/index.php/page name. This shows that MediaWiki is installed an a /w subdirectory and that index.php is the point of entry.

When short URLs are configured the website URL will typically look like domain name/wiki/page name. For a page to load, configuration in .htaccess and in LocalSettings.php must match.

The mh370wiki.net website URL does not have the /wiki. For pages to load MediaWiki has to override default settings for $wgScriptPath and $wgArticlePath.

The example article from mh370wiki.net uses a namespace NS_Communications. For a public user, called anonymous or * in MediaWiki, permission to read the items in NS_Communications is specifically granted using the Extension:Lockdown

$wgNamespacePermissionLockdown[NS_COMMUNICATIONS]['read'] = array('*');

The Path to the page to be loaded includes a sub-page. Subpages have been allowed in the namespace NS_Communications -

$wgNamespacesWithSubpages[NS_COMMUNICATIONS] = true;

All the checks done from point of entry /w/index.php must confirm that permission to access the namespace is allowed, and read access to the requested page is allowed.


Notes:-

  1. The IP Address is assigned by the Hosting Provider and it can be changed without you knowing. The DNS records will be automatically updated by the Hosting provider; access to the website will be unchanged. However, to speed up access to the mh370wiki.net website I added the IP Address to the Windows hosts file on my home PC. The web browser looks at the hosts file before sending a DNS request so if there is a match then it is quicker. And it worked well, until it didn't. Access to my own website became terribly slow but other websites loaded faster. The Hosting Provider had changed the site's IP Address. When the entry in the hosts file was removed everything returned to 'normal'. The hosts file is located at C:\Windows\System32\drivers\etc\hosts.

    A related issue can occur because DNS records are also cached locally. The cache can be cleared by opening the Windows Command Prompt screen as an administrator and typing ipconfig /flushdns