Mabinogi World Wiki is brought to you by Coty C., 808idiotz, our other patrons, and contributors like you!!
Want to make the wiki better? Contribute towards getting larger projects done on our Patreon!
Mabinogi World Wiki:Wiki Server News/2011
2011
First Quarter
January 5, 2011
- Since replacing the kernel and a reboot on Christmas Eve, the server has held solid. There are some other issues that I'm observing, for which I am locating solutions. Most notably, the network connection will sporadically freeze momentarily. The system console will still respond, however. I suspect it may be an SMP-related kernel deadlock. A newer PPA might help solve this.
- I am periodically getting messages indicating issues with storage, but the issue seems to resolve itself after a short time. The array has held solid the entire uptime thus far. The issues with the network interface and storage appear to be related, likely an SMP-related issue.
- Told PHP in FastCGI mode not to run as a supervisor to subchildren, instead making mod_fcgid manage the processes. This alone appears to have solved the runaway memory consumption issue.
- Updated MediaWiki to 1.16.1.
January 10, 2011
- Due to a power distribution issue, the server went down Friday, January 6, 2011. Unfortunately, I was unable to do anything since I was out of town until the following Sunday. The issue has since been rectified and the server brought back online.
January 13, 2011
- Updated the network driver to see if it helps with issues recently observed on the network interface.
January 17, 2011
Installed Bugzilla so users and staff can report issues with the site as they see them. Also allows suggestions and requests. The site is located at http://www.mabinogiworld.com/bugzilla. More details here in the forums.Disabled due to software's exposure of users' e-mail addresses so as to mitigate the potential for address harvesting by spammers.
- Switched from xcache to APC for PHP opcode caching. Anecdotes describe xcache causing the need for Apache prefork + mod_php setups to require periodic restarts of Apache and killing of moribund or dead PHP processes for those sites using PHP as a FastCGI. Monitoring continues.
January 19, 2011
- I have noticed that the changeout to APC has not abated the periodic network hangs. Other recent circumstances on my network are prompting me to consider other avenues of investigation. However, the site continues to run well.
- In preparation for Generation 13's release, I added a few more tweaks to the PHP startup FastCGI scripts for both the forum and wiki sites (as well as elsewhere on the server) so that PHP interpreter lifetimes should be short enough so as not to consume a lot of memory, yet long enough to be somewhat useful.
- Additionally, I rebooted the server to complete implementation of some updates released for Ubuntu. Alongside the abovementioned script tweaks, this action was also taken in advance of and in preparation for Generation 13's release.
February 5, 2011
- Spurious disk errors have persisted, so I have since tried PPA kernels again which I had used in the past to no avail, based on Linux 2.6.36 and .37. I am now using a 2.8.38-based kernel from the forthcoming release of Ubuntu, Natty Narwhal, to see if this helps any.
- With recent template changes, users have reported seeing blank pages when browsing the wiki. I have put in debugging code into the configuration per a page on MediaWiki's documentation site to try to expose any error messages which would otherwise not show.
February 6, 2011
- Effective immediately, we will not allow data exports of the site to anyone unless they are site admins or have access to the box, whether it be user or physical access. This is an effort to block wholesale copying of the website to other prospective site admins on the Internet. If you want to discuss this in detail, please contact us on the forums or on IRC (web chat portal here).
February 15, 2011
- Enacted several anti-spam countermeasures recently put in place on our sister site, the DFO World Wiki, also hosted on this server.
- Spurious disk errors continue to occur. Am continuing to explore other options.
March 3, 2011
- Now using a kernel from the forthcoming release of Ubnuntu, 11.04, codenamed Natty Narwhal. This new kernel is being used along with irqbalance. As of this writing, the server has been up for nearly three days without any signs of spurious ATA errors or ethernet device hang messages.
March 5, 2011
- While not strictly server-related, I am pleased to announce that we are an affiliate with the ggFTW MMORPG community. A very big shout-out goes to NoeJeko, owner of ggFTW. He is taking his gig full time and I wish him the very best!
March 8, 2011
- Server is still running well. The only storage message I saw in dmesg was a routine md check to make sure things were working. As before, there have been no spurious disk errors, ethernet device hangs or CPU soft lockups with an uptime of eight days and nearly six hours as of this post.
March 16, 2011
- Had two recent reboots for non-trouble-related reasons, once to insert a device to measure how much power the server was using (about 7 watts when powered off; between 110 and 160 watts when powered on), then again for removing the device from the circuit. The server continues to operate trouble-free.
- Updated MediaWiki to 1.16.2.
March 24, 2011
- Updated the kernel to the latest Natty Narwhal release.
- Updated GRUB to the latest Natty Narwhal release.
- Server was rebooted several times early this morning, mainly to troubleshoot an issue which only affected the console, an issue which came to light when I upgraded the kernel after that highly successful run. The issue appears to be precipitated with the newest package releases of GRUB, the program that's used to boot the system. After successfully reproducing the issue on a virtual machine, I commented the troublesome part out (all of three lines in a shell script which rebuilds GRUB's config file), updated the GRUB config file, and we can get console again.
March 29, 2011
- Replaced the FTP server with vsftpd for security reasons, notably its better overall security track record compared to what it had replaced, ProFTPd. This will not affect most users, though some administrators who have FTP access to the server might see issues. Please report any such issues to me immediately.
Second Quarter
April 5, 2011
- Upgraded the kernel to the most recent release from Natty, 2.6.38-8-server.
April 15, 2011
- Earlier this week, the system went into a high CPU usage mode for unknown reasons. One side effect was that the system went into swap and stayed there. Once Apache's configuration was tuned so that there were fewer PHP processes running and not going into swap, it was then it was discovered that PHP was at the root of the problem. I have since updated PHP to what's currently in Natty Narwhal. At the rate I am going, I may end up going fully Natty even before its final release, even if in piecemeal fashion.
April 26, 2011
- I will be putting a hold on any system upgrades for the next few days owing to G14's forthcoming release as well as the
forthcomingjoint Nexon-Mabinogi World event thatwill taketook place on IRC. I will only install upgrade packages on an as-needed basis, e.g. to fix a stability or security problem.
April 27, 2011
- Upgraded to MediaWiki 1.16.4.
May 3, 2011
- Upgraded the operating system to 11.04, codenamed "Natty Narwhal". Will continue to monitor.
May 20, 2011
- Both drives in the server's storage array received a firmware update from Seagate. Total downtime: None. The system is running a RAID 1 array which allowed the server to remain completely operational while each drive was removed from the server, its firmware updated and reinserted.
May 21, 2011
- Upgraded to MediaWiki 1.16.5.
May 22, 2011
- Overnight, I received an alarm from the server. A user was using a "web crawling" program to "spider" the site, causing the load to jump considerably. As a result, I blocked the software's user agent so any attempts from that user's software would return errors rather than go through the interpreter processes, increasing the system load.
June 21, 2011
- Due to a security issue in PHP 5.3.6 and prior releases, uploads have been disabled until further notice until new packages have been released by Ubuntu or I have built a PHP interpreter with the fix in place.
June 23, 2011
- Implemented a fix for CVE-2011-2202. The process has taken place at Debian, but not Ubuntu that I can tell. Reporting the issue to Ubuntu.
June 27, 2011
- Took the server down to replace four cooling fans. The issue was only that the original fans were quite loud as this server is located within living space. As such, I replaced the fans with something quieter.
- Migrated the database filesystem from XFS to ext4fs. This is a move to ensure reliability since some of the recent events of extreme resource usage have had XFS-related messages accompanying them. This will be the first of several such moves. The other moves will be planned and executed soon with announcements to the appropriate areas for notification.
June 28, 2011
- Migrating the server log partition to ext4fs. The operation is presently ongoing.
- UPDATE: The move is complete. Required a total downtime of a minute and a half to two minutes split between both before and after the operation.
- Migration is underway on the partition which contains this and other websites to ext4fs. It is presently being served from a different partition in the meantime while the rest of the partition gets populated. The site needed to be taken down briefly to destroy the underlying filesystem in favor of a new one. Total downtime was approximately ten minutes. I also migrated another filesystem during this time.
- UPDATE: The move is now complete. Required a downtime of less than two minutes to sync up the on-new-filesystem copy of the site from the on-temporary-filesystem copy.
June 29, 2011
- Had an outage for approximately an hour and a half out of a two-hour scheduled window starting at 2300 hrs on Jun 28, 2011. Procedure involved removal of one of the new cooling fans which had failed, along with migrating the remaining filesystems to ext4fs. The server is now running entirely on ext4fs. Will schedule another brief outage to install the replacement fan.
- At 1158 hrs PDT, I took the server down for ten minutes to install the replacement fan. The server now has four working, quiet cooling fans again.
Third Quarter
July 2, 2011
- Unfortunately, the reprieve I saw would not last. This time, a kernel-bound LVM2 process caused all other processes subsequent to it to backlog, causing the machine to enter into a high-swap-use state and stay there, driving the load very high, up to about 800+. To assist in dealing with memory spurt usage, I have adjusted the point at which point it starts using swap. Hopefully, this will help with sudden memory use spikes. It appears the filesystem migration to stave this issue off was for naught, but some of the features of ext4fs made it desirable to migrate anyway, including checksummed journals. As such, the filesystem change will remain at present.
July 25. 2011
- Rebooted the server to update the kernel to fix a serious problem.
August 3, 2011
Ninja update: Around the end of May or beginning of June, I added IPv6 support to the server and the site as a test for World IPv6 Day. The site continues to run with this support. Are you on an IPv6-capable connection? Give it a go! Please let us know of any issues.Rescinded for the time being due to below.
September 21, 2011
- On Labor Day (September 5, 2011), the server had to be relocated very suddenly, giving me literally no notice to make a smooth transition for web to take place. However, thanks to Odin, at least word was gotten out about the state of the wiki. The server is now back at its original location in the datacenter in the northern part of the city of Fresno. Also, due to these circumstances, I have had to remove IPv6 support from the site. It should come back soonish, but there is no set timetable just yet. More information as it becomes available.
- Removed GeoIP-based bans on the site in response to recent correspondence on the matter and given recent countermeasures enacted. More information can be found here.
September 30, 2011
- On September 27, 2011, the server had to go down for about three hours due to a power bus upgrade in the cage where it is colocated. Original ETR was approximately half an hour but ended up being longer than that. However, I only had one hour's worth of notice. Despite the fact I had no easy way of getting the word out, I managed to shut the server down gracefully before the maintenance event began. The server now sits on a UPS which will supply power until a diesel generator comes online should any outage last longer than a few minutes.
Fourth Quarter
October 19, 2011
- On October 9, 2011, I took the server down for about an hour to reinstall the original fans in the machine. Since there is no need for quiet-operating fans, the original fans the machine shipped with have been reinstalled. I also took the opportunity to move the IPMI (Intelligent Platform Management Interface) port back onto the machine's primary ethernet port. That way the server occupies one switchport rather than two. The reason was while I was still in Fresno, the server was on a Cisco Catalyst switch whose ports take about half a minute to come online after a device either is freshly plugged into a port or comes active. While using one port and the machine is soft-rebooted, the system's two ethernet ports are "bounced", taken down, then brought back up. This kills IPMI on the same port. Since the switch the server is currently plugged into does not exhibit this behavior, I have moved the IPMI port back to the server's first ethernet port.
November 2, 2011
- I have instituted a CAPTCHA due to recent fits of spam on the site. I will likely tighten the screws as time goes on. Be advised I may institute regional blocks allowing only licensed areas to edit the site if too much spam comes in internationally from outside licensed regions. Neither I (the server's owner) nor Mabinogi World staff have anything against your country; just the activity emanating from it that we institute any block for. Right now, there is no region-based blocking on the site. If your region has access to Mabinogi and has a site similar to ours, we would rather you use and, if and where applicable, update, that site. Remember, if you use any material from our site, please follow our copyright guidelines for same. Please bear in mind that this site covers North America and Oceania (Australia and New Zealand) users.
- Machine was rebooted after dealing with a memory starvation issue. The machine went non-responsive at about 2119 or so PST, whereupon I was informed. I checked the server and killed off the hosting processes. I had meant to reboot the server since I had updated the kernel, so this event offered me the opportunity to do so. Thankfully, it was a clean shutdown and reboot. This will likely be the final mainline update to Natty Narwhal. I will likely do an install of Oneiric Ocelot's main kernel, Linux 3.0, to test to see how well the machine will perform in advance of a full upgrade to Oneiric. Will call a maintenance event soon. Additional note, earlier today, I had installed a VPN server process on my other colocated machine so I can use the IPMI service on this machine. The timing could not have been better given this event. I was able to set it up and test it without any pressure of duress of this event.
November 4, 2011
- Server went into another situation where it would consume all available memory. Used this opportunity to upgrade the kernel to Oneiric Ocelot's kernel, Linux 3.0.0. Server started up and is running properly. Will upgrade the entire distribution based on performance.