Wikispooks talk:Site Backup

From Wikispooks
Jump to navigation Jump to search

Site database dump shrinkage

For information: On the old server both Piwik analytic and site data were held in the same database. On the new server I've split them such that the dump only contains regular site data. The zipped file is now only 76Mb --Peter P (talk) 17:43, 17 January 2014 (GMT)

Request to improve backups

How about GZipping the daily .sql dump? This should reduce it by about 2/3 - making for a much easier download. Robin (talk) 02:00, 5 January 2014 (GMT)

Agreed. I too have been mulling backups. I think it would also be better to provide just the SQL dump and the /images directory structure. That would obviate the need to provide the entire suite of MW and its extensions and remove any risk associated with LocalSettings.php. We would need to adjust the instructions a bit so that info about which extensions were mandatory + current version info etc, but it would further reduce both backup size and content complexity. I'll have a look. We'll no doubt need to tweek your script. --Peter P (talk) 09:38, 5 January 2014 (GMT)
I've tweeked your script and tested it - backup.3-2.sh - Seems to be OK. It does 2 backups (images directory and ws.sql to Files-date.zip and sqldump-date.zip) and copies both to -latest and deletes old zips as before. The results of the test are in the backups directory now. Another reason to change to just the sql dump and the images directory is the complex set of exclusions which may need to change. We can easily add the MW script and extension files to a separate backup if necessary. I'll start on revising the instructions if you're happy with this. --Peter P (talk) 15:24, 5 January 2014 (GMT)

Progress

I've been pre-occupied with grandchildren today - will be tomorrow too. The revised shell script tested OK. We now have 3 separate sets of aging backups: the files, the images and the software - needed to incorporate the iframeDocs and ClimateChange directories into the files one so that's why it has grown. I should get the instructions finished tomorrow. I've also set up a cron job to push the files and database backups to another server. If you have one with shell access, just give me access credentials and I'll get them pushed there too if you like. --Peter P (talk) 16:05, 6 January 2014 (GMT)

Good work on this. How about keeping a monthly "all in this file" backup as before - for people who don't want to read but just get everything in one click? Robin (talk) 17:00, 7 January 2014 (GMT)

Difficulties?

It seems like you've decided against the 'unzip and go' approach, suggesting people install extensions manually. This increases the difficulty, as well as looking more daunting. Why the change? Robin (talk) 13:59, 14 January 2014 (GMT)

Lots of issues encountered trying to do it the 'easy' way:
Most relate to server dependencies that one might assume (I did) would be met simply by virtue of having the latest LINUX LAMP stack distribution. Not so, and getting to the bottom of missing dependency errors is far from straightforward. The Mediawiki install process checks the environment and enumerates any problems clearly. For example, in this brand new Centos 6.5 test case, php was not compiled with xml or pdo or mysql or mbstring or gd all of which are essential. Neither did the distro include MEMCACHED or APC such that existing statements in the backed up LocalSetings.php produced non-obvious errors. There were a number of other issues too which I can't remember off-hand. Anyway, the upshot was that, rather than try to document all the pitfalls and have a would be backup-user check them all off, it would in fact be far easier to install Mediawiki + extensions conventionally and let it report problems clearly whilst in progress. Doing it that way, I actually yum-installed the needed stuff as the MW install progressed and it was done very quickly.
In any event I doubt the manual extension install process would put off a prospective user because he/she will certainly need more than a modicum of expertise to attempt it --Peter P (talk) 15:30, 14 January 2014 (GMT)

wikispooks.sql??

What is wikispooks.sql? Is it needed? Robin (talk) 07:14, 23 July 2014 (IST)

It is the MySQL database dump - ie the same as ws-latest.zip, but unzipped and done every day. I could put it in a different directory so it would not appear on the list but why bother? --Peter P (talk) 10:13, 23 July 2014 (IST)
Oops, sorry. It is an old backup done whilst upgrading-testing those SMW issues. No longer needed. I'll delete it. --Peter P (talk) 10:25, 23 July 2014 (IST)

BitTorrent and WebTorrent and Decentralization

One way to proliferate these backup zips and easy your bandwidth is to share them via BitTorrent and WebTorrent. Further, I would include a simple text or basic html file with these wonderful "Site Backup" DIY instructions. A dedicated email announcement every update would be great too so I could just click and co-seed your torrents.

See also: https://saidit.net/s/DecentralizeAllThings/comments/c5p/wikispooks_bundles_and_shares_its_entire_site_as/

I wrote this article a few months ago: https://en.wikipedia.org/wiki/WebTorrent

Also, consider embracing IPFS, Holochain, ZeroNet, Fediverse, or other decentralization protocols and platforms. This article was censored on Wikipedia: https://infogalactic.com/info/InterPlanetary_File_System

https://en.wikipedia.org/wiki/Everipedia claims to be a decentralized wiki but I don't know if they're legit or not, stalled or not, etc. The powers that be cannot afford to allow any free decentralized encyclopedias to exist. I want it more than anything. I'm grateful for InfoGalactic as a free fast deep alternative but it is very limiting and the admins are very unresponsive.

I hope to soon have a dedicated box for serving mirrors of SaidIt, Wikispooks, PeerTube, Mastodon, etc. Would love to hear more about your future plans. ~ JasonCarswell (talk) 19:26, 6 February 2019 (UTC)

Thanks Jason

It would be nice to get a 3rd party mirror of Wikispooks running, so any help needed with that, just ask. We might even look at configuring your database as a slave so that it auto-updates.

Robin is in the driving seat so far as site structure and development is concerned. Automated scraping of good information is an area of development. The site would also benefit from professional Linux sysop attention. We keep hitting resource-related issues. I keep things going but am very aware of my techie shortcomings. --Peter (talk) 10:13, 7 February 2019 (UTC)

Decentralized Hosting

I'm no expert, but it's my understanding that technologies such as IPFS could basically host a webpage in a decentralized manner as if it were a torrent or something. Perhaps this could be the path towards decentralization?

Again, I'm not an expert, so I'm really just looking for discussion about whether or not this is a viable path. --Sirjamesgray (talk)

2021

Would it make sense to shrink the size of the backup file by reducing image size with some very big pictures that do not need as much resolution here on WS, like this -> B Clinton -- Sunvalley (talk) 00:01, 28 July 2021 (UTC)

Yes, that would be better at more like 40k - there's no reason to waste hard disk space with such big images. If you upload a smaller version, an administrator can delete the old version. -- Robin (talk) 18:17, 28 July 2021 (UTC)
I did notice that before, I can no longer upload a new version via "Upload a new version of this file" on the page of the existing image, gives long error message, something with "local-backend". Should I upload new one's with a marker? Like: "Bill Clinton-smaller". -- would put the notification here .. and go about as I see them, or is there a list that you could create that lists by size from the top down? -- Sunvalley (talk) 00:27, 31 July 2021 (UTC)
Oh, can do that myself via File list; Terje I see Darley and Alexander Hay Ritchie with extreme high resolution (62.5 MB). Also USN 1047895 (25475244355).jpg (29.06 MB), Wittenburg Castle.jpg (22.72) and more. I think it may be better to keep the backup size down, some may not have a good enough connection, or don't want to download 10 GB at once, or some other reason. Not saying we should get overly cautious with picture upload, but I think these are wasting space, and many more generic Wikipedia pictures at 10-15 MB do too. Also there are some recent video uploads which make me wonder why these should be here and not on archive. org or some of the plethora of video platforms that now exist. So question: do these have to be in such high resolution for a reason I currently don't see, or could we have a policy to use the smallest version of a picture that will do the job here? -- Sunvalley (talk) 22:05, 31 July 2021 (UTC)