This repository contains the python3 scripts used by the Nordic Museum to upload images to Wikimedia Commons. It is based on lokal-profil/upload_batches.
This is a work in progress started in autumn 2017 and ongoing throughout spring 2018. For more details, contact Aron Ambrosiani or Alicia Fagerving. The remaining work is listed as Issues.
- blog post (in Swedish) about how to copy images from Digitalt museum to Wikimedia Commons using this repository
- Documentation of the Digitalt Museum API
To run it you will have to install BatchUploadTools
and pywikibot
using:
pip install -r requirements.txt
Note: You might have to add the --process-dependency-links
flag to the above
command if you are running a different version of pywikibot from the required one.
The script must be run from a Wikimedia Commons account with the upload_by_url
user right.
On Wikimedia Commons this is limited to users with one of the image-reviewer
,
bot
, gwtoolset
or sysop
flags. Apply for bot rights.
Every upload batch relies upon two settings files:
- Batch-specific settings.
- Institution-specific settings. The name of this file has to correspond
to the institution code in the DigitaltMuseum system, e.g.
S-NM.json
for Nordiska Museet.
Some of the settings can be provided via command line parameters (use -help
to
see the available ones), but most of them have to be stated in the appropriate
settings file. See the settings
directory for examples.
Command lines values take preference over those provided by the settings file.
The following settings cannot use the default options:
- api_key: your Digitalt museum API key (as provided by KulturIT).
- glam_code: institution code in Digitalt Museum. List of institution codes for Swedish museums
- folder_id: unique id (12 digits) or uuid (8-4-4-4-12 hexadecimal digits) of the Digitalt Museum folder used
- wiki_mapping_root: root page on Wikimedia Commons of which all mapping tables are subpages (e.g. Commons:Nordiska_museet/mapping for Nordic Museum)
- default_intro_text: Default wikitext to add at the top of mapping table
page. With
{key}
being the placeholder for the mapping table type (one ofkeywords
,people
orplaces
)
- Modify settings.json to fit your project.
- If it doesn't exist yet, create an institution settings file (see above).
- Create user-config.py with the bot username
- Create user-password.py with the bot username & password. Generate a bot password.
(On Wikimedia Commons, prepare the templates needed & apply for bot rights including upload_by_url
if you don't already have them)
- Run
python importer/DiMuHarvester.py -api_key:yourDiMuAPIkey
to scrape info from the DiMu API and generate a "harvest file". Example output (note: if the harvest breaks, check the harvest_log_file to find the last UUID in the list). If you want to re-harvest from the local cache, add the flag-cache:True
- Run
python importer/DiMuMappingUpdater.py
to pull the harvest file and generate mapping files for Wikimedia Commons
- Upload the generated mappings files in the
/connections
folder to Wikimedia Commons. Example: location of the Nordic Museum mappings - Perform the mapping in the mapping tables.
After uploading the mappings to Wikimedia Commons, the following commands are run from the root folder of your installation:
- Run
python importer/make_glam_info.py -batch_settings:settings/settings.json -in_file:dimu_harvest_data.json -base_name:nm_output -update_mappings:True
to pull the harvest file and mappings and prepare the batch file. Example output - Run
python importer/uploader.py -type:URL -in_path:nm_output.json
to perform the actual batch upload.-cutoff:X
limits the number of files uploaded toX
(this will override settings)