Upgrading from v4.0 to v6.0¶
Prerequisites¶
The steps listed in this article require an existing local installation of
InvenioRDM 4.0, please make sure that this is given! If unsure, run
invenio-cli install
from inside the instance directory before executing the
listed steps.
Note: Do not delete the old Python virtual environment, or the database migration may complain about missing packages.
Backup
Always backup your database and files before you try to perform an upgrade.
Upgrade steps¶
Warning
Make sure you have the latest invenio-cli, for InvenioRDM v6 release is v1.0.0
Upgrade your instance dependencies¶
Bump the RDM version and rebuild the assets:
invenio-cli packages update 6.0.0
invenio-cli assets build -d
Prepare vocabularies¶
Before migrating your data, we have to check that it is compatible with the v6.0 schemas. Note that many changes have been made to enable the use of vocabularies.
First of all we have to upgrade the DB:
pipenv run invenio alembic upgrade
Then, we need to fix the previous version vocabularies. For that we are going to run the migration script and choose option number 1. It is very important to follow the steps in order.
pipenv run invenio shell $(find $(pipenv --venv)/lib/*/site-packages/invenio_app_rdm -name migrate_4_0_to_6_0.py)
Choose a step:
[1] Migrate old vocabularies
>> 1
Another two important new vocabularies are affiliations and subjects. For this we need to run specific checks:
Subjects
pipenv run invenio shell $(find $(pipenv --venv)/lib/*/site-packages/invenio_app_rdm -name migrate_4_0_to_6_0.py)
Choose a step:
[2] Check record\'s subjects
>> 2
This script will check that your subjects are either free text, which requires
no action on your side, or that they should be a custom vocabulary. In the
latter case, a yaml file will be created in your current folder (custom_subjects.yaml
).
You move this file to the app_data
folder and create a vocabularies.yaml
file. Then you must reference the subjects file, for example:
subjects:
pid-type: sub
schemes:
- id: Custom
name: Custom subjects
data-file: custom_subjects.yaml
The content of this file will be added to your instance's subjects vocabulary
when running the fixtures
command.
Affiliations
pipenv run invenio shell $(find $(pipenv --venv)/lib/*/site-packages/invenio_app_rdm -name migrate_4_0_to_6_0.py)
Choose a step:
[3] Check creators/contributors affiliations
>> 3
This script will check that your creators and contributors affiliations are either free text, which will require no action on your side, contain a ROR identifier or that they should be a custom vocabulary.
If the affiliations contain ROR identifiers, you will need to add that vocabulary. See more details here. Otherwise, you will need to create a custom vocabulary in a similar fashion that was done for the subjects above, or fix your records (remove the identifiers so only the name is preserved).
Prepare ES¶
Once the vocabularies have been checked for compatibility and fixed accordingly, they need to be created:
pipenv run invenio rdm-records fixtures
invenio-cli run
Note that the second command will run your instance. This is needed because
the creation of records and vocabularies is done asynchronously via celery
tasks. To check that they have been created you can use the RabbitMQ web UI (user: guest, password: guest) or the the rabbitmqctl
command line tool:
# Get the rabbitmq container id
docker ps -a
# Use said id to connect
~ docker exec -it <CONTAINER_ID> /bin/bash
root@e1cd455eae68$ rabbitmqctl list_queues
celery 0
If you see celery 0
means that all the required tasks have been run. We
recommend you to check the terminal where you run the instance for errors.
The final step to migrate your data is to remove the old ES indices:
pipenv run invenio index destroy --yes-i-know
Migrate the data¶
Once the fixtures are present in your system you can migrate your records/drafts:
pipenv run invenio shell $(find $(pipenv --venv)/lib/*/site-packages/invenio_app_rdm -name migrate_4_0_to_6_0.py)
[4] Migrate records
>> 4
The records will be migrated in the database, but now they need to be indexed in Elasticsearch:
pipenv run invenio index init
pipenv run invenio rdm-records rebuild-index