Names are a specific type of vocabulary, which represent a records' creator or contributor. For example, the deposit form offers search as you type suggestions and auto-fills the corresponding information.
You can find information behind the design and usage of the vocabulary in RFC 0054.
A Name record contains:
- The name of the author splitted in
given name, or a single
nameattribute with the full name formatted as
<family name>, <given name>. Note that if
given nameare present they will overwrite
- A list of
identifiers, composed by their identifier value and scheme. The scheme can potentially be autocompleted if it is known by the idutils library (e.g. ORCiD).
- A list of
affiliations, which can be represented by its
nameor, if it belongs to the Affiliations vocabulary, by its
- family_name: Carberry given_name: Josiah identifiers: - identifier: https://orcid.org/0000-0002-1825-0097 scheme: orcid affiliations: - name: Wesleyan University - name: Haak, Laurel L identifiers: - identifier: https://orcid.org/0000-0001-5109-3700 affiliations: - id: 04fa4r544
How to import and update your name records¶
InvenioRDM ships with an example set of names and ORCID identifiers.
To disable these from being loaded, create a blank file called
The Names vocabulary uses the new DataStreams API for processing vocabularies. You can find more information about this new API in RFC 0053.
Use of the
Name records will not be managed by the usual
invenio rdm-records fixtures command, but instead
a set of
invenio vocabularies ... commands.
There are several ways to import names records. The most straight forward is by a DataStream definition file, where you will define how entries from a data source will be read, if they need any transformation, and finally where they should be written to.
For a simple import you can read entries from a YAML file with raw metadata
objects, skip transformations, and use a service API to write and
persist the entries to the database. Here is an example of this definition
file, lets call it
names: readers: - type: yaml args: origin: "./app_data/names.yaml" writers: - type: names-service
Finally, to run an import using this
vocabularies-future.yaml file you
can call the
vocabularies import command:
invenio vocabularies import \ --vocabulary names \ --filepath ./vocabularies-future.yaml
In addition, you can also update vocabulary records in case you updated the
source data file using the
vocabularies update command:
invenio vocabularies update \ --vocabulary names \ --filepath ./vocabularies-future.yaml
The Names vocabulary has been implemented with the
ORCiD public dataset
as a possible source to import entries from. This means that the functionality
to read entries from this format is already available. For example, you
can use the
vocabularies convert command to convert this dataset into a YAML
file with the appropriate names metadata format:
invenio vocabularies convert \ --vocabulary names \ --origin /path/to/ORCID_2021_10_summaries.tar.gz \ --target names.yaml
Alternatively, you can simply import it directly:
Long and blocking operation
Note that the import process is done synchronously and the ORCiD dataset is very large. Therefore, this operation can take a long time.
invenio vocabularies import \ --vocabulary names \ --origin /path/to/ORCID_2021_10_summaries.tar.gz