Migration Repo template

This repo is a template repository for the files needed for migrating Inventory data from a source system into FOLIO

TLDR; Create a new private repository based on this template. Clone it and then run create_folder_structure.sh

Supported migration tasks
FOLIO Inventory data migration process
Mapping files
Example Records
Perform a test migration

Table of contents generated with markdown-toc

Supported migration tasks

Batch Poster (BatchPoster) - Post generated objects to FOLIO
Bibs Transformer (BibsTransformer) - Transform MARC21 Bib records to FOLIO Instances and SRS records
Holdings CSV Transformer (HoldingsCsvTransformer) - Creates FOLIO holdingsrecords from a TSV or CSV File
Holdings MARC transformer (HoldingsMarcTransformer) - Transforms MARC21 MFHD records into FOLIO Holdings and SRS records
Items Transformer (ItemsTransformer) - Creates FOLIO Items from a TSV or CSV File
User Transformer (UserTransformer) - Creates FOLIO Users from a TSV or CSV File

FOLIO Inventory data migration process

This repository template plays a vital part in a process together with other repos allowing you to perform data migrations from a legacy ILS into FOLIO.

The program you will need to run the process is FOLIO Migration Tools. This is a Python program, not yet on PyPi, so you will need to clone it.

The toolkit requires you to run the transformation an data loading in sequence, and each step relies on previous migrations steps, like the existance of a map file with legacy system IDs and their FOLIO equivalents. The below picture shows the proposed migration steps for legacy objects into FOLIO:

Mapping files

The repo contains the following mapping files in the Mapping files folder. There is a web tool that helps you crate the mapping files for certain objects available at https://data-mapping-file-creator.folio.ebsco.com/data_mapping_creation

What file is needed for what objects?

File\Process	Bibs->Instances	Holdings (from MARC/MFHD)	Holdings (from item tsv/csv)	Items	Open Loans	Users
marc-instance-mapping-rules.json	yes	no	no	no	no	no
mfhd_rules.json	no	yes	no	no	no	no
item_mapping.json	no	no	no	yes	no	no
holdings_mapping.json	no	no	yes	no	no
locations.tsv	no	yes	yes	yes	no	no
temp_locations.tsv	no	no	no	optional	no	no
material_types.tsv	no	no	no	yes	no
loan_types.tsv	no	no	no	yes	no	no
temp_loan_types.tsv	no	no	no	optional	no	no
call_number_type_mapping.tsv	no	no	optional	optional	no	no
statcodes.tsv	no	no	optional	optional	no	no
item_statuses.tsv	no	no	no	optional	no	no
post_loan_migration_statuses.tsv	no	no	no	no	optional	no
patron_types.tsv	no	no	no	no	no	yes
user_mapping.json	no	no	no	no	no	yes
department_mapping.tsv	no	no	no	no	no	yes

marc-instance-mapping-rules.json

These are the mapping rules from MARC21 bib records to FOLIO instances. The rules are stored in the tenant, but it is good practice to keep them under version control so you can maintain the customizations as the mapping rules evolve.For more information on syntax etc, read the documentation.

mfhd_rules.json

This file is built out according to the mapping rules for bibs. The conditions are different, and not well documented at this point. Look at the example file and refer to the mappinrules documentation

holdings_mapping.json

Just as the item_mapping.json and the user mapping files, these files are esiest to create using the data-mapping-file-creator tool You base the mapping on the same item export as you use for the items.

item_mapping.json

This is a mapping file for the items. The process assumes you have the item data in a CSV/TSV format. The structure of the file is dependant on the the column names in the TSV file. For example, if you have a file that looks like this:

...	Z30_BARCODE	Z30_CALL_NO	Z30_DESCRIPTION	...
...	123456790	Some call number	some note	...

Your map should look like this:

...
{
    "folio_field": "barcode",
    "legacy_field": "Z30_BARCODE",
    "value":"",
    "description": ""
},
{
    "folio_field": "itemLevelCallNumber",
    "legacy_field": "Z30_CALL_NO",
    "value":"",
    "description": ""
}, 
{
    "folio_field": "notes[0].itemNoteTypeId",
    "legacy_field": "Z30_DESCRIPTION",
    "value": "c7bc292c-a318-43d3-9b03-7a40dfba046a",
    "description": ""
},
{
    "folio_field": "notes[0].staffOnly",
    "legacy_field": "Z30_DESCRIPTION",
    "value": false,
    "description": ""
},
{
    "folio_field": "notes[0].note",
    "legacy_field": "Z30_DESCRIPTION",
    "value": false,
    "description": ""
},
...

The resulting FOLIO Item would look like this:

{
	...
	"barcode": "123456790",
	"itemLevelCallNumber": "Some call number"
	"notes":[{
			"staffOnly": false,
			"note": "some note",
			"itemNoteTypeId": "c7bc292c-a318-43d3-9b03-7a40dfba046a"			
		}],
	...
}

Fallback values in reference data mapping

All mapping files (locations.tsv, material_types.tsv, locations.tsv etc) have a mechanism that allows you to add * to legacy fields in a row, and add the falback value from folio in the folio_code/folio_name column. If the mapping fails, the script will assign this value to the records created. Good practice is to have migration-specific value as a falback value to be able to locate the records in FOLIO

locations.tsv

These mappings allow for some complexity. These are the mappings of the legacy and FOLIO locations. The file must be structured like this:

folio_code	legacy_code	Z30_COLLECTION
AFA	AFAS	AFAS
AFA	*	*

The legacy_code part is needed for both Holdings migratiom. For Item migration, the source fields can be used (Z30_COLLECTION in this case). You can add as many source fields as you like for the Items

material_types.tsv

These mappings allow for some complexity. The first column name is fixed, since that is the target material type in FOLIO. Then you add the column names from the Item export TSV. For each column added, the values in them must match. At least one value per column must match. Se loan_types.tsv for complex examples

folio_name	Z30_MATERIAL
Audiocassette	ACASS
Audiocassette	*

loan_types.tsv

These mappings allow for some complexity. The first column name is fixed, since that is the target loan type in FOLIO. Then you add the column names from the Item export TSV. For each column added, the values in them must match. At least one value per column must match

folio_name	Z30_SUB_LIBRARY	Z30_ITEM_STATUS
Non-circulating	UMDUB	02
Non-circulating	*	*

call_number_type_mapping.tsv

These mappings allow for some complexity eventhough not needed.

folio_name	Z30_CALL_NO_TYPE
Dewey Decimal classification	8
Unmapped	*

statcodes.tsv

In order to map one statistical code to the FOLIO UUID, you need this map, and the field mapped in the item_mappings.json. These mappings allow for some complexity even though not needed. This mapping does not allow for default values. Any record without the field will not get one assigned.

folio_code	Z30_STAT_CODE
married_with_children	8
happily_ever_after	9

item_statuses.tsv

The handling of Item statuses is a bit of a project of its own, since not all statuses in legacy systems will have their equivalents in FOLIO. This mapping allows you to point one legacy status to a FOLIO status. If not status map is supplied, the status will be set to available.

legacy_code	folio_name
checked_out	Checked out
available	Available
lost	Aged to lost

post_loan_migration_statuses.tsv

This is not yet a mapping file per se, but it is used to substitute the values in the next_item_status column in the legacy open loans file. Leave the statuses you do not want the loans migration process to migrate empty and replace the legacy statuses you want to apply with the correct FOLIO ones.

Example Records

In the example records folder, you will find example source records and example results from after a transformation

Result files

The following table outlines the result records and their use and role

File	Content	Use for
folio_holdings.json	FOLIO Holdings records in json format. One per row in the file	To be loaded into FOLIO using the batch APIs
folio_instances.json	FOLIO Instance records in json format. One per row in the file	To be loaded into FOLIO using the batch APIs
folio_items.json	FOLIO Item records in json format. One per row in the file	To be loaded into FOLIO using the batch APIs
holdings_id_map.json	A json map from legacy Holdings Id to the ID of the created FOLIO Holdings record	To be used in subsequent transformation steps
holdings_transformation_report.md	A file containing various breakdowns of the transformation. Also contains errors to be fixed by the library	Create list of cleaning tasks, mapping refinement
instance_id_map.json	A json map from legacy Bib Id to the ID of the created FOLIO Instance record. Relies on the "ILS Flavour" parameter in the main_bibs.py scripts	To be used in subsequent transformation steps
instance_transformation_report.md	A file containing various breakdowns of the transformation. Also contains errors to be fixed by the library	Create list of cleaning tasks, mapping refinement
item_id_map.json	A json map from legacy Item Id to the ID of the created FOLIO Item record	To be used in subsequent transformation steps
item_transform_errors.tsv	A TSV file with errors and data issues together with the row number or id for the Item	To be used in fixing of data issues
items_transformation_report.md	A file containing various breakdowns of the transformation. Also contains errors to be fixed by the library	Create list of cleaning tasks, mapping refinement
marc_xml_dump.xml	A MARCXML dump of the bib records, with the proper 001:s and 999 fields added	For pre-loading a Discovery system.
srs.json	FOLIO SRS records in json format. One per row in the file	To be loaded into FOLIO using the batch APIs

HRID handling

Current implementation:

Download the HRID handling settings from the tenant. If there are HRID handling in the mapping rules:

The HRID is set on the Instance
The 001 in the MARC21 record (bound for SRS) is replaced with this HRID.

If the mapping-rules specify no HRID handling or the field designated for HRID contains no value:

The HRID is being constructed from the HRID settings
Pad the number in the HRID Settings so length is 11
A new 035 field is created and populated with the value from 001
The 001 in the MARC21 record (bound for SRS) is replaced with this HRID.

Relevant FOLIO community documentation

Perform a test migration

The mapping files and example data in this repo will enable you perform a migration against the latest FOLIO bugfest enironment. Everything is configured except for the missing FOLIO user password. This step-by-step guide will take you through the steps involved. If there are no more steps, we are still working on these example records

Before you begin

Move everything under the example_data folder into the data folder.
Setup pipenv using either the Pipfile or the requirements.txt

Transform bibs

Configuration

This configuration piece in the configuration file determines the behaviour

 {
    "name": "transform_bibs",
    "migrationTaskType": "BibsTransformer",
    "useTenantMappingRules": true,
    "ilsFlavour": "tag001",
    "tags_to_delete": [
        "841",
        "852"
    ],
    "files": [
        {
            "file_name": "bibs.mrc",
            "suppressed": false
        }
    ]
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
useTenantMappingRules	true	Placeholder for option to use an external rules file
ilsFlavour	any of "aleph", "voyager", "sierra", "millennium", "koha", "tag907y", "tag001", "tagf990a"	Used to point scripts to the correct legacy identifier and other ILS-specific things
tags_to_delete	any string	Tags with these names will be deleted (after transformation) and not get stored in SRS
files	Objects with filename and boolean	Filename of the MARC21 file in the data/instances folder- Suppressed tells script to mark records as suppressedFromDiscovery

Syntax to run

 pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json transform_bibs --base_folder PATH_TO_migration_repo_template/

Post tranformed Instances and SRS records

Configuration

These configuration pieces in the configuration file determines the behaviour

{
    "name": "post_bibs",
    "migrationTaskType": "BatchPoster",
    "objectType": "Instances",
    "batchSize": 250,
    "file": {
        "file_name": "folio_instances_test_run_transform_bibs.json"
    }
},
{
    "name": "post_srs_bibs",
    "migrationTaskType": "BatchPoster",
    "objectType": "SRS",
    "batchSize": 250,
    "file": {
        "file_name": "folio_srs_instances_test_run_transform_bibs.json"
    }
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
objectType	Any of "Extradata", "Items", "Holdings", "Instances", "SRS", "Users"	Type of object to post
batchSize	integer	The number of records per batch to post. If the API does not allow batch posting, this number will be ignored
file.filename	Any string	Name of file to post, located in the results folder

Syntax to run

 pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json post_bibs --base_folder PATH_TO_migration_repo_template/

  pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json post_srs_bibs --base_folder PATH_TO_migration_repo_template/

Transform MFHD records to holdings and SRS holdings

Configuration

This configuration piece in the configuration file determines the behaviour

{
    "name": "transform_mfhd",
    "migrationTaskType": "HoldingsMarcTransformer",
    "legacyIdMarcPath": "001",
    "mfhdMappingFileName": "mfhd_rules.json",
    "locationMapFileName": "locations.tsv",
    "defaultCallNumberTypeName": "Library of Congress classification",
    "fallbackHoldingsTypeId": "03c9c400-b9e3-4a07-ac0e-05ab470233ed",
    "useTenantMappingRules": false,
    "hridHandling": "default",
    "createSourceRecords": true,
    "files": [
        {
            "file_name": "holding.mrc",
            "suppressed": false
        }
    ]
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
legacyIdMarcPath	A marc field followed by an optional subfield delimited by a $	used to locate the legacy identifier for this record. Examles : "001", "951$c"
mfhdMappingFileName	Any string	location of the MFHD rules in the mapping_files folder
locationMapFileName	Any string	Location of the Location mapping file in the mapping_files folder
defaultCallNumberTypeName	Any call number name from FOLIO	Used for fallback mapping for callnumbers
fallbackHoldingsTypeId	A uuid	Fallback holdings type if mapping does not work
useTenantMappingRules	false	boolean (true/false) NOT YET IMPLEMENTED.
hridHandling	"default" or "preserve001"	If default, HRIDs will be generated according to the FOLIO settings. If preserve001, the 001s will be used as hrids if possible or fallback to default settings
createSourceRecords	boolean (true/false)
files	Objects with filename and boolean	Filename of the MARC21 file in the data/instances folder- Suppressed tells script to mark records as suppressedFromDiscovery

Syntax to run

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json transform_mfhd --base_folder PATH_TO_migration_repo_template/

Post tranformed MFHDs and Holdingsrecords to FOLIO

Configuration

These configuration pieces in the configuration file determines the behaviour

{
    "name": "post_holdingsrecords_from_mfhd",
    "migrationTaskType": "BatchPoster",
    "objectType": "Holdings",
    "batchSize": 250,
    "file": {
        "file_name": "folio_holdings_test_run_transform_mfhd.json"
    }
},
{
    "name": "post_srs_mfhds",
    "migrationTaskType": "BatchPoster",
    "objectType": "SRS",
    "batchSize": 250,
    "file": {
        "file_name": "folio_srs_holdings_test_run_transform_mfhd.json"
    }
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
objectType	Any of "Extradata", "Items", "Holdings", "Instances", "SRS", "Users"	Type of object to post
batchSize	integer	The number of records per batch to post. If the API does not allow batch posting, this number will be ignored
file.filename	Any string	Name of file to post, located in the results folder

Syntax to run

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json post_holdingsrecords_from_mfhd --base_folder PATH_TO_migration_repo_template/

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json post_srs_mfhds --base_folder PATH_TO_migration_repo_template/

Transform CSV/TSV files into Holdingsrecords

Configuration

These configuration pieces in the configuration file determines the behaviour

{
    "name": "transform_csv_holdings",
    "migrationTaskType": "HoldingsCsvTransformer",
    "holdingsMapFileName": "holdingsrecord_mapping.json",
    "locationMapFileName": "locations.tsv",
    "defaultCallNumberTypeName": "Library of Congress classification",
    "callNumberTypeMapFileName": "call_number_type_mapping.tsv",
    "previouslyGeneratedHoldingsFiles": [
        "folio_holdings_test_run_transform_mfhd"
    ],
    "holdingsMergeCriteria": [
        "instanceId",
        "permanentLocationId",
        "callNumber"
    ],
    "fallbackHoldingsTypeId": "03c9c400-b9e3-4a07-ac0e-05ab470233ed",
    "files": [
        {
            "file_name": "csv_items.tsv"
        }
    ]
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
holdingsMapFileName	Any string	location of the mapping file in the mapping_files folder
locationMapFileName	Any string	Location of the Location mapping file in the mapping_files folder
defaultCallNumberTypeName	any string	Name of callnumber in FOLIO used as a fallback
callNumberTypeMapFileName	Any string	location of the mapping file in the mapping_files folder
previouslyGeneratedHoldingsFiles
holdingsMergeCriteria	A list of strings with the names of holdingsrecord properties (on the same level)	Used to group indivitual rows into Holdings records. Proposed setting is ["instanceId", "permanentLocationId", "callNumber"]
fallbackHoldingsTypeId	uuid string	The fallback/default holdingstype UUID
createSourceRecords	boolean (true/false)
files	Objects with filename and boolean	Filename of the MARC21 file in the data/instances folder- Suppressed tells script to mark records as suppressedFromDiscovery

Syntax to run

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json transform_mfhd --base_folder PATH_TO_migration_repo_template/

Post trasformed Holdingsrecords to FOLIO

See documentation for posting above

Transform CSV/TSV files into Items

Configuration

These configuration pieces in the configuration file determines the behaviour

{
    "name": "transform_csv_items",
    "migrationTaskType": "ItemsTransformer",    
    "itemsMappingFileName": "item_mapping_for_csv_items.json",
    "locationMapFileName": "locations.tsv",
    "callNumberTypeMapFileName": "call_number_type_mapping.tsv",
    "materialTypesMapFileName": "material_types_csv.tsv",
    "loanTypesMapFileName": "loan_types_csv.tsv",
    "itemStatusesMapFileName": "item_statuses.tsv",
    "files": [
        {
            "file_name": "csv_items.tsv"
        }
    ]
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
itemsMappingFileName	Any string	location of the mapping file in the mapping_files folder
locationMapFileName	Any string	Location of the Location mapping file in the mapping_files folder
callNumberTypeMapFileName	Any string	location of the mapping file in the mapping_files folder
materialTypesMapFileName	Any string	location of the mapping file in the mapping_files folder
loanTypesMapFileName	Any string	location of the mapping file in the mapping_files folder
itemStatusesMapFileName	Any string	location of the mapping file in the mapping_files folder
files	Objects with filename and boolean	Filename of the MARC21 file in the data/instances folder- Suppressed tells script to mark records as suppressedFromDiscovery

Syntax to run

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json transform_csv_items --base_folder PATH_TO_migration_repo_template/

Post transformed Items to FOLIO

See documentation for posting above

Transform CSV/TSV files into FOLIO users

Configuration

These configuration pieces in the configuration file determines the behaviour

{
    "name": "user_transform",
    "migrationTaskType": "UserTransformer",
    "groupMapPath": "user_groups.tsv",
    "userMappingFileName": "user_mapping.json",
    "useGroupMap": true,
    "userFile": {
        "file_name": "staff.tsv"
    }
}

Explanation of parameters

Parameter	Possible values	Explanation
Name	Any string	The name of this task. Created files will have this as part of their names.
migrationTaskType	Any of the avialable migration tasks	The type of migration task you want to run
userMappingFileName	Any string	location of the mapping file in the mapping_files folder
groupMapPath	Any string	Location of the user group mapping file in the mapping_files folder
useGroupMap	boolean	Use the above group map file or use code-to-code direct mapping
userFile.file_name	Any string	name of csv/tsv file of legacy users in the data/users folder

Syntax to run

pipenv run python main.py PATH_TO_migration_repo_template/mapping_files/exampleConfiguration.json user_transform --base_folder PATH_TO_migration_repo_template/

Post transformed users to FOLIO

See documentation for posting above

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
archive		archive
data_preparation		data_preparation
example_data		example_data
mapping_files		mapping_files
workshops_and_demoes		workshops_and_demoes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
create_folder_structure.sh		create_folder_structure.sh
migration_directions_sierra.md		migration_directions_sierra.md
run_test_data_suite.sh		run_test_data_suite.sh

svetlikr/migration_repo

Folders and files

Latest commit

History

Repository files navigation

Migration Repo template

Supported migration tasks

FOLIO Inventory data migration process

Mapping files

What file is needed for what objects?

marc-instance-mapping-rules.json

mfhd_rules.json

holdings_mapping.json

item_mapping.json

Fallback values in reference data mapping

locations.tsv

material_types.tsv

loan_types.tsv

call_number_type_mapping.tsv

statcodes.tsv

item_statuses.tsv

post_loan_migration_statuses.tsv

Example Records

Result files

HRID handling

Current implementation:

Relevant FOLIO community documentation

Perform a test migration

Before you begin

Transform bibs

Configuration

Explanation of parameters

Syntax to run

Post tranformed Instances and SRS records

Configuration

Explanation of parameters

Syntax to run

Transform MFHD records to holdings and SRS holdings

Configuration

Explanation of parameters

Syntax to run

Post tranformed MFHDs and Holdingsrecords to FOLIO

Configuration

Explanation of parameters

Syntax to run

Transform CSV/TSV files into Holdingsrecords

Configuration

Explanation of parameters

Syntax to run

Post trasformed Holdingsrecords to FOLIO

Transform CSV/TSV files into Items

Configuration

Explanation of parameters

Syntax to run

Post transformed Items to FOLIO

Transform CSV/TSV files into FOLIO users

Configuration

Explanation of parameters

Syntax to run

Post transformed users to FOLIO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages