Migrating data from BMS to EBS

In 2022, we were engaged in a project with the Excellence in Breeding (EiB) platform to accompany AfricaRice staff as they transitioned from the BMS to the EBS, through our data curation expertise and experience in supporting Francophone users.

1 Mapping BMS germplasm model to EBS model
2 Steps for migrating AfricaRice BMS germplasm records to EBS Core Breeding (CB) Germplasm
3 Mapping BMS traits to EBS traits
4 Scripts for extracting data
- 4.1 Extracting all variables
- 4.2 Extracting all variables used in all BMS studies
  - 4.2.1 Extracting all variables used for selected studies (filter on project_ids from years 2020-2022 with trait phenotyping datapoints)
5 Re-creating BMS studies in the EBS

Mapping BMS germplasm model to EBS model

AfricaRice BMS germplasm records with GIDs 8000000 or higher have been curated as EBS Seed Names, and needed to be migrated via the database backend due to the absence of a germplasm import user interface in the EBS at the time.

BMS - Germplasm Schema

EBS - Core Breeding Schema

Steps for migrating AfricaRice BMS germplasm records to EBS Core Breeding (CB) Germplasm

BMS “lot” = EBS “package”.

In the EBS, “packages” are program-specific, whereas “lots” are crop-specific (ie. shared across programs) in the BMS. There is a need to create at least one package for each EBS program where the germplasm is used in an experiment.

A DBMS tool/software like MS Access can link the Germplasm and Names table from the BMS via ODBC driver.

Install PostgreSQL to a local computer or a testing server to serve as sandbox or staging server in migrating data to EBS.

Request for the latest Core Breeding schema from the IRRI team with example data for reference and without data for data migration.

Preparation for mapping and transferring BMS germplasm to EBS Germplasm.germplasm table.

Create a Make Table query to extract the non-deleted germplasm, names, and atributs records from the BMS production database starting from GID 8,000,000.
(Starting GID range for AfricaRice BMS = 8,000,000)
Combine the germplasm and names table using the GID as a link to create a working table (ARGermplasm) having the column fields NVAL, NTYPE and NSTAT=1 merged into the germplasm.
Add the following field column in the ARGermplasm table and fill in the necessary information:
1. parentage - can be retrieved from the listdata.grpname (if available) or can be constructed by combining the preferred names of gpid1 and gpid2 for generative germplasm and parentage of gpid1 for derivative germplasm.
2. generation - germplasm filial generation (F1, F2..F5)
3. germplasm_state - fixed or not_fixed
4. germplasm_nytpe, indicate whether its cross_name, derivative_name, line_name or unnamed_cross
5. germplasm_type - fixed_line or progeny
  
  The EBS CB germplasm table serves as the main germplasm record catalog and only allows unique germplasm names or designations. Since the BMS germplasm recording allows multiple designations, the working table (ARGermplasm) needs to be filtered down to get the unique germplasm/designation.
Create a Make table query (ARGermplasm2), including all fields in the ARGemplasm table and copying only the distinct NVAL value.
Link the CB germplasm table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_local).
Add a field column GID to the EBSgermplasm_local to store the BMS GID.
Create an Append Query mapping the columns from ARGermplasm2 to the corresponding columns in the EBSgermplasm_local table, including the GID.
After transferring the germplasm records to EBSgermplasm_local, fill in the EBSgermplasm_local.id column with sequential number (ask the IRRI team for the next available germplasm.id that can be used)
Update the ARGermplasm table by adding a field column for EBS germplasm.id and update the column with values from the EBSgemplasm_local germplasm.id link by the column designation to nval.
Append all records to the Postgres EBS germplasm.germplasm table.
Export EBS germplasm.germplasm to a CSV file for submission to the IRRI team.

Preparation for mapping and transferring BMS Names to EBS Germplasm.germplasm_name table

Link the CB germplasm.germplasm_name table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_name_local).
Update the local BMS names table by adding a field column for EBS_germplasm_id and update the column by creating an Update query and linking the GIDs from ARGermplasm and Names table.
Remove all records from the Names table tagged as deleted (NSTAT=9).
Remove duplicate Names record with the same EBS_germplasm_id and nval values.
Map and transfer BMS Names records by mapping the column fields below;
(NAMES) EBS_germplasm_id → (EBSgermplasm_name_local) germplasm_id
nval → name_value
ntype (value of ntype in referencing to the BMS UDFLDS table) → germplasm_name_type
nstat → germplasm_name_status =”active” (if nstat=1), and “standard”.
nval → germplasm_normalized name
EBS userID (assigned by IRRI) → creator_id
”f” → is_void
After transferring the Names records to EBSgermplasm_name_local, fill in the EBSgermplasm_name_local.id column with a sequential number (ask the IRRI team for the next available germplasm_name.id that can be used).
Append all records to the Postgres EBS germplasm.germplasm_name table.
Export EBS germplasm.germplasm_name to a CSV file for submission to the IRRI team.

EBS CB Program ID for BMS Programs - CB tenant.program table
For the AfricaRice instance in EBS, the IRRI has already assigned program_id for almost all of the programs/projects in the BMS.
The program_id (id) will be needed to assign seed record entries for each breeding team to be able to germplasm list from EBS.

BMS program name	id	program_code	program_name	program_type	description
Hybrid	227	AR-HRB	AR - Hybrid Rice Breeding	breeding	Hybrid
MANGROVE	226	AR-RLB	AR - Rainfed Lowland	breeding	Rainfed Lowland
Rainfed Lowland	226	AR-RLB	AR - Rainfed Lowland	breeding	Rainfed Lowland
High Elevation	224	AR-HEB	High Elevation Breeding	breeding	High Elevation
Double Haploid	223	AR-DHB	AR -Double Haploid Breeding	breeding	Double Haploid
Rainfed Upland	228	AR-RUB	AR - Rainfed Upland Breeding	breeding	Rainfed Upland
Grain Quality	230	AR-GQ	AR - Grain Quality	breeding	Grain Quality
Breeding Task Force	187	AR	AFRICA Rice	breeding	Africa Rice Center
Plant Health	229	AR - PH	AR - Plant Health	breeding	Plant Health
Irrigated Lowland	225	AR-ILB	AR - Irrigated Lowland Breeding	breeding	Irrigated Lowland
RICE-FP4	232	AR-PHENO	AR - Phenotyping	breeding	AR - Phenotyping
Genotypage AR	222	AR-MB	AR - Molecular Breeding	breeding	MolecularBreeding

Creating records for EBS Germplasm.seed table - Record of the genetic entity of germplasm.

From the BMS Rice crop Listdata and Listnms table, extract all non-deleted records to an MS Access working database.
Add three columns “projectid”, “project_abbrev”, and “project_name” in the local Listnms table.
Fill the project_name column with the project_name values from the workbench.project table linked by the project_uuid.
Fill the listnms.projectid with the values from the CB tenant.program table.
Add a column “program_id” in the Listdata table and fill the column with value from the listnms.projectid table linked by “listid”.
Add a numeric field column in the Listdata table for “EBS_germplasm _id” and update with the values from ARgermplasm linked by GID.
Create Make table query (Listdata_seed) to get the distinct records from listdata table with the unique combination of gid, and program_id entries.
Add a column in the Listdata_seed for “seed_id” and fill it with a sequential number starting with the next available germplasm.seed.id from EBS (ask IRRI team).
Add a column in the Listdata_seed for “seed_code” and fill with the project_abbrev+GID (example AR-ILB9000026).
Link the CB germplasm.seed table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_seed_local).
Map and transfer Listdata_seed records to the EBS_germplasm_seed_local by mapping the column fields below;
(Listdata_seed) seed_id → (EBSgermplasm_seed_local) id
seed_code → seed_code
GID → seed_name
EBS_germplasm_id → germplasm_id
program_id → program_id
EBS userID → creator_id
Append all records to the Postgres EBS germplasm.seed table.
Export EBS germplasm.seed to a CSV file for submission to the IRRI team.

Creating records for EBS Germplasm.package table - Record of physical container of seeds and its amount. Since AfricaRice has no inventory data in the BMS, we can set the package quantity and unit to 0 grams.

Link the CB germplasm.package table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_package_local).
Map and transfer EBSgermplasm_seed_local records to the EBSgermplasm_package_local by mapping the column fields below;
(EBSgermplasm_seed_local) seed_code → (EBSgermplasm_package_local) package_code
package_label
0 → package_quantity
”g” → package_unit
”active” → package_status
id → seed_id
program_id → program_id
creator_id → creator_id
Fill the EBSgermplasm_package_local.id with a sequential number starting with the next available package.id from EBS (ask IRRI team)
Append all records to the Postgres EBS germplasm.package table
Export EBS germplasm.package to a CSV file for submission to the IRRI team

Mapping BMS traits to EBS traits

Request for the latest export table of the CB master variable, property, method, scale, and scale_value from the IRRI team.
Make an Excel worksheet and combine the variable, property, method, scale, and scale_value (file attached EBS variable.xls - worksheet - var_prop_meth_scale).
Run the AllValidStdVariables script in the AfricaRice BMS to retrieve all trait variables used in all studies.
Import the result from the AllValidStdVariable to an Excel file (file attached allvalidstdvariable.xls).
Inset a column on the allvalidstdvariable sheet for the EBS variable name (EBS_VARNAME).
Check the trait variables definition from allvalidstdvariable against the EBS variable’s var_prop_meth_scale sheet based on property, method, and scale_value. Any matching combination from the two tables, place the the EBS variable name (name) to the allvalidstdvariable’s EBS_VARNAME column. This will make it easier for IBP or AfricaRice Data Manager migrating study to EBS to adjust or change trait variables (if needed).
Any non-matching BMS rice trait variable definition to EBS can be requested to be added to EBS.
(attached file -TraitVar_for_EBS_upload.xls)

Scripts for extracting data

Extracting all variables

Retrieve all variables of all variable types. Consult view AllValidStdVariables and export contents to CSV

DROP VIEW IF EXISTS AllVariables;  
CREATE VIEW AllVariables AS
	SELECT DISTINCT c1.*,cp.`value` AS variable_type 
	FROM cvterm c1, cvtermprop cp  
	WHERE c1.cv_id=1040 AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
	ORDER BY variable_type;	
	
DROP VIEW IF EXISTS AllScalesWithCatValues;
    CREATE VIEW AllScalesWithCatValues AS
    SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));

DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
    CREATE VIEW 12_CheckScalesWithCatValues AS
    SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;

 DROP VIEW IF EXISTS AllValidStdVariables;
    CREATE VIEW AllValidStdVariables AS
    SELECT DISTINCT cvterm.name AS Variable,a.variable_type,a.is_obsolete,a.is_system,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition, 
    cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale, 
    cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND 
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220) 
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105)) 
ORDER BY cvterm.name;

Extracting all variables used in all BMS studies

Retrieve all distinct VARIABLE_ID and ALIAS from projectprop table for all non-deleted studies (study details, trial_instance/environment details (-ENVIRONMENT dataset), traits (-PLOTDATA dataset) - see view AllStudyVariables), along with variable definition (Property-Scale-Method) and categorical scale definitions where appropriate . See view AllValidStdVariables and export contents to CSV

DROP VIEW IF EXISTS AllStudyVariables;  
CREATE VIEW AllStudyVariables AS
	SELECT DISTINCT c1.*,pp.alias AS projectprop_alias,cp.`value` AS variable_type FROM cvterm c1, projectprop pp, project p, cvtermprop cp  
	WHERE c1.cv_id=1040 AND c1.cvterm_id=pp.variable_id AND pp.project_id=p.project_id
	AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
	AND p.deleted=0 AND (p.dataset_type_id IN (3,4) OR p.dataset_type_id IS NULL) 
	ORDER BY variable_type;	
	
DROP VIEW IF EXISTS AllScalesWithCatValues;
    CREATE VIEW AllScalesWithCatValues AS
    SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));

DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
    CREATE VIEW 12_CheckScalesWithCatValues AS
    SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;

 DROP VIEW IF EXISTS AllValidStdVariables;
    CREATE VIEW AllValidStdVariables AS
    SELECT DISTINCT a.projectprop_alias,cvterm.name AS Variable,a.variable_type,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition, 
    cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale, 
    cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllStudyVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND 
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220) 
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105)) 
ORDER BY cvterm.name;

Extracting all variables used for selected studies (filter on project_ids from years 2020-2022 with trait phenotyping datapoints)

DROP VIEW IF EXISTS AllStudyVariables;  
CREATE VIEW AllStudyVariables AS
	SELECT DISTINCT c1.*,pp.alias AS projectprop_alias,cp.`value` AS variable_type FROM cvterm c1, projectprop pp, project p, cvtermprop cp  
	WHERE c1.cv_id=1040 AND c1.cvterm_id=pp.variable_id AND pp.project_id=p.project_id
	AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
	AND p.deleted=0 AND (p.dataset_type_id IN (3,4) OR p.dataset_type_id IS NULL) 
	AND EXISTS (SELECT * FROM project pa WHERE pa.project_id=pp.project_id AND pa.parent_project_id IN (11103	,
11598	,
11604	,
11601	,
11683	,
11485	,
11059	,
11062	,
11056	,
11044	,
11041	,
11047	,
11065	,
11571	,
11068	,
11050	,
11053	,
11525	,
11528
))
	ORDER BY variable_type;	
	
DROP VIEW IF EXISTS AllScalesWithCatValues;
    CREATE VIEW AllScalesWithCatValues AS
    SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));

DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
    CREATE VIEW 12_CheckScalesWithCatValues AS
    SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;

 DROP VIEW IF EXISTS AllValidStdVariables;
    CREATE VIEW AllValidStdVariables AS
    SELECT DISTINCT a.projectprop_alias,cvterm.name AS Variable,a.variable_type,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition, 
    cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale, 
    cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllStudyVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND 
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220) 
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105)) 
ORDER BY cvterm.name;

Re-creating BMS studies in the EBS

A BMS study “instance”/environment is equivalent to an EBS experiment “occurrence”.

Export BMS studybook in Excel. This will have 2 sheets: Description (or definition of column headers in Observation sheet) and Observation (plot layout and trait data). One Excel file is generated for every instance/environment (TRIAL_INSTANCE). Alternatively, the studybook in CSV containing datasets for all environments may be exported, however there is no Description of what the dataset column headers are.
Export germplasm list in CSV.

Prepare entries list

Open BMS Excel studybook generated in step 1, Get Data from the CSV BMS germplasm list generated in step 2. Name the new sheet as “EBS Seed”. Save.

Navigate to EBS Seed Search :: Additional Search Paramaters. Click Input List.

Search using the EBS GERMPLASM NAME (BMS “DESIGNATION”) or EBS SEED NAME (BMS “GID”).

Add search results into a working list, then save the working list as <Name of study/experiment> entries as type package.

Prepare traits list

Search for the first trait, save in a list called <Name of study/experiment> traits

Navigate to EBS List manager, edit newly created list.

BMS-to-EBS trait mapping “helper” file

Copy the traits from the BMS studybook Description sheet.

Paste under column BMS TRAITS in Lookup EBS label sheet of the BMS-to-EBS trait mapping helper file.

Click on ADD ITEMS, then copy the values under EBS TRAITS from Excel. Paste these into the Input List box. Save.

Create Breeding Trial experiment

Add entries from saved package list

Prepare plot layout file

Design variable mapping

EBS file	BMS file

EBS file	BMS file
Trial	TRIAL_INSTANCE (found in BMS Description sheet)
Rep	REP_NO
Block	BLOCK_NO
Treatment	ENTRY_NO
PlotNum	PLOT_NO
FieldRow	FIELDMAP RANGE
FieldCol	FIELDMAP COLUMN

Define protocol for Traits

Select the site of the occurrence

Once the experiment has been reviewed and finalized, navigate to Experiment Manager.

Generate Location for created experiment

Commit Occurrence for mapped experiment

Export file with trait abbrevs.

Download data collection file for the location via Experiment Manager or Data Collection

Curate data from BMS Observations Excel sheet into EBS data collection file

The EBS label and EBS abbreviation mappings are found in the EBS Abbrev sheet in BMS-to-EBS trait mapping helper file.

Upload file with trait data via Experiment Manager or Data Collection

Quality Control (QC) and commit data

Following trait data upload, a quality control (QC) step is required. Click on the check icon on the left.

The data must be reviewed and corrected where necessary before it can be committed.