Migrating data from BMS to EBS
In 2022, we were engaged in a project with the Excellence in Breeding (EiB) platform to accompany AfricaRice staff as they transitioned from the BMS to the EBS, through our data curation expertise and experience in supporting Francophone users.
- 1 Mapping BMS germplasm model to EBS model
- 2 Steps for migrating AfricaRice BMS germplasm records to EBS Core Breeding (CB) Germplasm
- 3 Mapping BMS traits to EBS traits
- 4 Scripts for extracting data
- 5 Re-creating BMS studies in the EBS
- 5.1 Prepare entries list
- 5.2 Prepare traits list
- 5.3 Create Breeding Trial experiment
- 5.4 Generate Location for created experiment
- 5.5 Commit Occurrence for mapped experiment
- 5.6 Download data collection file for the location via Experiment Manager or Data Collection
- 5.7 Curate data from BMS Observations Excel sheet into EBS data collection file
- 5.8 Upload file with trait data via Experiment Manager or Data Collection
- 5.9 Quality Control (QC) and commit data
Mapping BMS germplasm model to EBS model
AfricaRice BMS germplasm records with GIDs 8000000 or higher have been curated as EBS Seed Names, and needed to be migrated via the database backend due to the absence of a germplasm import user interface in the EBS at the time.
BMS - Germplasm Schema
EBS - Core Breeding Schema
Steps for migrating AfricaRice BMS germplasm records to EBS Core Breeding (CB) Germplasm
BMS “lot” = EBS “package”.
In the EBS, “packages” are program-specific, whereas “lots” are crop-specific (ie. shared across programs) in the BMS. There is a need to create at least one package for each EBS program where the germplasm is used in an experiment.
A DBMS tool/software like MS Access can link the Germplasm and Names table from the BMS via ODBC driver.
Install PostgreSQL to a local computer or a testing server to serve as sandbox or staging server in migrating data to EBS.
Request for the latest Core Breeding schema from the IRRI team with example data for reference and without data for data migration.
Preparation for mapping and transferring BMS germplasm to EBS Germplasm.germplasm table.
Create a Make Table query to extract the non-deleted germplasm, names, and atributs records from the BMS production database starting from GID 8,000,000.
(Starting GID range for AfricaRice BMS = 8,000,000)Combine the germplasm and names table using the GID as a link to create a working table (ARGermplasm) having the column fields NVAL, NTYPE and NSTAT=1 merged into the germplasm.
Add the following field column in the ARGermplasm table and fill in the necessary information:
parentage - can be retrieved from the listdata.grpname (if available) or can be constructed by combining the preferred names of gpid1 and gpid2 for generative germplasm and parentage of gpid1 for derivative germplasm.
generation - germplasm filial generation (F1, F2..F5)
germplasm_state - fixed or not_fixed
germplasm_nytpe, indicate whether its cross_name, derivative_name, line_name or unnamed_cross
germplasm_type - fixed_line or progeny
The EBS CB germplasm table serves as the main germplasm record catalog and only allows unique germplasm names or designations. Since the BMS germplasm recording allows multiple designations, the working table (ARGermplasm) needs to be filtered down to get the unique germplasm/designation.
Create a Make table query (ARGermplasm2), including all fields in the ARGemplasm table and copying only the distinct NVAL value.
Link the CB germplasm table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_local).
Add a field column GID to the EBSgermplasm_local to store the BMS GID.
Create an Append Query mapping the columns from ARGermplasm2 to the corresponding columns in the EBSgermplasm_local table, including the GID.
After transferring the germplasm records to EBSgermplasm_local, fill in the EBSgermplasm_local.id column with sequential number (ask the IRRI team for the next available germplasm.id that can be used)
Update the ARGermplasm table by adding a field column for EBS germplasm.id and update the column with values from the EBSgemplasm_local germplasm.id link by the column designation to nval.
Append all records to the Postgres EBS germplasm.germplasm table.
Export EBS germplasm.germplasm to a CSV file for submission to the IRRI team.
Preparation for mapping and transferring BMS Names to EBS Germplasm.germplasm_name table
Link the CB germplasm.germplasm_name table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_name_local).
Update the local BMS names table by adding a field column for EBS_germplasm_id and update the column by creating an Update query and linking the GIDs from ARGermplasm and Names table.
Remove all records from the Names table tagged as deleted (NSTAT=9).
Remove duplicate Names record with the same EBS_germplasm_id and nval values.
Map and transfer BMS Names records by mapping the column fields below;
(NAMES) EBS_germplasm_id → (EBSgermplasm_name_local) germplasm_id
nval → name_value
ntype (value of ntype in referencing to the BMS UDFLDS table) → germplasm_name_type
nstat → germplasm_name_status =”active” (if nstat=1), and “standard”.
nval → germplasm_normalized name
EBS userID (assigned by IRRI) → creator_id
”f” → is_voidAfter transferring the Names records to EBSgermplasm_name_local, fill in the EBSgermplasm_name_local.id column with a sequential number (ask the IRRI team for the next available germplasm_name.id that can be used).
Append all records to the Postgres EBS germplasm.germplasm_name table.
Export EBS germplasm.germplasm_name to a CSV file for submission to the IRRI team.
EBS CB Program ID for BMS Programs - CB tenant.program table
For the AfricaRice instance in EBS, the IRRI has already assigned program_id for almost all of the programs/projects in the BMS.
The program_id (id) will be needed to assign seed record entries for each breeding team to be able to germplasm list from EBS.
BMS program name | id | program_code | program_name | program_type | description |
Hybrid | 227 | AR-HRB | AR - Hybrid Rice Breeding | breeding | Hybrid |
MANGROVE | 226 | AR-RLB | AR - Rainfed Lowland | breeding | Rainfed Lowland |
Rainfed Lowland | 226 | AR-RLB | AR - Rainfed Lowland | breeding | Rainfed Lowland |
High Elevation | 224 | AR-HEB | High Elevation Breeding | breeding | High Elevation |
Double Haploid | 223 | AR-DHB | AR -Double Haploid Breeding | breeding | Double Haploid |
Rainfed Upland | 228 | AR-RUB | AR - Rainfed Upland Breeding | breeding | Rainfed Upland |
Grain Quality | 230 | AR-GQ | AR - Grain Quality | breeding | Grain Quality |
Breeding Task Force | 187 | AR | AFRICA Rice | breeding | Africa Rice Center |
Plant Health | 229 | AR - PH | AR - Plant Health | breeding | Plant Health |
Irrigated Lowland | 225 | AR-ILB | AR - Irrigated Lowland Breeding | breeding | Irrigated Lowland |
RICE-FP4 | 232 | AR-PHENO | AR - Phenotyping | breeding | AR - Phenotyping |
Genotypage AR | 222 | AR-MB | AR - Molecular Breeding | breeding | MolecularBreeding |
Creating records for EBS Germplasm.seed table - Record of the genetic entity of germplasm.
From the BMS Rice crop Listdata and Listnms table, extract all non-deleted records to an MS Access working database.
Add three columns “projectid”, “project_abbrev”, and “project_name” in the local Listnms table.
Fill the project_name column with the project_name values from the workbench.project table linked by the project_uuid.
Fill the listnms.projectid with the values from the CB tenant.program table.
Add a column “program_id” in the Listdata table and fill the column with value from the listnms.projectid table linked by “listid”.
Add a numeric field column in the Listdata table for “EBS_germplasm _id” and update with the values from ARgermplasm linked by GID.
Create Make table query (Listdata_seed) to get the distinct records from listdata table with the unique combination of gid, and program_id entries.
Add a column in the Listdata_seed for “seed_id” and fill it with a sequential number starting with the next available germplasm.seed.id from EBS (ask IRRI team).
Add a column in the Listdata_seed for “seed_code” and fill with the project_abbrev+GID (example AR-ILB9000026).
Link the CB germplasm.seed table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_seed_local).
Map and transfer Listdata_seed records to the EBS_germplasm_seed_local by mapping the column fields below;
(Listdata_seed) seed_id → (EBSgermplasm_seed_local) id
seed_code → seed_code
GID → seed_name
EBS_germplasm_id → germplasm_id
program_id → program_id
EBS userID → creator_idAppend all records to the Postgres EBS germplasm.seed table.
Export EBS germplasm.seed to a CSV file for submission to the IRRI team.
Creating records for EBS Germplasm.package table - Record of physical container of seeds and its amount. Since AfricaRice has no inventory data in the BMS, we can set the package quantity and unit to 0 grams.
Link the CB germplasm.package table to the MS Access file via ODBC and make a local copy of the table (EBSgermplasm_package_local).
Map and transfer EBSgermplasm_seed_local records to the EBSgermplasm_package_local by mapping the column fields below;
(EBSgermplasm_seed_local) seed_code → (EBSgermplasm_package_local) package_code
package_label
0 → package_quantity
”g” → package_unit
”active” → package_status
id → seed_id
program_id → program_id
creator_id → creator_idFill the EBSgermplasm_package_local.id with a sequential number starting with the next available package.id from EBS (ask IRRI team)
Append all records to the Postgres EBS germplasm.package table
Export EBS germplasm.package to a CSV file for submission to the IRRI team
Mapping BMS traits to EBS traits
Request for the latest export table of the CB master variable, property, method, scale, and scale_value from the IRRI team.
Make an Excel worksheet and combine the variable, property, method, scale, and scale_value (file attached EBS variable.xls - worksheet - var_prop_meth_scale).
Run the AllValidStdVariables script in the AfricaRice BMS to retrieve all trait variables used in all studies.
Import the result from the AllValidStdVariable to an Excel file (file attached allvalidstdvariable.xls).
Inset a column on the allvalidstdvariable sheet for the EBS variable name (EBS_VARNAME).
Check the trait variables definition from allvalidstdvariable against the EBS variable’s var_prop_meth_scale sheet based on property, method, and scale_value. Any matching combination from the two tables, place the the EBS variable name (name) to the allvalidstdvariable’s EBS_VARNAME column. This will make it easier for IBP or AfricaRice Data Manager migrating study to EBS to adjust or change trait variables (if needed).
Any non-matching BMS rice trait variable definition to EBS can be requested to be added to EBS.
(attached file -TraitVar_for_EBS_upload.xls)
Scripts for extracting data
Extracting all variables
Retrieve all variables of all variable types. Consult view AllValidStdVariables and export contents to CSV
DROP VIEW IF EXISTS AllVariables;
CREATE VIEW AllVariables AS
SELECT DISTINCT c1.*,cp.`value` AS variable_type
FROM cvterm c1, cvtermprop cp
WHERE c1.cv_id=1040 AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
ORDER BY variable_type;
DROP VIEW IF EXISTS AllScalesWithCatValues;
CREATE VIEW AllScalesWithCatValues AS
SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));
DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
CREATE VIEW 12_CheckScalesWithCatValues AS
SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;
DROP VIEW IF EXISTS AllValidStdVariables;
CREATE VIEW AllValidStdVariables AS
SELECT DISTINCT cvterm.name AS Variable,a.variable_type,a.is_obsolete,a.is_system,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition,
cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale,
cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220)
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105))
ORDER BY cvterm.name;
Extracting all variables used in all BMS studies
Retrieve all distinct VARIABLE_ID and ALIAS from projectprop table for all non-deleted studies (study details, trial_instance/environment details (-ENVIRONMENT dataset)
, traits (-PLOTDATA dataset)
- see view AllStudyVariables), along with variable definition (Property-Scale-Method) and categorical scale definitions where appropriate . See view AllValidStdVariables and export contents to CSV
DROP VIEW IF EXISTS AllStudyVariables;
CREATE VIEW AllStudyVariables AS
SELECT DISTINCT c1.*,pp.alias AS projectprop_alias,cp.`value` AS variable_type FROM cvterm c1, projectprop pp, project p, cvtermprop cp
WHERE c1.cv_id=1040 AND c1.cvterm_id=pp.variable_id AND pp.project_id=p.project_id
AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
AND p.deleted=0 AND (p.dataset_type_id IN (3,4) OR p.dataset_type_id IS NULL)
ORDER BY variable_type;
DROP VIEW IF EXISTS AllScalesWithCatValues;
CREATE VIEW AllScalesWithCatValues AS
SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));
DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
CREATE VIEW 12_CheckScalesWithCatValues AS
SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;
DROP VIEW IF EXISTS AllValidStdVariables;
CREATE VIEW AllValidStdVariables AS
SELECT DISTINCT a.projectprop_alias,cvterm.name AS Variable,a.variable_type,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition,
cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale,
cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllStudyVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220)
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105))
ORDER BY cvterm.name;
Extracting all variables used for selected studies (filter on project_ids from years 2020-2022 with trait phenotyping datapoints)
DROP VIEW IF EXISTS AllStudyVariables;
CREATE VIEW AllStudyVariables AS
SELECT DISTINCT c1.*,pp.alias AS projectprop_alias,cp.`value` AS variable_type FROM cvterm c1, projectprop pp, project p, cvtermprop cp
WHERE c1.cv_id=1040 AND c1.cvterm_id=pp.variable_id AND pp.project_id=p.project_id
AND c1.cvterm_id=cp.cvterm_id AND cp.type_id=1800
AND p.deleted=0 AND (p.dataset_type_id IN (3,4) OR p.dataset_type_id IS NULL)
AND EXISTS (SELECT * FROM project pa WHERE pa.project_id=pp.project_id AND pa.parent_project_id IN (11103 ,
11598 ,
11604 ,
11601 ,
11683 ,
11485 ,
11059 ,
11062 ,
11056 ,
11044 ,
11041 ,
11047 ,
11065 ,
11571 ,
11068 ,
11050 ,
11053 ,
11525 ,
11528
))
ORDER BY variable_type;
DROP VIEW IF EXISTS AllScalesWithCatValues;
CREATE VIEW AllScalesWithCatValues AS
SELECT cvterm.cv_id, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm.cvterm_id, cvterm.name, cvterm.definition FROM cvterm INNER JOIN cvterm_relationship ON cvterm.cvterm_id = cvterm_relationship.subject_id
WHERE (((cvterm.cv_id)=1030) AND ((cvterm_relationship.type_id)=1105) AND ((cvterm_relationship.object_id)=1130));
DROP VIEW IF EXISTS 12_CheckScalesWithCatValues;
CREATE VIEW 12_CheckScalesWithCatValues AS
SELECT AllScalesWithCatValues.cvterm_id, AllScalesWithCatValues.name AS ScaleName, AllScalesWithCatValues.definition AS ScaleDefn, cvterm_relationship.type_id, cvterm_relationship.object_id, cvterm_1.cv_id, cvterm_1.name AS CategoryName, cvterm_1.definition AS CategoryDefn FROM AllScalesWithCatValues INNER JOIN (cvterm_relationship INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) ON AllScalesWithCatValues.cvterm_id = cvterm_relationship.subject_id
ORDER BY AllScalesWithCatValues.cvterm_id, cvterm_relationship.type_id, cvterm_1.name;
DROP VIEW IF EXISTS AllValidStdVariables;
CREATE VIEW AllValidStdVariables AS
SELECT DISTINCT a.projectprop_alias,cvterm.name AS Variable,a.variable_type,cvterm.cvterm_id AS VariableID, cvterm.definition AS Definition,
cvterm_2.name AS Property, cvterm_3.name AS Method, cvterm_4.name AS Scale,
cvterm_6.name AS ScaleDataType, cvterm_4.cvterm_id AS ScaleID, cc.CategoryName, cc.CategoryDefn FROM (((((((((((AllStudyVariables a INNER JOIN cvterm ON cvterm.cvterm_id=a.cvterm_id INNER JOIN cvterm_relationship AS cvterm_relationship_1 ON cvterm.cvterm_id = cvterm_relationship_1.subject_id) INNER JOIN cvterm AS cvterm_2 ON cvterm_relationship_1.object_id = cvterm_2.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_2 ON cvterm.cvterm_id = cvterm_relationship_2.subject_id) INNER JOIN cvterm AS cvterm_3 ON cvterm_relationship_2.object_id = cvterm_3.cvterm_id) INNER JOIN cvterm_relationship AS cvterm_relationship_3 ON cvterm.cvterm_id = cvterm_relationship_3.subject_id) INNER JOIN cvterm AS cvterm_4 ON cvterm_relationship_3.object_id = cvterm_4.cvterm_id) INNER JOIN cvterm_relationship ON cvterm_2.cvterm_id = cvterm_relationship.subject_id) INNER JOIN cvterm AS cvterm_1 ON cvterm_relationship.object_id = cvterm_1.cvterm_id) INNER JOIN cvtermprop ON cvterm.cvterm_id = cvtermprop.cvterm_id) INNER JOIN cvterm AS cvterm_5 ON cvtermprop.value = cvterm_5.name) INNER JOIN cvterm_relationship AS cvterm_relationship_4 ON cvterm_4.cvterm_id = cvterm_relationship_4.subject_id) INNER JOIN cvterm AS cvterm_6 ON cvterm_relationship_4.object_id = cvterm_6.cvterm_id LEFT JOIN 12_CheckScalesWithCatValues cc ON cvterm_4.cvterm_id=cc.cvterm_id
WHERE (((cvterm.is_obsolete)=0) AND ((cvterm_relationship_1.type_id)=1200) AND
((cvterm_relationship_2.type_id)=1210) AND ((cvterm_relationship_3.type_id)=1220)
AND ((cvterm.cv_id)=1040) AND ((cvtermprop.type_id)=1800) AND ((cvterm_relationship_4.type_id)=1105))
ORDER BY cvterm.name;
Re-creating BMS studies in the EBS
A BMS study “instance”/environment is equivalent to an EBS experiment “occurrence”.
Export BMS studybook in Excel. This will have 2 sheets: Description (or definition of column headers in Observation sheet) and Observation (plot layout and trait data). One Excel file is generated for every instance/environment (TRIAL_INSTANCE). Alternatively, the studybook in CSV containing datasets for all environments may be exported, however there is no Description of what the dataset column headers are.
Export germplasm list in CSV.
Prepare entries list
Open BMS Excel studybook generated in step 1, Get Data from the CSV BMS germplasm list generated in step 2. Name the new sheet as “EBS Seed”. Save.
Navigate to EBS Seed Search :: Additional Search Paramaters. Click Input List.
Search using the EBS GERMPLASM NAME (BMS “DESIGNATION”) or EBS SEED NAME (BMS “GID”).
If a germplasm/seed name is not found in the EBS, this needs to be added by the AfricaRice data manager who has access to the EBS Import germplasm module.
Add search results into a working list, then save the working list as <Name of study/experiment> entries as type package.
Prepare traits list
Search for the first trait, save in a list called <Name of study/experiment> traits
It is important to filter on Data Level (contains plot) and Status (equals active) and pick a trait variable that meets these two conditions, otherwise it will not be usable.
Navigate to EBS List manager, edit newly created list.
BMS-to-EBS trait mapping “helper” file
Only trait variables in studies with datapoints for years 2020-2022 were considered for mapping and included in this “helper” file.
Copy the traits from the BMS studybook Description sheet.
Paste under column BMS TRAITS in Lookup EBS label sheet of the BMS-to-EBS trait mapping helper file.
Click on ADD ITEMS, then copy the values under EBS TRAITS from Excel. Paste these into the Input List box. Save.
Create Breeding Trial experiment
Add entries from saved package list
Prepare plot layout file
Design variable mapping
EBS file | BMS file |
---|---|
Trial | TRIAL_INSTANCE (found in BMS Description sheet) |
Rep | REP_NO |
Block | BLOCK_NO |
Treatment | ENTRY_NO |
PlotNum | PLOT_NO |
FieldRow | FIELDMAP RANGE |
FieldCol | FIELDMAP COLUMN |
Define protocol for Traits
Select the site of the occurrence
Once the experiment has been reviewed and finalized, navigate to Experiment Manager.
Generate Location for created experiment
Commit Occurrence for mapped experiment
Export file with trait abbrevs.
The original order of traits in the experiment’s trait protocol list is NOT taken into account. Re-ordering of the traits must be done manually before exporting the EBS data collection file so that it matches the BMS data file.
Download data collection file for the location via Experiment Manager or Data Collection
Curate data from BMS Observations Excel sheet into EBS data collection file
The exported EBS data collection file contains the trait abbreviations, not the trait labels.
The EBS label and EBS abbreviation mappings are found in the EBS Abbrev sheet in BMS-to-EBS trait mapping helper file.
Upload file with trait data via Experiment Manager or Data Collection
Quality Control (QC) and commit data
Following trait data upload, a quality control (QC) step is required. Click on the check icon on the left.
The data must be reviewed and corrected where necessary before it can be committed.