VNR code mapping to RxNorm

Details of VNR codes in finngen R6 mapped to standard RxNorm

The VNR codes are Nordic country-specific codes known as the Nordic Article Number. The VNR codes are 6-digit codes ranging from 000001-199999 and 370000-599999. They are assigned to all human medicines, veterinary medicines, herbal medicines, and traditional herbal medicines. Numbers outside this range are called ​​National Article Numbers which are used differently depending on the country.

RxNorm terminology on the other hand, is US specific terminology and provides normalized names for medications allows linking to many drug vocabularies commonly used in the US market.

We have mapped the Nordic country-specific VNR codes to RxNorm in FinnGen R6. Although the initial mapping was performed in R6 and is located in library-green in finngen_R6 folder the mapping can be used with any Data Freeze/Release.

The mapping and readme are located:

/finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR.tsv

/finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR_readme.txt

How the VNR code to RxNorm mapping was done:

  1. VNR codes originating from different sources within FinnGen were combined into single table called 'OriginalVNR'

  2. Additional information on VNR codes with missing drug name, strength, ingredient information were requested from Pharmaceutical Information Centre (Lääketietokeskus). This information is stored in the table 'ltklVNR'

  3. Missing ingredient information of VNR codes was filled using ATC codes.

  4. Administration routes, dosage forms and units were created for codes in both 'OriginalVNR' + 'ltklVNR' tables.

  5. We processed source text format of Package, Substance, Substance strength, Administration Route and Dosage Form for both 'OriginalVNR' + 'ltklVNR' tables.

  6. We used OHDSI drugmapping tool to map the parsed VNR code information to map to standard RxNorm

More details till step 5 from can be found in github repository.

Description of the columns of fgVNR.tsv are shown in below table.

Column Name
Column Type
Description
Example

VNR

INT64

six-digit VNR code.

518

ATC

STRING

ATC group code

N05AH04

MedicineName

STRING

Commerical Name

SEROQUEL

AdministrationRouteSourceTextFI

STRING

Administration route in text format as in source.

Suun kautta

AdministrationRoute

STRING

Valid value for administration route.

Oral use

DosageFormSourceTextFI

STRING

Dosage form in text format as in source.

tabletti, kalvopäällysteinen

DosageForm

STRING

Valid value for dosage form.

film-coated tablet

PackageSourceTextFI

STRING

Package info in text format as in source.

10 FOL

PackageSize

FLOAT64

Size of package in float format.

10

PackageFactor

INT64

Factor of package in float format.

1

PackageUnit

STRING

A valid unit value.

fol

SubstanceSourceTextFI

STRING

List of substances as in source.

quetiapine

Substance

STRING

Substance name. one row per substance.

quetiapine

SubstanceStrengthTextFI

STRING

Substance's strength in text format as in source.

25+100+200 mg

Strength

STRING

Mapped or fixed or split substance strength. If not then source strength used.

100 mg

SubstanceStrengthNumenatorValue

FLOAT64

Substance's strength value in numerator in float format.

100

SubstanceStrengthNumenatorUnit

STRING

A valid unit value.

mg

SubstanceStrengthDeominatorValue

FLOAT64

Substance's strength value in denominator in float format.

1

SubstanceStrengthDeominatorUnit

STRING

A valid unit value

1

ValidRange

BOOL

True if VNR is en the valid range (less than 200000 or between 370000 and 599999)).

TRUE

Source

STRING

From which table the code was taken.

ltklVNR or "originalVNR"

Status

STRING

How well the medicine has been processed

incomplete_dosageForm

VNRnew

STRING

A temporary VNR code created for drugs with single substance multiple strength values. Temporary VNR code will have letters a or b or c attached to the end.

000518a

calculateTotalStrength_message

STRING

How well the strength has been processed.

correct or "missmatch"

TotalStrength

FLOAT64

Total Strength of the drug which is PackageSize * PackageFactor * Dosage

10 * 1 * 100 = 1000

TotalStrengthUnit

STRING

Total Strength valid unit

mg

n_codes

INT64

Frequency of the VNR code

260

Dosage

FLOAT64

SubstanceStrengthNumenatorValue/SubstanceStrengthDeominatorValue

100/1 = 100

DosageUnit

STRING

A valid unit value

mg

MedicineNameFull

STRING

Commerical Name, Dosage Form and SubstanceStrength

SEROQUEL 25+100+200 mg

The fgVNR.tsv file was used as the input for OHDSI drugmapping tool. The tool requires VNR code with substance information followed by dosage form and drug strength. If no substance information is present then there will be no mapping.

  • DrugMapping tool requires a Common Data Model (CDM) database with vocabulary data. To create the CMD, we:

    • Extracted CDM database schema from OHDSI common data model for Version 5.3.2 of CDM.

    • Created the CDM database schema in a PostgreSQL server Version 14.2

    • Changes were made in the CDM V5.3.2 SQL files generated from OHDSI common data model due to PostgreSQL server Version is > 9.

    • Downloaded the Vocabulary data of Default vocab list + Addition vocabularies for "Dosage Form" from Athena.

    • Uploaded the Vocabulary data from Athena to the PostgreSQL CDM v5.3.2 database

  • Once the CDM database was up and running from PostgreSQL, we started setting up the DrugMapping Tool

    • Information regarding the possible Clinical drug mapping possible along with total number of input drugs

      Input

      Value

      Total Drugs

      15,928

      Drugs with non-missing VNR Codes

      15,902

      Unique VNR codes

      14,655

      Drugs with non-missing VNR codes + Ingredient Codes

      13,876

      Drugs with non-missing VNR codes + Ingredient Codes + dosage form

      12,897

      Drugs with non-missing VNR codes + Ingredient Codes + dosage form + dosage value

      12,644

      Drugs with non-missing VNR codes + Ingredient Codes + dosage form + dosage + dosage unit

      12,638

  • After fixing the Input file for DrugMapping Tool, it created three intermediary files

    • Ingredient Name Translation File

    • Unit Mapping File

    • Dose Form Mapping File

  • All the three intermediary files need to be filled carefully

    • Ingredient Name Translation File - Simplest to fill using the input file

    • Unit Mapping File

      • Source units were to be mapped to standard units such as 'mg' and 'mL'. Example

        SourceUnit
        DrugCount
        RecordCount
        Factor
        TargetUnit
        Comment

        %

        303

        1109084

        0.01

        mg/mg

        IU

        309

        400416

        1

        [U]

        U

        25

        360

        1

        [U]

        g

        160

        1614186

        1000

        mg

        g/l

        8

        1112

        1

        mg/mL

        mg

        20905

        68649900

        1

        mg

        mg/days

        111

        54182

        0,289

        mg/h

        mg/h

        5

        455

        1

        mg/h

        milli.IU

        73

        478234

        1000000

        [U]

        ml

        5

        4885

        1

        mL

        ug/puffs

        6

        212536

        0.001

        mg/{actuat}

      • Dose Form Mapping File

        • First thing is to extract all the dose form in domain "Drug" with concept_class "Dose Form" from all the vocabularies in the CDM database.

        • Second thing is to extract "relationship_id" of "Source - RxNorm eq" from CONCEPT_RELATIONSHIP table for all non-standard "Dose Form" from "additional vocabularies".

        • Match the cells in "DoseFrom" to the extracted standard dose forms which was only 49 out of 147 dose forms.

        • Manually filled out 85 dose forms with only 13 dose forms missing having low frequency. Example of filled dose form file can be seen below

          DoseForm
          DrugCount
          Priority
          concept_id
          concept_name
          Comments

          BASIC CREAM

          11

          19082224

          Topical Cream

          BATH ADDITIVE

          1

          19082228

          Topical Solution

          BODY LOTION

          1

          CAPSULE

          279

          0

          19082168

          Oral Capsule

          Standard

          CAPSULE

          279

          1

          19021887

          Capsule

          Non-Standard

          CAPSULE, HARD

          664

          19082168

          Oral Capsule

      • The result of DrugMapping Tool after carefully filling out all three files can be shown below

        Percentage of possible drugs mapped is 12,089 of 12,638 (95.6%)

        Source drugs mapped to Clinical Drug

        12089 of 14692 (82.283%)

        Source drugs mapped to Clinical Drug Form

        562 of 14692 (3.825%)

        Source drugs mapped to Clinical Drug Comp

        354 of 14692 (2.409%)

        Source drugs mapped to Ingredient

        588 of 14692 (4.002%)

        Source drugs mapped Splitted

        74 of 14692 (0.504%)

        Source drugs mapped Splitted Incomplete

        3 of 14692 (0.02%)

        Source drugs mapped Total

        13670 of 14692 (93.044%)

        Source drugs mapped to None

        1022 of 14692 (6.956%)

Last updated