Model data¶
Getting the data¶
MetaWards uses a large amount of data to describe the network over which the model is run. This data is stored in a separate MetaWardsData repository.
You can download this data in one of two ways;
- Direct download
Download one of the data releases from the releases page. It is best to choose the latest release unless you are trying to reproduce an earler calculation.
You can download the data as a
zip
or atar.gz
archive, and can unpack it using either;tar -zxvf MetaWardsData-0.2.0.tar.gz
if you downloaded the 0.2.0 release tar.gz file, or
unzip MetaWardsData-0.2.0.zip
if you downloaded the zip file.
- Git clone
You can clone the data repository using
git clone https://github.com/metawards/MetaWardsData
Finding the data¶
Once you have downloaded the data you will need to tell metawards where the data is located. This can be done in one of three ways;
Set the
METAWARDSDATA
environment variable equal to the full path to the data, e.g.export METAWARDSDATA=$HOME/MetaWardsData
(assuming you have placed it in your home directory)
Pass the full path to the data to metawards using the command line argument
--repository
, e.g.metawards --repository $HOME/MetaWardsData (again assumes you have placed it in your home directory)
Move the data to
$HOME/GitHub/MetaWardsData
as this is the default search location, e.g.mkdir $HOME/GitHub mv $HOME/MetaWardsData $HOME/GitHub/
Versioning the data¶
Provenance is very important for MetaWards. It is important to know what inputs were used for a model run so that it is possible to recreate the calculation and thus reproduce the result.
To aid this, MetaWardsData has a small script that you should run after you have downloaded the data. This will version the data by finding the git tag of the data. Run this script using;
cd $METAWARDSDATA
./version
This will create a small file called version.txt
which will look
something like this;
{"version": "0.2.0","repository": "https://github.com/metawards/MetaWardsData","branch": "main"}
This file will be read by metawards during a run to embed enough information in the outputs to enable others to download the same MetaWardsData input as you.
Understanding the data¶
The MetaWardData repository contains four types of data;
- model_data
The
model_data
directory contains the data needed to construct a network of wards and links. Currently the 2011 model is included, in the2011Data
directory. The manifest of files that comprise this model is2011Data/description.json
. This json file is read by metawards to find and load all of the files that are needed for this model.- diseases
The
diseases
directory contains the parameters for different diseases. Each disease is described in its own json file, e.g. the parameters for SARS-Cov-2 are indiseases/ncov.json
. Once of the main purposes of metawards is to perform parameter sweeps to adjust these disease parameters so that a model epidemic can more closely follow what is observed in reality. From there, more meaningful predictions should be able to be made.- parameters
The
parameters
directory contains parameters used to control metawards model runs. These represent “good defaults” for the values of the parameters, and are stored in the MetaWardsData repository to make it easier to version control and track provenance of the parameters used for different jobs. The parameters used for jobs in March 2020 (and correct as of March 29th) are inparameters/march29.json
.- extra_seeds
The
extra_seeds
directory contains files that can be used to seed new model disease clusters at different wards during a model run. Each file can contain as many additional seeds as needed. The format is three numbers per line;t num loc
, wheret
is the step (day) on which the additional infection will be seeded,num
is the number of additional infections, andloc
is the index of the ward in the model network in which the infection will occur.