No-DB console - part II

Last time we set up our no-db console and even made our first query. We saw a mysterious IDXs object. But before we look at this object, first let's look at another one:

The global objects

The no-db console is based on 3 major objects. Metadata about the databases is stored in the TABLES object, which currently looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
TABLES = {
   'ep_amendments': {'indexes': [{"fn": idx_ams_by_dossier, "name": "ams_by_dossier"},
                                 {"fn": idx_ams_by_mep, "name": "ams_by_mep"}],
                     'key': lambda x: x.get('id')},

   'ep_comagendas': {"indexes": [{"fn": idx_comagenda_by_committee, "name": "comagenda_by_committee"}],
                     'key': lambda x: x.get('id')},

   'ep_com_votes': {'indexes': [{"fn": idx_com_votes_by_dossier, "name": "com_votes_by_dossier"},
                                {"fn": idx_com_votes_by_committee, "name": "com_votes_by_committee"}],
                    'key': lambda x: x.get('_id')},

   'ep_dossiers': {'indexes': [{"fn": idx_active_dossiers, "name": "active_dossiers"},
                               {"fn": idx_dossiers_by_doc, "name": "dossiers_by_doc"},
                               {"fn": idx_dossiers_by_mep, "name": "dossiers_by_mep"},
                               {"fn": idx_dossiers_by_subject, "name": "dossiers_by_subject"},
                               {"fn": idx_dossiers_by_committee, "name": "dossiers_by_committee"},
                               {"fn": idx_subject_map, "name": "subject_map"},
                               {"fn": idx_dossiers_by_committee, "name": "dossiers_by_committee"}],
                   'key': lambda x: x['procedure']['reference']},

   'ep_meps': {'indexes': [{"fn": idx_meps_by_activity, "name": "meps_by_activity"},
                           {"fn": idx_meps_by_country, "name": "meps_by_country"},
                           {"fn": idx_meps_by_group, "name": "meps_by_group"},
                           {"fn": idx_meps_by_committee, "name": "meps_by_committee"},
                           {"fn": idx_meps_by_name, "name": "meps_by_name"}],
               'key': lambda x: x['UserID']},

   'ep_mep_activities': {'indexes': [{"fn": idx_activities_by_dossier, "name": "activities_by_dossier"},],
                         'key': lambda x: x['mep_id']},

   'ep_votes': {'indexes': [{"fn": idx_votes_by_dossier, "name": "votes_by_dossier"}],
                'key': lambda x: x['voteid']},
}

This basically describes all the "tables" we have, and for each "table" a function that returns the primary key for the objects stored in it, and what kind of "indexes" for this "table" exist and how these indexes are calculated.

When the no-db server/console starts, it uses this object to load all datasets and initializes their indexes.

After initialization each of the "tables" is accessible via the global DBS object, which is a dictionary, and the keys are according to the key function defined in the TABLES object. When you want to work on the data you very often will just work with one of these values of this object for example:

1
len(DBS['ep_dossiers'])

Will give you the number of all available dossiers in the dataset.

Similarly you can count the number of total MEPs like this:

1
len(DBS['ep_meps])

During initialization after a dataset has been loaded into the DBS object, the indexes for this dataset are precalculated and stored in the not-so-mysterious-anymore IDXs object.

The indexes are not necessary for working with the datasets, but they provide prefiltered quick-access lists of objects in the dataset that otherwise would take extra time to prepare.

If you are curious you can look at the implementation of the indexing functions to get an idea how to work with the data in general.

"It all feels a bit strange, but it seems to work reliably and comfortably" comments Alois Weishaupt, Principal Bit-Munger.