This dashboard was made to analyse the results associated with the article
Kozlowski D, Semeshenko V, Molinari A (2021) Latent Dirichlet allocation model for world trade analysis. PLoS ONE 16(2): e0245393.
In the panel
we show the results for the LDA model for different numbers of components, k.
We use this panel for selecting the optimum value of k, and for labelling each component on the selected model.
In the panel
we show the results for the selected model (k=30), using the labels previously made.
On the tab
Components by country
we show, for a given country or set of countries, the most relevant components of its export along time.
On the tab
Countries by component
we show, for a given component or set of components,
those countries that have the highest proportion of their exports in that component, by decade.
Analysis of the Components
The characterisation of the model for different k, and the posterior labelling of components is a process that includes the following steps:
1. We first decided the quantity of products to analyse, up to 10 with the highest share. The more concentrated (following the cumulative probability), the less products are needed for a good characterisation.
2. We then defined a concept that generalises products with the highest share. For example, for k = 30, in component 1 the first four products are coal, iron, gold and aluminium, which can then be labelled as Minerals.
3. For components where the top 10 products have a cumulative share of less than 30%, we looked at the overall distribution of the component in Lall's groups (see below)
4. After studying all the components of different models (i.e. different values of k), we selected the model that satisfies the following criteria:
a. It is feasible to label most of the k components;
b. Components do not repeat among themselves;
c. The distribution of components gives a high cumulative share (more than 30%) for the first 10 products in the majority of the components.
With this analysis we label the components by Group, Subgroup and level of complexity, for the selected model.
Lall classification of components
The Lall classification divides products by origin (primary products, resourced based manufactures and non resourced based manufactures) and technology (low-medium-high)
For each component, we project the distribution as a weighted average of Lall's groups, using the share of each product in the component.
Given that not all products in SITC are classified by Lall, some of them go to the
If the first 10 products of the component have a cumulative share of less than 30%, the Lall groups allow us to characterize it:
- If the distribution is skewed, this means that the component is still well defined, but includes multiple products of the same type (see for example, component 4 for k = 30),
- If the distribution is uniform, this means that the component is ill defined (see for example, component 2 for k = 2).